gcc10 vs vxworks - General Discussion and Assistance - CHDK Forum supplierdeeply

gcc10 vs vxworks

  • 18 Replies
  • 1053 Views
*

Online reyalp

  • ******
  • 13295
gcc10 vs vxworks
« on: 21 / March / 2021, 19:42:34 »
Advertisements
Whim mentioned here that gcc10 produced non-working builds for vxworks. Caefix maybe also confirmed this but I find their posts very difficult to understand.

If gcc10 produces invalid builds, it should be disabled in the makefiles or fixed.

I tested building the trunk for a540, using
https://armkeil.blob.core.windows.net/developer/Files/downloads/gnu-rm/10-2020q4/gcc-arm-none-eabi-10-2020-q4-major-win32.exe
and
https://armkeil.blob.core.windows.net/developer/Files/downloads/gnu-rm/9-2020q2/gcc-arm-none-eabi-9-2020-q2-update-win32.exe
from  https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm/downloads

Both builds booted and ran fine  :-[

Note this only applies to the trunk, 1.5 only supports gcc 3-5, and gcc10 fails with "callfunc.S:67: Error: selected processor does not support `blx R12' in ARM mode" (which is fine)

@whim
Can you check for romlogs after running the bad build, and upload both loader and platform main.bin.dump from builds with working gcc9 and non-working gcc10?

edit:
I verified building using whim's gcc1021_host920.7z produced a build working a540 build, identical to my earlier gcc10 test aside from the build date.
« Last Edit: 21 / March / 2021, 20:22:38 by reyalp »
Don't forget what the H stands for.

*

Offline philmoz

  • *****
  • 3325
    • Photos
Re: gcc10 vs vxworks
« Reply #1 on: 22 / March / 2021, 08:14:19 »
IXUS 700 built with GCC 10 fails - camera will not switch on.


What I've found so far:
- Build all with GCC 9, then delete platform/sub/boot.o, then rebuild boot.o with GCC 10 fails
- Build all with GCC 10, then delete boot.o and rebuild with GCC 9, then build works.


So it looks like whatever GCC 10 is doing with the platform/sub/boot.c code causes startup problems.


The only differences I can see so far in the compiled code between 9 & 10 for boot.c are:
- function boot() has push {r4, lr} at the start and pop {r4, lr} just before the jump to h_usrInit
- function boot() calls FW memcpy and memset to initialise the data and bss RAM


I don't understand how either of these would cause the startup to fail  ???


Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline whim

  • ******
  • 2040
  • A495/590/620/630 ixus70/115/220/230/300/870 S95
Re: gcc10 vs vxworks
« Reply #2 on: 22 / March / 2021, 09:28:38 »
@reyalp

Hi and thanks for looking into this mystery !

Just to make sure I redid all my experiments.

Method:

compilers:  gcc1021_host481, gcc1021_host920, gcc931_host481, gcc931_host920

compile settings: using Windows GUI for gcc toolchain,
checked options: PTP, WARNINGS and DEBUGGING
(initially I used OPT_FIRMWARE_PC24_CALL=1 as well)

CHDK source: pristine trunk 5795
Target camera: ixus70_sd100-101b

Before every test the SD card (Apacer 1 GB, bootable for CHDK) was wiped clean

Result: both gcc 10 binaries show absolutely no reaction from the camera,
while both gcc 9 run fine & pass the CRC check.
I made and checked ROMLOG before and after this experiments:
dated 20.11.2020, long before this experiment.

Also re-ran a compile on linux (Manjaro rolling, gcc 10.2 / 10.2 host/target)
on r5795, no joy either.
Curiously, when removing the card after this test,
camera screen lit up with 'No memory card' ???


wim

Attached: requested dumps as dumps.7z & ROMLOG.LOG

*

Online reyalp

  • ******
  • 13295
Re: gcc10 vs vxworks
« Reply #3 on: 22 / March / 2021, 13:22:43 »
The only differences I can see so far in the compiled code between 9 & 10 for boot.c are:
- function boot() has push {r4, lr} at the start and pop {r4, lr} just before the jump to h_usrInit
- function boot() calls FW memcpy and memset to initialise the data and bss RAM
My gcc9 a540 build has the push / pop as well. The gcc10 build also uses memset for CHDK bss in startup.

a540 and ixus70 use the generic/main.c, ixus700 does not.

boot() on a540 is entirely inline asm, so it doesn't use memcpy/memset. boot() on ixus700 and ixus70 is a mix of C and asm. So that maybe points to memset/memcpy but I don't see exactly how.

The a540 boot() code re-initializes sp to 0x1900, but that should only make it a couple words different from how ixus700 would end up.
Don't forget what the H stands for.


*

Offline philmoz

  • *****
  • 3325
    • Photos
Re: gcc10 vs vxworks
« Reply #4 on: 22 / March / 2021, 17:39:24 »
I found that adding '-fno-tree-loop-distribute-patterns' to CFLAGS in arm_rules.inc (line 28) stops GCC 10 from replacing the boot code with memcpy/memset calls.


The resulting build now runs on the IXUS 700.


As you noted memset was also being called in startup before boot() was called - so it's unlikely this is an issue.
Perhaps the FW function for memcpy is wrong, although it is used in other places in CHDK and if it was wrong would likely cause other issues.


Phil.

CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Online reyalp

  • ******
  • 13295
Re: gcc10 vs vxworks
« Reply #5 on: 22 / March / 2021, 19:00:44 »
Perhaps the FW function for memcpy is wrong, although it is used in other places in CHDK and if it was wrong would likely cause other issues.
It appears to be correct, it's the memcpy eventproc, same as on a540. The function is rather complicated (to be optimized while dealing with unaligned and sizes that aren't multiples of 4 bytes, I guess) but it doesn't seem like it should depend on the Canon data or bss already being initialized. It does use a bunch of registers and the stack, but I don't see anything in the following code that makes assumptions about those.

The ixus700 boot code skips a bunch of cp15 operations compared to the a540 code, but  again it's not obvious how those would interact with memcpy being called or not.

The first function called by h_usrInit appeard to do something with the value in r0, but that should come from the bit of inline asm after the memcpy/memset.

We could just disable that optimization for gcc10, but I suspect the failure is a sign we're doing something wrong.

 ???
Don't forget what the H stands for.

*

Offline philmoz

  • *****
  • 3325
    • Photos
Re: gcc10 vs vxworks
« Reply #6 on: 22 / March / 2021, 19:31:06 »
boot() on a540 is entirely inline asm, so it doesn't use memcpy/memset. boot() on ixus700 and ixus70 is a mix of C and asm. So that maybe points to memset/memcpy but I don't see exactly how.


Out of curiosity, what happens if you use memcpy on the A540 in the boot() function?

CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline philmoz

  • *****
  • 3325
    • Photos
Re: gcc10 vs vxworks
« Reply #7 on: 22 / March / 2021, 20:07:28 »
This is just getting weird.


If I replace the C initialisation code in boot() with calls to both memcpy and memset the camera fails to boot.
If I just replace the data initialisation with a call to memcpy the build works.
If I just replace the bss initialisation with a call to memset the build works.
If I replace both; but call memset first the build works.


It only fails if both memcpy and memset are used with the call to memcpy done first ??? ???


EDIT:
If I add the naked attribute to boot() then GCC 10 does not add the push {r4,lr} / pop {r4,lr} instructions.
The camera then fails to boot if memcpy or memset are used alone or together in either order.
I noticed that the FW code has push {lr} / pop {lr} in the boot() code - adding these fixes the build regardless of how memcpy/memset are used.


In other words if boot() mimics the FW with push {lr} at the start and pop {lr} before the jump to h_usrInit() then GCC 10 works fine.
If, however, boot() has push {r4,lr} and pop {r4.lr} generated by GCC 10 then it fails if memcpy is called before memset.


I have no idea why.

« Last Edit: 22 / March / 2021, 20:39:17 by philmoz »
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)


*

Online reyalp

  • ******
  • 13295
Re: gcc10 vs vxworks
« Reply #8 on: 22 / March / 2021, 20:43:57 »
Out of curiosity, what happens if you use memcpy on the A540 in the boot() function?
Using the same style of code as the ixus700 reproduces the problem, fails with gcc 10, works with gcc9

I tried putting blinking code after each copy, and it hung blinking with gcc 9  :o (edit: or maybe my code was buggy)

The cp15 code enables the data cache, then disables after the copy is done. It doesn't clean or flush before or after. I wonder if there's something funky happening with stack usage in between...
« Last Edit: 22 / March / 2021, 20:53:34 by reyalp »
Don't forget what the H stands for.

*

Online reyalp

  • ******
  • 13295
Re: gcc10 vs vxworks
« Reply #9 on: 22 / March / 2021, 21:10:20 »
I tried putting blinking code after each copy, and it hung blinking with gcc 9  :o (edit: or maybe my code was buggy)
Nope. It fails if the function is static and works if not. In the static case, the compiler moves the 5 into the function itself, instead of passing it in r0, and uses the stack. In the non-static case, it doesn't use the stack.

Definitely getting cache coherency vibes.
Code: [Select]
static void x_blink(int cnt)
{
volatile long *p=(void*)LED_PR;
int i;

for(;cnt>0;cnt--){
p[0]=0x46;

for(i=0;i<0x200000;i++){
asm ("nop\n");
asm ("nop\n");
}
p[0]=0x44;
for(i=0;i<0x200000;i++){
asm ("nop\n");
asm ("nop\n");
}
}
}

...
    asm volatile (
"MRC     p15, 0, R0,c1,c0\n"
"ORR     R0, R0, #0x1000\n"
"ORR     R0, R0, #4\n"
"ORR     R0, R0, #1\n"
"MCR     p15, 0, R0,c1,c0\n"
    :::"r0");

    x_blink(5);

    for(i=0;i<canon_data_len/4;i++)
canon_data_dst[i]=canon_data_src[i];

    x_blink(5);

    for(i=0;i<canon_bss_len/4;i++)
canon_bss_start[i]=0;

    x_blink(5);

    asm volatile (
"MRC     p15, 0, R0,c1,c0\n"
"ORR     R0, R0, #0x1000\n"
"BIC     R0, R0, #4\n"
"ORR     R0, R0, #1\n"
"MCR     p15, 0, R0,c1,c0\n"
    :::"r0");

    x_blink(5);

Don't forget what the H stands for.

 

Related Topics