Shot Histogram Request - page 28 - CHDK Releases - CHDK Forum

Shot Histogram Request

  • 467 Replies
  • 129065 Views
*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #270 on: 12 / April / 2013, 12:14:33 »
Advertisements
I very much doubt the L vs Lt thing is the cause, but then again, I don't have any good explanation so why not  :-[
Ha. I have a thousand explanations. Unfortunately all of them are wrong (so far).

My theory (one of them) is that the bug is triggered by an interrupt that occurs  while the Lua code is executing a function that isn't thread safe, possibly garbage collection. The bug seems to be triggered by variations in shutter time, which I assume is generating an extra interrupt without incrementing the tick count.

If that's true, it might be possible to add a delay function call right before lua resume() that adds a varying delay from 0 to 10 msec. That way the next tick count incrementing interrupt would occur at a different time after the start of each call to lua resume().

The idea is to try to trigger the bug with a CHDK patch, and simple Lua script:

repeat sleep(10) until false

Then you guys could figure it out in no time, I bet.
I don't use ptpCamGui; but chdkptp works fine with this change.
Thanks for that info. This may be my incentive to learn chdkptp batch mode, which I'm sure works much better once you learn it.

I still wonder why ptpCamGui won't init, though. This may be related to why CHDK starts a second Lt thread. ptpCamGui runs a Lua program on the camera, so maybe this program starts a new Lua thread? Is the Lt thread necessary to do that? As Phil said initially, "there must be some reason."

Also, did anyone look at the LUAASRT.DMP file I posted here?
http://chdk.setepontos.com/index.php?topic=8997.msg99161#msg99161

I'm wondering why it was generated when there was no Lua error?

I'll see if I can trigger the bug again with the 2 thread CHDK and post the new log files from reyalp's new patch.
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

*

Offline reyalp

  • ******
  • 14082
Re: Shot Histogram Request
« Reply #271 on: 12 / April / 2013, 12:43:47 »
My theory (one of them) is that the bug is triggered by an interrupt that occurs  while the Lua code is executing a function that isn't thread safe, possibly garbage collection.
I don't understand this. Lua code is just regular C code. It's the OS's responsibility to ensure interrupts don't affect the state of a running task, there's no special "thread safety" required for this. If there was some defect in this area, the odds of it it causing exactly the same symptom in one specific place over many different builds appear small.


Quote
I still wonder why ptpCamGui won't init, though.
So why not debug it? ptpCamGui uses regular PTP calls and lua. Some specific thing must be failing...

Quote
Also, did anyone look at the LUAASRT.DMP file I posted here?
It means that that an assert was hit in Lua code. The specific assertion is
Code: [Select]
ASSERT ldo.c:401 ci == L->base_ci && firstArg > L->base
I haven't looked into what this means yet.
Don't forget what the H stands for.

*

Offline rudi

  • ***
  • 129
  • A590IS_101B, SX260HS_100B
Re: Shot Histogram Request
« Reply #272 on: 12 / April / 2013, 15:35:33 »
And why does ptpCamGui fail to init when there's only one thread. Do you think you could test this on your cameras with the Lt=L change and see if you can get ptpCam to init? I would really like to solve that problem because 2 of my SD cards don't have write protect tabs now and ptpCamGui is the only way to load new builds (other than Scotch tape).  Thanks.

I don't use ptpCamGui; but chdkptp works fine with this change.
You can also use chdkptp in console mode (with batch files) to simplify downloading updates to the camera.

Phil.

I tested ptpcam/ptpcamGUI and chdkptp cli with change "Lt = L". All applications will not return luar results.
Notes for tests:
- changes are in lua modul not in bin file
- luar is different
    ptpcam: luar get_buildinfo()
    chdkptp: luar return get_buildinfo()

rudi

*

Offline reyalp

  • ******
  • 14082
Re: Shot Histogram Request
« Reply #273 on: 12 / April / 2013, 16:12:12 »
I tested ptpcam/ptpcamGUI and chdkptp cli with change "Lt = L". All applications will not return luar results.
Notes for tests:
- changes are in lua modul not in bin file
- luar is different
    ptpcam: luar get_buildinfo()
    chdkptp: luar return get_buildinfo()
This suggests the PTP message interface is broken. I don't see why offhand, but it does use Lt explicitly in a few places. chkdptp relies on this heavily, so if chdkptp works correctly for Phil, maybe he has implemented the change differently.
Don't forget what the H stands for.


*

Offline philmoz

  • *****
  • 3450
    • Photos
Re: Shot Histogram Request
« Reply #274 on: 12 / April / 2013, 18:27:00 »
I tested ptpcam/ptpcamGUI and chdkptp cli with change "Lt = L". All applications will not return luar results.
Notes for tests:
- changes are in lua modul not in bin file
- luar is different
    ptpcam: luar get_buildinfo()
    chdkptp: luar return get_buildinfo()
This suggests the PTP message interface is broken. I don't see why offhand, but it does use Lt explicitly in a few places. chkdptp relies on this heavily, so if chdkptp works correctly for Phil, maybe he has implemented the change differently.

Comment out the line after 'Lt = L;' that contains:
     lua_setfield( L, LUA_REGISTRYINDEX, "Lt" );
This will fix the return results from luar.

(If you don't call lua_newthread to create Lt then there is nothing on the lua stack for lua_setfield to use).

Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #275 on: 12 / April / 2013, 19:08:56 »
Comment out the line after 'Lt = L;' that contains:
     lua_setfield( L, LUA_REGISTRYINDEX, "Lt" );
This will fix the return results from luar.

(If you don't call lua_newthread to create Lt then there is nothing on the lua stack for lua_setfield to use).
Thanks Phil. I was typing a question about that line when your post appeared. I'll give it a try.

[edit] It worked, thanks again. I deleted the declaration for "L", and changed the references to "L" into "Lt". Since there are a lot of "L" references as function paramaters, I just compiled it and fixed the "L" reference errors. I made Lt static, and removed L and Lt from luascript.h. This cleans up the code and clears up the "L" confusion.

I did a build for the sx260 with the original  2-thread CHDK and reyalp's new logging patch for lua. I'll see if I can trigger the bug with the sx260 tonight. It bugged out twice last night with the sx260. I've attached the files reyalp requested for this build (let me know if they're not the files you want).

I'll also repeat the test with the SX50 and the 1 thread modification to see if it triggers the bug, since it worked last night for a long time without the bug happening. It's possible that the 2 thread version might be causing the bug somehow.
« Last Edit: 12 / April / 2013, 20:54:14 by lapser »
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

*

Offline philmoz

  • *****
  • 3450
    • Photos
Re: Shot Histogram Request
« Reply #276 on: 13 / April / 2013, 00:07:18 »
Had a look at the dbg dump files from this post:
http://chdk.setepontos.com/index.php?topic=8997.msg99135#msg99135

If I'm reading the files correctly then for the SX260 we have
Lres value from lua_resume = 3144556 (0x2FFB6C) - from 04101716.log
Lua module load address = 0x2f80c8 - from modules.log

So the Lres value is in the middle of the Lua code @ 0x7AA4 bytes from the start of the module.

For the SX50:
Lres = 3214364 (0x310C1C)
Lua load address = 0x309178

Again Lres is in the Lua code @ 0x7AA4 bytes from the start of the module.

Can't be a coincidence; but no idea why this is happening.

Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #277 on: 13 / April / 2013, 00:56:21 »
So the Lres value is in the middle of the Lua code @ 0x7AA4 bytes from the start of the module.
Can you see what's at that address?

Both the sx260(2 lua threads) and sx50(1 lua thread) ran for about 5 hours without triggering an error or writing any of reyalp's error logs. So now I have another 35,000 pictures of the fog behind my house! Maybe they'll be valuable some day.
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos


*

Offline philmoz

  • *****
  • 3450
    • Photos
Re: Shot Histogram Request
« Reply #278 on: 13 / April / 2013, 01:04:29 »
So the Lres value is in the middle of the Lua code @ 0x7AA4 bytes from the start of the module.
Can you see what's at that address?


Copy the attached Makefile over your modules/Makefile and rebuild.
You should get a lua.elf.dumpobj file in the modules directory which will include a disassembly of the lua module.

To be accurate the source code needs to match what you used when you created the dump files. Since I was looking at the original dumps you posted you would probably need to remove any other changes (e.g. the Lt changes).

Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #279 on: 13 / April / 2013, 11:29:42 »
To be accurate the source code needs to match what you used when you created the dump files.
Hmm, I'm not confident I can do that. I've also added reyalp's new patch since then. However, I do have some new data.

I did another test this morning with the sx260 (2 lua threads) and sx50 (1 lua thread). The sx260 worked normally, but the sx50 crashed (power off, lens extended).

One thing interrupts do is use extra stack space. I added a function call in capt_seq_hook_set_nr() that use stack, as well as a function call in raw_savefile(). If an interrupt happened at a spot when CHDK code was deeper into the stack for that task than the camera ever gets without CHDK, that might cause a stack overflow and trash whatever memory is next to the stack for that task. I don't know the internals, so maybe you can think of a better idea.

I created a zip with the romlog from the sx50 and the TLUARES1.DMP file. The time stamp on this file is the same as the last jpg picture before the crash. I had the backlight off with my backlight patch, which put a lot of turnbacklightoff calls into the romlog, so that's not the problem.

I also compiled with your modules makefile and added lua.elf.dumpobj to the zip. That made it too big to attach, so it's posted here:

http://www.adrive.com/public/b7bfgJ/ReyalpLogs_130413.zip

EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

 

Related Topics