Shot Histogram Request - page 26 - CHDK Releases - CHDK Forum

Shot Histogram Request

  • 467 Replies
  • 147795 Views
*

Offline reyalp

  • ******
  • 14125
Re: Shot Histogram Request
« Reply #250 on: 09 / April / 2013, 22:34:22 »
Advertisements
http://chdk.setepontos.com/index.php?topic=9607.msg99095#msg99095
Since we've already done so much discussion of the bug here, it might be clearer to keep all the code related stuff in one thread.

LUA_MINSTACK is not the lua stack size. If I understand correctly, it's the minimum number of Lua stack slots that need to be available when Lua calls a C function.

LUAI_MAXCSTACK in luaconf.h is the maximum size of the Lua stack. luaconf.h is where most of the user adjustable knobs live.

If the lua stack were exhausted, it should generate an error.

edit:
My opinion is still that the corrupted lua state (as shown by the bogus return value from lua_resume and bogus error message) is the important clue. Narrowing down where it gets corrupted and what parts would be the logical next steps to me.

edit 2:
If you want to test garbage collection related theories, see collectgarbage
http://www.lua.org/manual/5.1/manual.html#5.1
you could call collectgarbage('collect') in your main loop to make or adjust the step values to make it run more frequently.

edit 3:
Going through the code a bit more, the bogus return value from lua_resume isn't necessarily a corrupt lua_State. If there was an error, it will be the return value of luaD_rawrunprotected. This should in theory also be a Lua error code, with the value coming from luaD_throw lj.status

I went back through my dumps from Alarik, and was reminded just how weird it was: luaD_throw was never called before the error, and lj.status was not non-zero at the end of luaD_rawrunprotected. The dump showed the lua_State status was 1 (=yield) and the rest of the state that I decoded seemed to be sane.
« Last Edit: 10 / April / 2013, 01:06:24 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Shot Histogram Request
« Reply #251 on: 10 / April / 2013, 02:27:42 »
I've attached an updated version of my debug dump patch. To use it, build with

OPT_DBG_DUMP=1
in your buildconf.inc or localbuildconf.inc

You can also add OPT_DBG_LUA_ASSERT=1

This patch will create various named dump files in the if lua hits an error. This includes "normal" errors like a syntax error etc. The files will be overwritten every time the triggering condition is met.

The dump includes a stack trace and some other possibly relevant information.

The dumps can be decoded in chdkptp using
!m=require'extras/dbgdump'
!return m.load('<filename>.DMP')

Module code will just be labeled as "heap" in the stack trace, since the decoder doesn't know where modules are loaded.

Modules addresses can be figured out manually if you turn on module logging in the "miscellaneous -> modules" menu.

The stack trace is also fixed size, so it may run off the end of the tasks actual stack, probably showing bits of other tasks stacks.
Don't forget what the H stands for.

*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #252 on: 10 / April / 2013, 02:44:42 »
@reyalp
Thanks for the help, and the patch. I'll give it a try tomorrow night.

Both cameras terminated with the same error tonight. I was still home when the sx50 stopped, so I restarted it and it worked until I got back about 2 hours later. I had the G1X pointed at the sunset from my Forest Ridge drive viewpoint, and there was a really bright ISS flyby which I think was just out of view. I'm hoping the SX50 caught it from my house.
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #253 on: 10 / April / 2013, 14:09:31 »
I've attached an updated version of my debug dump patch.
OK, I got it working, including the module logging. I added a test error to my script by calling an unknown function (with an obscene name in German) :) when you press <display>. I ended up with 4 .DMP files, and MODULES.LOG

I'll try to trigger the lua yield() error tonight, and then send you the files.

I also modified luascript.c to use one Lua thread, as Phil described. It works, except for ptpcamgui fails to init after the change. I guess it needs the 2 threads?

I changed "L" and "Lt" to static variables and removed them from luascript.h without any problems. I didn't find any references to luascript.h other than in luascript.c.
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

*

Offline reyalp

  • ******
  • 14125
Re: Shot Histogram Request
« Reply #254 on: 10 / April / 2013, 16:50:09 »
I also modified luascript.c to use one Lua thread, as Phil described. It works, except for ptpcamgui fails to init after the change. I guess it needs the 2 threads?
I doubt it. It shouldn't be much different from regular script, but ptp specific code might reference Lt somewhere.
Don't forget what the H stands for.

*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #255 on: 10 / April / 2013, 19:37:54 »
It shouldn't be much different from regular script, but ptp specific code might reference Lt somewhere.
First I just replaced the new thread call with Lt=L; That worked, but ptpCamGui wouldn't init. Then I replaced all "L" with "LT" and removed the declaration for L and made Lt static, and removed L and Lt from luascript.h. So there must can't be any references to L because it's no longer there. But I got the same result.

Maybe someone else can check out Lua over PTP with just one Lua thread?
===========

I did catch the ISS flyby last night, but just barely. It shows up briefly at 2:10 in the video at the extreme left. It got brighter and brighter after it went out of frame, and looked like a police helicopter with a searchlight or something.
http://www.youtube.com/watch?v=fHMj0xVPK0M#ws
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

*

Offline lapser

  • *****
  • 1093
Logs for reyalp
« Reply #256 on: 11 / April / 2013, 12:17:21 »
I left the sx260 and sx50 running at home while I took the G1X up Spencer's Butte for sunset. The G1X has never triggered the script interrupted error since the Lua update. When I got home, the SX50 display showed the interrupted error, and the SX260 was off with the lens extended.

Both cameras got the error about 10 minutes after I left. Usually, it takes longer. This time, I set a small metering area where clouds were moving by, so the shutter time was varying a lot. My smoothing routine limited the Tv96 changes to plus or minus 1. I suspect that the timing of an extra interrupt that occurs to close the shutter may be involved. That is, an interrupt in a function that isn't thread safe (or something). Perhaps this could be the Lua incremental garbage collector. Maybe it has something to do with the 2 Lua threads also.

Anyway, one of reyalp's new logs appeared for both cameras. I'm also including the romlog from the SX260, but the crash may not be significant. The time in the romlog is 2013:04:10 09:40:45. I have the camera clock in AM/PM mode, so if the time was 9 p.m. that would make sense. I had an external battery connected that switches itself off automatically, so I suspect the "crash" was just a sudden loss of power. I also included the script log, which shows the error code at the end. Everything looks normal in the script log until the error. The second "Lapser" number is the return value from resume(). I think it's just a pointer to the NULL error message string.

I don't know enough at this point to tell much from the new Lua logs, but every time one of my new "theories" is shot down, I learn a little more. Thanks!
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

*

Offline reyalp

  • ******
  • 14125
Re: Shot Histogram Request
« Reply #257 on: 11 / April / 2013, 17:19:03 »
Thanks for the logs. I don't have to go through them in detail right now, but the fact both cameras only have LUARES1.DMP suggests it is very similar to the earlier ones from Alarik. I'll try to take a more detailed look later.

If you have the main.bin.dump (or main.bin) from the CHDK build that was running when this was taken, that will help. The module elfs would also help.

I suspect we will need to find a way to trace further back to get a real handle on this. I have some ideas but it will take time to get it into code.
I'm also including the romlog from the SX260, but the crash may not be significant. The time in the romlog is 2013:04:10 09:40:45. I have the camera clock in AM/PM mode, so if the time was 9 p.m. that would make sense.
I would be surprised if the romlog respected the camera UI 12/24 hour setting, but in any case that romlog looks like an assert that is triggered you try to have too many files open simultaneously. This could happen if you try to log too soon after a shot.
Don't forget what the H stands for.

*

Offline lapser

  • *****
  • 1093
Re: Shot Histogram Request
« Reply #258 on: 11 / April / 2013, 19:35:29 »
If you have the main.bin.dump (or main.bin) from the CHDK build that was running when this was taken, that will help. The module elfs would also help.
Where do I find those files?
Quote
that romlog looks like an assert that is triggered you try to have too many files open
That's probably from earlier then. I crashed the camera that morning when I intentionally test triggered your patch to write its logs, without the proper compile options.

I'm sure glad that the G1X didn't get the error. The sunset was really nice last night:

http://www.youtube.com/watch?v=yJWqL3krjiw#ws
EOS-M3_120f / SX50_100b / SX260_101a / G1X_100g / D20_100b
https://www.youtube.com/user/DrLapser/videos

*

Offline reyalp

  • ******
  • 14125
Re: Shot Histogram Request
« Reply #259 on: 11 / April / 2013, 22:03:39 »
Where do I find those files?
main.bin and main.bin.dump are in the core directory after you build.

the module .elf files are in the modules directory.

If you are using chdkshell to build, you may need to turn off the clean after build option.
Don't forget what the H stands for.

 

Related Topics


SimplePortal © 2008-2014, SimplePortal