help: how to debug script interrupting? - page 18 - General Discussion and Assistance - CHDK Forum supplierdeeply

help: how to debug script interrupting?

  • 224 Replies
  • 44839 Views
*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #170 on: 03 / May / 2013, 17:11:36 »
Advertisements
However - I can confirm that this bug appears somehow randomly

I tried with and without photos on SD, with and withount garbage collection. I thought that my script crashes only in accurate mode, but now I see that the fast mode fails as well (or as bad rather).
Yes, this is an important point. It's variable enough that it's very had to say whether a given change affects it. That's why I'm trying to collect statistics, otherwise it's very hard to have confidence whether particular cameras are affected, or whether particular factors impact the frequency.


This is further reinforced by my msgtest experiments: I just ran 250k messages in play mode (=67 minutes), and it triggered the error once, about half way through.
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #171 on: 03 / May / 2013, 17:29:03 »
If you need another pair of eyes, tell me what tests you want to run, I have a couple of cameras to test on, and have been watching the hunt for the cause of this problem for a while.
If you want to, any more data we can get on which cameras are affected would certainly be helpful. However, if you have better things to do with your cameras, I wouldn't say this is urgent.

If you have ptp setup, the message test is easy and doesn't put wear and tear on the camera

get the latest chdkptp lua files from svn https://www.assembla.com/spaces/chdkptp/trac_subversion_tool

start chdkptp -i -c

run the commands

Code: [Select]
!m=require'extras/msgtest'
!m.load()
set cli_time=true
!m.test(100,100000)
The 100000 is how many times you want it to run. It looks like 100k isn't really enough to trigger it reliably, 1 million (should be ~ 5 hours on digic IV) might be a decent sample.

If you hit the bug on a stock build (without my debug patch from this post http://chdk.setepontos.com/index.php?topic=8273.140), it will should start spamming "send failed" instead of ok. On the debug patch, the "b" count in misc debug vals will go up, and resN.dmp files will be written each time it's hit.

Alternatively, you can use the script posted here:
http://chdk.setepontos.com/index.php?topic=8273.msg100018#msg100018

with the settings posted here:
http://chdk.setepontos.com/index.php?topic=8273.msg100090#msg100090

Or you can roll your own script to try to trigger it. We know the bug happens in resume, so aside from whatever other unknown factors are involved, more frequent sleeps should trigger it more.  I have a gut feeling that the camera being "busy" is a factor, but that could easily be wrong.

One experiment I've had in mind is a script that
press half shoot
sleep(10) until get_shooting
record time it took for get_shooting to go true
release shoot_half
sleep(10) for the same amount of time as it took get_shooting

With decent statistics, this could tell us if being in half shoot is required or more likely to trigger the bug. This would require the above mentioned debug patch to keep running when the error happens.

Don't forget what the H stands for.

*

Offline philmoz

  • *****
  • 3411
    • Photos
Re: help: how to debug script interrupting?
« Reply #172 on: 03 / May / 2013, 18:01:24 »
what toolchain are you using? Can you post a copy of the lua.elf from your build?

I'm using arm-elf-gcc 4.6.2.
I can also build with 4.4.0, 4.4.3 or 4.5.1 so I will give them a try.

Lua.elf attached.

Phil.

Edit: Ran the msgtest.lua test on the G12. 250K cycles using GCC 4.4.3 and again with 4.5.1 and no errors. Running 1M cycles now with 4.5.1.

« Last Edit: 03 / May / 2013, 21:32:10 by philmoz »
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #173 on: 03 / May / 2013, 22:47:17 »
FWIW, I got 3 errors running msgtest 1M on d10. Runtime was ~4.5 hours, so this doesn't appear to be a very efficient way of producing the problem. 

edit:
All of the testing I've done has been with my debug dump code. I suppose it's possible this affects the frequency. I'll try the autobuild.

edit:
There's a slight difference in the return from luaD_rawrunprotected

philmoz lua.elf
Code: [Select]
    5234: 981b      ldr r0, [sp, #108] ; 0x6c
    5236: b01c      add sp, #112 ; 0x70
    5238: bc10      pop {r4}
    523a: bc02      pop {r1}
    523c: 4708      bx r1
mine (with debug code)
Code: [Select]
26fd8a: ac03      add r4, sp, #12
...
  26fd9a: b01c      add sp, #112 ; 0x70
  26fd9c: 6e20      ldr r0, [r4, #96] ; 0x60
  26fd9e: bc10      pop {r4}
  26fda0: bc02      pop {r1}
  26fda2: 4708      bx r1

mine (vanilla trunk)
Code: [Select]
    6a8e: ab03      add r3, sp, #12
    6a90: b01c      add sp, #112 ; 0x70
    6a92: 6e18      ldr r0, [r3, #96] ; 0x60
    6a94: bc02      pop {r1}
    6a96: 4708      bx r1
None of this should make any difference to the value in r0...

If you can use the 4.5.1 windows toolchain from chdkshell, that will exactly match what I am using.
« Last Edit: 03 / May / 2013, 23:12:28 by reyalp »
Don't forget what the H stands for.


*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #174 on: 04 / May / 2013, 00:06:43 »
Here's a G12 100c build from my tree, with the reyalp-lua-debug-trunk-2735-2.patch applied

If the error is hit, the b value in misc debug values will increase, and resN.DMP will be written to the card. (up to N = 10)
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #175 on: 04 / May / 2013, 04:33:20 »
I ran the script in no-shoot mode using the latest trunk autobuild. 10k cycles didn't hit the bug...

Running the debug code with the same script settings, it hit the error 8 times in less than 5000 cycles.

I realized after I started it the camera settings were not identical to the previous no-shoot run with the debug code. IS was off, and focus was left on AF.

I'm re-running it with IS on and MF, which I think were what I used with the debug code
Don't forget what the H stands for.

Re: help: how to debug script interrupting?
« Reply #176 on: 04 / May / 2013, 07:42:43 »
As far as I can be sure I was always using MF and the focus set to "infinity". Focus was set on the camera. No CHDK overrides were set except the Tv set by the script itself.

I have no idea whether it is important but other settings:
- auto white balance
- M mode
- no flash
- different zooms, but usually the widest
- L size of photos

from CHDK:
- no RAW
- dark frame substraction set to off

if (2*b || !2*b) {
    cout<<question
}

Compile error: poor Yorick

*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #177 on: 04 / May / 2013, 15:00:44 »
edit: Oops, nuked this when I was trying to quote it. From memory:
As far as I can be sure I was always using MF and the focus set to "infinity". Focus was set on the camera. No CHDK overrides were set except the Tv set by the script itself.
Do you remember if you this was using an autobuild, or something you built yourself ?


Quote
- L size of photos
FWIW, the runs I did with shooting were done with 640x480 / low quality.

Last night I ran 20k (making 30k with the ones previous mentioned) noshoot cycles using the autobuild without hitting the bug.  IS was enabled to match the original test runs I did with my debug build.
« Last Edit: 04 / May / 2013, 19:33:11 by reyalp »
Don't forget what the H stands for.


*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #178 on: 04 / May / 2013, 16:22:53 »
However, I got an error in 850 cycles on the ixus110_sd960 (r31, same generation as the d10). Perhaps we should collect the affected and (presumably) not affected models somewhere, maybe there's a pattern.
What source branch / and version was this? Did it include my debug patch ?
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 13718
Re: help: how to debug script interrupting?
« Reply #179 on: 04 / May / 2013, 17:02:26 »
Here's a g10 102a build with my debug patch.
Don't forget what the H stands for.

 

Related Topics