supplierdeeply

A570 ROMLOG

  • 4 Replies
  • 889 Views
*

Offline fb

  • *
  • 2
A570 ROMLOG
« on: 21 / October / 2013, 22:45:53 »
Advertisements
Been having an issue with an A570 camera running CHDK a570-101a-1.2.0-3131-full. Camera suddenly crashes after working without issue for multiple days. I'm using a 32GB card with a 2MB boot partition. Camera is running a custom Lua script for timelapse photography.

I originally brought up the issue on IRC. reyalp and waterwingz offered assistance and suggestions. Original ROMLOG: http://pastebin.com/YNiNzA5L

I've since updated my script to log memory usage every 5 minutes, as the most obvious cause for the crash was an out of memory issue. Camera ran without issue for nearly 4 days until crashing, ROMLOG: http://pastebin.com/8Lmb4iKG

Additional notes:
-I accidentally removed a call to collectgarbage() in the script; script ran without manually collecting garbage.
-Many unique strings are generated while logging, raising Lua memory usage. However, based on the logs, Lua is effective at automatic garbage collection.
-This may still be an out of memory issue, but it's worth noting that memory wasn't low before the most recent crash (below).
-Memory usage ranges:
  -get_meminfo().allocated_size ~2MB
  -get_meminfo().free_size ~85-280K
  -Lua ~120-250K
-Memory gets scarce shortly after startup, but quickly recovers and stays within ranges above.
-Log excerpts below.

Log format:
[timestamp] <shots taken>/<get_jpg_count()>, <script runtime DAYS.HOURS:MINUTES:SECONDS>
<Temperature degrees F optical/CCD/battery>
<Battery voltage>
<SD card stats>
<Lua memory usage: collectgarbage("count") * 1024>
<get_meminfo() result (note: '-' indicates nil was returned): [heap name] <total_size>, <allocated_size>/<allocated_peak>/<allocated_count>, <free_size>/<free_block_max_size>/<free_block_count>

----------
STARTUP
----------
[2013-10-14 12:19:22] 16/9999, runtime 00:05:21
Temp Opt/CCD/Batt: 73/82/69
Batt: 3138 mV
Disk Free/Size: 30148/30190
Lua mem: 182272
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2095352/-/2603, FREE size/block_max_size/block_count 208328/150864/200

----------
LOW MEMORY
----------
[2013-10-14 13:01:22] 142/9999, runtime 00:47:21
Temp Opt/CCD/Batt: 105/107/91
Batt: 3138 mV
Disk Free/Size: 29853/30190
Lua mem: 271360
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2194020/-/3709, FREE size/block_max_size/block_count 109660/107268/127

[2013-10-14 13:22:01] 204/9999, runtime 01:08:01
Temp Opt/CCD/Batt: 109/111/93
Batt: 3138 mV
Disk Free/Size: 29710/30190
Lua mem: 320512
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2249088/-/4254, FREE size/block_max_size/block_count 54592/33404/130

[2013-10-14 13:32:21] 235/9999, runtime 01:18:21
Temp Opt/CCD/Batt: 111/111/95
Batt: 3138 mV
Disk Free/Size: 29639/30190
Lua mem: 294912
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2216476/-/3518, FREE size/block_max_size/block_count 87204/33404/588

[2013-10-14 13:37:41] 251/9999, runtime 01:23:41
Temp Opt/CCD/Batt: 111/111/95
Batt: 3138 mV
Disk Free/Size: 29603/30190
Lua mem: 131072
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2036836/-/2719, FREE size/block_max_size/block_count 266844/50048/370

----------
DAY 2
----------
[2013-10-15 07:20:19] 1101/9999, runtime 19:06:19
Temp Opt/CCD/Batt: 53/60/51
Batt: 3121 mV
Disk Free/Size: 27894/30190
Lua mem: 180224
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2092400/-/3375, FREE size/block_max_size/block_count 211280/69928/276

----------
DAY 3
----------
[2013-10-16 07:21:32] 3078/9999, runtime 1.19:07:32
Temp Opt/CCD/Batt: 55/62/53
Batt: 3116 mV
Disk Free/Size: 23752/30190
Lua mem: 153600
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2086280/-/3077, FREE size/block_max_size/block_count 217400/47236/342

----------
FINAL LOGS BEFORE CRASH
----------
[2013-10-17 18:12:05] 6993/9712, runtime 3.05:58:04
Temp Opt/CCD/Batt: 68/80/64
Batt: 3142 mV
Disk Free/Size: 15479/30190
Lua mem: 224256
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2210660/-/3900, FREE size/block_max_size/block_count 93020/9456/570

---script sleeps until sunrise---

[2013-10-18 07:24:01] 7005/9706, runtime 3.19:10:01
Temp Opt/CCD/Batt: 59/66/59
Batt: 3121 mV
Disk Free/Size: 15469/30190
Lua mem: 207872
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2189016/-/3408, FREE size/block_max_size/block_count 114664/9456/678

[2013-10-18 07:29:02] 7019/9688, runtime 3.19:15:01
Temp Opt/CCD/Batt: 60/69/59
Batt: 3142 mV
Disk Free/Size: 15442/30190
Lua mem: 133120
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2108844/-/2813, FREE size/block_max_size/block_count 194836/12228/395

---CRASH OCCURS AT 2013:10:18 07:30:20---

*

Offline reyalp

  • ******
  • 11086
Re: A570 ROMLOG
« Reply #1 on: 21 / October / 2013, 23:39:27 »
[2013-10-18 07:29:02] 7019/9688, runtime 3.19:15:01
Temp Opt/CCD/Batt: 60/69/59
Batt: 3142 mV
Disk Free/Size: 15442/30190
Lua mem: 133120
Sys mem: [system] TOTAL -, ALLOCATED size/peak/count 2108844/-/2813, FREE size/block_max_size/block_count 194836/12228/395

---CRASH OCCURS AT 2013:10:18 07:30:20---
I still think this is a memory issue. The last bit is relevant
Total free is a relatively healthy ~190k, but the largest free block is ~12k. In other words, memory is highly fragmented, so even though you have quite a bit free, the largest block that can be allocated is small.

A brute force solution for this would be a periodic reboot.

It's possible that there is some usage pattern in the script that is particular bad for fragmentation. For example lua builds up some garbage, a shooting allocates a small amount of image per shot, when the garbage is collected, the per-shot structure is left floating in the middle of the heap. Post the script might give me a better idea if this is likely. If so, it may be possible to adjust the script and garbage collection to minimize this effect.

Restarting the script might help, but there is currently no really good way to do that. You might be able to check "restart lua on error" and trigger an error with error()

Building CHDK with exmem support might help, since the lua heap would be independent of the main camera heap. However, it looks like no one has yet implemented this for A570.
Don't forget what the H stands for.

Re: A570 ROMLOG
« Reply #2 on: 21 / October / 2013, 23:49:08 »
Restarting the script might help, but there is currently no really good way to do that. You might be able to check "restart lua on error" and trigger an error with error()
Not sure if this is related,  but I've noticed on my memory limited A1200 that if I have syntax errors in my Lua script, and I then correct and reload the script via PTP without restarting the camera, then one or two iterations of that will cause my camera to report "not enough memory" and stop.   A power cycle clears the problem but simply restarting Lua does not.
Ported :   A1200    SD940   G10    Powershot N    G16

*

Offline fb

  • *
  • 2
Re: A570 ROMLOG
« Reply #3 on: 23 / October / 2013, 18:38:29 »
Entire memory log: http://pastebin.com/ygpWDrPu
Script: http://pastebin.com/WZmxXGE9

Good catch on the fragmentation issue. block_max_size even dips below 10k at times.

As for restarting the script, I'll see what happens with a daily reboot() and script autostart enabled. I may also implement a watchdog that triggers a reboot when a low memory or high fragmentation threshold is met.

Thanks for the suggestions. Need to do more testing. I'll report back soon.


*

Offline reyalp

  • ******
  • 11086
Re: A570 ROMLOG
« Reply #4 on: 23 / October / 2013, 23:05:19 »
As for restarting the script, I'll see what happens with a daily reboot() and script autostart enabled. I may also implement a watchdog that triggers a reboot when a low memory or high fragmentation threshold is met.
A couple of things to watch out for with reboot()

1) It may not save the current image counter. This is OK if the last shot image is still on the card, because the camera will set it based on the highest image number on the card when it starts up again (the resetting behavior is documented in the canon manual). If your script were deleting images, it could cause issues.

2) It's a good idea to make sure the camera is in play mode and the lens is retracted before you reboot.
Don't forget what the H stands for.

 

Related Topics