I have tested on s2is. No problems as far. DNG saving ~1s faster than from CHDK trunk 1.2.
Thanks.
Since s2is is a vxworks camera, I would expect it to have the hang problem, where the camera sometimes locks up in DNG saving. Have you run the test script from post:
http://chdk.setepontos.com/index.php?topic=9970.msg100756#msg100756 ?
I'm still quite confused by this problem. Brain dump follows... this more to organize my thoughts than anything else, not really expecting others to wade through it.
I've confirmed that it happens in a few different ways. Usually it's spytask that locks up, somewhere in free_dng_header. It can happen in the free() or ufree(), but sometimes kbd_task task is locked. It's unclear what has actually happened when kbd_task is locked, since I am unable to do any sort of debugging.
I added code to kbd_task to make the power LED constantly flicker. This tells me at a glance whether the main loop is still running or not.
Since PTP (sometimes) continues to work when spytask hangs, I wrote some code to dump all the TCBs to a file. I run this using PTP call function, and then download with PTP. Interestingly, sometimes PTP download fails with ptp error 0x2002, apparently due to open failing on any file I try to open. Other (non-lua) PTP functions like rmem and callfunction continue to work. In these cases, after rebooting the file is there and readable.
The TCB includes the task context (registers from the last time it was suspended) and the running state. The PC is always in the vxworks task code, but the LR varies. When a task is "hung" in this situation, it always appears to be in the "ready" state. I have never seen other tasks in that state. As far as I understand, this should mean that it will get scheduled the next time OS scheduling rules permit, it it never appears to change after the hang. Most other tasks are in "suspend" which makes sense because they are typically waiting on some IPC object. kbd_task, where the dump code actually runs is always 0 status (running?). tClockSave also seems to be status 0, but this appears to be the case when the camera operating normally. The same is sometimes true of spytask.
Interestingly, calling taskResume (as identified by the vxworks flirt sigs) on the "hung" task will sometimes allow the shot to complete and the script to continue, although a crash or hard hang usually follows fairly soon after. There's another function identified as "ResumeTask" which I have not yet tried.
I tried moving the free_dng_header into the DNG writer task, directly after the actual header write. This causes the writer task to hang instead of spytask. Otherwise the symptoms seems similar
In several of the spytask hangs, I noticed the LR of the task context was in this code (a540, 100b):
ROM:FFEE08FC 00 00 A0 03 MOVEQ R0, #0
ROM:FFEE0900 E3 34 00 0B BLEQ f_taskSuspend
ROM:FFEE0904 loc_FFEE0904 ; CODE XREF: memPartFree+F0j
ROM:FFEE0904 F0 06 9F E5 LDR R0, =0x110003
Note this is in memPartFree free is the underlying code for free() and ufree().
Another oddity in the tcb dump is the task options, which show as 0x104 for all tasks except tExcTask and tLogTask. According to the vxworks docs I can find 0x100 is VX_NO_STACK_FILL, but 4 is not documented. This is based on the reversed TCB in includes/vxwork.h, which might have errors.
Some other things I have tried
- make the main task function in ARM code. Theory: ending the task in thumb makes the vxworks kernal do something wrong. But I may have messed this up, since the task was created from thumb, so might still actually be using the automatically generated interworking...
- Move the file open/close into the writer task, just passing the filename to write_dng. Theory: sharing file descriptors between tasks is bad. Vxworks docs say it's OK, but maybe the FsIoNotify stuff could be confused. I think this is a good idea anyway, but it doesn't help.
- Use cached or uncached DNG. Theory: cache cleaning code might be doing something weird.
- Use different SD cards

One thing I'm tempted to test is to make the DNG writer get re-used instead of exiting. This is how most of the Canon tasks operate. However, unless we want it constantly sitting in sleep loop, we would need get some task synchronization object functions (semaphore, message queue, event flag) in the sig fingers. Using these could gain a bit of efficiency in the process too, since we wouldn't have to do 10ms sleeps when the various tasks are waiting for each other.
I discovered that on msleep(0) does *not* yield, contrary to my comment in the current svn code, it's effectively non-op on both vxworks and dryos. This is slightly odd, because inserting did seem to allow dng_writer to get started a bit earlier on dryos.
attached for future reference are my tcb dumping code and a chdkptp script to parse the results. Note there's some other cruft in the dumping code, I'm sure anyone who needs it can sort it out.
The way to call this over ptp is to find req_task_dump in your main.bin.dump file, and then call it like
!con:call_function(0x94c2d)
Note you need to add 1 to the address in the main.bin.dump listing since it's thumb.
The decoding code can be used like this:
!m=require'extras/vxtcbdump'
!d=m.load('tasks.dmp')
!d:list_tasks()
!d.byname.tSpyTask:print()
edit:
I should add that I've done long runs (>50 shots) using only the original single task code without hitting the hang.