Copied from dpreview, per mx32's suggestion,
Just want to continue discussion of the universal dumper idea of mx32. I have to admit at the first time I saw it. I thought it was some day dreams, but now I think it is theoretically doable.
The technical problems to have the dumper running in the live OS are:
1: How to prevent the live OS overwriting the space that dumper sits.
2: How to re-gain control when the live OS is running.
3: How to find the FileSystem related API entries in a new live OS.
For 3, it is possible to find the api entry points using function signature, with enough FW dumps we should be able to solve this problem for most of the models.
The hard part is 1 & 2, If we were able to modify the ROM, it could be easy. but it is impossible. mx32's proposal is to use cpu's memory protection unit, I think this is like gamble. I prefer to use the method which CHDK uses currently.
CHDK has the same requirement to hide it self from the live OS. currently, we copy the initialization assembly code from the firmware into CHDK itself. In the initialization code, Canon FW passes the size of its data segment and BSS segment to the init routine. we enlarge the size, and call those routines as original Canon FW does, at last, jump to "the next point" in Canon FW and return control to the FW. so that CHDK cheats the FW and it can hide in Canon FW's data segment.
After some reading on both VXworks and DryOS init code, I found they are mostly identical, I can see the data segment and size at the beginning of the code. So the same trick can work with both vxworks and dryos. problem 1 can be solved if we can use this trick.
How to re-gain control when the OS is running? this is pretty easy, since we copy the code into CHDK, we always have the control before we decide to return it to the OS. A better way may be to create a task using the CreateTask API provided by the FW(thanks god this one is easy to find in Canon FW), after the task system is initialized. so that we can always have cpu time to run during the whole period the OS is running. This is useful if some IO systems like SD card aren't initialized at early stage. problem 2 is solved, too. (Even CHDK for dryos can use this approach, maybe)
So only the most hard problem is left: How can we copy the code like we do in IDA without dumping out the FW? Well. that's why I say this idea is "theoretically doable". It is possible that we write something like a disassembler, it reads the code from the beginning, tracks the execution path and copy the binary opcodes to RAM, then modify the binary codes to fix all the reference to old data and procedures., at last, apply our fix. This sounds like impossible, but it is the major technique used in decent software protectors like StarForce, Themida, ACProtect... Since X86 is a more complex and variable length instruction set, ARM should be much simpler than it. Anyway it require lots of code work, and is not something that can be done quickly....
(In more detail, the idea of copying code is not always doable because sometimes people use LDR PC, reg to do jump, but it seems to be ok for vxworks/dryos initialization code, we don't need to copy all the code in subfunction , we only need to copy the top level code as we do in CHDK, if there is some jump-by-reg code, we don't know how to copy either)
Thanks