Updated ImportCHDKStubs.py and added two more scripts:
InitCHDKMemMap.py - Use information from stubs_entry.S to create a memory map in freshly loaded, not analyzed dumps. It should work reasonably well on most digic 2-5, as well as known digic 6 and 7 firmware. (there may be some oddballs that don't work well, s110 with ALT_ROMBASEADDR for example)
The memory map includes
* Copied code and data for the main CPU, with code marked executable
* ROM is split into executable and non-executable regions, for the main firmware code and romstarter where identified.
* Uninitialized regions are created for RAM outside of copied areas, uncached RAM, and known MMIO regions. MMIO regions are marked as volatile (the only effect I can see is decompiler showing things like "read_volatile_4()", but it makes MMIO access stand out a bit)
Creating big uninitialized blocks for RAM, uncached RAM and MMIO seems to have positives and negatives: Positive, because analysis can recognize data references, references are easy to follow, and things outside the defined areas stand out. Negative, because analysis looking for data references can see a lot of values as addresses just because a large portion of the address space is valid, especially on d6 and d7 cams with lots of RAM.
CleanThumb2BookmarkErrors.py - This attempts to clean up some common issues with autoanalysis on thumb2 firmware. It should be run after initial auto-analysis, or after large chunks of code are analyzed.
Background: Ghidra creates "bookmarks" when the disassembler runs into problems, like invalid instructions or data already defined where it wants to disassemble. You can see them by enabling the bookmarks window. There are other informational bookmarks too, to see just errors, click on the gear in the upper right of the window.
CleanThumb2BookmarkErrors.py iterates over "error" bookmarks and tries to clean up some common issues, like data being defined at +1 of a thumb function, and places where the disassembler started disassembling in ARM when it shouldn't have. It probably doesn't always do the right thing, but it's a significant improvement on the dumps I ran it on. By default, it removes the bookmarks there appears to be valid disassembly at the bookmark address. This can turned of by changing remove_resolved at the top of the file.
ImportCHDKStubs.py updates
* Now prompts for whether to disassemble, or just create entry points.
* Entry point mode now sets the thumb register, so auto-analysis will know which mode to start in
* Works on non-T processors for digic < 6. Since canon doesn't appear to use thumb on those cameras, using the non-T variants may avoid the disassembly mistakenly going off in thumb mode.
* Adds labels / entry points on the main firmware start, and some romstarter entries. This is useful to trigger analysis on early parts of the firmware startup.
The scripts now have a menupath set, so if you select "in tool" they'll appear under the Ghidra source viewer tools menu.
stubs_loader module updates
* Now parses comments out of stubs_entry.S, and derives a bunch of useful values from them
* Can be used by regular python 2 or 3 outside ghidra (not heavily tested)
With these updates, my suggested workflow is
* Load the PRIMARY.BIN at the rom start address
* Open the firmware, cancel auto-analysis
* Run InitCHDKMemMap.py
* Run ImportCHDKStubs.py in entry point mode
* Set auto-analysis options and run
* For thumb2 firmware, run CleanThumb2BookmarkErrors.py after auto-analysis completes
Some suggestions for auto-analysis
* Turn off "embedded media" for the first run, as it seems to misidentify some things as WAV in code. Run it from the one-shot menu afterwards instead
* Turn off "Non-returning functions - discovered". This seems to cause disassembly to stop in a lot of places it shouldn't
* Turn on "Shared return calls". This helps deal with code that does a b ... after a pop lr.
* Turn off "address tables". I'm not sure about this one, but I think it's better run as a one-shot after initial analysis, to avoid creating data from runs of things that could be addresses.
One other important thing: The auto-analysis options don't just apply to the initial, full analysis, they also apply whenever new code is disassembled. So if you turn something off for the initial run, you may want to re-enable it after. The settings are still saved when you cancel