Ghidra reverse engineering tool - page 4 - General Discussion and Assistance - CHDK Forum

Ghidra reverse engineering tool

  • 37 Replies
  • 23331 Views
*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #30 on: 12 / December / 2020, 00:48:38 »
Advertisements
Ghidra 9.2 was released in November: https://ghidra-sre.org/releaseNotes_9.2.html

There are a lot of changes, but nothing jumps out as being especially important for CHDK development.

It is not compatible with some of my scripts ( get_pinsn_at fails, which breaks ImportCHDKStubs and others). I'll look into fixing this, but for the moment upgrading is not recommended.

While both can be installed at the same time, opening programs with the new version will cause them to be upgraded, which may make them not open in the old version.

Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #31 on: 13 / December / 2020, 23:34:05 »
I checked in fixes for the scripts in r5674. However, I noticed that ghidra 9.2 incorrectly disassembles some bl instructions (interpreting them as vst4.* with a warning "Instruction pcode is unimplemented") so I'd recommend sticking with 9.1.2 for now.
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #32 on: 23 / January / 2021, 19:09:24 »
I checked in fixes for the scripts in r5674. However, I noticed that ghidra 9.2 incorrectly disassembles some bl instructions (interpreting them as vst4.* with a warning "Instruction pcode is unimplemented") so I'd recommend sticking with 9.1.2 for now.
9.2.2 was released on Dec 29 and appears to fix this issue. The CHDK scripts work correctly, so I'd say this is the recommended version for new installs now. If you're already on 9.1.x, I haven't seen any really compelling reasons to upgrade, though I think the disassembly may be a bit improved.
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #33 on: 16 / February / 2021, 23:11:44 »
In trunk r5733, I added some files in tools/ghidra_scripts/datatypes that can be used to make Ghidra aware of function prototypes and structure definitions. This can significantly improve decompiler output, both in the identified functions themselves, and in other code that calls them.

The header files are manually created. I thought about trying make the normal CHDK files usable directly in Ghidra, but it seemed pretty impractical so I just started with a copy of lowlevel.h instead. I've added some additional functions that were used elsewhere, as well as some common ones from the stubs that aren't currently used in CHDK. IMO, even outside the direct benefits to analysis, this is a useful place to document functions we have named and understood.

Basic usage from the README.TXT below (I'll add it to the wiki later done ). Ghidra supports several different workflows, but this was what seemed to work best for me after playing around with it for a while.

The program should already be analyzed.

Go to File -> Parse C Source

* Use the small disk icon with ... under it to copy an existing parse configuration, e.g clib.prf
* Name your copy something obviously related to CHDK and camera configuration, e.g. chdk-dryos31
* Select all the header file entries, and use the red X button to delete them
* Use the green + button to add chdk source/tools/ghidra_scripts/fw_functions.h
* Adjust the parse options section to match your platform:
  Remove all entries except
   -D__builtin_va_list=void *
  If your camera uses dryos, add the PLATFORMOSVER value from makefile.inc, like
   -DCAM_DRYOS_REL=31
  If your camera uses 3 argument DebugAssert (see platform_camera.h) add
   -DCAM_3ARG_DebugAssert=1
  This applies to some early vxworks, all digic 6, and some other DryOS 52 and later.
 * Save your parse configuration with the big floppy icon. Note: Parse configurations are global
   within Ghidra, not specific to a particular project or program.
 * Click "Parse to Program", and continue when prompted
 * If a prompt about "Use Open Archives" appears, click continue. It may be covered by a dialog titled "Parsing C Files". If so, move the "Parsing C files" dialog out of the way.
 * If parsing is unsuccessful, the pre-processed output will appear in your system home directory in a file named CParserPlugin.out
 * If parsing succeeds, dismiss the Parse C Source dialog.

In the types manager window, right click on your program, and choose "Apply Function Data Types"

If you update the header files, re-run File -> Parse C Source, select the parse configuration
created earlier, and re-run "Apply Function Data Types".
« Last Edit: 18 / February / 2021, 11:14:31 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #34 on: 06 / April / 2021, 02:47:32 »
In trunk 5812, I added a script CommentLeventCalls.py to comment calls to "logical event" functions with the names from levent_table

edit:
In 5814, I added ListLeventCalls.py, which lists calls referencing events, specified by name or ID.

edit:
A few more scripts:
CleanEmptyFuncs.py - Ghidra auto-analysis seems to sometimes create zero-length functions. This script removes them, and re-creates them if they are in valid code. This solves many cases where the comment / list calls scripts say "not in a function" when it looks like a function is present.

LabelsToFuncs.py - create functions from labels if they look like valid function starts. It might get a few cases wrong (particularly where the original Canon code was doing weird things in ASM) but seems seems seems to give good results in the vast majority of cases.

CommentMzrmCreateCalls.py - for thumb2 firmware, add comments with the name of mzrm messages, using list generated from tools  in https://chdk.setepontos.com/index.php?topic=11316.msg129104#msg129104

ListMzrmCreateCalls.py - as above, but list address where specific messages are created

NameMzrmFunctions.py - As above, but name the calling function with the name of the message, if it only creates one.

There's getting to be enough of these that I'm starting to want one master script to run them all in order. FWIW, my recommended sequence for a new dump is
InitCHDKMemMap.py
ImportCHDKStubs.py (Entry points only)
... auto analyze ...
... load and apply function data types ...
CleanThumb2BookmarkErrors.py (thumb2 only)
CleanFuncBookmarks.py
CleanEmptyFuncs.py
ImportCHDKStubs.py (load and disassemble)
LabelsToFuncs.py
CommentLeventCalls.py
CommentPropCalls.py
CommentMzrmCreateCalls.py (thumb2 only)
NameMzrmFunctions.py (thumb2 only)

Except for InitCHDKMemMap.py, they can safely be run repeatedly, for example if new stubs have been added, or more of the firmware has been analyzed.
« Last Edit: 18 / April / 2021, 15:15:00 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #35 on: 29 / May / 2021, 01:45:01 »
Ghidra 9.2.4 was release at the end of April. It appears to work OK with the current scripts. I don't see any obvious CHDK relevant changes.

I added a script called ListCallsWithArgs.py a while back (r5878). It searches for calls to a function with specified r0 through r4 arguments. The function can be a name or address. The arguments are given as a list of numbers or - to ignore. So to find all the asserts referring to line 123 (on a 3 arg assert cam), you could do
DebugAssert
- - 123

If multiple values are given, they must all match.

Note that it does not currently handle veneers/thunks, so you'd have to search them individually.

Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #36 on: 26 / June / 2021, 02:23:43 »
Ghidra 10.0 was released recently. Testing CHDK scripts on fresh loads of digic 2, 4, 6 and 7 dumps worked and gave similar results to 9.2.4

Most of the major changes don't seem relevant to us, but it does seem to analyze a bit faster, and fixes at least one ARM issue related to thumb instructions.
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14125
Re: Ghidra reverse engineering tool
« Reply #37 on: 07 / October / 2024, 00:56:25 »
I played around a bit with the Ghidra "emulator" tool in Ghidra 11.2 (note the debugger / emulator features are under active development, so what's described here may differ in other versions).

The emulator lets you execute code using Ghidra's instruction analysis infrastructure, rather than an external debugger + emulator like qemu (it's actually a special case of the "debugger" functionality which connects to an external debugger like GDB). It obviously is not suitable for emulating the whole firmware, or functions that rely extensively on variable state in RAM or interaction with hardware, but seems potentially quite useful for cases where a function does convoluted bit fiddling or lookups into tables in ROM or where there's complicated math you don't want to figure out by hand.

As a proof of concept, I used it to decode a DISKBOOT.BIN for elph180 (firmware 100c), described below

Open the already analyzed elph180 firmware in the "emulator" tool (three gears icon in the toolchest).
The emulator has a bunch of new windows which aren't present in the regular "code browser" tool, and by default doesn't display some (like decompile and strings) which are on by default in the code browser. You can use the window menu to display them if you want.

To get the file to be decoded into memory, I used File -> Add to Program. This adds it as a regular memory region just like if you did it in the code browser. It seems like there should be a way to just override uninitialized RAM for the duration of the emulator session, but I didn't find one.

Since it's a regular memory region, the address has to not conflict with existing blocks. I chose 0x0a000000, which is outside RAM on this camera. Note the emulator defaults to putting the stack at 0x08000000

The function that decodes DISKBOOT.BIN is FUN_ff89b0b8 called from ff836f34. I navigated the cursor to  ff836f34 in the static listing window (the lower of the two in the default layout) and clicked the "emulate" button in the toolbar (just to the right of the "debug" bug icon). This initializes emulation, but doesn't start running.

To stop once the function is called, I clicked to the next instruction and pressed K to set a breakpoint, accepting the default values.

The parameters to FUN_ff89b0b8 are buffer, size. The file is decoded in place, so there is only one address. To set these, first click on the pencil icon at the top of the registers window, which allows setting register values. Then double click on the value field of r0 to set the address to 0x0a000000 and r1 to set the size to the size of the loaded file.

You can single step step emulation using the toolbar buttons, but in this case I just want to run until the breakpoint, so I clicked the green Run arrow.

The emulator is very slow, it took something like ten minutes decode the diskboot. You can interrupt it with the pause button on the toolbar or cancel on the "emulator running" dialog, and resume with the run button.

After the emulator completed, I selected the memory region where the diskboot was loaded by double clicking on the first column in the "regions" window. This selects the whole memory region. Note this is the emulator-specific "regions" window, not the regular memory map, which looks like it would select the unmodified values.

To get the data out, I used File -> Export Program, with format "raw bytes" and "selection only" checked. Note if "selection only" is greyed out, make sure focus is in the top (dynamic) listing window with the selection.

The resulting binary file was identical to the loader/ixus175_elph180/main.bin which the diskboot was generated from, except that it's 5 bytes longer with garbage at the end. One byte is due to a null being added to the start in the diskboot process, and I believe the remaining 4 are because diskboot is encoded in blocks of  8 bytes.

Other notes:
The top "dynamic" listing window is supposed to show you disassembly, but for arm5t, it appears to always disassemble in thumb mode, regardless of the actual state.

The documentation implies that you should be able to make a "trace" which contains the program state over time, but I didn't see how to do this with emulation. Perhaps it's only available with actual debuggers. The process described above does create a trace, but it appears to only have the initial state.
« Last Edit: 07 / October / 2024, 00:59:06 by reyalp »
Don't forget what the H stands for.

 

Related Topics


SimplePortal © 2008-2014, SimplePortal