Ghidra reverse engineering tool - General Discussion and Assistance - CHDK Forum supplierdeeply

Ghidra reverse engineering tool

  • 36 Replies
  • 22146 Views
*

Offline reyalp

  • ******
  • 14111
Ghidra reverse engineering tool
« on: 06 / March / 2019, 00:49:02 »
Advertisements
The NSA has open-sourced  a reverse engineering tool https://www.ghidra-sre.org/

Use at you're own risk, they pinky swear there's no back door :haha

It looks quite capable and supports a bunch of processors, including ARM of course, though sadly appears not to include Xtensa.

I was able to load a digic 6 dump with minimal fuss.
Don't forget what the H stands for.

Ghidra disassembler
« Reply #1 on: 10 / March / 2019, 15:00:08 »
Wondering if anyone is familiar with Ghidra, a recently opensourced disassembly tool, that might complement some of the other tools we use?

*

Offline reyalp

  • ******
  • 14111
Re: Ghidra disassembler
« Reply #2 on: 10 / March / 2019, 17:12:56 »
edit 12/30/2019 - Current information is in the Wiki https://chdk.fandom.com/wiki/Firmware_analysis_with_Ghidra

----

Wondering if anyone is familiar with Ghidra, a recently opensourced disassembly tool, that might complement some of the other tools we use?
Merged topics.

I tried it briefly on windows. It's straightforward to set up and looks quite good. Java based UI is ugly as sin. Analyzing a full firmware takes a long time.

Some scripts / extensions would probably be helpful, for example to name functions based on funcs_by_*.csv. I haven't explored this aspect, but it's supposed to be quite extensible. see below

Install / setup:
* Download and unzip file from https://www.ghidra-sre.org/
* Download and unzip current JDK https://jdk.java.net/11/
* Run ghidraRun.bat -  it will prompt for JDK location if needed

Loading a dump
* New project, not shared (shared could be interesting...)
** Note: You need a project, but you can have many firmwares in a project, allowing use of the version control tool and having them open in the code viewer at the same time.
* Pick a directory. It will use significant diskspace (e.g ~500mb after initial analsys)
* With the project selected, choose file, import file, select primary.bin. I'm using sx710 here. Creating a folder in the project tree is a good idea if loading multiple firmwares. You can also name the "program" something other than PRIMARY.BIN
* Format - raw binary
* Language ARM v7t 32 bit little endian default (Digic 6 is v7, earlier should be v5t)
* Options - Block name ROM, Base address = ROMBASADDR (0xfc000000) File offset 0
* Double click on primary bin to open in default tool (code browser)
* It will prompt you to analyze. I clicked NO because I want to add additional copied code first.
* File -> Add to Program. PRIMARY.BIN again
* Options - Name RAMCODE, values from stubs entry: Base  0x010e1000, offset 0xd4742c (copied from adr - base), length 158672 (dec!)
* File -> Add to Program. PRIMARY.BIN again
* Options - Name BTCMCODE, values from stubs entry: Base  0xbfe10800, offset 0xd6dffc (copied from adr - base), length 27674 (dec, rounded up)
* File -> Add to Program. PRIMARY.BIN again
* Options - Name RAMDATA, values from stubs entry: Base  0x8000, offset 0xd1e5d4 (copied from adr - base), length 167512  (dec) - This helps seeing references to initialized RAM variables
* Tools - Window, memory map, uncheck X on RAMDATA
* Save
* Analysis - auto-analyze. I left the options at default, but just disassembling rather than decompiling might be a better initial choice.
* Go get a $beverage (like IDA, you can do stuff while it's analyzing, and it seems to prioritize what you have in view)

edit 12 06 2019
As of Ghidra 9.1, you can add file initialized data that uses the original file directly in the memory map, rather than using add to program again.

You can also split memory blocks which have initialized data. This allows you to split the executable and non-executable portions of the ROM code, which significantly speeds up analysis and leads to bad disassembly. The end of main ROM code can be found from the dryos version string associated with that ROM (referenced from early in the code) but beware there are multiple dryos instances in the ROM.



One particularly interesting, potentially CHDK relevant thing in the docs is the "version tracking" section:

Quote
Version Tracking refers to the process used by reverse engineers to identify matching code or data between different software binaries. One common use case is to version track two different versions of the same binary. Alternatively, version tracking techniques can be used to check for the presence of of a particular piece of code within a given binary of interest.

edit:
Screenshot
« Last Edit: 30 / December / 2019, 18:49:10 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14111
Re: Ghidra reverse engineering tool
« Reply #3 on: 10 / March / 2019, 18:16:03 »
Here's a ghidra python script to import funcs_by_address.csv.

To add:
Download somewhere.
Window -> Script manager.
Script directories (right click or bullet list icon)
Either copy the script into one of the defined directories, or add wherever you want to keep your scripts
Script should be recognized. You can use the refresh button if you add more scripts
Right click, run. Select the funcs_by_address.csv for your port
After it finishes, the functions from the CSV will be named in Ghidra

Note this is very basic, and my python and Ghidra knowledge is very limited...

edit:
Oops, first version didn't handle the thumb bit in the csv correctly. This one should. There's a commented line that will remove the bad ones if you already ran the old version

edit:
Some other useful things for scripting:
Window->python gives you an interactive python interpreter where you can use the script API.
Help->Ghidra API help gives you an API reference (though for Java)
« Last Edit: 10 / March / 2019, 20:06:25 by reyalp »
Don't forget what the H stands for.


*

Offline srsa_4c

  • ******
  • 4451
Re: Ghidra reverse engineering tool
« Reply #4 on: 11 / March / 2019, 14:55:20 »
Here's a ghidra python script to import funcs_by_address.csv.
Thanks for making that script.

Some pros/cons I found while using ghidra:
+ The decompiler is very useful
+ Function boundaries are recognized
- Could not find a way to customize the label prefix
- If I define a variable that consists of 2 registers, the disassembly becomes bogus (those registers are no longer displayed separately)
- LDR shows the constant's location even though I rarely care about it (the value is displayed on the same line but too far off)

... but the overall impression is positive.

*

Offline reyalp

  • ******
  • 14111
Re: Ghidra reverse engineering tool
« Reply #5 on: 20 / April / 2019, 16:02:46 »
FWIW, the source is now available on github: https://github.com/NationalSecurityAgency/ghidra and some updates have been released https://www.ghidra-sre.org/releaseNotes_9.0.2.html
Don't forget what the H stands for.

*

Offline philmoz

  • *****
  • 3450
    • Photos
Re: Ghidra reverse engineering tool
« Reply #6 on: 06 / May / 2019, 00:41:25 »
The more I use this the more I like it - much better than IDA.


Anyone else notice that Ghidra identifies and displays JPEG images embedded in the firmware :)

CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline reyalp

  • ******
  • 14111
Re: Ghidra reverse engineering tool
« Reply #7 on: 18 / May / 2019, 20:06:36 »
- If I define a variable that consists of 2 registers, the disassembly becomes bogus (those registers are no longer displayed separately)
Similar to this, if you define a function prototype (edit function, add parameters), by default ALL references to the register are shown with parameter name, which is unhelpful.

It turns out you can turn this off, as described in https://github.com/NationalSecurityAgency/ghidra/issues/309
Unchecking "Markup register variable references" under edit->tool options->listing options->operand markup switches it back to the register name, while the de-compiled view still uses the parameter name.

One thing that would be very useful is a way to set the prototypes of all known functions. This can apparently be done with file->parse C source, but our includes (e.g. lolevel.h) don't map cleanly to stub names, and we often just throw the prototype in the place the function is used.

Anyone else notice that Ghidra identifies and displays JPEG images embedded in the firmware :)
I also noticed it incorrectly identified some code as WAV data  :haha
Don't forget what the H stands for.


*

Offline reyalp

  • ******
  • 14111
Re: Ghidra reverse engineering tool
« Reply #8 on: 19 / May / 2019, 20:05:07 »
Here's an updated version of stubs import script, which attempts to set the thumb context register and disassemble any functions that weren't already disassembled. I suspect this isn't really the "right" way to do it, but it seems to work well enough.

Running it on project that it was already run on should be fine. If you've already named functions differently from what appears in the CSV, you may end up with multiple labels at the start of a given function (which is not a problem, AFAIK)

Some other random notes:
* Double clicking an address or name in the script console jumps to it
* When using the python (including in the console through window->python) you need to use toAddr for most functions that expect an address, e.g. goTo(toAddr(0xffc000000))
* getSymbol allows you to query named functions etc. Use None for the global namespace, like goTo(getSymbol('free',None)) to jump the code view to free
* The main analysis functions are in FlatProgramAPI (which GhidraScript inherits from). Some other useful stuff is in Program and ProgramContext
* The ghidra_scripts subdirectories in the install tree can provide useful hints, e.g. looking at ghidra_9.0.2/Ghidra/Features/Base/ghidra_scripts/DoThumbDisassemble.java helped me figure out how to set the thumb bit for the script.
« Last Edit: 11 / December / 2019, 18:28:37 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14111
Re: Ghidra reverse engineering tool
« Reply #9 on: 29 / May / 2019, 03:38:13 »
I played with the version tracking a little bit.
To use it, load 2 or more firmware dumps into a single project. The firmwares should be analyzed normally, have stub script run etc. before starting the compare. I used A410 and A540 in my test.

To compare versions, select the version tracking tool from the project, start new session. It will ask you for the programs to compare.

Click next through the wizard interface. You will end up with the version tracking screen, and a code disassembly viewer for each firmware.

The magic wand icon on the version tracking tools "Runs several correlators and adds good matches" (this may not be the best option). It took a long time (I canceled part way through), but seems seems to find a lot of matches. If you click on a function in one disassembly view, it will show you the matches that apply from the other, like "exact instruction match"

There are a lot of "duplicate function" matches for simple functions that just return a variable etc.

I definitely need to spend some time RTFMing, but it seems like it could be very useful.
Don't forget what the H stands for.

 

Related Topics


SimplePortal 2.3.6 © 2008-2014, SimplePortal