Ghidra reverse engineering tool - General Discussion and Assistance - CHDK Forum

Ghidra reverse engineering tool

  • 8 Replies
  • 480 Views
*

Offline reyalp

  • ******
  • 11849
Ghidra reverse engineering tool
« on: 06 / March / 2019, 00:49:02 »
Advertisements
The NSA has open-sourced  a reverse engineering tool https://www.ghidra-sre.org/

Use at you're own risk, they pinky swear there's no back door :haha

It looks quite capable and supports a bunch of processors, including ARM of course, though sadly appears not to include Xtensa.

I was able to load a digic 6 dump with minimal fuss.
Don't forget what the H stands for.

Ghidra disassembler
« Reply #1 on: 10 / March / 2019, 15:00:08 »
Wondering if anyone is familiar with Ghidra, a recently opensourced disassembly tool, that might complement some of the other tools we use?

*

Offline reyalp

  • ******
  • 11849
Re: Ghidra disassembler
« Reply #2 on: 10 / March / 2019, 17:12:56 »
Wondering if anyone is familiar with Ghidra, a recently opensourced disassembly tool, that might complement some of the other tools we use?
Merged topics.

I tried it briefly on windows. It's straightforward to set up and looks quite good. Java based UI is ugly as sin. Analyzing a full firmware takes a long time.

Some scripts / extensions would probably be helpful, for example to name functions based on funcs_by_*.csv. I haven't explored this aspect, but it's supposed to be quite extensible. see below

Install / setup:
* Download and unzip file from https://www.ghidra-sre.org/
* Download and unzip current JDK https://jdk.java.net/11/
* Run ghidraRun.bat -  it will prompt for JDK location if needed

Loading a dump
* New project, not shared (shared could be interesting...)
* Pick a directory. It will use significant diskspace (e.g ~500mb after initial analsys)
* With the project selected, choose file, import file, select primary.bin. I'm using sx710 here
* Format - raw binary
* Language ARM v7 32 bit little endian default (Digic 6 is v7, earlier should be v5)
* Options - Block name ROM, Base address = ROMBASADDR (0xfc000000) File offset 0
* Double click on primary bin to open in default tool (code browser)
* It will prompt you to analyze. I clicked NO because I want to add additional copied code first.
* File -> Add to Program. PRIMARY.BIN again
* Options - Name RAMCODE, values from stubs entry: Base  0x010e1000, offset 0xd4742c (copied from adr - base), length 158672 (dec!)
* File -> Add to Program. PRIMARY.BIN again
* Options - Name BTCMCODE, values from stubs entry: Base  0xbfe10800, offset 0xd6dffc (copied from adr - base), length 27674 (dec, rounded up)
* File -> Add to Program. PRIMARY.BIN again
* Options - Name RAMDATA, values from stubs entry: Base  0x8000, offset 0xd1e5d4 (copied from adr - base), length 167512  (dec) (not sure this is useful just trying now)
* Tools - Window, memory map, uncheck X on RAMDATA
* Save
* Analysis - auto-analyze. I left the options at default, but just disassembling rather than decompiling might be a better initial choice.
* Go get a $beverage (like IDA, you can do stuff while it's analyzing, and it seems to prioritize what you have in view)



One particularly interesting, potentially CHDK relevant thing in the docs is the "version tracking" section:

Quote
Version Tracking refers to the process used by reverse engineers to identify matching code or data between different software binaries. One common use case is to version track two different versions of the same binary. Alternatively, version tracking techniques can be used to check for the presence of of a particular piece of code within a given binary of interest.

edit:
Screenshot
« Last Edit: Yesterday at 19:42:27 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 11849
Re: Ghidra reverse engineering tool
« Reply #3 on: 10 / March / 2019, 18:16:03 »
Here's a ghidra python script to import funcs_by_address.csv.

To add:
Download somewhere.
Window -> Script manager.
Script directories (right click or bullet list icon)
Either copy the script into one of the defined directories, or add wherever you want to keep your scripts
Script should be recognized. You can use the refresh button if you add more scripts
Right click, run. Select the funcs_by_address.csv for your port
After it finishes, the functions from the CSV will be named in Ghidra

Note this is very basic, and my python and Ghidra knowledge is very limited...

edit:
Oops, first version didn't handle the thumb bit in the csv correctly. This one should. There's a commented line that will remove the bad ones if you already ran the old version

edit:
Some other useful things for scripting:
Window->python gives you an interactive python interpreter where you can use the script API.
Help->Ghidra API help gives you an API reference (though for Java)
« Last Edit: 10 / March / 2019, 20:06:25 by reyalp »
Don't forget what the H stands for.


*

Offline srsa_4c

  • ******
  • 3878
Re: Ghidra reverse engineering tool
« Reply #4 on: 11 / March / 2019, 14:55:20 »
Here's a ghidra python script to import funcs_by_address.csv.
Thanks for making that script.

Some pros/cons I found while using ghidra:
+ The decompiler is very useful
+ Function boundaries are recognized
- Could not find a way to customize the label prefix
- If I define a variable that consists of 2 registers, the disassembly becomes bogus (those registers are no longer displayed separately)
- LDR shows the constant's location even though I rarely care about it (the value is displayed on the same line but too far off)

... but the overall impression is positive.

*

Offline reyalp

  • ******
  • 11849
Re: Ghidra reverse engineering tool
« Reply #5 on: 20 / April / 2019, 16:02:46 »
FWIW, the source is now available on github: https://github.com/NationalSecurityAgency/ghidra and some updates have been released https://www.ghidra-sre.org/releaseNotes_9.0.2.html
Don't forget what the H stands for.

*

Offline philmoz

  • *****
  • 3102
    • Photos
Re: Ghidra reverse engineering tool
« Reply #6 on: 06 / May / 2019, 00:41:25 »
The more I use this the more I like it - much better than IDA.


Anyone else notice that Ghidra identifies and displays JPEG images embedded in the firmware :)

CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)

*

Offline reyalp

  • ******
  • 11849
Re: Ghidra reverse engineering tool
« Reply #7 on: 18 / May / 2019, 20:06:36 »
- If I define a variable that consists of 2 registers, the disassembly becomes bogus (those registers are no longer displayed separately)
Similar to this, if you define a function prototype (edit function, add parameters), by default ALL references to the register are shown with parameter name, which is unhelpful.

It turns out you can turn this off, as described in https://github.com/NationalSecurityAgency/ghidra/issues/309
Unchecking "Markup register variable references" under edit->tool options->listing options->operand markup switches it back to the register name, while the de-compiled view still uses the parameter name.

One thing that would be very useful is a way to set the prototypes of all known functions. This can apparently be done with file->parse C source, but our includes (e.g. lolevel.h) don't map cleanly to stub names, and we often just throw the prototype in the place the function is used.

Anyone else notice that Ghidra identifies and displays JPEG images embedded in the firmware :)
I also noticed it incorrectly identified some code as WAV data  :haha
Don't forget what the H stands for.


*

Offline reyalp

  • ******
  • 11849
Re: Ghidra reverse engineering tool
« Reply #8 on: Yesterday at 20:05:07 »
Here's an updated version of stubs import script, which attempts to set the thumb context register and disassemble any functions that weren't already disassembled. I suspect this isn't really the "right" way to do it, but it seems to work well enough.

Running it on project that it was already run on should be fine. If you've already named functions differently from what appears in the CSV, you may end up with multiple labels at the start of a given function (which is not a problem, AFAIK)

Some other random notes:
* Double clicking an address in the script console jumps to it
* When using the python (including in the console through window->python) you need to use toAddr for most functions that expect an address, e.g. goTo(toAddr(0xffc000000))
* getSymbol allows you to query named functions etc. Use None for the global namespace, like goTo(getSymbol('free',None)) to jump the code view to free
* The main analysis functions are in FlatProgramAPI (which GhidraScript inherits from). Some other useful stuff is in Program and ProgramContext
* The ghidra_scripts subdirectories in the install tree can provide useful hints, e.g. looking at ghidra_9.0.2/Ghidra/Features/Base/ghidra_scripts/DoThumbDisassemble.java helped me figure out how to set the thumb bit for the script.
Don't forget what the H stands for.

 

Related Topics