This uses Ghidra’s Version Tracking to diff a libpng update in order to extract the patch changes from it. While it works, it only works because the diffed library had symbols. Ghidra is lacking a correlator to match functions that are only similar but include changes. Ghidra is only good matching near identical functions, i.e., functions that did not change with the update.

Thus, I currently wouldn’t recommend Ghidra for patch diffing.

Update 1: Now that there is PatchDiffCorrelator I fully recommend Ghidra for patch diffing!


Preparation

In order to diff a patch you need some program pre- and post-patch.

I decided to go for CVE-2015-8126 in libpng. More specifically:

The change log of the libpng-1.5.13-7.el7_2.i686.rpm indicates this includes a fix for CVE-2015-8126:

Changelog by Petr Hracek (2015-11-28):
- Security fix for CVE-2015-8126
- Changing png_ptr to info_ptf based on upstream
- Related: #1283576

In this article I will use Ghidra’s Version Tracking to extract a diff of the changes introduced by that patch.


First, I load the two libpng libraries into the same Ghdira project:

Ghidra project with the two binaries to be compared loaded.

Ghidra project with the two binaries to be compared loaded.

Before you can do anything you need to analyze the two programs. Auto-analysis is, however, enough.

Once you have analyzed the programs you can start the Version Tracking Tool.

Version Tracking

When starting a Version Tracking Session Ghidra guides you through the process via the Version Tracking Wizard.

First you select the two programs:

Version Tracking Wizard: New Version Tracking Session

Version Tracking Wizard: New Version Tracking Session

Preconditions

In order to ensure quality Version Tracking results Ghidra’s Version Tracking Wizard allows you to check preconditions before starting the actual correlation process.

These precondition checks ensure that the binaries to be compared have been sufficiently analyzed.

Version Tracking Wizard: Precondition Checklist

Version Tracking Wizard: Precondition Checklist

After the precondition checks the Version Tracking Session can be created:

Version Tracking Wizard: Summary

Version Tracking Wizard: Summary

This will pop up two additional tools:

which provide the same functionality as the Code Browser Tool, but offer linking functionality. So when you jump to a function in the Source Tool the Destination Tool will jump to the same function - if a matching function was found in the Destination Program.

In order to generate those matches we must run the correlation process.

Correlators

Ghidra uses Correlators to check for similar or exact matching functions and data between the two programs.

Correlators can be individually selected:

Version Tracking Wizard: Select Correlation Algorithm(s)

Version Tracking Wizard: Select Correlation Algorithm(s)

Selection

In case your binary has symbols, it is advised to turn of the Similar Symbol Name Match, otherwise png_get_uint32 and png_get_int32 would produce a match because their symbol names are similar. (Matches by correlators can be filtered out afterwards, so running the correlator doesn’t destroy the process.)

Options

The individual options of the correlators can be adjusted:

Version Tracking Wizard: Correlator Options with Merge Version Tracking Session processing window

Version Tracking Wizard: Correlator Options with Merge Version Tracking Session processing window

It doesn’t really seem to matter what options are changed.

Update 1: This is not true. They actually do make a difference. However, because the analyzed binary had symbols and matched up quiet well the correlators with meaningful settings weren’t needed.

Correlation runtime

The correlators can take quite some time to process. For a firefox.exe this took several hours. But this is typical for binary diffing.

Automatic Version Tracking

If you do not want to bother with manually selecting the correlators and tweaking their settings you can run Automatic Version Tracking:

Automatic Version Tracking button

Automatic Version Tracking button

According to the manual this

“uses various correlators in a predetermined order to automatically create matches, accept the most likely matches, and apply markup all with one button press.” - Ghidra Help

Automatic Version Tracking is very well suited for when you want to quickly port over your analysis markup to a new version of a binary and then analyze the not matching function again.

Update 1: Automatic Version Tracking is also the recommended way to produce matches for patch diffing. However, it may take longer then running fewer selected correlators yourself.

Finding changed functions

You find the Version Tracking Matches in the Version Tracking Matches Window.

Because none of the correlators seems to find partial matching functions - they either match perfectly with a 1.000 Score or are not listed in the Matches table at all - we must use a different way to find functions that have changed.

Apply perfectly matching functions

You can use the Match Table Filters to only list the following algorithms:

Match Table Filters (only Exact Function Bytes, Instruction and Mnemonic Matches)

Match Table Filters (only Exact Function Bytes, Instruction and Mnemonic Matches)

Next, set the Score Filter to 1.000 to 1.000 to only include the perfect matches. (In my case there were no matches below 1.000.)

Then you can select all entries in the Version Tracking Matches Window and Accept the matches. They will now have a little flag in the Status column:

Accepted Matches

Accepted Matches

Manually assess changed functions

Now we can enable all match algorithms except Similar Symbol Name Match (as it cases too many false positives) and filter out already Accepted Matches (and due to those Accepted Matches other Blocked Matches) via:

Match Table Filters (all except Similar Symbol Name Match)

Match Table Filters (all except Similar Symbol Name Match)

Then you can go through the remaining Function Matches in the Matches Table and assess the differences.

Finding the patch

After using Version Tracking to remove all unchanged functions we are left with only 4 functions that we need to check:

4 changed functions

4 changed functions

In the diff view of the Version Tracking Markup Items window you can quickly see differences indicated by turquoise color:

Differences indicated by turquoise color

Differences indicated by turquoise color

Compilation differences

Looking at the same difference in the decompiler it is clear that this difference likely stems from the compiler not pulling the invariant param_2 + 0x14 out of the loop in the patched version:

Difference likely caused by compiler

Difference likely caused by compiler

The patch changes

Eventually, we find the patch in the png_set_PLTE function:

The patched function png_set_PLTE

The patched function png_set_PLTE

… it’s just a simple check to ensure param_4 is not negative and larger then some other value in what appears to be a structure pointed to by param_2.

Resume

While Ghidra’s Version Tracking allows to find matching functions, its inability to find almost matching functions or similar functions, makes it unusable for patch diffing.

I could only pull this off here because the library had symbols and thus I could match png_set_PLTE via an Exact Symbol Name Match. If this was not the case you would need to perform n * (n-1) / 2 manual comparisons for every n functions that you could not rule out via perfect matching as outlined in this article.

Future work

A very crude way to implement would be to calculate hashes of the basic blocks of each function (with the immediate and displacement values masked out) and then use the number of matching basic blocks vs. total number of basic blocks between two functions as an additional correlator.

I may implement such a crude correlator, if only to familiarize myself with writing such correlators in preparation for improving Version Tracking more.

Eventually, I hope someone implements a proper graph-based or similar sufficient function similarity correlator. Until then I wouldn’t recommend doing patch diffing with Ghdira… Unless of course I am missing something. In that case, please, tell me. Kthxbye.



Appendix: Video

A video of the described patch diffing work-flow/-around is at https://www.youtube.com/watch?v=w-9K20lILEk.