prev. article next article

prove binary deployment by recompiling and comparing

16 Feb 2011

motivation

binary deployment’ seems to be a good and fast solution nowadays (i’m talking about open source here). but what prove do i have to check if the source code was modified before compiled and signed (say by downstream::debian)?

Note: you can replace debian by any other distribution doing ‘binary deployment’ (it is just an example).

how is binary deployment actually done

this is very much distribution dependent. in general this workflow is used:

  1. download upstream source
  2. arrange a build environment
  3. apply ‘downstream’ patches
  4. install into DESTDIR/PREFIX and create an image from that
  5. finally distribute that image
  1. can be secured by signatures using cryptographic hashes and a sig file. (2) is complicated as a pure build environment CAN NOT be guaranteed by most distributions while a notable exception is nix as the build chain and all packages are pure (pure means that no mutual effects between two or more installed components do happen). (3) as downstream patches are usually very small they could be checked manually for security related issues.

security problems using binary deployment

downstream could simply add another ‘evil’ patch in step (3) but when the package got created, the source patch could be removed to hide the modification. this has happended already, see [2]. if the user wants to prevent such a situations there is a limited set of options. he could:

.. another option

i’ve been plying with nix lately and as nix is a ‘purely functional package manager’ this implies that step (2) effects are minimized as components don’t interfere. as a result this means: if you clone the original build chain, you could expect the same outcome using the same input. so i experimented with two components:

the results are very promising as:

Edit: it turns out that there was some research on this topic already, see [3] page 30. I quote it and hightlight some passages:

To ascertain how well these measures work in preventing impurities in NixOS, we performed two builds of the Nixpkgs collection6 on two different NixOS machines. This consisted of building 485 non-fetchurl derivations. The output consisted of 165927 files and directories. Of these, there was only one file name that differed between the two builds, namely in mono-1.1.4: a directory gac/IBM.Data.DB2/1.0.3008.37160 7c307b91aa13- d208 versus 1.0.3008.40191 7c307b91aa13d208. The differing number is likely derived from the system time. We then compared the contents of each file. There were differences in 5059 files, or 3.4% of all regular files. We inspected the nature of the differences: almost all were caused by timestamps being encoded in files, such as in Unix object file archives or compiled Python code. 1048 compiled Emacs Lisp files differed because the hostname of the build machines were stored in the output. Filtering out these and other file types that are known to contain timestamps, we were left with 644 files, or 0.4%. However, most of these differences (mostly in executables and libraries) are likely to be due to timestamps as well (such as a build process inserting the build time in a C string). This hypothesis is strongly supported by the fact that of those, only 42 (or 0.03%) had different file sizes. None of these content differences have ever caused an observable difference in behaviour.

how did i do the checks

i used a prefix installation of nix on gentoo. i set the store path to something like ‘~/mynix/store’ so that every program needs to be recompiled (nix limitation/feature). afterwards i did:

nix-env -i apache-httpd
ls store| grep apache-httpd
cp -R store/gyp2arhqcglbq6iq1hndclljs7v9n30k-apache-httpd-2.2.17/ apache1
nix-env -e apache-http
nix-env --delete-generations old
nix-store --delete store/gyp2arhqcglbq6iq1hndclljs7v9n30k-apache-httpd-2.2.17/

and then do it again but copy to apache2/ instead. next start the comparing.

possible solution to the timestamp problem

as it seems that the timestamps are the only problems, here are some thoughts how to overcome this:

summary

links