Re: [gnu-prog-discuss] Automake dist reproducibility

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [gnu-prog-discuss] Automake dist reproducibility

Mike Gerwitz
There is ongoing discussion about reproducible builds within GNU.  I'm
having trouble figuring out the best approach for deterministic
distribution archives using Automake.

Here's my original message on gnu-prog-discuss:

> I did read https://reproducible-builds.org/docs/archives/.
>
> Automake-generated Makefiles have many archive options.  I'm assuming
> that my best option is to modify the timestamps and other metadata of
> the files in distdir using `dist-hook`, but that doesn't solve file
> ordering.
>
> What would the GNU recommendation be in this case, and what fits best
> with the spirit of Automake?  Post-processing the tarball is awkward
> since it is part of a pipeline (to whatever compression algorithm is
> chosen for the final archive).  I'm not sure how to modify am__tar to
> include processing as part of that pipeline (e.g. as used in
> dist-gzip)---Automake doesn't provide options to configure its value
> outside of _AM_PROG_TAR, which is rigid.
>
> strip-nondeterminism appears to support ar, gzip, jar, and zip; should I
> just use that?

Ludo had some suggestions:

On Tue, Dec 22, 2015 at 17:23:55 +0100, Ludovic Courtès wrote:

> At the very least, Automake should change the default value of
> ‘GZIP_ENV’ to “--best --no-name” (the latter tells gzip to not add a
> timestamp in its output.)
>
> Ideally ‘make dist’ would also sort files in the archives.  Recent
> versions of GNU tar support ‘--sort=name’ but we’d need a way to do that
> portably (or require GNU tar for ‘make dist’.)
>
> Lastly, archive timestamps could be reset, as per --mtime=@0, but again,
> portability needs to be considered.  In some cases, this feature might
> need to be turned off.
>
> Thoughts?

Is there a [good] way to solve this problem until we can implement any
suggestions in Automake?

--
Mike Gerwitz
Free Software Hacker | GNU Maintainer
https://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB

signature.asc (834 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [gnu-prog-discuss] Automake dist reproducibility

Pádraig Brady
On 22/12/15 17:00, Mike Gerwitz wrote:
> There is ongoing discussion about reproducible builds within GNU.  I'm
> having trouble figuring out the best approach for deterministic
> distribution archives using Automake.

I've not thought much about this, but I'm
wondering about how useful deterministic tarballs are?

The main thrust of reproducible builds is to verify what's
running on the system, and there are so many variables
between the tarball and build, that I'm not sure it's
worth worrying about non determinism in the intermediate steps?

Perhaps the main focus for tarballs should just to
ensure they're properly signed.

cheers,
Pádraig.

p.s. It would be good to give more control to upstream devs
to config archiving options in Makefile.am etc.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [gnu-prog-discuss] Automake dist reproducibility

Warren Young-2
On Dec 22, 2015, at 12:16 PM, Pádraig Brady <[hidden email]> wrote:
>
> On 22/12/15 17:00, Mike Gerwitz wrote:
>> There is ongoing discussion about reproducible builds within GNU.
>
> I’m wondering about how useful deterministic tarballs are?

This page gives the “whys” of reproducible builds:

  https://wiki.debian.org/ReproducibleBuilds/About

> Perhaps the main focus for tarballs should just to
> ensure they're properly signed.

Signing only proves that the package provider possesses the private key, which implies — but does not prove — that the signer is the party you expect the packages to come from.

The security risk is that if someone can steal the private key, they can sign arbitrary packages.

But, if you can independently create the same pre-signature tarball from the source package, you can prove conclusively that the source code is the same used for creating that binary package.

This does not prove that the source code hasn’t also been compromised, but once you’ve reduced the verification problem to the source level, you can use traditional high-level means of verification: diffing against previous source releases, diffing against the project’s public source repo, auditing the source, etc.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [gnu-prog-discuss] Automake dist reproducibility

Bob Friesenhahn
In reply to this post by Pádraig Brady
On Tue, 22 Dec 2015, Pádraig Brady wrote:

> On 22/12/15 17:00, Mike Gerwitz wrote:
>> There is ongoing discussion about reproducible builds within GNU.  I'm
>> having trouble figuring out the best approach for deterministic
>> distribution archives using Automake.
>
> I've not thought much about this, but I'm
> wondering about how useful deterministic tarballs are?
>
> The main thrust of reproducible builds is to verify what's
> running on the system, and there are so many variables
> between the tarball and build, that I'm not sure it's
> worth worrying about non determinism in the intermediate steps?
>
> Perhaps the main focus for tarballs should just to
> ensure they're properly signed.
I would agree that it is the extracted binary contents of the tarballs
(ignoring artifacts like file timestamps and user ids) which counts.
Attempting to get archiving tools to produce the same results at
different times on different machines is close to impossible.

Bob
--
Bob Friesenhahn
[hidden email], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [gnu-prog-discuss] Automake dist reproducibility

Warren Young-2
On Dec 22, 2015, at 2:51 PM, Bob Friesenhahn <[hidden email]> wrote:
>
> Attempting to get archiving tools to produce the same results at different times on different machines is close to impossible

Fortunately, others have already done much of the hard work:

  https://reproducible-builds.org/


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [gnu-prog-discuss] Automake dist reproducibility

Ludovic Courtès-3
In reply to this post by Pádraig Brady
Pádraig Brady <[hidden email]> skribis:

> On 22/12/15 17:00, Mike Gerwitz wrote:
>> There is ongoing discussion about reproducible builds within GNU.  I'm
>> having trouble figuring out the best approach for deterministic
>> distribution archives using Automake.
>
> I've not thought much about this, but I'm
> wondering about how useful deterministic tarballs are?
>
> The main thrust of reproducible builds is to verify what's
> running on the system, and there are so many variables
> between the tarball and build, that I'm not sure it's
> worth worrying about non determinism in the intermediate steps?
>
> Perhaps the main focus for tarballs should just to
> ensure they're properly signed.

You’re right that deterministic tarballs are not the immediate concern
of reproducible builds; usually, we focus on binaries.

However, if running ‘make dist’ at a given commit of a project leads to
exactly one tarball, then people can verify the tarball against the VCS
commit.  This is especially interesting when people sign commits/tags.
We could authenticate code with much finer grain.

This also reduces incentives to attack the person that runs ‘make dist’
and signs the result since anyone could independently check the tarball.

Basically same motivation as with reproducible builds, but one level
higher.

Ludo’.

Loading...