bug#25629: Automake output non-deterministic when used with later Perls

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

bug#25629: Automake output non-deterministic when used with later Perls

demerphq
Some time back I switched Perl to using a randomly seeded hash
function. (I am a core perl committer, and I do a lot of work on its
hash function.)

A consequence of this is that default output from tools like
Data::Dumper is non-deterministic, meaning output files under source
control, etc, change unnecessarily every time.

At least part of the automake toolset is affected, for instance autom4te.

For autom4te a simple solution is to add:

$Data::Dumper::Sortkeys = 1;

to the script. This seems to fix the problem I was observing, but an
audit of the use of  the "keys" function to ensure that any such use
for output is sorted will eliminate this class of errors.

I would be happy to put together a pull request, or a patch if you
could point me to the relevant repository. (Which does not appear to
be documented in the installed versions of automake, although I may
have overlooked it.)

I am sorry I cannot give a better reproduction description than this:

while hacking the "dieharder" package I observed that after running
"autoreconf -i" that the file ./autom4te.cache/requests file changed
in ways that it should not have. Opening the file I observed it
contains output clearly from the Perl Data::Dumper module, I then made
the patch to autom4te described above, and observed that after running
"autoreconf -i" again the output was sorted, and that after running it
yet again the file was unchanged.

Classic unsorted use of undefined hash order bug that we had to do a
lot of cleanup in the Perl world around the time of Perl 5.10

Thanks,
Yves



--
perl -Mre=debug -e "/just|another|perl|hacker/"



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

bug#25629: Hrm, actually autom4te isnt part of automake, but rather autoconf.

demerphq
Also I observe that there were previous patches to ensure most uses of
keys() were sorted. However not all of them, including diagnostics,
and various other places where the key order could be exposed to a
user.

Attached is a patch that I believe fixes any remaining uses of unsorted keys.

I took the policy that automake is not performance sensitive, at least
at the level of sorting keys, and that sorting something that is not
strictly necessary does no harm.

I admit I have not been able to properly test all of these changes,
but superficial testing does not reveal any issues.

Anyway, sorry for the misleading original bug-report.

Cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

automake.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

bug#25629: Hrm, actually autom4te isnt part of automake, but rather autoconf.

Mathieu Lirzin
Hello Yves,

demerphq <[hidden email]> writes:

> Also I observe that there were previous patches to ensure most uses of
> keys() were sorted. However not all of them, including diagnostics,
> and various other places where the key order could be exposed to a
> user.
>
> Attached is a patch that I believe fixes any remaining uses of unsorted keys.
>
> I took the policy that automake is not performance sensitive, at least
> at the level of sorting keys, and that sorting something that is not
> strictly necessary does no harm.
>
> I admit I have not been able to properly test all of these changes,
> but superficial testing does not reveal any issues.
>
> Anyway, sorry for the misleading original bug-report.
>
> Cheers,
> Yves

Can you explain what would be the benefit for Automake to have such
deterministic behavior?

Thanks for the report.

--
Mathieu Lirzin
GPG: F2A3 8D7E EB2B 6640 5761  070D 0ADE E100 9460 4D37



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

bug#25629: Hrm, actually autom4te isnt part of automake, but rather autoconf.

demerphq
On 16 Jul 2017 01:50, "Mathieu Lirzin" <[hidden email]> wrote:
Hello Yves,

demerphq <[hidden email]> writes:

> Also I observe that there were previous patches to ensure most uses of
> keys() were sorted. However not all of them, including diagnostics,
> and various other places where the key order could be exposed to a
> user.
>
> Attached is a patch that I believe fixes any remaining uses of unsorted keys.
>
> I took the policy that automake is not performance sensitive, at least
> at the level of sorting keys, and that sorting something that is not
> strictly necessary does no harm.
>
> I admit I have not been able to properly test all of these changes,
> but superficial testing does not reveal any issues.
>
> Anyway, sorry for the misleading original bug-report.
>
> Cheers,
> Yves

Can you explain what would be the benefit for Automake to have such
deterministic behavior?

Thanks for the report.

The most obvious reason is when debugging with something like diff deterministic behaviour eliminates spurious changes. Also having a deterministic and well understood ordering rule eliminates any concern that order might matter, and that a bug or whatnot was due to a different order (regardless of whether such a concern actually makes sense.) also sorting output lists makes them more readable.  I don't know how well tested Automake is but I would assume deterministic hash order would help there too. 

Also note that Prior to hash randomization the key order would have been consistent between multiple runs. Thus when randomization was introduced the behaviour of Automake changed and effectively regressed. Sorting keys effectively restores the original behaviour. 

Cheers
Yves



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

bug#25629: Hrm, actually autom4te isnt part of automake, but rather autoconf.

Mathieu Lirzin
demerphq <[hidden email]> writes:

> On 16 Jul 2017 01:50, "Mathieu Lirzin" <[hidden email]> wrote:
>
>  Can you explain what would be the benefit for Automake to have such
>  deterministic behavior?
>
>  Thanks for the report.
>
> The most obvious reason is when debugging with something like diff
> deterministic behaviour eliminates spurious changes. Also having a
> deterministic and well understood ordering rule eliminates any concern
> that order might matter, and that a bug or whatnot was due to a
> different order (regardless of whether such a concern actually makes
> sense.) also sorting output lists makes them more readable.  I don't
> know how well tested Automake is but I would assume deterministic hash
> order would help there too.
>
> Also note that Prior to hash randomization the key order would have
> been consistent between multiple runs. Thus when randomization was
> introduced the behaviour of Automake changed and effectively
> regressed. Sorting keys effectively restores the original behaviour.

IMO if the hash randomization introduces bugs in Automake this is the
sign of bad code on Automake side that need to be fixed.  I think it
would be better to fix those bugs in a case by case manner instead of
preemptively imposing an order on data structure models that are foreign
to the notion of order.

Could you run Automake test suite with those hash randomizations applied
and report the actual failures?

Thanks.

--
Mathieu Lirzin
GPG: F2A3 8D7E EB2B 6640 5761  070D 0ADE E100 9460 4D37



Loading...