Skip to content

BUG: fix strict aliasing UB in MurMur hash implementation.#459

Closed
Romain-Geissler-1A wants to merge 1 commit intoseccomp:mainfrom
Romain-Geissler-1A:main
Closed

BUG: fix strict aliasing UB in MurMur hash implementation.#459
Romain-Geissler-1A wants to merge 1 commit intoseccomp:mainfrom
Romain-Geissler-1A:main

Conversation

@Romain-Geissler-1A
Copy link
Copy Markdown
Contributor

@Romain-Geissler-1A Romain-Geissler-1A commented Feb 18, 2025

This was spotted when trying to upgrade the libseccomp fedora package to version 2.6.0 in fedora rawhide. It comes with gcc 15 and LTO enabled by default. When running the test 61-sim-transactions we get plenty of such errors in valgrind:

==265507== Use of uninitialised value of size 8
==265507==    at 0x4096AD: _hsh_add (gen_bpf.c:599)
==265507==    by 0x40A557: UnknownInlinedFun (gen_bpf.c:2016)
==265507==    by 0x40A557: gen_bpf_generate (gen_bpf.c:2341)
==265507==    by 0x400CDE: UnknownInlinedFun (db.c:2685)
==265507==    by 0x400CDE: UnknownInlinedFun (db.c:2682)
==265507==    by 0x400CDE: UnknownInlinedFun (api.c:756)
==265507==    by 0x400CDE: UnknownInlinedFun (util.c:162)
==265507==    by 0x400CDE: UnknownInlinedFun (util.c:153)
==265507==    by 0x400CDE: main (61-sim-transactions.c:128)
==265507==  Uninitialised value was created by a stack allocation
==265507==    at 0x409590: _hsh_add (gen_bpf.c:573)

[EDITED this part after the review of @thesamesam] Investigating this a bit, it seems that because of LTO the MurMur hash implementation is being inlined in _hsh_add. The two buffers data and blocks to point at the same underlying data, but via incompatible type, which is a strict aliasing violation. Instead, remove the getblock32 function and inline the copy with memcpy.

This is reproducible on a "fedora:rawhide" container (gcc 15) and using:
export CFLAGS='-O2 -flto=auto -ffat-lto-objects -g'

@coveralls
Copy link
Copy Markdown

coveralls commented Feb 18, 2025

Coverage Status

coverage: 90.239% (-0.01%) from 90.252%
when pulling b7d0f04 on Romain-Geissler-1A:main
into 7db46d7 on seccomp:main.

@Romain-Geissler-1A
Copy link
Copy Markdown
Contributor Author

Friendly ping @pcmoore ;)

src/hash.c Outdated
@@ -56,7 +52,7 @@ uint32_t hash(const void *key, size_t length)
/* body */
blocks = (const uint32_t *)(data + nblocks * 4);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast shouldn't be needed anymore (just inline it to the memcpy args)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented what you suggested (and removed as well another cast after the loop).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@thesamesam
Copy link
Copy Markdown
Contributor

The way we call getblock32 with the explicit cast to const uint32_t* is a strict aliasing violation.

It's accessing via a different (non-compatible) type, it's not about the cast.

This was spotted when trying to upgrade the libseccomp fedora package to
version 2.6.0 in fedora rawhide. It comes with gcc 15 and LTO enabled by
default. When running the test 61-sim-transactions we get plenty of such
errors in valgrind:

==265507== Use of uninitialised value of size 8
==265507==    at 0x4096AD: _hsh_add (gen_bpf.c:599)
==265507==    by 0x40A557: UnknownInlinedFun (gen_bpf.c:2016)
==265507==    by 0x40A557: gen_bpf_generate (gen_bpf.c:2341)
==265507==    by 0x400CDE: UnknownInlinedFun (db.c:2685)
==265507==    by 0x400CDE: UnknownInlinedFun (db.c:2682)
==265507==    by 0x400CDE: UnknownInlinedFun (api.c:756)
==265507==    by 0x400CDE: UnknownInlinedFun (util.c:162)
==265507==    by 0x400CDE: UnknownInlinedFun (util.c:153)
==265507==    by 0x400CDE: main (61-sim-transactions.c:128)
==265507==  Uninitialised value was created by a stack allocation
==265507==    at 0x409590: _hsh_add (gen_bpf.c:573)

Investigating this a bit, it seems that because of LTO the MurMur hash
implementation is being inlined in _hsh_add. The two buffers data and
blocks to point at the same underlying data, but via incompatible type,
which is a strict aliasing violation. Instead, remove the getblock32
function and inline the copy with memcpy.

This is reproducible on a "fedora:rawhide" container (gcc 15) and using:
export CFLAGS='-O2 -flto=auto -ffat-lto-objects -g'

Signed-off-by: Romain Geissler <romain.geissler@amadeus.com>
Copy link
Copy Markdown
Member

@drakenclimber drakenclimber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Acked-by: Tom Hromatka <tom.hromatka@oracle.com>

Copy link
Copy Markdown
Contributor

@thesamesam thesamesam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me now, thanks!

Reviewed-by: Sam James <sam@gentoo.org>

@pcmoore pcmoore changed the title Fix strict aliasing UB in MurMur hash implementation. BUG: fix strict aliasing UB in MurMur hash implementation. Mar 18, 2025
@pcmoore pcmoore added the bug label Mar 18, 2025
@pcmoore pcmoore added this to the v2.6.1 milestone Mar 18, 2025
@pcmoore
Copy link
Copy Markdown
Member

pcmoore commented Mar 18, 2025

Pushed to main via 614530b, keeping open until the backport is complete.

@pcmoore
Copy link
Copy Markdown
Member

pcmoore commented Mar 18, 2025

Merged into release-2.6 via 84005ec, thank you!

@pcmoore pcmoore closed this Mar 18, 2025
@Romain-Geissler-1A
Copy link
Copy Markdown
Contributor Author

Thanks !

Note that technically this may affect releases 2.5.x too, is the branch 2.5 still maintained ? If yes you may backport it there as well.

@pcmoore
Copy link
Copy Markdown
Member

pcmoore commented Mar 18, 2025

Yes, release-2.5 is still currently maintained, but not for much longer and the bar for backporting to it is pretty high.

Unofficially, we tend to support v2.n until v2.(n+1).1 is released, but that isn't a hard rule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants