Skip to content

implement va_arg for x86_64 systemv #141538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

folkertdev
Copy link
Contributor

@folkertdev folkertdev commented May 25, 2025

tracking issue: #44930

Turns out LLVM's va_arg is also unreliable for this target.

llvm/llvm-project#141361

So, like clang, we implement our own. I used

We can take a bunch of shortcuts because the return type of va_list must implement VaArgSafe. I also extended some of the tests, because up to 11 floats can be stored in the reg_safe_area for this calling convention.

r? @workingjubilee
@rustbot label +F-c_variadic

try-job: x86_64-apple-1

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 25, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 25, 2025

This PR modifies run-make tests.

cc @jieyouxu

@rustbot rustbot added the F-c_variadic `#![feature(c_variadic)]` label May 25, 2025
@folkertdev folkertdev force-pushed the systemv-x86_64-va_arg branch from 5d64c38 to 52601bf Compare May 25, 2025 19:27
@rust-log-analyzer

This comment has been minimized.

@folkertdev folkertdev force-pushed the systemv-x86_64-va_arg branch from 52601bf to ac051ae Compare May 25, 2025 19:42
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits and general questions

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 25, 2025
@folkertdev folkertdev force-pushed the systemv-x86_64-va_arg branch from ac051ae to 584035c Compare May 25, 2025 21:46
@folkertdev
Copy link
Contributor Author

This now supports structs like

struct B(i32, i32);
struct C(i64, i64, i64);
struct D(f64, i64);
struct E(i64, f64);
struct F(f64, f64);

It however does not support

#[repr(C)]
struct Empty;

impl Copy for Empty {}

#[repr(C)]
struct Cursed([Empty; 8], i32);

Because rust passes this as BackendRepr::Memory, while in C it is treated as a scalar (based on the generated assembly). I don't know why rust passes it that way, or whether that is in fact correct.

@folkertdev
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 26, 2025
@folkertdev
Copy link
Contributor Author

the cause is this optimize_abi value:

let optimize_abi = !repr.inhibit_newtype_abi_optimization();
// Try to make this a Scalar/ScalarPair.
if sized && size.bytes() > 0 {
// We skip *all* ZST here and later check if we are good in terms of alignment.
// This lets us handle some cases involving aligned ZST.
let mut non_zst_fields = fields.iter_enumerated().filter(|&(_, f)| !f.is_zst());
match (non_zst_fields.next(), non_zst_fields.next(), non_zst_fields.next()) {
// We have exactly one non-ZST field.
(Some((i, field)), None, None) => {
layout_of_single_non_zst_field = Some(field);
// Field fills the struct and it has a scalar or scalar pair ABI.
if offsets[i].bytes() == 0 && align.abi == field.align.abi && size == field.size
{
match field.backend_repr {
// For plain scalars, or vectors of them, we can't unpack
// newtypes for `#[repr(C)]`, as that affects C ABIs.
BackendRepr::Scalar(_) | BackendRepr::SimdVector { .. }
if optimize_abi =>
{
abi = field.backend_repr;
}

when I disable that if optimize_abi => that struct does work. Anyway, I think the va_arg code is correct, it may just need some adjusted repr as input.

@folkertdev folkertdev force-pushed the systemv-x86_64-va_arg branch from dcdc645 to 6424fbe Compare May 26, 2025 15:19
@rust-log-analyzer

This comment has been minimized.

@workingjubilee
Copy link
Member

when I disable that if optimize_abi => that struct does work. Anyway, I think the va_arg code is correct, it may just need some adjusted repr as input.

It's fine if it's buggy for non-scalars, it just felt slightly silly to write it in the way that can support at least some structs and then not support any structs.

@folkertdev
Copy link
Contributor Author

I added this bit of logic and now it actually does work correctly for that struct

    // Peel off any newtype wrappers.
    let layout = {
        let mut layout = bx.cx.layout_of(target_ty);

        while let Some((_, inner)) = layout.non_1zst_field(bx.cx) {
            layout = inner;
        }

        layout
    };

@kouhe3
Copy link
Contributor

kouhe3 commented May 27, 2025

This now supports structs like

struct B(i32, i32);
struct C(i64, i64, i64);
struct D(f64, i64);
struct E(i64, f64);
struct F(f64, f64);

Does it support tulple that mix 8bit 32bit 64bit or different length type

@folkertdev
Copy link
Contributor Author

Currently, arguments to va_arg must implement VaArgSafe. Only 32-bit and 64-bit integers implement this trait.

So, the testing with structs is just to verify the implementation, because the standard library currently does not allow va_arg on 8-bit integers, or any sort of struct/tuple/custom type.

The exact size of the fields does not matter, because they all get 8 bytes of space:

    // l->gp_offset = l->gp_offset + num_gp * 8
    if num_gp_registers > 0 {
        let offset = bx.const_u32(num_gp_registers * 8);
        let sum = bx.add(gp_offset_v, offset);
        bx.store(sum, gp_offset_ptr, Align::from_bytes(8).unwrap());
    }

@workingjubilee
Copy link
Member

Yeah, I'm not blocking this on being correct for all structs, I just kinda wanted to see us bother to land something that could be bugfixed, essentially.

@folkertdev folkertdev force-pushed the systemv-x86_64-va_arg branch from 6424fbe to 7daa926 Compare May 27, 2025 19:26
Copy link
Contributor Author

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebased and turned some inbounds_gep into inbounds_ptradd. I think this is good to go?

@rustbot rustbot added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label May 27, 2025
@tgross35
Copy link
Contributor

@bors try
@bors delegate=folkertdev for retries

Maybe varargs should also get asm tests since it involves ABI intricacies?

@bors
Copy link
Collaborator

bors commented May 28, 2025

✌️ @folkertdev, you can now approve this pull request!

If @tgross35 told you to "r=me" after making some further change, please make that change, then do @bors r=@tgross35

@bors
Copy link
Collaborator

bors commented May 28, 2025

⌛ Trying commit fd76915 with merge 61d3c5d...

bors added a commit that referenced this pull request May 28, 2025
implement `va_arg` for x86_64 systemv

tracking issue: #44930

Turns out LLVM's `va_arg` is also unreliable for this target.

llvm/llvm-project#141361

So, like clang, we implement our own. I used

- the spec at https://gitlab.com/x86-psABIs/x86-64-ABI
- the clang implementation at https://github.com/llvm/llvm-project/blob/9a440f84773c56d3803f330774acb2b4f471d5b4/clang/lib/CodeGen/Targets/X86.cpp#L3041

We can take a bunch of shortcuts because the return type of `va_list` must implement `VaArgSafe`. I also extended some of the tests, because up to 11 floats can be stored in the `reg_safe_area` for this calling convention.

r? `@workingjubilee`
`@rustbot` label +F-c_variadic

try-job: x86_64-apple-1
@bors
Copy link
Collaborator

bors commented May 28, 2025

☀️ Try build successful - checks-actions
Build commit: 61d3c5d (61d3c5d4666a9f1e8c58f4b08798221ebd759bcf)

@workingjubilee
Copy link
Member

Seems like we're good.

@folkertdev
Copy link
Contributor Author

Yes, I think so.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while scrutinizing this more closely before the final approval I came up with some questions that may need me to re-reference the layout code to answer...

@folkertdev folkertdev force-pushed the systemv-x86_64-va_arg branch from 47f694a to 11d1c33 Compare May 29, 2025 09:24
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me after review addressed

// Copy into a temporary if the type is more aligned than the register save area.
copy_to_temporary_if_more_aligned(bx, reg_addr, layout)
}
Primitive::Float(_) => bx.inbounds_ptradd(reg_save_area_v, fp_offset_v),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm. should already be aligned to 16 bytes so we should handle f128 fine when that becomes relevant? right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. Similarly SIMD types are stored in the same area, and always with an alignment of 16.

}
Primitive::Float(_) => bx.inbounds_ptradd(reg_save_area_v, fp_offset_v),
},
BackendRepr::ScalarPair(scalar1, scalar2) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine

Turns out LLVM's `va_arg` is also unreliable for this target, so we need
our own implementation.
@folkertdev folkertdev force-pushed the systemv-x86_64-va_arg branch from 11d1c33 to 94cc726 Compare May 29, 2025 20:07
@workingjubilee
Copy link
Member

cool

@bors r+

@bors
Copy link
Collaborator

bors commented May 29, 2025

📌 Commit 94cc726 has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-run-make Area: port run-make Makefiles to rmake.rs F-c_variadic `#![feature(c_variadic)]` S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants