Skip to content

fix(data): vsetvli/vsetivli: correct assembly and fields #838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ThinkOpenly
Copy link
Collaborator

vsetvli and vsetivli accept the components of VTYPEI as
individual parameters, and they are encoded in individual fields.

...with a few additional 0 bits as a result, per the
definition of vtype.

`vsetvli` and `vsetivli` accept the _components_ of VTYPEI as
individual parameters, and they are encoded in individual fields.

...with a few additional `0` bits as a result, per the
definition of vtype.
@ThinkOpenly ThinkOpenly requested a review from dhower-qc as a code owner June 10, 2025 20:21
@ThinkOpenly
Copy link
Collaborator Author

I'm a little confused as to why this is failing in CI.

It's failing within "regress:smoke" in "test:llvm", complaining about a mismatch with the new (correct) operands and LLVM's apparent expectation of a single "vtype" operand. The RISC-V instruction syntax for vsetvli/vsetivli is sort-of defined one way in the ISA chapter 30.6 "Configuration-Setting Instructions (vsetvli/vsetivli/vsetvl):

vsetvli rd, rs1, vtypei   # rd = new vl, rs1 = AVL, vtypei = new vtype setting
vsetivli rd, uimm, vtypei # rd = new vl, uimm = AVL, vtypei = new vtype setting

(Here, apparently using a single "vtypei" operand.)

However, an example later (30.16.5 "Vector Compress Instruction") shows 6 operands:

vsetivli t0, 9, e8, m1, tu, ma

The Sail code shows:

"vsetvli" ^ spc() ^ reg_name(rd) ^ sep() ^ reg_name(rs1) ^ sep() ^ sew_flag(sew) ^ maybe_lmul_flag(lmul) ^ ta_flag(ta) ^ ma_flag(ma)
"vsetivli" ^ spc() ^ reg_name(rd) ^ sep() ^ hex_bits_5(uimm) ^ sep() ^ sew_flag(sew) ^ maybe_lmul_flag(lmul) ^ ta_flag(ta) ^ ma_flag(ma)

(6 operands, with lmul apparently being optional, and defaulting to "m1".)

However, LLVM has:

def VSETVLI : RVInstSetVLi<(outs GPR:$rd), (ins GPR:$rs1, VTypeIOp11:$vtypei),
                           "vsetvli", "$rd, $rs1, $vtypei">,
                           Sched<[WriteVSETVLI, ReadVSETVLI]>;
def VSETIVLI : RVInstSetiVLi<(outs GPR:$rd), (ins uimm5:$uimm, VTypeIOp10:$vtypei),
                             "vsetivli", "$rd, $uimm, $vtypei">,
                             Sched<[WriteVSETIVLI]>;

(Not too familiar with this code, but it seems to show a single "vtypei" operand.)

and the LLVM TableGen we use in the CI step shows:

    "AsmString": "vsetivli\t$rd, $uimm, $vtypei",
[...]
    "AsmString": "vsetvli\t$rd, $rs1, $vtypei",

Is LLVM wrong here, or is our interpretation of it wrong, or something else? @AFOliveira? @lenary?

@jordancarlin
Copy link
Contributor

Some additional discussion over in this Sail PR: riscv/sail-riscv#1010.

For valid values, the 4-component approach is used, but expressing them as a single immediate is also accepted by the assemblers and is the only way to express reserved values (which in this case does not mean that the instruction itself is reserved).

@ThinkOpenly
Copy link
Collaborator Author

For this sequence of instructions:

        vsetvli t0, a0, 192
        vsetvli t0, a0, e8, m1, ta, ma
        vsetvli t0, a0, e8, ta, ma

GCC emits the same encoding for each.
LLVM (clang 20 2b9fc0f1c82e514f8b46c4c7d282edecb0346089) does not respect the optional "lmul" operand:

$ clang -c vsetvli.s
vsetvli.s:21:25: error: operand must be e[8|16|32|64],m[1|2|4|8|f2|f4|f8],[ta|tu],[ma|mu]
        vsetvli t0, a0, e8, ta, ma
                        ^

@lenary
Copy link
Collaborator

lenary commented Jun 11, 2025

What's going on here is that LLVM has a custom parser for the vtype information: https://github.com/llvm/llvm-project/blob/b3873e8aa41be17b6664c918fd6ba30307361d08/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp#L2295

I haven't looked closely at how parseVTypeToken works, but you have an example that GCC accepts and LLVM does not, and you have a reason you think LLVM is wrong, so it would be good to file that against LLVM: http://github.com/llvm/llvm-project/issues (link back to this in case someone wants to see where you found the issue)

I would expect that LLVM will want to continue to bundle all the vsetvli operands into a single immediate, but I could be wrong, so this might be somewhere where we need a little gap between the "asmstring" in LLVM (which is using a more complex parser underneath), and the unified db information, when doing the cross-verification.

@AFOliveira
Copy link
Collaborator

AFOliveira commented Jun 11, 2025

Thankfully, @lenary arrived to the issue earlier than me and clarified much better than what I could have :)

I see two ways of going forward here:

If @ThinkOpenly opens that issue against LLVM:

  • I can skip this test in the llvm tests for now
  • We wait to see if that gets fixed

I don't quite know how fast LLVM is in fixing this if they agree with our view here, but I can really just skip those test for now in a couple minutes if you want me to

@ThinkOpenly
Copy link
Collaborator Author

What's going on here is that LLVM has a custom parser for the vtype information: https://github.com/llvm/llvm-project/blob/b3873e8aa41be17b6664c918fd6ba30307361d08/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp#L2295

I see. Is there anything in the TableGen JSON that indicates that this instruction has a custom parser, or is weird in some way? (I didn't see anything obvious.) I'm just looking for a programmatic way to separate these two instructions from the herd.

file that against LLVM: http://github.com/llvm/llvm-project/issues (link back to this in case someone wants to see where you found the issue)

llvm/llvm-project#143744

I would expect that LLVM will want to continue to bundle all the vsetvli operands into a single immediate, but I could be wrong, so this might be somewhere where we need a little gap between the "asmstring" in LLVM (which is using a more complex parser underneath), and the unified db information, when doing the cross-verification.

UDB is going to need to accommodate all 3 different syntax possibilities somehow. There is no way to do that at present.

Validating those three against the TableGen JSON is perhaps not possible.

@topperc
Copy link

topperc commented Jun 11, 2025

The intent of the vector spec is that the LMUL is not optional. Text saying it was optional was explicitly removed here riscv/riscv-isa-manual@75e857d

@lenary
Copy link
Collaborator

lenary commented Jun 11, 2025

I see. Is there anything in the TableGen JSON that indicates that this instruction has a custom parser, or is weird in some way? (I didn't see anything obvious.) I'm just looking for a programmatic way to separate these two instructions from the herd.

Here's how it works in the tablegen. In the JSON, all "defs" should be available in the JSON, from what I understand.

Starting at the Instruction definition for vsetvli (I will ignore vsetivli for the moment, but following it is equivalent)

def VSETVLI : RVInstSetVLi<(outs GPR:$rd), (ins GPR:$rs1, VTypeIOp11:$vtypei),
                           "vsetvli", "$rd, $rs1, $vtypei">,
                           Sched<[WriteVSETVLI, ReadVSETVLI]>;

To find the information about the operand, we find the def of VTypeIOp11. This should be in the tablegen JSON with this name.

def VTypeIOp11 : VTypeIOp<11>;

Which will have fields from its class, VTypeIOp<int VTypeINum>:

class VTypeIOp<int VTypeINum> : RISCVOp {
  let ParserMatchClass = VTypeIAsmOperand<VTypeINum>;
  // ... more fields
}

In this case, we care about ParserMatchClass. This is an inline/anonymous definition, so expect this to be something like anonymous_NNNN, but you can look up that name in the JSON, to get an instance that is based on the VTypeIAsmOperand<int VTypeINum> class:

class VTypeIAsmOperand<int VTypeINum> : AsmOperandClass {
  let Name = "VTypeI" # VTypeINum;
  let ParserMethod = "parseVTypeI";
  let DiagnosticType = "InvalidVTypeI";
  let RenderMethod = "addVTypeIOperands";
}

It is the presence of the ParserMethod (and the fact it's non-empty) that says this operand type has a custom parser.

I would take the time to look in the LLVM RISC-V tablegen for let ParserMethod, and look at the operands that have them. Most are doing something reasonably complex (maybe symbol support, maybe some kind of optional syntax, maybe some special identifiers (fences)). The thing that's maybe strange is that we also use these for parsing registers for Zfinx, for complicated reasons.

llvm/llvm-project#143744

Thanks

@topperc
Copy link

topperc commented Jun 11, 2025

The intent of the vector spec is that the LMUL is not optional. Text saying it was optional was explicitly removed here riscv/riscv-isa-manual@75e857d

Unfortunately, this removal happened years after ratification. So I guess LLVM might need to support it to be compatible with the ratified spec prior to this change?

But independent of that, the SAIL code should probably reflect what's in the current text. And we should consider it being optional a historical artifact and not encourage its use.

@ThinkOpenly
Copy link
Collaborator Author

If @ThinkOpenly opens that issue against LLVM:

  • I can skip this test in the llvm tests for now
  • We wait to see if that gets fixed

I don't quite know how fast LLVM is in fixing this if they agree with our view here, but I can really just skip those test for now in a couple minutes if you want me to

I don't see that a fix coming from LLVMland that would help, given our current use of the JSON.

I think Sam's awesomely detailed overview of how to interpret the JSON could be used to intelligently skip this class of cases in general:

  1. look up the instruction as is already done
  2. walk through the "InOperandList":"args", for each "def"...
  3. look up each "def" value at the top level, I think
  4. then look up its "ParserMatchClass":"def" (this is the "anonymous" thing") also at top level
  5. see if that has a "ParserMethod" attribute. If so, skip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants