-
Notifications
You must be signed in to change notification settings - Fork 232
Description
Background Work
- I searched the mailing list
- I searched prior issues
- I searched the documentation
Setup
- Chipyard Version: 1.10.0
- Chipyard Commit Hash: 00853c
- Gemmini Commit Hash: f13847e
- OS: Ubuntu 22.04.5 LTS (Linux 6.8.0-79-generic, x86_64)
- Toolchain: Default Chipyard setup as per documentation
Issue Description
Running standard Gemmini baremetal tests works (e.g., tiled_matmul_ws-baremetal), but workloads involving more complex operations (e.g., softmax, layernorm) consistently fail in simulation with an assertion in ReservationStation.scala.
This assertion indicates an invalid entry is being accessed in the reservation station:
assert(entries_st(issue_id).valid)
It seems the reservation station is attempting to issue an entry that is not valid, possibly due to scheduling/queueing logic when handling these micro-ops.
Steps to Reproduce
Successful Run
make CONFIG=GemminiRocketConfig run-binary \
BINARY=../../generators/gemmini/software/gemmini-rocc-tests/build/bareMetalC/tiled_matmul_ws-baremetalFailing Run
make CONFIG=GemminiRocketConfig run-binary \
BINARY=../../generators/gemmini/software/gemmini-rocc-tests/build/bareMetalC/tiled_matmul_ws_softmax-baremetalError Log (Failing Case)
/home/mingzhenjia/Desktop/chipyard/sims/vcs/generated-src/chipyard.harness.TestHarness.GemminiRocketConfig/gen-collateral/ReservationStation.sv", 9827:
TestDriver.testHarness.chiptop0.system.tile_prci_domain.tile_reset_domain_tile.gemmini.reservation_station: at time 2877727000 ps
Assertion failed at ReservationStation.scala:479
assert(entries_st(issue_id).valid)
Fatal: .../ReservationStation.sv", 9829:
$finish called at time 2877727000 ps
Log (Successful Case)
Starting gemmini matmul
Cycles taken: 2392
Starting slow CPU matmul
Cycles taken: 3227174
Fatal: ".../TestDriver.v", 147:
$finish called at time 10000000500 ps
Expected Behavior
The simulation should complete normally (print cycles), as with the tiled_matmul_ws-baremetal test.
Request for Help
- What conditions might cause
ReservationStation.scalato issue an invalid entry for workloads like softmax/layernorm? - Any suggestions for signals to trace or configuration/debugging strategies to narrow down the issue?