-
Notifications
You must be signed in to change notification settings - Fork 870
[rom_ctrl,rtl] Add a flop between rom_ctrl and kmac #27658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
129145e
to
86d892c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rswarbrick , this looks great! Let's wait for @davidschrammel 's feedback.
I think since the flop stage is optional, it's fine to use 2 stages. In some cases ROMs are very big and one may want to have the initial boot phase to run fast.
This injects a clock edge between data coming out of the ROM and data going into KMAC to be hashed. Inserting the delay is pretty trivial: we just use a prim_fifo_sync. There *is* an area vs. efficiency question. Because there is no pass-through, a prim_fifo_sync with depth 1 will have half throughput, because it has to alternate between taking an item and passing it on. Although, there's the bandwidth to do both on the same cycle, breaking the combinatorial path means that we can't pass the KMAC block's "ready" signal all the way back to the ROM. This works perfectly well, but takes twice as long as you might hope. A lazy workaround is to use Depth = 2. This is a little silly, because it uses twice as much area as necessary. But it's very easy from a coding perspective! Signed-off-by: Rupert Swarbrick <rswarbrick@lowrisc.org>
Signed-off-by: Rupert Swarbrick <rswarbrick@lowrisc.org>
Signed-off-by: Rupert Swarbrick <rswarbrick@lowrisc.org>
86d892c
to
a537191
Compare
sent to KMAC. This may break long paths in a target chip, at the cost of | ||
adding chip area. | ||
''' | ||
local: "false", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to make it local: "true"
so that it becomes an localparam and the top-level HJSON is the single source of truth.
This injects a clock edge between data coming out of the ROM and data going into KMAC to be hashed. Inserting the delay is pretty trivial: we just use a prim_fifo_sync.
There is an area vs. efficiency question. Because there is no pass-through, a prim_fifo_sync with depth 1 will have half throughput, because it has to alternate between taking an item and passing it on. Although, there's the bandwidth to do both on the same cycle, breaking the combinatorial path means that we can't pass the KMAC block's "ready" signal all the way back to the ROM.
This works perfectly well, but takes twice as long as you might hope.
A lazy workaround is to use Depth = 2. This is a little silly, because it uses twice as much area as necessary. But it's very easy from a coding perspective!