-
Notifications
You must be signed in to change notification settings - Fork 21k
Draft: New EraE implementation #32157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lint is failing, you can check on your machine with make lint
internal/era2/era2.go
Outdated
@@ -0,0 +1,429 @@ | |||
package era2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a bunch of comments here - we still need to figure out how to call this module since era2
isn't very elegant. In the meantime should rename builder2.go
to just builder.go
and era2.go
to era.go
.
internal/era2/builder2.go
Outdated
} | ||
} | ||
|
||
func (b *Builder) Add(header types.Header, body types.Body, receipts types.Receipts, blockhash common.Hash, blocknum uint64, td *big.Int, proof *Proof) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blockhash
and blocknum
are already available via header
, so no need to duplicate them
internal/era2/builder2.go
Outdated
} | ||
} | ||
|
||
func (b *Builder) Add(header types.Header, body types.Body, receipts types.Receipts, blockhash common.Hash, blocknum uint64, td *big.Int, proof *Proof) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would separate this into Add
and AddRLP
internal/era2/builder2.go
Outdated
headersRLP [][]byte | ||
bodiesRLP [][]byte | ||
receiptsRLP [][]byte | ||
proofsRLP [][]byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would remove RLP
suffix since it is pretty explanatory by the type [][]byte
internal/era2/era2.go
Outdated
"github.com/klauspost/compress/snappy" | ||
) | ||
|
||
type meta struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type meta struct { | |
type metadata struct { |
internal/era2/era2.go
Outdated
type meta struct { | ||
start uint64 // start block number | ||
count uint64 // number of blocks in the era | ||
compcount uint64 // number of properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compcount uint64 // number of properties | |
components uint64 // number of properties |
internal/era2/era2.go
Outdated
start uint64 // start block number | ||
count uint64 // number of blocks in the era | ||
compcount uint64 // number of properties | ||
filelen int64 // length of the file in bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filelen int64 // length of the file in bytes | |
length int64 // length of the file in bytes |
internal/era2/era2.go
Outdated
mu *sync.Mutex | ||
headeroff, bodyoff, receiptsoff, tdoff, proofsoff []uint64 // offsets for each entry type | ||
indstart int64 | ||
rootheader uint64 // offset of the root header in the file if present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the use of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when reading each era file I load in the index table into cache to make lookup faster, indstart is where the byte where the index table starts, and the root header is where the accumulator root is present when reading so it can seek there quickly since it is its own e2store object. The mutex should be removed though, forgot to do so very early on when I didn't understand what the file was doing I thought I wouldn't want it to read and write at the same time so put a lock.
internal/era2/era2.go
Outdated
m meta // metadata for the era2 file | ||
mu *sync.Mutex | ||
headeroff, bodyoff, receiptsoff, tdoff, proofsoff []uint64 // offsets for each entry type | ||
indstart int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this?
Also, please write a description for your PRs and try to fix the CI errors so your code is all green. |
Fixed all issues including extra logic regarding proof types, modularizing some functions and refactoring code for correctness and readability.
internal/era2/era.go
Outdated
|
||
func (*BlockProofHistoricalSummariesDeneb) Variant() proofvar { return proofDeneb } | ||
|
||
func proofVariantOf(p Proof) proofvar { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func proofVariantOf(p Proof) proofvar { | |
func variantOf(p Proof) proofvar { |
internal/era2/builder.go
Outdated
@@ -57,41 +57,33 @@ const ( | |||
|
|||
type proofvar uint16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type proofvar uint16 | |
type variant uint16 |
We know that the variant is for Proofs, should be at most a comment, not part of the variable name, otherwise you end up with the types like mev-boost :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha will make the change :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a lot of style nits, but the main problems we'll need to work through are:
- utilize proof interface more, avoid handling variant values directly
- remove unneeded struct fields
- avoid building entire era file in memory, write incremental work to disk
internal/era2/builder.go
Outdated
buff buffer | ||
off offsets | ||
|
||
prooftype variant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed. If you store the proofs as the interface object you will be able to determine the variant at anytime / alternatively if it makes better sense to store the bytes, then the bytes will already have included the variant type so it won't matter anymore.
internal/era2/builder.go
Outdated
off offsets | ||
|
||
prooftype variant | ||
tdsint []*big.Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed and tds in buffer should be big.Int
s. Serialize them all at once during finalize.
internal/era2/builder.go
Outdated
type Builder struct { | ||
w *e2store.Writer | ||
buf *bytes.Buffer | ||
sn *snappy.Writer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sn *snappy.Writer | |
snappy *snappy.Writer |
internal/era2/era.go
Outdated
} | ||
|
||
// retrieves the raw body frame in bytes of a specific block | ||
func (e *Era) GetRawBodyFrameByNumber(blockNum uint64) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean by "frame" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the raw bytes that are written, without snappy decoding it and rlp decoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should decode the snappy here (and other methods like this) and return the RLP bytes
internal/era2/era.go
Outdated
return io.ReadAll(r) | ||
} | ||
|
||
// retrieves the raw receipts frame in bytes of a specific block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// retrieves the raw receipts frame in bytes of a specific block | |
// GetRawReceiptsFrameByNumber retrieves the raw receipts frame in bytes of a specific block. |
internal/era2/era.go
Outdated
return io.ReadAll(r) | ||
} | ||
|
||
// retrieves the raw proof frame in bytes of a specific block proof |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// retrieves the raw proof frame in bytes of a specific block proof | |
// GetRawProofFrameByNumber retrieves the raw proof frame in bytes of a specific block proof. |
internal/era2/era.go
Outdated
return io.ReadAll(r) | ||
} | ||
|
||
// loads in the index table containing all offsets and caches it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// loads in the index table containing all offsets and caches it | |
// loadIndex loads in the index table containing all offsets and caches it. |
internal/era2/era.go
Outdated
// Getter methods to calculate offset of a specific component in the file. | ||
func (e *Era) headerOff(num uint64) (uint64, error) { return e.indexOffset(num, compHeader) } | ||
func (e *Era) bodyOff(num uint64) (uint64, error) { return e.indexOffset(num, compBody) } | ||
func (e *Era) rcptOff(num uint64) (uint64, error) { return e.indexOffset(num, compReceipts) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (e *Era) rcptOff(num uint64) (uint64, error) { return e.indexOffset(num, compReceipts) } | |
func (e *Era) receiptOff(num uint64) (uint64, error) { return e.indexOffset(num, compReceipts) } |
Really no reason to ever abbreviate by taking out vowels in golang.
cmd/utils/cmd.go
Outdated
@@ -403,8 +410,8 @@ func ExportAppendChain(blockchain *core.BlockChain, fn string, first uint64, las | |||
|
|||
// ExportHistory exports blockchain history into the specified directory, | |||
// following the Era format. | |||
func ExportHistory(bc *core.BlockChain, dir string, first, last, step uint64) error { | |||
log.Info("Exporting blockchain history", "dir", dir) | |||
func ExportHistory(bc *core.BlockChain, dir string, first, last, step uint64, f ExportFormat) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not use an enum to dictate the output format, this should be done via the method naming, e.g.
ExportHistoryEra1(..)
ExportHistoryEraE(..)
cmd/utils/cmd.go
Outdated
if f == Era1 { | ||
filename = era.Filename | ||
newBuilder = func(w io.Writer) any { return era.NewBuilder(w) } | ||
add = func(b any, blk *types.Block, rcpt types.Receipts, td *big.Int) error { | ||
return b.(*era.Builder).Add(blk, rcpt, td) | ||
} | ||
} else { | ||
filename = era2.Filename | ||
newBuilder = func(w io.Writer) any { return era2.NewBuilder(w) } | ||
add = func(b any, blk *types.Block, rcpt types.Receipts, td *big.Int) error { | ||
return b.(*era2.Builder).Add(*blk.Header(), *blk.Body(), rcpt, td, nil) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of impressive, but also is what an Interface
is for :)
If you want different builders with the same methods (like Add
) you can create the interface type and implement the method for both.
receipts := bc.GetReceiptsByHash(block.Hash()) | ||
if receipts == nil { | ||
return fmt.Errorf("export failed on #%d: receipts not found", n) | ||
rcpt := bc.GetReceiptsByHash(blk.Hash()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rcpt := bc.GetReceiptsByHash(blk.Hash()) | |
receipts := bc.GetReceiptsByHash(blk.Hash()) |
|
||
for j := uint64(0); j < step && batch+j <= last; j++ { | ||
n := batch + j | ||
blk := bc.GetBlockByNumber(n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blk := bc.GetBlockByNumber(n) | |
block := bc.GetBlockByNumber(n) |
cmd/utils/cmd.go
Outdated
) | ||
if block == nil { | ||
return fmt.Errorf("export failed on #%d: not found", n) | ||
bldr := newBuilder(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bldr := newBuilder(f) | |
builder := newBuilder(f) |
internal/era2/builder.go
Outdated
tds []*big.Int | ||
} | ||
|
||
// The offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file. | |
// offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file. |
internal/era2/builder.go
Outdated
buf *bytes.Buffer | ||
|
||
buff buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still have buf
and buff
, I think we discussed removing buf
since it isn't used until the end?
internal/era2/builder.go
Outdated
// Add writes a block entry, its reciepts, and optionally its proofs as well into the e2store file. | ||
func (b *Builder) Add(header types.Header, body types.Body, receipts types.Receipts, td *big.Int, proof Proof) error { | ||
if len(b.buff.headers) == 0 { // first block determines wether proofs are expected | ||
b.expectsProofs = proof != nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think you need to track this explicitly, just check if b.buff.proofs != nil
and if proof == nil
or vice versa. only special case is first block, which you allow for the b.buff.proofs
to be nil
even if proof is non-nil
.
internal/era2/builder.go
Outdated
if err != nil { | ||
return common.Hash{}, fmt.Errorf("compute accumulator: %w", err) | ||
} | ||
if n, err := b.w.Write(TypeAccumulatorRoot, accRoot[:]); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this still needs to be addressed
internal/era2/era.go
Outdated
} | ||
|
||
// retrieves the raw body frame in bytes of a specific block | ||
func (e *Era) GetRawBodyFrameByNumber(blockNum uint64) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should decode the snappy here (and other methods like this) and return the RLP bytes
cmd/utils/era_interface.go
Outdated
type Iterator interface { | ||
Next() bool | ||
Number() uint64 | ||
Block() (*types.Block, error) | ||
Receipts() (types.Receipts, error) | ||
Error() error | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface looks right, but it's in the wrong place. We should define it in the era
package. I will take a stab at this.
cmd/utils/era_interface.go
Outdated
func (era1Format) Filename(n string, e int, h common.Hash) string { return era.Filename(n, e, h) } | ||
func (era1Format) NewBuilder(w io.Writer) Builder { return &era1Builder{era.NewBuilder(w)} } | ||
func (era1Format) ReadDir(dir, net string) ([]string, error) { return era.ReadDir(dir, net) } | ||
func (era1Format) NewIterator(f *os.File) (Iterator, error) { | ||
e, err := era.From(f) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return era.NewIterator(e) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't need to redefine these methods to satisfy the interface - if the type implements the interface methods then it can be accepted anywhere the interface is accepted
cmd/utils/cmd.go
Outdated
@@ -248,11 +250,11 @@ func readList(filename string) ([]string, error) { | |||
// ImportHistory imports Era1 files containing historical block information, | |||
// starting from genesis. The assumption is held that the provided chain | |||
// segment in Era1 file should all be canonical and verified. | |||
func ImportHistory(chain *core.BlockChain, dir string, network string) error { | |||
func ImportHistory(chain *core.BlockChain, dir string, network string, format Format) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So on this method and ExportHistory we can avoid using this Format
type by just passing in the functions we need. For example, here we need a way to read all the entries and a way to create an iterator from an open file. If we just pass those two functions in, we can create an if statement in the cli handling so that we can reuse the function.
The issue with format is it is kind of a superfluous type.
Here is a draft for the New EraE implementation. The code follows along with the spec listed at this link.