-
Notifications
You must be signed in to change notification settings - Fork 3
Performance Issue on Large Datasets #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @myoder020 Sorry for the super late reply Currently there isn't a real workaround for this, you have to load in-memory the whole dataset to export data. I have been experimenting with the stream API of SheetJs a while ago, but it's not easy to implement while keeping the nice DX provided by the lib. Do you still need this ? If yes, I can spend some time working on it |
@myoder020 SheetJs offered very poor stream capabilities, which made this near impossible to implement since the library is based on SheetJs. Which is why I rewrote the internals to use ExcelJs instead, which offers much better stream support. I managed to fully implement a new stream-based builder in addition to the base builder. Here's an example I managed to run : const streamBuilder = ExcelStreamBuilder.create({
filePath,
chunkMaxSize: 5000, // Process in chunks of 5000 rows
})
const userSheet = streamBuilder.sheet('Users', {
schema: userSchema,
title: 'User Report - 1 000,000 Records',
})
const TOTAL_ROWS = 1_000_000
const CHUNK_SIZE = 10000
for (let chunkIndex = 0; chunkIndex < TOTAL_ROWS / CHUNK_SIZE; chunkIndex++) {
const userChunk: User[] = Array.from({ length: CHUNK_SIZE }, (_, i) => generateUser())
await userSheet.addChunk(userChunk)
}
await streamBuilder.save() Stream Excel Generation: 33.077s
Excel file created at: ./examples/stream-test-100k.xlsx
File size: 60.29 MB The v2 release will happen in the coming week with these new streaming capabilities. It will also support custom write streams : const fs = require('fs');
const { ExcelStreamBuilder } = require('your-excel-library');
// Create a write stream
const writeStream = fs.createWriteStream('large-report.xlsx');
// Initialize the ExcelStreamBuilder with a stream instead of filePath
const streamBuilder = ExcelStreamBuilder.create({
stream: writeStream,
chunkMaxSize: 5000, // Process in chunks of 5000 rows
});
const userSheet = streamBuilder.sheet('Users', {
schema: userSchema,
title: 'User Report - 1,000,000 Records',
});
const TOTAL_ROWS = 1_000_000;
const CHUNK_SIZE = 10000;
// Generate and add data in chunks
for (let chunkIndex = 0; chunkIndex < TOTAL_ROWS / CHUNK_SIZE; chunkIndex++) {
const userChunk = Array.from({ length: CHUNK_SIZE }, (_, i) => generateUser());
await userSheet.addChunk(userChunk);
// Optional: log progress
console.log(`Processed ${(chunkIndex + 1) * CHUNK_SIZE} rows`);
}
// Finish and close the stream
await streamBuilder.save();
// You can also handle stream events if needed
writeStream.on('finish', () => {
console.log('Excel file has been written successfully');
});
writeStream.on('error', (err) => {
console.error('Error writing Excel file:', err);
}); |
are there any performance tips when trying to build an excel file with 500k+ rows?
The text was updated successfully, but these errors were encountered: