-
-
Couldn't load subscription status.
- Fork 79
Description
Description
The CSV importer does not increment the createdDocuments counter when importing CSV files as documents, causing misleading log output that shows "Total documents: 0" even when documents are successfully imported.
Steps to Reproduce
- Import a CSV file as documents using the Java importer
- Observe the log output after import completes
Expected Behavior
Log should show:
INFO [CSVImporterFormat] Importing of documents from CSV source completed in 0 seconds (279152/sec)
INFO [CSVImporterFormat] - Parsed lines...: 86538
INFO [CSVImporterFormat] - Total documents: 86537
Actual Behavior
Log shows:
INFO [CSVImporterFormat] Importing of documents from CSV source completed in 0 seconds (0/sec)
INFO [CSVImporterFormat] - Parsed lines...: 86538
INFO [CSVImporterFormat] - Total documents: 0
Documents are imported successfully, but the counter and rate calculation show 0.
Root Cause
In CSVImporterFormat.java, the loadDocuments() method increments context.parsed but never increments context.createdDocuments.
Current code (lines 115-135):
The document import loop increments the parsed counter but is missing the createdDocuments increment after document.save().
For comparison, the loadVertices() method in the same file correctly increments context.createdVertices (line 261) after creating each vertex.
Proposed Fix
Add context.createdDocuments.incrementAndGet() after document.save() in the loadDocuments() method, similar to how loadVertices() increments context.createdVertices.
Impact
- Severity: Low - cosmetic logging issue only
- Data integrity: Not affected - documents are imported correctly
- User confusion: Medium - misleading logs suggest import failed
- Workaround: Check "Parsed lines" count instead of "Total documents"
Environment
- ArcadeDB version: Latest (as of 2025-10-25)
- File: CSVImporterFormat.java
- Method:
loadDocuments()(lines ~100-150)
Additional Notes
This affects downstream projects like the Python bindings (arcadedb-embedded) which rely on these counters to report import statistics. The Python wrapper currently works around this by using the parsed count minus 1 (header row) to report the actual document count. Tested on 25.10.1-SNAPSHOT.