Skip to content

CSV Importer: createdDocuments counter not incremented for document imports #2700

@tae898

Description

@tae898

Description

The CSV importer does not increment the createdDocuments counter when importing CSV files as documents, causing misleading log output that shows "Total documents: 0" even when documents are successfully imported.

Steps to Reproduce

  1. Import a CSV file as documents using the Java importer
  2. Observe the log output after import completes

Expected Behavior

Log should show:

INFO [CSVImporterFormat] Importing of documents from CSV source completed in 0 seconds (279152/sec)
INFO [CSVImporterFormat] - Parsed lines...: 86538
INFO [CSVImporterFormat] - Total documents: 86537

Actual Behavior

Log shows:

INFO [CSVImporterFormat] Importing of documents from CSV source completed in 0 seconds (0/sec)
INFO [CSVImporterFormat] - Parsed lines...: 86538
INFO [CSVImporterFormat] - Total documents: 0

Documents are imported successfully, but the counter and rate calculation show 0.

Root Cause

In CSVImporterFormat.java, the loadDocuments() method increments context.parsed but never increments context.createdDocuments.

Current code (lines 115-135):

The document import loop increments the parsed counter but is missing the createdDocuments increment after document.save().

For comparison, the loadVertices() method in the same file correctly increments context.createdVertices (line 261) after creating each vertex.

Proposed Fix

Add context.createdDocuments.incrementAndGet() after document.save() in the loadDocuments() method, similar to how loadVertices() increments context.createdVertices.

Impact

  • Severity: Low - cosmetic logging issue only
  • Data integrity: Not affected - documents are imported correctly
  • User confusion: Medium - misleading logs suggest import failed
  • Workaround: Check "Parsed lines" count instead of "Total documents"

Environment

  • ArcadeDB version: Latest (as of 2025-10-25)
  • File: CSVImporterFormat.java
  • Method: loadDocuments() (lines ~100-150)

Additional Notes

This affects downstream projects like the Python bindings (arcadedb-embedded) which rely on these counters to report import statistics. The Python wrapper currently works around this by using the parsed count minus 1 (header row) to report the actual document count. Tested on 25.10.1-SNAPSHOT.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions