You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GH-40278: [C++] Support casting string to duration in CSV converter (#46035)
### Rationale for this change
Currently, the Arrow C++ CSV converter does not support parsing strings into `duration` types. This limits CSV ingestion capabilities when handling datasets with time-based intervals represented as numeric strings (e.g., `1000`, `2000000`). This PR adds support for parsing such strings into Arrow's `DurationType`.
Note: Human-readable duration formats such as `1s`, `2m`, or `3h` are not supported in this PR. Support for those formats may be considered in a future enhancement.
### What changes are included in this PR?
- Added `DurationValueDecoder` using `StringConverter<DurationType>`
- Registered support in both standard and dictionary converters
- Added unit tests covering:
- Basic parsing across all time units (s, ms, µs, ns)
- Null and custom null values
- Whitespace handling and error cases
### Are these changes tested?
Yes, conversion logic is fully covered by new tests in `converter_test.cc`.
### Are there any user-facing changes?
Yes, users can now convert duration strings in CSV files to Arrow `duration` arrays by specifying the appropriate schema type.
* GitHub Issue: #40278
Lead-authored-by: Zihan Qi <zihan.qi@tum.de>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: Antoine Pitrou <antoine@python.org>
0 commit comments