-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Is your feature request related to a problem or challenge?
As part of #11060, @devinjdangelo made file format support into a Trait which is good!
However the code to serialize these new (dynamic) structures is not yet implemented
As @devinjdangelo says https://github.com/apache/datafusion/pull/11060/files#r1650268578
Users depending on the ability to serialize COPY plans (e.g. ballista) will need this TODO to be completed before upgrading to any version of datafusion including this change.
It turns out there are no unit tests for them either so no tests failed
Describe the solution you'd like
Implement the named codec for serializing plans and a test for it
Describe alternatives you've considered
The code is here: datafusion/proto/src/logical_plan/file_formats.rs
The test would go here: https://github.com/apache/datafusion/blob/main/datafusion/proto/tests/cases/roundtrip_physical_plan.rs
Note there is already coverage for LogicalPlans here:
datafusion/datafusion/proto/tests/cases/roundtrip_logical_plan.rs
Lines 325 to 346 in d2ff218
async fn roundtrip_logical_plan_copy_to_sql_options() -> Result<()> { | |
let ctx = SessionContext::new(); | |
let input = create_csv_scan(&ctx).await?; | |
let mut table_options = ctx.copied_table_options(); | |
table_options.set_file_format(FileType::CSV); | |
table_options.set("format.delimiter", ";")?; | |
let plan = LogicalPlan::Copy(CopyTo { | |
input: Arc::new(input), | |
output_url: "test.csv".to_string(), | |
partition_by: vec!["a".to_string(), "b".to_string(), "c".to_string()], | |
format_options: FormatOptions::CSV(table_options.csv.clone()), | |
options: Default::default(), | |
}); | |
let bytes = logical_plan_to_bytes(&plan)?; | |
let logical_round_trip = logical_plan_from_bytes(&bytes, &ctx)?; | |
assert_eq!(format!("{plan:?}"), format!("{logical_round_trip:?}")); | |
Ok(()) | |
} |
Additional context
There are several other codecs needed:
JsonLogicalExtensionCodec
ParquetLogicalExtensionCodec
ArrowLogicalExtensionCodec
AvroLogicalExtensionCodec
However, I think we need to get one example done and then we can file tickets to fill out the others
Maybe this is what @lewiszlw was getting at with #11095 / https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/composed_extension_codec.rs