Open
Description
Hi
Similar to existing issues:
- s3.to_parquet() - Changes columns names from DataFrame. Adds an '_' #515
- wr.s3.to_parquet(table=table) create table name different than specified #483
I have Glue tables with -
and them, and fields with .
, and while I understand they're not supported, they do work!
I'm trying to use this package to help write some parquet files, but this "feature" is preventing me.
Table
Table i'm trying to add data to:
Dataset
Dataset i'm trying to write:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 p_version 1 non-null object
1 asset 1 non-null object
2 date 1 non-null object
3 meta.format 1 non-null object
...
Sample:
df_flat.head()
Code
wr.s3.to_parquet(
df=df_flat,
path=path,
dataset=True,
mode="append",
database=database,
table=table,
sanitize_columns=False, # ignored!
partition_cols=partition_cols,
schema_evolution=False, # prevent accidental Catalogue updates
)
Error
InvalidArgumentValue: Schema change detected: New column meta_format with type string. Please pass schema_evolution=True to allow new columns behaviour.
Seeking advice.