-
-
Couldn't load subscription status.
- Fork 362
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Question about pandera
I don't quite fully understand when pandera considers a row to contain a null value and drops it during validation. I.e. in the below example, why is the row containing None (None being the way to specify nulls in ibis as clarified in ibis-project/ibis#11602) contained in the eventual output? From how I read this part of the docs and this part of the docs, I would expect the code to either raise or drop the row. Can someone please clarify this to me a little? 🙏
import ibis
import pandera.ibis as pa
pa_schema = pa.DataFrameSchema({
"a": pa.Column(int),
"b": pa.Column(int),
}, drop_invalid_rows=True)
schema_nullable = ibis.schema(
{"a": ibis.dtype("int"), "b": ibis.dtype("int")}
)
table = ibis.memtable([{"a": 42, "b": 43}, {"a": 44, "b": None}], schema=schema_nullable)
validated = pa_schema.validate(table, lazy=True)
print("Read with `schema_nullable`, validated against `pa_schema`")
print(validated.execute())results in
> uv run main.py
Read with `schema_nullable`, validated against `pa_schema`
a b
0 42 43.0
1 44 NaN- pandera:
0.26.1 - ibis:
10.8.0
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested