-
Notifications
You must be signed in to change notification settings - Fork 667
Open
Labels
bugIncorrect behavior inside of ibisIncorrect behavior inside of ibis
Description
What happened?
The following breaks:
import ibis
schema = ibis.schema(
{"a": ibis.dtype("int", nullable=False), "b": ibis.dtype("int", nullable=True)}
)
table = ibis.memtable([{"a": 42, "b": 43}, {"a": 44, "b": ibis.null()}], schema=schema)
print("Original schema:")
print(table.schema())
print("Schema after dropping nulls:")
print(table.drop_null().schema())
print(table.execute())
print(table.drop_null().execute())
Output:
> uv run main.py
Original schema:
ibis.Schema {
a !int64
b int64
}
Schema after dropping nulls:
ibis.Schema {
a !int64
b int64
}
Traceback (most recent call last):
File "project/.venv/lib/python3.13/site-packages/ibis/backends/duckdb/__init__.py", line 1716, in _register_in_memory_table
obj = data.to_pyarrow_dataset(schema)
^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'PandasDataFrameProxy' object has no attribute 'to_pyarrow_dataset'. Did you mean: 'to_pyarrow_bytes'?
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "project/main.py", line 13, in <module>
print(table.execute())
~~~~~~~~~~~~~^^
File "project/.venv/lib/python3.13/site-packages/ibis/expr/types/core.py", line 424, in execute
return self._find_backend(use_default=True).execute(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
self, limit=limit, params=params, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "project/.venv/lib/python3.13/site-packages/ibis/backends/duckdb/__init__.py", line 1398, in execute
rel = self._to_duckdb_relation(expr, params=params, limit=limit, **kwargs)
File "project/.venv/lib/python3.13/site-packages/ibis/backends/duckdb/__init__.py", line 1316, in _to_duckdb_relation
self._run_pre_execute_hooks(expr)
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "project/.venv/lib/python3.13/site-packages/ibis/backends/duckdb/__init__.py", line 1297, in _run_pre_execute_hooks
super()._run_pre_execute_hooks(expr)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "project/.venv/lib/python3.13/site-packages/ibis/backends/__init__.py", line 1321, in _run_pre_execute_hooks
self._register_in_memory_tables(expr)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "project/.venv/lib/python3.13/site-packages/ibis/backends/__init__.py", line 1298, in _register_in_memory_tables
self._register_in_memory_table(memtable)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
File "project/.venv/lib/python3.13/site-packages/ibis/backends/duckdb/__init__.py", line 1718, in _register_in_memory_table
obj = data.to_pyarrow(schema)
File "project/.venv/lib/python3.13/site-packages/ibis/formats/pandas.py", line 436, in to_pyarrow
return pa.Table.from_pandas(obj, schema=pyarrow_schema)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/table.pxi", line 4795, in pyarrow.lib.Table.from_pandas
File "project/.venv/lib/python3.13/site-packages/pyarrow/pandas_compat.py", line 637, in dataframe_to_arrays
arrays = [convert_column(c, f)
~~~~~~~~~~~~~~^^^^^^
File "project/.venv/lib/python3.13/site-packages/pyarrow/pandas_compat.py", line 625, in convert_column
raise e
File "project/.venv/lib/python3.13/site-packages/pyarrow/pandas_compat.py", line 619, in convert_column
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
File "pyarrow/array.pxi", line 365, in pyarrow.lib.array
File "pyarrow/array.pxi", line 91, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: ('Could not convert None with type NullScalar: tried to convert to int64', 'Conversion failed for column b with type object')
One side-question on the above schemas: after explicitly dropping nulls, I'd expect the targeted columns to be marked as non-nullable, i.e. I'd expect Schema after dropping nulls
to read
ibis.Schema {
a !int64
b !int64 # mind the !
}
Can you please shed some light on why this is not the case? It's probably expected, but I don't understand why 🤔 Thanks! 🙏
What version of ibis are you using?
Python: 3.13
ibis
: 10.8.0
What backend(s) are you using, if any?
duckdb
, python package version 1.3.2
Relevant log output
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
bugIncorrect behavior inside of ibisIncorrect behavior inside of ibis
Type
Projects
Status
backlog