Description
Since the v3.5.0 release and PR #440, the Cursor methods catalogs(), schemas(), tables() & columns() return the new ColumnTable objects when fetching the results with fetchall_arrow()
despite pyarrow being installed:
Since 3.5.0:
from databricks import sql
import os
conn = sql.connect(
server_hostname=os.getenv("DATABRICKS_HOST"),
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
access_token=os.getenv("DATABRICKS_TOKEN"),
)
cursor = conn.cursor()
cursor.catalogs()
type(cursor.fetchall_arrow()) # databricks.sql.utils.ColumnTable
Prior to 3.5.0 databricks-sql-python would respect the users wish and return a pyarrow table, i.e. output of above code is pyarrow.lib.Table
.
As far as I can understand, the goal of #440 was only to return ColumnTable objects if pyarrow is not installed. But since 3.5.0 the behaviour of fetchall_arrow()
when pyarrow is installed is inconsistent. For cursor.execute()
then cursor.fetchall_arrow()
queries it will return pyarrow tables, but for the catalogs(), schemas(), tables() & columns() methods it now returns ColumnTables when it should really return pyarrow tables.