Skip to content

TypeError: Cannot cast array data from dtype('O') to dtype('bool') according to the rule 'safe' #13

@jstbryan

Description

@jstbryan

Hi,

I was running the recipe/plugin in dataiku and encountered the above error.
below is an extract of the traceback.


*************** Recipe code failed **************
[09:50:49] [INFO] [dku.utils] - Begin Python stack
[09:50:49] [INFO] [dku.utils] - Traceback (most recent call last):
[09:50:49] [INFO] [dku.utils] - File "pandas/_libs/parsers.pyx", line 1156, in pandas._libs.parsers.TextReader._convert_tokens
[09:50:49] [INFO] [dku.utils] - TypeError: Cannot cast array data from dtype('O') to dtype('bool') according to the rule 'safe'
[09:50:49] [INFO] [dku.utils] - During handling of the above exception, another exception occurred:
[09:50:49] [INFO] [dku.utils] - Traceback (most recent call last):
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dss/jobs/compute_Responses_Lemmatize_NP/custom-python-recipe/pyoutAqiJYccVwKv8/python-exec-wrapper.py", line 208, in
[09:50:49] [INFO] [dku.utils] - exec(f.read())
[09:50:49] [INFO] [dku.utils] - File "", line 27, in
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dss/plugins/installed/nlp-preparation/python-lib/dku_io_utils.py", line 79, in process_dataset_chunks
[09:50:49] [INFO] [dku.utils] - for i, df in tqdm(enumerate(df_iterator), total=len_iterator, unit="chunk", mininterval=1.0):
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dss/code-envs/python/plugin_nlp-preparation_managed/lib/python3.6/site-packages/tqdm/std.py", line 1178, in iter
[09:50:49] [INFO] [dku.utils] - for obj in iterable:
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dataiku-dss-9.0.4/python/dataiku/core/dataset.py", line 611, in iter_dataframes
[09:50:49] [INFO] [dku.utils] - for df in df_it:
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dss/code-envs/python/plugin_nlp-preparation_managed/lib64/python3.6/site-packages/pandas/io/parsers.py", line 1007, in next
[09:50:49] [INFO] [dku.utils] - return self.get_chunk()
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dss/code-envs/python/plugin_nlp-preparation_managed/lib64/python3.6/site-packages/pandas/io/parsers.py", line 1070, in get_chunk
[09:50:49] [INFO] [dku.utils] - return self.read(nrows=size)
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dss/code-envs/python/plugin_nlp-preparation_managed/lib64/python3.6/site-packages/pandas/io/parsers.py", line 1036, in read
[09:50:49] [INFO] [dku.utils] - ret = self._engine.read(nrows)
[09:50:49] [INFO] [dku.utils] - File "/home/dataiku/dss/code-envs/python/plugin_nlp-preparation_managed/lib64/python3.6/site-packages/pandas/io/parsers.py", line 1848, in read
[09:50:49] [INFO] [dku.utils] - data = self._reader.read(nrows)
[09:50:49] [INFO] [dku.utils] - File "pandas/_libs/parsers.pyx", line 876, in pandas._libs.parsers.TextReader.read
[09:50:49] [INFO] [dku.utils] - File "pandas/_libs/parsers.pyx", line 903, in pandas._libs.parsers.TextReader._read_low_memory
[09:50:49] [INFO] [dku.utils] - File "pandas/_libs/parsers.pyx", line 968, in pandas._libs.parsers.TextReader._read_rows
[09:50:49] [INFO] [dku.utils] - File "pandas/_libs/parsers.pyx", line 1094, in pandas._libs.parsers.TextReader._convert_column_data
[09:50:49] [INFO] [dku.utils] - File "pandas/_libs/parsers.pyx", line 1164, in pandas._libs.parsers.TextReader._convert_tokens
[09:50:49] [INFO] [dku.utils] - ValueError: cannot safely convert passed user dtype of bool for object dtyped data in column 32


this happened for spell checking and text cleaning.
Hope you could shed some light on this.

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions