Skip to content

CastError: Couldn't cast symbol: string #78

@harsshal

Description

@harsshal

Program :
from datasets import load_dataset

Load the Yahoo Finance dataset from Hugging Face

dataset = load_dataset("bwzheng2010/yahoo-finance-data")

Inspect available splits

print(dataset)

Access the 'train' split (or whichever split is available)

data = dataset["train"]

Filter rows where the ticker is AAPL

aapl_data = data.filter(lambda x: x["ticker"] == "AAPL")

Display the first few rows

print(aapl_data[:5])

Error:
Resolving data files: 100%|█████████████████████████████████████████████████████| 19/19 [00:00<00:00, 32329.32it/s]
Downloading data: 100%|█████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 45.78files/s]
Generating train split: 0 examples [00:00, ? examples/s]Failed to read file '/home/harshal/.cache/huggingface/hub/datasets--bwzheng2010--yahoo-finance-data/snapshots/cb209fa251093af17213d35b6295a306787a8fca/data/exchange_rate.parquet' with error CastError: Couldn't cast
symbol: string
report_date: string
open: decimal128(38, 2)
close: decimal128(38, 2)
high: decimal128(38, 2)
low: decimal128(38, 2)
-- schema metadata --
parquet.avro.schema: '{"type":"record","name":"ExchangeRate","fields":[{"' + 482
writer.model.name: 'avro'
to
{'report_date': Value('string'), 'bc1_month': Value('decimal128(16, 4)'), 'bc2_month': Value('decimal128(16, 4)'), 'bc3_month': Value('decimal128(16, 4)'), 'bc6_month': Value('decimal128(16, 4)'), 'bc1_year': Value('decimal128(16, 4)'), 'bc2_year': Value('decimal128(16, 4)'), 'bc3_year': Value('decimal128(16, 4)'), 'bc5_year': Value('decimal128(16, 4)'), 'bc7_year': Value('decimal128(16, 4)'), 'bc10_year': Value('decimal128(16, 4)'), 'bc30_year': Value('decimal128(16, 4)')}
because column names don't match
Generating train split: 8956 examples [00:00, 532206.32 examples/s]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions