-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Program :
from datasets import load_dataset
Load the Yahoo Finance dataset from Hugging Face
dataset = load_dataset("bwzheng2010/yahoo-finance-data")
Inspect available splits
print(dataset)
Access the 'train' split (or whichever split is available)
data = dataset["train"]
Filter rows where the ticker is AAPL
aapl_data = data.filter(lambda x: x["ticker"] == "AAPL")
Display the first few rows
print(aapl_data[:5])
Error:
Resolving data files: 100%|█████████████████████████████████████████████████████| 19/19 [00:00<00:00, 32329.32it/s]
Downloading data: 100%|█████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 45.78files/s]
Generating train split: 0 examples [00:00, ? examples/s]Failed to read file '/home/harshal/.cache/huggingface/hub/datasets--bwzheng2010--yahoo-finance-data/snapshots/cb209fa251093af17213d35b6295a306787a8fca/data/exchange_rate.parquet' with error CastError: Couldn't cast
symbol: string
report_date: string
open: decimal128(38, 2)
close: decimal128(38, 2)
high: decimal128(38, 2)
low: decimal128(38, 2)
-- schema metadata --
parquet.avro.schema: '{"type":"record","name":"ExchangeRate","fields":[{"' + 482
writer.model.name: 'avro'
to
{'report_date': Value('string'), 'bc1_month': Value('decimal128(16, 4)'), 'bc2_month': Value('decimal128(16, 4)'), 'bc3_month': Value('decimal128(16, 4)'), 'bc6_month': Value('decimal128(16, 4)'), 'bc1_year': Value('decimal128(16, 4)'), 'bc2_year': Value('decimal128(16, 4)'), 'bc3_year': Value('decimal128(16, 4)'), 'bc5_year': Value('decimal128(16, 4)'), 'bc7_year': Value('decimal128(16, 4)'), 'bc10_year': Value('decimal128(16, 4)'), 'bc30_year': Value('decimal128(16, 4)')}
because column names don't match
Generating train split: 8956 examples [00:00, 532206.32 examples/s]