S3FileLoader downloads punkt and averaged_perceptron_tagger #12663

IamExperimenting · 2023-10-31T18:09:00Z

IamExperimenting
Oct 31, 2023

Hi Team,

I’m trying to load text files from S3 bucket using AWS lambda function,

code :

from langchain.document_loaders import S3FileLoader
bucketname = “simple_bucket”
documentname = “textfiles/file1.txt”

document = S3FileLoader(bucketname, documentname).load()
textsplit = RecursiveCharacterTextSplitter()
docs = textsplit.split_documents(document)

here, in the below specific line, it downloads punkt and averaged_perceptron_tagger
“document = S3FileLoader(bucketname, documentname).load()”

As, I’m using lambda it throws me an error

OSError: [Errno 30] Read-only file system

when I debugged internally it tries to create a folder in lambda container, since “/home” is read only it doesn’t allow the code to create folder and download those

usually, I used to create folder in “/tmp/“ and download.

But here how do I mention the directory?

can someone guide me here?

santoshsg1308 · 2024-02-22T10:44:57Z

santoshsg1308
Feb 22, 2024

I am getting same error with S3Fileloader. @IamExperimenting were you able to solve the issue.

I also tried the below where I actually download the file first at /tmp location of a lambda which is not read only. I am trying to load a word document
s3.download_file(bucket_name, key, f"/tmp/{tmp_file_name}")

with open(f"/tmp/{tmp_file_name}", "rb") as f:
loader = UnstructuredWordDocumentLoader(f"/tmp/{tmp_file_name}")
docs = loader.load()

I still get the error
OSError: [Errno 30] Read-only file system: '/home/sbx_user1051'

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

S3FileLoader downloads punkt and averaged_perceptron_tagger #12663

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

S3FileLoader downloads punkt and averaged_perceptron_tagger #12663

Uh oh!

IamExperimenting Oct 31, 2023

Replies: 1 comment

Uh oh!

santoshsg1308 Feb 22, 2024

IamExperimenting
Oct 31, 2023

santoshsg1308
Feb 22, 2024