Skip to content

What are the files that haystack downloads to cache directory? and how can I create a local package? #2066

Discussion options

You must be logged in to vote

Hi @asharm0662
the files downloaded to ~/.cache/hugging/transformers/ are language models that interpret queries and find an answer in a text document (for example) and they include the corresponding tokenizers that split any arbitrary input strings, e.g., queries, into sequences of tokens. When you run reader = TransformersReader(tokenizer="deepset/roberta-base-squad2", use_gpu=-1) the tokenizer is loaded from https://huggingface.co/deepset/roberta-base-squad2
These files are about 2 GB large, sometimes even larger. We cache these models inside the transformers library because it would take a long time to download them on-the-fly every time you want to run a query. The models won't chang…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@asharm0662
Comment options

@OlivierBondu
Comment options

@anakin87
Comment options

Answer selected by asharm0662
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants