Question about loading tokenizer

Hi. I use your official scripts in [Readme](https://github.com/LargeWorldModel/LWM/blob/f45d2b70bda27abfa9cf32e228916b2883801366/README.md?plain=1#L120C3-L120C4):
```
#! /bin/bash

export SCRIPT_DIR="$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
export PROJECT_DIR="$( cd -- "$( dirname -- "$SCRIPT_DIR" )" &> /dev/null && pwd )"
cd $PROJECT_DIR
export PYTHONPATH="$PYTHONPATH:$PROJECT_DIR"

export llama_tokenizer_path="LargeWorldModel/LWM-Chat-1M-Jax"
export vqgan_checkpoint="/data/lei/localmodel/LargeWorldModel/LWM-Chat-1M-Jax/vqgan"
export lwm_checkpoint="/data/lei/localmodel/LargeWorldModel/LWM-Chat-1M-Jax/params"

python3 -u -m lwm.vision_generation \
    --prompt='Fireworks over the city' \
    --output_file='fireworks.mp4' \
    --temperature_image=1.0 \
    --temperature_video=1.0 \
    --top_k_image=8192 \
    --top_k_video=1000 \
    --cfg_scale_image=5.0 \
    --cfg_scale_video=1.0 \
    --vqgan_checkpoint="$vqgan_checkpoint" \
    --n_frames=8 \
    --mesh_dim='!1,1,-1,1' \
    --dtype='fp32' \
    --load_llama_config='7b' \
    --update_llama_config="dict(sample_mode='vision',theta=50000000,max_sequence_length=32768,scan_attention=False,scan_query_chunk_size=128,scan_key_chunk_size=128,scan_mlp=False,scan_mlp_chunk_size=8192,scan_layers=True)" \
    --load_checkpoint="$lwm_checkpoint" \
    --tokenizer="$llama_tokenizer_path"
read
```

But it still appear:

> Entry Not Found for url: https://huggingface.co/LargeWorldModel/LWM-Chat-1M-Jax/resolve/main/config.json.

How to modify the code " tokenizer = AutoTokenizer.from_pretrained(FLAGS.tokenizer)" in [file](https://github.com/LargeWorldModel/LWM/blob/f45d2b70bda27abfa9cf32e228916b2883801366/lwm/vision_generation.py#L59)?

Would be like that?
```
sp = spm.SentencePieceProcessor()
tokenizer .Load("LargeWorldModel/LWM-Chat-1M-Jax/tokenizer.model")
```

Thanks to your work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about loading tokenizer #86

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about loading tokenizer #86

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions