Using a larger model

@mikecvet Hi Mike:

I tried switching to a larger model (https://huggingface.co/mosaicml/mpt-7b-chat) on a MacBook Pro M2 using

```
  MODEL_ID = 'mosaicml/mpt-7b-chat'
  TOKENIZER_ID = 'mosaicml/mpt-7b-chat'
```

and

```
  from transformers import AutoTokenizer, AutoModelForCausalLM
  tokenizer = AutoTokenizer.from_pretrained(TOKENIZER_ID)
  model = AutoModelForCausalLM.from_pretrained(MODEL_ID)
```

but am getting the following error

```
  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/main.py", line 107, in <module>
    main()
  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/main.py", line 104, in main
    run_beam_search(tokenizer, model, input_ids, device, args.beam, args.temperature or 1.0, args.max_length, args.decay, args.verbose)
  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/main.py", line 43, in run_beam_search
    beam_ids = beam.search(
               ^^^^^^^^^^^^
  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/beam.py", line 69, in search
    gen_seq = torch.tensor(candidate.ids(), device=device).unsqueeze(0)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Could not infer dtype of NoneType
```

I've seen this with other models too (e.g., EleutherAI/gpt-j-6B). Any idea why this is happening and how to get around it?

Otherwise, can you recommend a larger model (e.g., 3B or 7B) that will work with your code? Perhaps a Llama model?

Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using a larger model #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Using a larger model #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions