Skip to content

Using a larger model #1

@rxk2rxk

Description

@rxk2rxk

@mikecvet Hi Mike:

I tried switching to a larger model (https://huggingface.co/mosaicml/mpt-7b-chat) on a MacBook Pro M2 using

  MODEL_ID = 'mosaicml/mpt-7b-chat'
  TOKENIZER_ID = 'mosaicml/mpt-7b-chat'

and

  from transformers import AutoTokenizer, AutoModelForCausalLM
  tokenizer = AutoTokenizer.from_pretrained(TOKENIZER_ID)
  model = AutoModelForCausalLM.from_pretrained(MODEL_ID)

but am getting the following error

  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/main.py", line 107, in <module>
    main()
  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/main.py", line 104, in main
    run_beam_search(tokenizer, model, input_ids, device, args.beam, args.temperature or 1.0, args.max_length, args.decay, args.verbose)
  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/main.py", line 43, in run_beam_search
    beam_ids = beam.search(
               ^^^^^^^^^^^^
  File "/Users/ron.katriel/Documents/GitHub/llm-beam-search/src/beam.py", line 69, in search
    gen_seq = torch.tensor(candidate.ids(), device=device).unsqueeze(0)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Could not infer dtype of NoneType

I've seen this with other models too (e.g., EleutherAI/gpt-j-6B). Any idea why this is happening and how to get around it?

Otherwise, can you recommend a larger model (e.g., 3B or 7B) that will work with your code? Perhaps a Llama model?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions