Use of causal models for generation

This is an amazing work. I have been working on something that would require me to evaluate the generated outputs of models like Mistral, using a prompt like:
`"Fill the [MASK] token in the sentence. Generate a single output."` 

Now earlier, I would simply instruction fine-tune a Mistral Model. But I would like to explore the possibility of using these models with a bi-directional attention.

I see that the library allows me to access the `backbone` model underneath. But it is not clear to me if this model has the bi-directional attention. Can you please clarify this? If it does, I could simply use the `backbone.generate()` function for my purpose.

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use of causal models for generation #82

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Use of causal models for generation #82

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions