|
2 | 2 |
|
3 | 3 | <img src="./docs/assets/images/logo.png" alt="Outlines-core Logo" width=500></img>
|
4 | 4 |
|
5 |
| -[![Contributors][contributors-badge]][contributors] |
| 5 | +[![Latest Version]][crates.io] [![License]][github] ![MSRV] |
| 6 | + |
| 7 | +[Latest Version]: https://img.shields.io/crates/v/outlines-core.svg |
| 8 | +[crates.io]: https://crates.io/crates/outlines-core |
| 9 | +[License]: https://img.shields.io/github/license/dottxt-ai/outlines-core.svg?color=blue&cachedrop |
| 10 | +[github]: https://github.com/dottxt-ai/outlines-core/blob/main/LICENSE |
| 11 | +[MSRV]: https://img.shields.io/badge/MSRV-1.71.1-brightgreen |
| 12 | + |
| 13 | +<!--- |
| 14 | +Once it uploaded to crates.io badge could be generated like: |
| 15 | + [version]: https://img.shields.io/crates/msrv/outlines-core.svg?label=msrv&color=lightgrayy |
| 16 | + --> |
6 | 17 |
|
7 | 18 | *Structured generation (in Rust).*
|
| 19 | + |
8 | 20 | </div>
|
9 | 21 |
|
10 |
| -This package provides the core functionality for structured generation, formerly implemented in [Outlines][outlines], with a focus on performance and portability. |
| 22 | +## Outlines-core |
11 | 23 |
|
12 |
| -# Install |
| 24 | +This package provides the core functionality for structured generation, formerly implemented in [Outlines][outlines], |
| 25 | +with a focus on performance and portability, it offers a convenient way to: |
13 | 26 |
|
14 |
| -We provide bindings to the following languages: |
15 |
| -- [Rust][rust-implementation] (Original implementation) |
16 |
| -- [Python][python-bindings] |
| 27 | +- build regular expressions from JSON schemas |
17 | 28 |
|
18 |
| -The latest release of the Python bindings is available on PyPi using `pip`: |
| 29 | +- construct an `Index` object by combining a `Vocabulary` and regular expression to efficiently map tokens from a given vocabulary to state transitions in a finite-state automation |
19 | 30 |
|
20 |
| -``` python |
21 |
| -pip install outlines-core |
22 |
| -``` |
| 31 | +### Example |
23 | 32 |
|
24 |
| -The current development branch of `outlines-core` can be installed from GitHub, also using `pip`: |
| 33 | +Basic example of how it all fits together. |
25 | 34 |
|
26 |
| -``` shell |
27 |
| -pip install git+https://github.com/outlines-dev/outlines-core |
28 |
| -``` |
| 35 | +```rust |
| 36 | +use outlines_core::prelude::*; |
29 | 37 |
|
30 |
| -Or install in a rust project with cargo: |
31 |
| -``` bash |
32 |
| -cargo add outlines-core |
| 38 | +// Define a JSON schema |
| 39 | +let schema = r#"{ |
| 40 | + "type": "object", |
| 41 | + "properties": { |
| 42 | + "name": { "type": "string" }, |
| 43 | + "age": { "type": "integer" } |
| 44 | + }, |
| 45 | + "required": ["name", "age"] |
| 46 | +}"#; |
| 47 | + |
| 48 | +// Generate a regular expression from it |
| 49 | +let regex = json_schema::regex_from_str(&schema, None)?; |
| 50 | + |
| 51 | +// Create `Vocabulary` from pretrained large language model (but manually is also possible) |
| 52 | +let vocabulary = Vocabulary::from_pretrained("openai-community/gpt2", None)?; |
| 53 | + |
| 54 | +// Create new `Index` from regex and a given `Vocabulary` |
| 55 | +let index = Index::new(®ex, &vocabulary)?; |
| 56 | + |
| 57 | +let initial_state = index.initial_state(); |
| 58 | +let allowed_tokens = index.allowed_tokens(&initial_state).expect("Some allowed token ids"); |
| 59 | +let token_id = allowed_tokens.first().expect("First token id"); |
| 60 | +let next_state = index.next_state(&initial_state, token_id); |
| 61 | +let final_states = index.final_states(); |
33 | 62 | ```
|
34 | 63 |
|
35 |
| -**Note:** The Minimum Supported Rust Version (MSRV) for this project is 1.78.0. |
| 64 | +## Python Bindings |
| 65 | + |
| 66 | +Additionally, project provides interfaces to integrate the crate's functionality with Python. |
| 67 | + |
| 68 | +``` python |
| 69 | +import json |
| 70 | + |
| 71 | +from outlines_core.json_schema import build_regex_from_schema |
| 72 | +from outlines_core.guide import Guide, Index, Vocabulary |
| 73 | + |
| 74 | +schema = { |
| 75 | + "title": "Foo", |
| 76 | + "type": "object", |
| 77 | + "properties": {"date": {"type": "string", "format": "date"}} |
| 78 | +} |
| 79 | +regex = build_regex_from_schema(json.dumps(schema)) |
| 80 | + |
| 81 | +vocabulary = Vocabulary.from_pretrained("openai-community/gpt2") |
| 82 | +index = Index(regex, vocabulary) |
| 83 | +guide = Guide(index) |
| 84 | + |
| 85 | +# Get current state of the Guide: |
| 86 | +current_state = guide.get_state() |
| 87 | + |
| 88 | +# Get allowed tokens for the current state of the Guide: |
| 89 | +allowed_tokens = guide.get_tokens() |
| 90 | + |
| 91 | +# Advance Guide to the next state via some token_id and return allowed tokens for that new state: |
| 92 | +next_allowed_tokens = guide.advance(allowed_tokens[-1]) |
| 93 | + |
| 94 | +# To check if Guide is finished: |
| 95 | +guide.is_finished() |
| 96 | + |
| 97 | +# If it's finished then this assertion holds: |
| 98 | +assert guide.get_tokens() == [vocabulary.get_eos_token_id()] |
| 99 | +``` |
36 | 100 |
|
37 |
| -# How to contribute? |
| 101 | +## How to contribute? |
38 | 102 |
|
39 |
| -## Setup |
| 103 | +### Setup |
40 | 104 |
|
41 |
| -First, fork the repository on GitHub and clone the fork locally: |
| 105 | +Fork the repository on GitHub and clone the fork locally: |
42 | 106 |
|
43 | 107 | ```bash
|
44 | 108 | git clone git@github.com/YourUserName/outlines-core.git
|
45 | 109 | cd outlines-core
|
46 | 110 | ```
|
47 | 111 |
|
48 |
| -Create a new virtual environment: |
| 112 | +Create a new virtual environment and install the dependencies in editable mode: |
49 | 113 |
|
50 | 114 | ``` bash
|
51 | 115 | python -m venv .venv
|
52 | 116 | source .venv/bin/activate
|
| 117 | +pip install -e ".[test]" |
| 118 | +pre-commit install |
53 | 119 | ```
|
54 | 120 |
|
55 |
| -Then install the dependencies in editable mode, and install the pre-commit hooks: |
| 121 | +### Before pushing your code |
| 122 | + |
| 123 | +If working with Python bindings don't forget to build Rust extension before testing, for example, in debug mode: |
| 124 | + |
| 125 | +```bash |
| 126 | +make build-extension-debug |
| 127 | +``` |
| 128 | + |
| 129 | +Run Python tests: |
56 | 130 |
|
57 | 131 | ``` bash
|
58 |
| -pip install -e ".[test]" |
59 |
| -pre-commit install |
| 132 | +pytest |
60 | 133 | ```
|
61 | 134 |
|
62 |
| -## Before pushing your code |
| 135 | +Run Rust tests: |
63 | 136 |
|
64 |
| -Run the tests: |
| 137 | +``` bash |
| 138 | +cargo test |
| 139 | +``` |
65 | 140 |
|
| 141 | +Or alternatively using Makefile for both: |
66 | 142 |
|
67 | 143 | ``` bash
|
68 |
| -pytest |
| 144 | +make test |
69 | 145 | ```
|
70 | 146 |
|
71 |
| -And run the code style checks: |
| 147 | +Finally, run the code style checks: |
72 | 148 |
|
73 | 149 | ``` bash
|
74 | 150 | pre-commit run --all-files
|
75 | 151 | ```
|
76 | 152 |
|
| 153 | +Or using Makefile: |
| 154 | + |
| 155 | +``` bash |
| 156 | +make pcc |
| 157 | +``` |
| 158 | + |
| 159 | +If necessary you can run benchmarks locally: |
| 160 | + |
| 161 | +``` bash |
| 162 | +make pybench |
| 163 | +``` |
| 164 | + |
| 165 | +## Join us |
| 166 | + |
| 167 | +- 💡 **Have an idea?** Come chat with us on [Discord][discord] |
| 168 | +- **Found a bug?** Open an [issue](https://github.com/dottxt-ai/outlines-core/issues) |
77 | 169 |
|
78 | 170 | [outlines]: https://github.com/dottxt-ai/outlines
|
79 |
| -[contributors]: https://github.com/outlines-dev/outlines-core/graphs/contributors |
80 |
| -[contributors-badge]: https://img.shields.io/github/contributors/outlines-dev/outlines-core?style=flat-square&logo=github&logoColor=white&color=ECEFF4 |
81 |
| -[rust-implementation]: https://github.com/outlines-dev/outlines-core/tree/readme/src |
82 |
| -[python-bindings]: https://github.com/outlines-dev/outlines-core/tree/readme/python/outlines_core |
| 171 | +[discord]: https://discord.gg/R9DSu34mGd |
0 commit comments