Authoring Recipes #10
bjhargrave
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
General Requirements
For information about contributing to this repo, code of conduct guidelines, etc., see the community CONTRIBUTING and Code of Conduct guides.
Checklist
See the pull request template for a thorough list of checks that should be completed before a PR will be merged.
Cooking Analogy
A "cookbook" is composed of "recipes".
In this version of the Granite Code Cookbook, a "recipe" is a Python notebook.
Under the implied cooking analogy, there are three key defining elements of a
good recipe:
1. State ingredients up-front
The "ingredients" and "tools" of the first point mean the data and software at hand.
2. Be straightforward to reproduce efficiently
The second point is straightforward to map into the technical space:
The reader should be able to run the cells of the notebook sequentially
and the same result as what was published in the original notebook.
Our objective is to make these recipes reproducible in 15 minutes or less.
This may inform the decisions about recipe granularity.
3. Result in something delicious to eat
The third point is more subtle, and is what sets this kind of writing apart.
In a technical sense, "something delicious to eat" means a system
that demonstrates useful functionality in such a way that it sets the
reader on the path to adopting it in their environment.
It clearly articulates the business value achieved by the resulting system.
Style Guide
The basic case is a list of python cells, each of which has a markdown cell preceding it. The markdown cells should begin with an imperative phrase (a command), and some description of what and how the following python does, and how it fits in to the overall flow.
Longer docs may need multiple sections. In that case, the section name should be a gerund phrase at an H2 level (
##
), with commands as imperative phrases at H3 (###
).The folder/file names for a recipe and any accompanying files must use underscore (
_
) instead of hyphen (-
) as a "word" separator when needed. Camel case can also be used if the file name is easily readable.Clear Outputs
In general, cell outputs should not be checked in with the recipe notebook. You can clear outputs while in Jupyter notebook.
Execution counts (
execution_count
in the notebook json) should be nulled out as well. This can be done by clearing outputs using a fresh (just-restarted) kernel.Pre-commit hook
Each cookbook repository has a git pre-commit hook called
nbstripout
that will clear the notebook cell outputs and execution counts for you upon commit. You can activate this for a given repository by runningpre-commit install
in the root directory of the cloned repo. (You might need to install it first:pip install pre-commit
.)Once activated, the
nbstripout
hook will clear outputs from each staged notebook and write the modified notebook over your working copy. Your commit will fail if there are staged files with uncleared outputs, but you cangit add
the modified files, then re-issue the commit, and the hook will pass.Disable or bypass pre-commit hook
Use
pre-commit uninstall
to deactive all pre-commit hooks, orgit commit --no-verify
to bypass them during commit.Keep some outputs
To keep outputs on individual cells, set the
keep_output
tag on the cell metadata. You can edit the cell metadata by selecting "Edit Metadata" on the Cell Toolbar, or by opening the notebook in a text editor and editing the json directly. The tag should look like this:See the documentation here.
Example Data
Many recipes must provide example data. This can either be data committed alongside the recipe, or downloaded during recipe execution.
Checking In Example Data
Checking in the data has the advantage of eliminating a moving part from the recipe. Not only might a referenced server disappear, but the data might change unexpectedly.
If the data is "small" (under 100 KB), and if the data can be committed to a cookbook repository and made available under the
CDLA Permissive 2.0 license (see Licenses page) then this is a good option.
Also consider the implications of the DCO requirement of commits. As noted below, see the legal section of the community CONTRIBUTING guide.
Downloading Example Data
In many cases the data is larger than something we'd want to manage in a Cookbook repository, or the the data cannot be made
available under the CDLA Permissive 2.0 license in the Cookbook repository.
In those cases, the recipe should obtain (download) the data during the execution of the notebook.
Be careful to state any login requirements at the top of the recipe
Ideally the data can be obtained from a public source without authentication.
Finally, also consider the testability of the download.
Recipe notebooks are executed automatically as a quality gate on pull requests.
This means that all data downloads should work unassisted/headless.
If this is not possible, consider using a flag (defaulting to true) to indicate to the recipe that it is running
as an automated test.
In that case, the recipe could take an alternate path to use some smaller, stand-in dataset that is committed to the Cookbook repository.
Other Guidelines
For recipe authors with strong familiarity with a specific capability or tool,
the first inclination may be to write a recipe oriented around the tool.
Consider alternate ways to phrase the recipe so that the end result is showcased, rather than the tool.
Under the cooking analogy, that would mean writing a great soup recipe rather than one that talks about the features of a food processor. If the soup tastes great and is easy to prepare, the reader will likely want to know more about how it was made.
Recipes will vary in complexity.
Some may be single inference calls.
Others may illustrate useful agentic workflows.
A "cookbook" is not intended to be a comprehensive guide to all
issues that may arise during development with Granite Code.
Recipes will link to helpful external resources on topics including: distributed systems, UI, design, AI/ML theory, metrics, etc.
Example
For a "text to SQL" recipe, for instance, the recipe should:
Beta Was this translation helpful? Give feedback.
All reactions