This project aims to analyze the recipes of the popular YouTube channel Binging with Babish and convert them into beautiful data.
- jklewa.github.io/data-with-babish-example (repo) An interactive episode data viewer
- ibdb.episodes.json (docs) Episodes and related guests, recipes and inspiration
- ibdb.guests.json (docs) Guests and their appearances
- ibdb.recipes.json (docs) Recipes and their origin episode
- ibdb.references.json (docs) TV Shows, Movies, etc. and when they were referenced
- ibdb.shows.json (docs) Babish's Shows and their episode lists
- babish.json (docs) Parsed recipe ingredients, grouped by episode (Deprecated)
Episodes and related guests, recipes and inspiration in the format:
[
  {
    "episode_id": "epid",
    "name": "Episode Name",
    "published_date": "YYYY-MM-DD",
    "youtube_link": "https://www.youtube.com/watch?v=...",
    "official_link": "https://www.bingingwithbabish.com/recipes/...",
    "image_link": "https://preview.image.host/image.png",
    "related": {
      "show": {
        "show_id": 1,
        "name": "Binging with Babish"
      },
      "guests": [
        {
          "guest_id": 1,
          "name": "Guest Name"
        }
      ],
      "inspired_by": [
        {
          "reference_id": 1,
          "type": "tv_show|movie|youtube_channel|video_game|other",
          "name": "Reference Name",
          "description": "A description of the reference.",
          "external_link": "https://link.to.more/"
        }
      ],
      "recipes": [
        {
          "recipe_id": 1,
          "name": "Recipe Name",
          "raw_ingredient_list": "Ingedient 1\nIngredient 2\n...",
          "raw_procedure": "Step 1.\nStep 2.\n...",
          "ingredient_list": [
            [
              1.0,                   # quantity
              "tablespoon",          # unit
              "Butter",              # name
              "1 tablespoon butter"  # raw text from recipe
            ],
            # ...
          ]
        }
      ]
    }
  },
  # ...
]Guests and their appearances in the format:
[
  {
    "guest_id": 1,
    "name": "Guest Name",
    "appearances": [
      {
        "episode_id": "epid",
        "name": "Episode Name",
        "published_date": "YYYY-MM-DD",
        "youtube_link": "https://www.youtube.com/watch?v=...",
        "official_link": "https://www.bingingwithbabish.com/recipes/...",
        "image_link": "https://preview.image.host/image.png"
      }
    ]
  },
  # ...
]Recipes and their origin episode in the format:
[
  {
    "recipe_id": 1,
    "name": "Recipe Name",
    "raw_ingredient_list": "Ingedient 1\nIngredient 2\n...",
    "raw_procedure": "Step 1.\nStep 2.\n...",
    "source": {
      "episode_id": "epid",
      "name": "Episode Name",
      "published_date": "YYYY-MM-DD",
      "youtube_link": "https://www.youtube.com/watch?v=...",
      "official_link": "https://www.bingingwithbabish.com/recipes/...",
      "image_link": "https://preview.image.host/image.png"
    },
    "ingredient_list": [
      [
        1.0,                   # quantity
        "tablespoon",          # unit
        "Butter",              # name
        "1 tablespoon butter"  # raw text from recipe
      ],
      # ...
    ]
  },
  # ...
]TV Shows, Movies, etc. References and when they were referenced in the format:
[
  {
    "reference_id": 1,
    "type": "tv_show|movie|youtube_channel|video_game|other",
    "name": "Reference Name",
    "description": "A description of the reference.",
    "external_link": "https://link.to.more/",
    "episodes_inspired": [
      {
        "episode_id": "epid",
        "name": "Episode Name",
        "published_date": "YYYY-MM-DD",
        "youtube_link": "https://www.youtube.com/watch?v=...",
        "official_link": "https://www.bingingwithbabish.com/recipes/...",
        "image_link": "https://preview.image.host/image.png"
      }
    ]
  },
  # ...
]Babish's Shows and their episode lists in the format:
[
  {
    "show_id": 1,
    "name": "Binging with Babish",
    "episodes": [
      {
        "episode_id": "epid",
        "name": "Episode Name",
        "published_date": "YYYY-MM-DD",
        "youtube_link": "https://www.youtube.com/watch?v=...",
        "official_link": "https://www.bingingwithbabish.com/recipes/...",
        "image_link": "https://preview.image.host/image.png"
      }
    ]
  },
  # ...
](Deprecated)
Contains ingredients from BWB Recipes in the format:
[
  {
    "episode_name": "Episode Name",
    "episode_link": "https://www.bingingwithbabish.com/recipes/...",
    "youtube_link": "https://www.youtube.com/watch?v=...",
    "published": "YYYY-MM-DD",
    "recipes": [
      {
        "method": "Method Name (from Episode Name)",
        "ingredients": [
          [
            1.0,                   # quantity
            "tablespoon",          # unit
            "Butter",              # name
            "1 tablespoon butter"  # raw text from recipe
          ],
          # ...
        ]
      },
      # ...
    ]
  },
  # ...
]- Regex Samples: https://regexr.com/3p7h8 https://regexr.com/3p6pq
- Handling Unicode Fractions: https://stackoverflow.com/questions/1263796/how-do-i-convert-unicode-characters-to-floats-in-python
Required tools: Docker, Docker Compose
- Build docker-compose build
- Run DB and API docker-compose up -d
- Browse http://localhost:5000/
- Update DB and datasets docker-compose exec ibdb sync update export
- See other commands docker-compose exec ibdb --help
This will use populate_db.py to scrape and upsert episodes into the DB and export.py to generate new datasets/ from the DB's contents.
You can also explore the original populate_babish_json.py and Jupyter Notebooks
- cd notebooks/
- Start Jupyter on http://localhost:8888 jupyter notebook
- Open Babish Recipe Extract.ipynborBabish Data Analysis.ipynb
NOTE: Be aware that Babish Recipe Extract.ipynb will make LOTS of network calls to the official bingingwithbabish.com website. Calls are cached and rate limited but please be very considerate and only run them if absolutely necessary.
Tests covering recipe_parser.py are located in the tests/ directory and can be run using pytest.
Required tools: Python 3.8
- Install packages pip install -r requirements.txt
- Run tests python -m pytest