Skip to content

dockerfile working with uv; instructions update #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Git
.git
.gitignore
.gitattributes


# CI
.codeclimate.yml
.travis.yml
.taskcluster.yml

# Docker
docker-compose.yml
Dockerfile
.docker
.dockerignore

# Byte-compiled / optimized / DLL files
**/__pycache__/
**/*.py[cod]

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.cache
nosetests.xml
coverage.xml

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Virtual environment
.env
.venv/
venv/

# PyCharm
.idea

# Python mode for VIM
.ropeproject
**/.ropeproject

# Vim swap files
**/*.swp

# VS Code
.vscode/
14 changes: 8 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ WORKDIR /app
# Update Linux package list and install some key libraries, including latex
RUN apt-get -y update && apt-get install -y openssl graphviz \
nano texlive graphviz-dev \
bash build-essential git
bash build-essential git curl

# change default shell from ash to bash
RUN sed -i -e "s/bin\/ash/bin\/bash/" /etc/passwd
Expand All @@ -18,14 +18,16 @@ COPY uv.lock .
COPY pyproject.toml .

# Install uv
COPY --from=ghcr.io/astral-sh/uv:0.5.14 /uv /bin/uv
COPY --from=ghcr.io/astral-sh/uv:0.5.15 /uv /bin/uv

# Copy the current directory contents into the container at /app
COPY . /app

WORKDIR "/app"

# Install everything at once:
RUN uv sync --frozen
RUN uv sync

RUN uv pip list

# Copy the current directory contents into the container at /app
COPY . /app

RUN echo "Success building the Python4DS container!"
83 changes: 61 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,94 @@
# Python for Data Science

## Admin
[![DOI](https://zenodo.org/badge/496994611.svg)](https://zenodo.org/doi/10.5281/zenodo.10518241) ![GitHub Release](https://img.shields.io/github/v/release/aeturrell/python4DS)

### Building the Book
This is the repo for **Python for Data Science**.

To build the book using Jupyter books use
This README is for developers and contributors. If you're here to read the book, head over to [https://aeturrell.github.io/python4DS](https://aeturrell.github.io/python4DS).

## Contributing

We are very keen to encourage contributors! You can contribute by raising issues with the book or by creating pull requests directly. If you are creating a pull request you will need to install the development environment locally and check the book builds after you've made your changes.

Note that we aim to closely follow the content of [**R for Data Science (2e)**](https://r4ds.hadley.nz/).

Before making a pull request you should test that the pre-commit checks pass, including that there are no outputs included, and that the book builds. See below for instructions on how to do these locally.

When you make a pull request, pre-commit and build will run automatically, and fail if there are errors. They are in `.github/workflows/tests.yml`.

## Installing the development environment locally

You will need installations of Python 3.10 and [**uv**](https://docs.astral.sh/uv/). **uv** can be used to install certain distributions of Python through the `uv python install 3.10` command but you can use other Python installations.

Clone this repository.

To install the development environment, run `uv sync` from the project root. This should create a `.venv/` directory with the Python4DS environment in it. You can check that the environment has been installed by running `uv run python -V` in the project root directory.

## Building the book

The book is compiled from source markdown and Jupyter notebook files [**jupyter-book**](https://jupyterbook.org/en/stable/) package.

To build the book, run

```bash
jupyter-book build .
uv run jupyter-book build .
```

Once this command is run, you should be able to look at the HTML files for the book locally on your computer.
Once this command is run, you should be able to look at the HTML files for the book locally on your computer. They will be in `_build`. The project is configured to stop the build if any errors are encountered. This is a frequent occurrence! You'll need to look at the logs to work out what might have gone wrong.

## Uploading the book

### Automatic uploads of the book

Note that, due to package conflicts, several pages may not compile when taking this approach. One work around is to manually run troublesome notebooks and, when jupyter-book encounters a problem when executing them to build the book, it will pick up the notebook at the last point it was successfully manually executed. If you do have this problem, it may be that jupyter-book is not picking up the right jupyter kernel. You can look at installed kernels using `jupyter kernelspec list`.
This repo is configured such that new versions automatically build and upload the book to the website. The GitHub Action that does this is in `.github/workflows/release.yml`.

### Uploading Built Files
### Uploading the built book manually

Only upload built files based on a successful commit or merge to the main branch. See [here](https://jupyterbook.org/publish/gh-pages.html) for how to upload revised HTML files, but the key command is
You shouldn't need to upload the book if you are a regular contributor. There are times when you might need to as an admin, but normally the book will be updated automatically upon release of a new version.

See [here](https://jupyterbook.org/publish/gh-pages.html) for how to upload revised HTML files, but the key command is

```bash
ghp-import -n -p -f _build/html
uv run ghp-import -n -p -f _build/html
```

Typically, only maintainers will need to upload built files.
## Code hygiene

### Pre-commit
This book uses pre-commit to strip output from notebooks, lint, format, and check for large files added by mistake.

To perform the pre-commit checks, use

```bash
pre-commit run --all-files
uv run pre-commit run --all-files
```

on your staged files. Ensure pre-commit reports all tests as having passed before committing.

## Running and Developing in the Docker Container
## Publishing a new version

1. Open a new branch with the version name, eg `v1.0.4`

2. Change the version in `pyproject.toml` (you can run `uv run version_bumper.py`, which has script-level dependencies)

There is a dockerfile associated with this project. Pre-reqs
3. Commit the change with a new version label (eg `v1.0.4`) as the commit message

4. Go to GitHub. Assuming the tests pass, merge into main.

5. The book should automatically build in GitHub actions, and be pushed to the website. A new release will also be created automatically. A new Zenodo entry is also automatically created.

## Running and developing in a Docker container

There is a Dockerfile associated with this project. Pre-reqs
To use it:

1. Pre-reqs: docker installed, VS Code installed, VS Code docker and remote explorer extensions installed.
2. Build the image from the file. Right click on the file in VS Code and select build
3. On the Docker tab of VS Code, right-click on the image and select 'Run Interactive'
4. On the remote explorer tab of VS Code, find the running dev container and select 'attach new window'. This will start up a new VS Code window in the running container
5. Within the new VS Code window, open the folder ("app/")
6. Do any development as required (see the instructions above)
1. Pre-reqs: docker installed, VS Code installed, VS Code docker and Remote Explorer extensions installed.
2. Build the image from the file. Right click on the file in VS Code and select "Build Image".
3. On the Docker tab of VS Code, right-click on the `python4DS:latest` image and select 'Run Interactive'.
4. On the Docker tab again, right-click on the running `python4DS:latest` container and click "Attach Visual Studio Code".
5. Do any development as required (see the instructions above)

If you wish to copy any files (eg the built HTML files) back to your local machine to check them, use

```bash
docker cp CONTAINER:app/_build/html/ temp_dir/
```

Note that seaborn is currently using a pre-release version so this is installed directly in the dockerfile.
2 changes: 1 addition & 1 deletion _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
title: Python for Data Science
author: The Py4DS Community
logo: logo.png
exclude_patterns: [_build, Thumbs.db, .DS_Store, "**.ipynb_checkpoints", ".venv"]
exclude_patterns: [_build, Thumbs.db, .DS_Store, "**.ipynb_checkpoints", ".venv", "README.md"]
# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
Expand Down
2 changes: 1 addition & 1 deletion welcome.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ To begin your data science journey, head to the next page.

## Contributors to Python4DS

Contributing is very much encouraged. If you're looking for content to implement or tweak, we aim to broadly follow the structure and content of **R for Data Science (2e)** and you can find open [issues here](https://github.com/aeturrell/python4DS/issues). For larger contributions of content, it's probably best to check with other contributors first.
Contributing is very much encouraged. If you're looking for content to implement or tweak, we aim to follow the structure and content of **R for Data Science (2e)** and you can find open [issues here](https://github.com/aeturrell/python4DS/issues). For larger contributions of content, it's probably best to check with other contributors first.

We thank the following contributors:

Expand Down
Loading