Skip to content

Commit a053c68

Browse files
zhengruifengdongjoon-hyun
authored andcommitted
[SPARK-52561][PYTHON][INFRA] Upgrade the minimum version of Python to 3.10
### What changes were proposed in this pull request? Upgrade the minimum version of Python to 3.10 ### Why are the changes needed? Python 3.9 is reaching its EOL ### Does this PR introduce _any_ user-facing change? yes, doc change ### How was this patch tested? PR builder with upgraded image https://github.com/zhengruifeng/spark/actions/runs/16064529566/job/45340924656 ### Was this patch authored or co-authored using generative AI tooling? No Closes #51259 from zhengruifeng/py_min_310. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
1 parent c2dd021 commit a053c68

File tree

3 files changed

+11
-21
lines changed

3 files changed

+11
-21
lines changed

.github/workflows/build_python_minimum.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ jobs:
3838
envs: >-
3939
{
4040
"PYSPARK_IMAGE_TO_TEST": "python-minimum",
41-
"PYTHON_TO_TEST": "python3.9"
41+
"PYTHON_TO_TEST": "python3.10"
4242
}
4343
jobs: >-
4444
{

dev/spark-test-image/python-minimum/Dockerfile

Lines changed: 8 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,12 @@ LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For PySpark wi
2424
# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
2525
LABEL org.opencontainers.image.version=""
2626

27-
ENV FULL_REFRESH_DATE=20250327
27+
ENV FULL_REFRESH_DATE=20250703
2828

2929
ENV DEBIAN_FRONTEND=noninteractive
3030
ENV DEBCONF_NONINTERACTIVE_SEEN=true
3131

32+
# Should keep the installation consistent with https://apache.github.io/spark/api/python/getting_started/install.html
3233
RUN apt-get update && apt-get install -y \
3334
build-essential \
3435
ca-certificates \
@@ -52,30 +53,19 @@ RUN apt-get update && apt-get install -y \
5253
libxml2-dev \
5354
openjdk-17-jdk-headless \
5455
pkg-config \
56+
python3.10 \
57+
python3-psutil \
5558
qpdf \
5659
tzdata \
5760
software-properties-common \
5861
wget \
5962
zlib1g-dev
6063

61-
62-
# Should keep the installation consistent with https://apache.github.io/spark/api/python/getting_started/install.html
63-
64-
# Install Python 3.9
65-
RUN add-apt-repository ppa:deadsnakes/ppa
66-
RUN apt-get update && apt-get install -y \
67-
python3.9 \
68-
python3.9-distutils \
69-
&& apt-get autoremove --purge -y \
70-
&& apt-get clean \
71-
&& rm -rf /var/lib/apt/lists/*
72-
73-
74-
ARG BASIC_PIP_PKGS="numpy==1.21 pyarrow==11.0.0 pandas==2.0.0 six==1.16.0 scipy scikit-learn coverage unittest-xml-reporting"
64+
ARG BASIC_PIP_PKGS="numpy==1.22.4 pyarrow==11.0.0 pandas==2.2.0 six==1.16.0 scipy scikit-learn coverage unittest-xml-reporting"
7565
# Python deps for Spark Connect
7666
ARG CONNECT_PIP_PKGS="grpcio==1.67.0 grpcio-status==1.67.0 googleapis-common-protos==1.65.0 graphviz==0.20 protobuf"
7767

7868
# Install Python 3.9 packages
79-
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.9
80-
RUN python3.9 -m pip install --force $BASIC_PIP_PKGS $CONNECT_PIP_PKGS && \
81-
python3.9 -m pip cache purge
69+
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
70+
RUN python3.10 -m pip install --force $BASIC_PIP_PKGS $CONNECT_PIP_PKGS && \
71+
python3.10 -m pip cache purge

python/docs/source/getting_started/install.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ and building from the source.
3030
Python Versions Supported
3131
-------------------------
3232

33-
Python 3.9 and above.
33+
Python 3.10 and above.
3434

3535

3636
Using PyPI
@@ -143,7 +143,7 @@ the same session as pyspark (you can install in several steps too).
143143

144144
.. code-block:: bash
145145
146-
conda install -c conda-forge pyspark # can also add "python=3.9 some_package [etc.]" here
146+
conda install -c conda-forge pyspark # can also add "python=3.10 some_package [etc.]" here
147147
148148
Note that `PySpark for conda <https://anaconda.org/conda-forge/pyspark>`_ is maintained
149149
separately by the community; while new versions generally get packaged quickly, the

0 commit comments

Comments
 (0)