Skip to content

Investigate multi-platform dependency locking to eliminate unnecessary system packages #1423

@coderabbitai

Description

@coderabbitai

Problem Description

During PR #1396 review, we discovered that ARM64 wheels for h5py 3.14.0 are available on PyPI but being ignored due to AMD64-only dependency locking with --platform=linux/amd64. This causes unnecessary hdf5-devel package installation in ARM64 TensorFlow images when the ARM64 wheel h5py-3.14.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl could be used instead.

Root Cause

  • Team uses containerized dependency locking with --platform=linux/amd64
  • pipenv only considers AMD64 wheels and source distributions during lock generation
  • Pipfile.lock contains only 2 hashes for h5py, confirming limited platform consideration
  • ARM64 builds can't use available ARM64 wheel because it's not in lock file
  • Forces source compilation requiring hdf5-devel package installation

Current Impact

  • Unnecessary build complexity and image size increase for ARM64
  • hdf5-devel installation could be eliminated for ARM64 builds
  • Pattern likely affects other packages across the repository

Investigation Areas

1. Immediate Solution: Conditional Package Installation

Replace current unconditional hdf5-devel installation with architecture-specific logic:

# Replace current hdf5 installation with:
RUN if [ "$(uname -m)" = "x86_64" ]; then \
    dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm && \
    dnf install -y hdf5-devel && \
    dnf clean all; \
fi

2. Multi-Platform Locking Ecosystem Analysis

Research current state (2024) shows:

  • Pipenv: No native multi-platform locking support (Issue #5130)
  • pip-tools: Same limitation - platform-specific only
  • Poetry: Also platform-specific only

3. Future Approaches

Evaluate potential solutions:

  • Platform-specific lock files with merging
  • Migration to tools with better multi-platform support
  • Custom scripts for lock file aggregation
  • Monitor upstream tool development

Affected Areas

Investigate similar patterns across:

  • All TensorFlow CUDA images (runtime and jupyter variants)
  • Other Python packages that may have ARM64 wheels but require system packages for source compilation
  • Dockerfile patterns that install development packages unconditionally

Acceptance Criteria

  • Implement conditional hdf5-devel installation for current TensorFlow images
  • Document ARM64 wheel availability analysis process
  • Investigate other packages with similar patterns
  • Evaluate long-term multi-platform locking strategies
  • Create implementation plan for systematic approach

Context

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

📋 Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions