Skip to content

Commit c8b3ae2

Browse files
vansangpfievkaiyuxLokiiiiiimegha95sangjanai
authored
rebase: v0.11.0 (#71)
* TensorRT-LLM v0.10 update * TensorRT-LLM Release 0.10.0 --------- Co-authored-by: Loki <lokravi@amazon.com> Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com> * TensorRT-LLM v0.11 Update (NVIDIA#1969) * fix: add formatter * fix: use executor API * fix: sync * fix: remove requests thread * fix: support unload endpoint for server example, handle release resources properly * refactor: InferenceState * fix: new line character for Mistral and Openhermes * fix: add benchmark script * Add Dockerfile for runner windows (#69) * Add Dockerfile for runner windows * Add Dockerfile for linux * Change CI agent * fix: build linux (#70) Co-authored-by: vansangpfiev <sang@jan.ai> --------- Co-authored-by: Hien To <tominhhien97@gmail.com> Co-authored-by: vansangpfiev <vansangpfiev@gmail.com> Co-authored-by: vansangpfiev <sang@jan.ai> * fix: default batch_size * chore: only linux build --------- Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Co-authored-by: Loki <lokravi@amazon.com> Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com> Co-authored-by: sangjanai <sang@jan.ai> Co-authored-by: hiento09 <136591877+hiento09@users.noreply.github.com> Co-authored-by: Hien To <tominhhien97@gmail.com>
1 parent 9311ae8 commit c8b3ae2

File tree

1,036 files changed

+2085989
-869436
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,036 files changed

+2085989
-869436
lines changed

.github/runners/linux/Dockerfile.multi

Lines changed: 20 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Multi-stage Dockerfile
22
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch
3-
ARG BASE_TAG=24.03-py3
3+
ARG BASE_TAG=24.05-py3
44
ARG DEVEL_IMAGE=devel
55

66
FROM ${BASE_IMAGE}:${BASE_TAG} as base
@@ -22,6 +22,10 @@ RUN bash ./install_cmake.sh && rm install_cmake.sh
2222
COPY docker/common/install_ccache.sh install_ccache.sh
2323
RUN bash ./install_ccache.sh && rm install_ccache.sh
2424

25+
# Only take effect when the base image is 12.4.0-devel-centos7.
26+
COPY docker/common/install_cuda_toolkit.sh install_cuda_toolkit.sh
27+
RUN bash ./install_cuda_toolkit.sh && rm install_cuda_toolkit.sh
28+
2529
# Download & install internal TRT release
2630
ARG TRT_VER
2731
ARG CUDA_VER
@@ -48,9 +52,7 @@ RUN bash ./install_mpi4py.sh && rm install_mpi4py.sh
4852
# Install PyTorch
4953
ARG TORCH_INSTALL_TYPE="skip"
5054
COPY docker/common/install_pytorch.sh install_pytorch.sh
51-
# Apply PyTorch patch for supporting compiling with CUDA 12.4 from source codes
52-
COPY docker/common/pytorch_pr_116072.patch /tmp/pytorch_pr_116072.patch
53-
RUN bash ./install_pytorch.sh $TORCH_INSTALL_TYPE && rm install_pytorch.sh /tmp/pytorch_pr_116072.patch
55+
RUN bash ./install_pytorch.sh $TORCH_INSTALL_TYPE && rm install_pytorch.sh
5456

5557
COPY setup.py requirements.txt requirements-dev.txt ./
5658

@@ -108,19 +110,30 @@ COPY tensorrt_llm tensorrt_llm
108110
COPY 3rdparty 3rdparty
109111
COPY setup.py requirements.txt requirements-dev.txt ./
110112

113+
# Create cache directories for pip and ccache
114+
RUN mkdir -p /root/.cache/pip /root/.cache/ccache
115+
ENV CCACHE_DIR=/root/.cache/ccache
116+
# Build the TRT-LLM wheel
111117
ARG BUILD_WHEEL_ARGS="--clean --trt_root /usr/local/tensorrt --python_bindings --benchmarks"
112-
RUN python3 scripts/build_wheel.py ${BUILD_WHEEL_ARGS}
118+
RUN --mount=type=cache,target=/root/.cache/pip --mount=type=cache,target=/root/.cache/ccache \
119+
python3 scripts/build_wheel.py ${BUILD_WHEEL_ARGS}
113120

114121
FROM ${DEVEL_IMAGE} as release
115122

123+
# Create a cache directory for pip
124+
RUN mkdir -p /root/.cache/pip
125+
116126
WORKDIR /app/tensorrt_llm
117127
COPY --from=wheel /src/tensorrt_llm/build/tensorrt_llm*.whl .
118-
RUN pip install tensorrt_llm*.whl --extra-index-url https://pypi.nvidia.com && \
128+
RUN --mount=type=cache,target=/root/.cache/pip \
129+
pip install tensorrt_llm*.whl --extra-index-url https://pypi.nvidia.com && \
119130
rm tensorrt_llm*.whl
120131
COPY README.md ./
121132
COPY docs docs
122133
COPY cpp/include include
123-
RUN ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/libs")') lib && \
134+
RUN ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/bin")') bin && \
135+
test -f bin/executorWorker && \
136+
ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/libs")') lib && \
124137
test -f lib/libnvinfer_plugin_tensorrt_llm.so && \
125138
ln -sv lib/libnvinfer_plugin_tensorrt_llm.so lib/libnvinfer_plugin_tensorrt_llm.so.9 && \
126139
echo "/app/tensorrt_llm/lib" > /etc/ld.so.conf.d/tensorrt_llm.conf && \

.github/runners/windows/Dockerfile

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# https://learn.microsoft.com/en-us/visualstudio/install/build-tools-container?view=vs-2022
2+
3+
# Use the Windows Server Core 2019 image.
4+
FROM mcr.microsoft.com/windows/servercore:ltsc2022 AS devel
5+
6+
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
7+
8+
# -----------------------------------------------------------------------------
9+
# Create a working directory
10+
11+
WORKDIR "C:\\\\workspace"
12+
13+
# -----------------------------------------------------------------------------
14+
# Install runtime dependencies
15+
16+
COPY setup_env.ps1 C:\\workspace\\setup_env.ps1
17+
# TRT is installed along with build-time dependencies
18+
RUN C:\workspace\setup_env.ps1 -skipTRT -skipCUDNN
19+
RUN Remove-Item "C:\workspace\setup_env.ps1" -Force
20+
# If enabling CUDNN, CUDNN paths are populated in the env variable CUDNN, add it to PATH
21+
# RUN [Environment]::SetEnvironmentVariable('Path', $Env:Path + ';' + $Env:CUDNN, [EnvironmentVariableTarget]::Machine)
22+
23+
# -----------------------------------------------------------------------------
24+
# Install build-time dependencies
25+
26+
COPY setup_build_env.ps1 C:\\workspace\\setup_build_env.ps1
27+
# TRT is installed in workspace
28+
RUN C:\workspace\setup_build_env.ps1 -TRTPath 'C:\\workspace'
29+
RUN Remove-Item "C:\workspace\setup_build_env.ps1" -Force
30+
31+
# Add binaries to Path
32+
RUN [Environment]::SetEnvironmentVariable('Path', $Env:Path + ';C:\Program Files\CMake\bin', [EnvironmentVariableTarget]::Machine)
33+
34+
# -----------------------------------------------------------------------------
35+
36+
# Install Vim (can delete this but it's nice to have)
37+
# and add binaries to Path
38+
39+
RUN Invoke-WebRequest -Uri https://ftp.nluug.nl/pub/vim/pc/gvim90.exe \
40+
-OutFile "install_vim.exe"; \
41+
Start-Process install_vim.exe -Wait -ArgumentList '/S'; \
42+
Remove-Item install_vim.exe -Force ; \
43+
[Environment]::SetEnvironmentVariable('Path', $Env:Path + ';C:\Program Files (x86)\Vim\vim90', [EnvironmentVariableTarget]::Machine)
44+
# -----------------------------------------------------------------------------
45+
46+
# Install Chocolatey
47+
# Chocolatey is a package manager for Windows
48+
49+
# If you try to install Chocolatey 2.0.0, it fails on .NET Framework 4.8 installation
50+
# https://stackoverflow.com/a/76470753
51+
ENV chocolateyVersion=1.4.0
52+
53+
# https://docs.chocolatey.org/en-us/choco/setup#install-with-powershell.exe
54+
RUN Set-ExecutionPolicy Bypass -Scope Process -Force; \
55+
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; \
56+
iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
57+
58+
# -----------------------------------------------------------------------------
59+
60+
# Install Git via Chocolatey
61+
RUN choco install git -y
62+
63+
# -----------------------------------------------------------------------------
64+
# Install CUDA 11.8 NVTX
65+
RUN Invoke-WebRequest -Uri https://developer.download.nvidia.com/compute/cuda/11.8.0/network_installers/cuda_11.8.0_windows_network.exe \
66+
-OutFile cuda_11.8.0_windows_network.exe; \
67+
Invoke-WebRequest -Uri https://7-zip.org/a/7zr.exe \
68+
-OutFile 7zr.exe
69+
70+
RUN .\7zr.exe e -i!'nsight_nvtx\nsight_nvtx\NVIDIA NVTX Installer.x86_64.Release.v1.21018621.Win64.msi' cuda_11.8.0_windows_network.exe ;
71+
72+
RUN cmd.exe /S /C "msiexec.exe /i 'NVIDIA NVTX Installer.x86_64.Release.v1.21018621.Win64.msi' /norestart /quiet"
73+
74+
RUN Remove-Item 'NVIDIA NVTX Installer.x86_64.Release.v1.21018621.Win64.msi' -Force ; \
75+
Remove-Item 7zr.exe -Force ; \
76+
Remove-Item cuda_11.8.0_windows_network.exe -Force
77+
78+
# -----------------------------------------------------------------------------
79+
80+
# Define the entry point for the docker container.
81+
# This entry point launches the 64-bit PowerShell developer shell.
82+
# We need to launch with amd64 arch otherwise Powershell defaults to x86 32-bit build commands which don't jive with CUDA
83+
ENTRYPOINT ["C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\Common7\\Tools\\VsDevCmd.bat", "-arch=amd64", "&&", "powershell.exe", "-NoLogo", "-ExecutionPolicy", "Bypass"]
84+
85+
# -----------------------------------------------------------------------------
86+
# COPY requirements-windows.txt C:\\workspace\\requirements-windows.txt
87+
# COPY requirements-dev-windows.txt C:\\workspace\\requirements-dev-windows.txt
88+
# RUN python3 -m pip --no-cache-dir install -r C:\workspace\requirements-dev-windows.txt
89+
# RUN Remove-Item "C:\workspace\requirements-windows.txt" -Force
90+
# RUN Remove-Item "C:\workspace\requirements-dev-windows.txt" -Force
91+
92+
ADD ./requirements-dev-windows.txt ./requirements-dev-windows.txt
93+
ADD ./requirements-windows.txt ./requirements-windows.txt
94+
95+
RUN python3 -m pip install --no-cache-dir -r .\requirements-dev-windows.txt
96+
97+
98+
ARG RUNNER_VERSION=2.317.0
99+
100+
# Define the entry point for the docker container.
101+
# This entry point launches the 64-bit PowerShell developer shell.
102+
# We need to launch with amd64 arch otherwise Powershell defaults to x86 32-bit build commands which don't jive with CUDA
103+
# ENTRYPOINT ["C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\Common7\\Tools\\VsDevCmd.bat", "-arch=amd64", "&&", "powershell.exe", "-NoLogo", "-ExecutionPolicy", "Bypass"]
104+
105+
RUN Invoke-WebRequest \
106+
-Uri https://github.com/actions/runner/releases/download/v$env:RUNNER_VERSION/actions-runner-win-x64-$env:RUNNER_VERSION.zip \
107+
-OutFile runner.zip; \
108+
Expand-Archive -Path ./runner.zip -DestinationPath ./actions-runner; \
109+
Remove-Item -Path .\runner.zip;
110+
111+
ADD runner.ps1 ./runner.ps1
112+
113+
RUN powershell -Command New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force
114+
115+
CMD ["powershell.exe", "-ExecutionPolicy", "Unrestricted", "-File", ".\\runner.ps1"]

0 commit comments

Comments
 (0)