|
1 |
| -=============================== |
2 |
| -SageMaker TensorFlow Containers |
3 |
| -=============================== |
| 1 | +===================================== |
| 2 | +SageMaker TensorFlow Training Toolkit |
| 3 | +===================================== |
4 | 4 |
|
5 |
| -SageMaker TensorFlow Containers is an open source library for making the |
6 |
| -TensorFlow framework run on `Amazon SageMaker <https://aws.amazon.com/documentation/sagemaker/>`__. |
| 5 | +SageMaker TensorFlow Training Toolkit is an open-source library for using TensorFlow to train models on Amazon SageMaker. |
7 | 6 |
|
8 |
| -This repository also contains Dockerfiles which install this library, TensorFlow, and dependencies |
9 |
| -for building SageMaker TensorFlow images. |
| 7 | +For inference, see `SageMaker TensorFlow Inference Toolkit <https://github.com/aws/sagemaker-tensorflow-serving-container>`__. |
10 | 8 |
|
11 |
| -For information on running TensorFlow jobs on SageMaker: `Python |
12 |
| -SDK <https://github.com/aws/sagemaker-python-sdk>`__. |
| 9 | +For the Dockerfiles used for building SageMaker TensorFlow Containers, see `AWS Deep Learning Containers <https://github.com/aws/deep-learning-containers>`__. |
| 10 | + |
| 11 | +For information on running TensorFlow jobs on Amazon SageMaker, please refer to the `SageMaker Python SDK documentation <https://github.com/aws/sagemaker-python-sdk>`__. |
13 | 12 |
|
14 | 13 | For notebook examples: `SageMaker Notebook
|
15 | 14 | Examples <https://github.com/awslabs/amazon-sagemaker-examples>`__.
|
16 | 15 |
|
17 |
| -Table of Contents |
18 |
| ------------------ |
19 |
| - |
20 |
| -#. `Getting Started <#getting-started>`__ |
21 |
| -#. `Building your Image <#building-your-image>`__ |
22 |
| -#. `Running the tests <#running-the-tests>`__ |
23 |
| - |
24 |
| -Getting Started |
25 |
| ---------------- |
26 |
| - |
27 |
| -Prerequisites |
28 |
| -~~~~~~~~~~~~~ |
29 |
| - |
30 |
| -Make sure you have installed all of the following prerequisites on your |
31 |
| -development machine: |
32 |
| - |
33 |
| -- `Docker <https://www.docker.com/>`__ |
34 |
| - |
35 |
| -For Testing on GPU |
36 |
| -^^^^^^^^^^^^^^^^^^ |
37 |
| - |
38 |
| -- `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__ |
39 |
| - |
40 |
| -Recommended |
41 |
| -^^^^^^^^^^^ |
42 |
| - |
43 |
| -- A Python environment management tool. (e.g. |
44 |
| - `PyEnv <https://github.com/pyenv/pyenv>`__, |
45 |
| - `VirtualEnv <https://virtualenv.pypa.io/en/stable/>`__) |
46 |
| - |
47 |
| -Building your Image |
48 |
| -------------------- |
49 |
| - |
50 |
| -`Amazon SageMaker <https://aws.amazon.com/documentation/sagemaker/>`__ |
51 |
| -utilizes Docker containers to run all training jobs & inference endpoints. |
52 |
| - |
53 |
| -The Docker images are built from the Dockerfiles specified in |
54 |
| -`Docker/ <https://github.com/aws/sagemaker-tensorflow-containers/tree/master/docker>`__. |
55 |
| - |
56 |
| -The Docker files are grouped based on TensorFlow version and separated |
57 |
| -based on Python version and processor type. |
58 |
| - |
59 |
| -The Docker files for TensorFlow 2.0 are available in the |
60 |
| -`tf-2 <https://github.com/aws/sagemaker-tensorflow-container/tree/tf-2>`__ branch, in |
61 |
| -`docker/2.0.0/ <https://github.com/aws/sagemaker-tensorflow-container/tree/tf-2/docker/2.0.0>`__. |
62 |
| - |
63 |
| -The Docker images, used to run training & inference jobs, are built from |
64 |
| -both corresponding "base" and "final" Dockerfiles. |
65 |
| - |
66 |
| -Base Images |
67 |
| -~~~~~~~~~~~ |
68 |
| - |
69 |
| -The "base" Dockerfile encompass the installation of the framework and all of the dependencies |
70 |
| -needed. It is needed before building image for TensorFlow 1.8.0 and before. |
71 |
| -Building a base image is not required for images for TensorFlow 1.9.0 and onwards. |
72 |
| - |
73 |
| -Tagging scheme is based on <tensorflow_version>-<processor>-<python_version>. (e.g. 1.4 |
74 |
| -.1-cpu-py2) |
75 |
| - |
76 |
| -All "final" Dockerfiles build images using base images that use the tagging scheme |
77 |
| -above. |
78 |
| - |
79 |
| -Before building these images, you need to have a pip-installable binary of this repository saved locally. To create the SageMaker Tensorflow Container Python package: |
80 |
| - |
81 |
| -:: |
82 |
| - # Create the binary |
83 |
| - git clone https://github.com/aws/sagemaker-tensorflow-container.git |
84 |
| - cd sagemaker-tensorflow-container |
85 |
| - python setup.py sdist |
86 |
| - cp dist/sagemaker_tensorflow_training*.tar.gz docker/<tensorflow_version>/sagemaker_tensorflow_training.tar.gz |
87 |
| - |
88 |
| -Once you have copied the tensorflow_training.tar.gz to the desired location [same directory as the Dockerfile], you can then build the image. |
89 |
| - |
90 |
| -If you want to build your "base" Docker image, then use: |
91 |
| - |
92 |
| -:: |
93 |
| - |
94 |
| - # All build instructions assume you're building from the same directory as the Dockerfile. |
95 |
| - |
96 |
| - # CPU |
97 |
| - docker build -t tensorflow-base:<tensorflow_version>-cpu-<python_version> -f Dockerfile.cpu . |
98 |
| - |
99 |
| - # GPU |
100 |
| - docker build -t tensorflow-base:<tensorflow_version>-gpu-<python_version> -f Dockerfile.gpu . |
101 |
| - |
102 |
| -:: |
103 |
| - |
104 |
| - # Example |
105 |
| - |
106 |
| - # CPU |
107 |
| - docker build -t tensorflow-base:1.4.1-cpu-py2 -f Dockerfile.cpu . |
108 |
| - |
109 |
| - # GPU |
110 |
| - docker build -t tensorflow-base:1.4.1-gpu-py2 -f Dockerfile.gpu . |
111 |
| - |
112 |
| -Final Images |
113 |
| -~~~~~~~~~~~~ |
114 |
| - |
115 |
| -The "final" Dockerfiles encompass the installation of the SageMaker specific support code. |
116 |
| - |
117 |
| -For images of TensorFlow 1.8.0 and before, all "final" Dockerfiles use `base images for building <https://github |
118 |
| -.com/aws/sagemaker-tensorflow-containers/blob/master/docker/1.4.1/final/py2/Dockerfile.cpu#L2>`__. |
119 |
| - |
120 |
| -These "base" images are specified with the naming convention of |
121 |
| -tensorflow-base:<tensorflow_version>-<processor>-<python_version>. |
122 |
| - |
123 |
| -Before building "final" images: |
124 |
| - |
125 |
| -Build your "base" image. Make sure it is named and tagged in accordance with your "final" |
126 |
| -Dockerfile. Skip this step if you want to build image of Tensorflow Version 1.9.0 and above. |
127 |
| - |
128 |
| -If you want to build "final" Docker images, for versions 1.6 and above, you will first need to download the appropriate tensorflow pip wheel, then pass in its location as a build argument. These can be obtained from pypi. For example, the files for 1.6.0 are here: |
129 |
| - |
130 |
| -https://pypi.org/project/tensorflow/1.6.0/#files |
131 |
| -https://pypi.org/project/tensorflow-gpu/1.6.0/#files |
132 |
| - |
133 |
| -Note that you need to use the tensorflow-gpu wheel when building the GPU image. |
134 |
| - |
135 |
| -Then run: |
136 |
| - |
137 |
| -:: |
138 |
| - |
139 |
| - # All build instructions assumes you're building from the same directory as the Dockerfile. |
140 |
| - |
141 |
| - # CPU |
142 |
| - docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu . |
143 |
| - |
144 |
| - # GPU |
145 |
| - docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu . |
146 |
| - |
147 |
| -:: |
148 |
| - |
149 |
| - # Example |
150 |
| - docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg py_version=2 |
151 |
| - --build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu . |
152 |
| - |
153 |
| -The dockerfiles for 1.4 and 1.5 build from source instead, so when building those, you don't need to download the wheel beforehand: |
154 |
| - |
155 |
| -:: |
156 |
| - |
157 |
| - # All build instructions assumes you're building from the same directory as the Dockerfile. |
158 |
| - |
159 |
| - # CPU |
160 |
| - docker build -t <image_name>:<tag> -f Dockerfile.cpu . |
161 |
| - |
162 |
| - # GPU |
163 |
| - docker build -t <image_name>:<tag> -f Dockerfile.gpu . |
164 |
| - |
165 |
| -:: |
166 |
| - |
167 |
| - # Example |
168 |
| - |
169 |
| - # CPU |
170 |
| - docker build -t preprod-tensorflow:1.4.1-cpu-py2 -f Dockerfile.cpu . |
171 |
| - |
172 |
| - # GPU |
173 |
| - docker build -t preprod-tensorflow:1.4.1-gpu-py2 -f Dockerfile.gpu . |
174 |
| - |
175 |
| - |
176 |
| -Running the tests |
177 |
| ------------------ |
178 |
| - |
179 |
| -Running the tests requires installation of the SageMaker TensorFlow Container code and its test |
180 |
| -dependencies. |
181 |
| - |
182 |
| -:: |
183 |
| - |
184 |
| - git clone https://github.com/aws/sagemaker-tensorflow-containers.git |
185 |
| - cd sagemaker-tensorflow-containers |
186 |
| - pip install -e .[test] |
187 |
| - |
188 |
| -Tests are defined in |
189 |
| -`test/ <https://github.com/aws/sagemaker-tensorflow-containers/tree/master/test>`__ |
190 |
| -and include unit, integration and functional tests. |
191 |
| - |
192 |
| -Unit Tests |
193 |
| -~~~~~~~~~~ |
194 |
| - |
195 |
| -If you want to run unit tests, then use: |
196 |
| - |
197 |
| -:: |
198 |
| - |
199 |
| - # All test instructions should be run from the top level directory |
200 |
| - |
201 |
| - pytest test/unit |
202 |
| - |
203 |
| -Integration Tests |
204 |
| -~~~~~~~~~~~~~~~~~ |
205 |
| - |
206 |
| -Running integration tests require `Docker <https://www.docker.com/>`__ and `AWS |
207 |
| -credentials <https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html>`__, |
208 |
| -as the integration tests make calls to a couple AWS services. The integration and functional |
209 |
| -tests require configurations specified within their respective |
210 |
| -`conftest.py <https://github.com/aws/sagemaker-tensorflow-containers/blob/master/test/integration/conftest.py>`__.Make sure to update the account-id and region at a minimum. |
211 |
| - |
212 |
| -Integration tests on GPU require `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__. |
213 |
| - |
214 |
| -Before running integration tests: |
215 |
| - |
216 |
| -#. Build your Docker image. |
217 |
| -#. Pass in the correct pytest arguments to run tests against your Docker image. |
218 |
| - |
219 |
| -If you want to run local integration tests, then use: |
220 |
| - |
221 |
| -:: |
222 |
| - |
223 |
| - # Required arguments for integration tests are found in test/integ/conftest.py |
224 |
| - |
225 |
| - pytest test/integration --docker-base-name <your_docker_image> \ |
226 |
| - --tag <your_docker_image_tag> \ |
227 |
| - --framework-version <tensorflow_version> \ |
228 |
| - --processor <cpu_or_gpu> |
229 |
| - |
230 |
| -:: |
231 |
| - |
232 |
| - # Example |
233 |
| - pytest test/integration --docker-base-name preprod-tensorflow \ |
234 |
| - --tag 1.0 \ |
235 |
| - --framework-version 1.4.1 \ |
236 |
| - --processor cpu |
237 |
| - |
238 |
| -Functional Tests |
239 |
| -~~~~~~~~~~~~~~~~ |
240 |
| - |
241 |
| -Functional tests are removed from the current branch, please see them in older branch `r1.0 <https://github.com/aws/sagemaker-tensorflow-container/tree/r1.0#functional-tests>`__. |
242 |
| - |
243 | 16 | Contributing
|
244 | 17 | ------------
|
245 | 18 |
|
246 | 19 | Please read
|
247 |
| -`CONTRIBUTING.md <https://github.com/aws/sagemaker-tensorflow-containers/blob/master/CONTRIBUTING.md>`__ |
| 20 | +`CONTRIBUTING.md <https://github.com/aws/sagemaker-tensorflow-training-toolkit/blob/master/CONTRIBUTING.md>`__ |
248 | 21 | for details on our code of conduct, and the process for submitting pull
|
249 | 22 | requests to us.
|
250 | 23 |
|
251 | 24 | License
|
252 | 25 | -------
|
253 | 26 |
|
254 |
| -SageMaker TensorFlow Containers is licensed under the Apache 2.0 License. It is copyright 2018 |
| 27 | +SageMaker TensorFlow Training Toolkit is licensed under the Apache 2.0 License. It is copyright 2018 |
255 | 28 | Amazon.com, Inc. or its affiliates. All Rights Reserved. The license is available at:
|
256 | 29 | http://aws.amazon.com/apache2.0/
|
0 commit comments