Skip to content
This repository was archived by the owner on Apr 28, 2023. It is now read-only.

Commit 2679f46

Browse files
Merge pull request #5 from facebookresearch/readme-arxiv
adding arXiv link to readme + small improvements to readme
2 parents 25c8644 + f17252b commit 2679f46

File tree

1 file changed

+19
-46
lines changed

1 file changed

+19
-46
lines changed

README.md

Lines changed: 19 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ![Tensor Comprehensions](docs/source/_static/img/tc-logo-full-color-with-text-2.png)
22

3-
Tensor Comprehensions (TC) is a fully-functional C++ library to *automatically* synthesize high-performance machine learning kernels using [Halide](https://github.com/halide/Halide), [ISL](http://isl.gforge.inria.fr/) and NVRTC or LLVM. TC additionally provides basic integration with Caffe2 and pybind11 bindings for use with python.
3+
Tensor Comprehensions (TC) is a fully-functional C++ library to *automatically* synthesize high-performance machine learning kernels using [Halide](https://github.com/halide/Halide), [ISL](http://isl.gforge.inria.fr/) and NVRTC or LLVM. TC additionally provides basic integration with Caffe2 and pybind11 bindings for use with python. We provide more details in our paper on [arXiv](https://arxiv.org/abs/1802.04730).
44

55
This library is designed to be highly portable, machine-learning-framework agnostic and only requires a simple tensor library with memory allocation, offloading and synchronization capabilities.
66

@@ -12,38 +12,29 @@ The following illustrates a short but powerful feature of the library: the capac
1212

1313
```cpp
1414
#include <ATen/ATen.h>
15-
1615
#include "tc/aten/aten_compiler.h"
1716
#include "tc/core/mapping_options.h"
1817

19-
// 1. Define and setup the TC compilation unit with CUDA memory
20-
// management backed by ATen tensors.
18+
// 1. Define and setup the TC compilation unit with CUDA memory management backed by ATen.
2119
std::string tc = R"TC(
22-
def channel_contraction(float(N, C1, C2, H, W) I0,
23-
float(N, C2, C3, H, W) I1)
24-
-> (O)
25-
{
26-
O(n, c1, c3, h, w) +=! I0(n, c1, c2, h, w) * I1(n, c2, c3, h, w)
27-
}
28-
)TC";
29-
30-
tc::ATenCompilationUnit atCompl;
31-
atCompl.define(tc);
20+
def TensorDot(float(N, C1, C2, H, W) I0, float(N, C2, C3, H, W) I1) -> (O) {
21+
O(n, c1, c3, h, w) +=! I0(n, c1, c2, h, w) * I1(n, c2, c3, h, w)
22+
})TC";
3223

3324
// 2. Allocate tensors with random data
34-
std::vector<at::Tensor> outputs;
3525
at::Tensor I0 = at::CUDA(at::kFloat).rand({32, 512, 8, 28, 28});
36-
at::Tensor I1 = at::CUDA(at::kFloat).rand({32, 8, 2, 28, 28});;
26+
at::Tensor I1 = at::CUDA(at::kFloat).rand({32, 8, 2, 28, 28});
27+
std::vector<at::Tensor> outputs;
3728

3829
// 3. Run autotuning with evolutionary search starting from a naive option
3930
auto options = tc::MappingOptions::makeNaiveMappingOptions();
40-
auto bestOption =
41-
autotune(cacheFilename, TC, "channel_contraction", {I0, I1}, options, {options});
31+
auto bestOption = autotune(cacheFilename, tc, "TensorDot", {I0, I1}, options, {options});
4232

4333
// 4. Compile and run the TC with the best option.
44-
// Outputs get allocated; could also be pre-allocated and passed
45-
auto handle = atCompl.compile("channel_contraction", {I0, I1}, bestOption);
46-
atCompl.run("channel_contraction", {I0, I1}, outputs, handle);
34+
tc::ATenCompilationUnit atCompl;
35+
atCompl.define(tc);
36+
auto handle = atCompl.compile("TensorDot", {I0, I1}, bestOption);
37+
atCompl.run("TensorDot", {I0, I1}, outputs, handle);
4738

4839
// 5. Perform precision checks against an ATen reference implementation
4940
check({I0, I1}, outputs, [&I0, &I1](){ return ...; });
@@ -55,37 +46,19 @@ After a few generations of autotuning on a 2-GPU P100 system, we see results res
5546
5647
We have not yet characterized the precise fraction of peak performance we obtain but it is not uncommon to obtain 80%+ of peak shared memory bandwidth after autotuning. Solid register-level optimizations are still in the work but TC in its current form already addresses the productivity gap between the needs of research and the needs of production. Which is why we are excited to share it with the entire community and bring this collaborative effort in the open.
5748
58-
# Documentation, Environment and Prerequisites
59-
We provide pre-built docker images in the docker subdirectory, they can be downloaded from [dockerhub](https://hub.docker.com/u/tensorcomprehensions/). We use and support those images as part of our continuous integration. Note that we can cross-compile CUDA (but not execute) even if the machine has no physical GPUs. In any case the CUDA toolkit and libraries should always be installed, for now.
60-
61-
To get started, see the [docs](master/docs) directory.
62-
63-
# Preparing the source
49+
# Installation / Documentation
50+
You can find documentation [here](https://facebookresearch.github.io/TensorComprehensions/) which contains instructions for building TC via docker, conda packages or in non-conda environment.
6451
65-
Once the environment is set up properly you can:
66-
``` shell
67-
git clone --recursive git@github.com:facebookresearch/TensorComprehensions.git
68-
cd TensorComprehensions
69-
```
70-
71-
# Build and test
52+
# Communication
7253
73-
```shell
74-
BUILD_TYPE=Release CLANG_PREFIX=$(llvm-config --prefix) ./build.sh --all && ./test_cpu.sh
75-
BUILD_TYPE=Release CLANG_PREFIX=$(llvm-config --prefix) ./build.sh --all && ./test.sh
76-
```
54+
* **GitHub issues**: bug reports, feature requests, install issues, RFCs, thoughts, etc.
55+
* **Slack**: For discussion around framework integration, build support, collaboration, etc. join our slack channel https://tensorcomprehensions.slack.com. You may need an invitation to join, contact us by email at tensorcomp@fb.com to get one.
7756
78-
# Build and test with Caffe2
79-
80-
```shell
81-
BUILD_TYPE=Release WITH_CAFFE2=ON CLANG_PREFIX=$(llvm-config --prefix) ./build.sh --all && ./build/test/test_caffe2
82-
```
57+
# Code of Conduct
58+
See the [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) file for more details.
8359
8460
# License
8561
Tensor Comprehensions is distributed under a permissive Apache v2.0 license, see the [LICENSE](LICENSE) file for more details.
8662
87-
# Code of Conduct
88-
See the [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) file for more details.
89-
9063
# Contributing
9164
See the [CONTRIBUTING.md](CONTRIBUTING.md) file for more details.

0 commit comments

Comments
 (0)