- 🚀 Apple Silicon Optimized - Leverages Metal Performance Shaders and ARM NEON vectorization
- 🐍 NumPy Compatible - Full syntax compatibility for easy Python-to-C++ translation
- 🔄 Einops Support - Intuitive tensor rearrangement with einops-style operations
- 📊 ONNX Ready - Seamless integration with ONNX model inference pipelines
- ⚡ Zero-Copy Operations - Memory-efficient tensor operations where possible
- 🎯 Type Safety - Modern C++20 with compile-time shape checking
- macOS 11.0+ (Big Sur) with Apple Silicon
- Xcode 13+ or Clang 13+
- CMake 3.20+
- Metal Performance Shaders Framework
# Clone the repository
git clone https://github.com/your-username/axiom.git
cd axiom
# Create build directory
mkdir build && cd build
# Configure with CMake
cmake .. -DCMAKE_BUILD_TYPE=Release -DAXIOM_BUILD_TESTS=ON
# Build
make -j$(sysctl -n hw.ncpu)
# Run tests
make test
# Install system-wide
sudo make install
# Or use in your CMake project
find_package(Axiom REQUIRED)
target_link_libraries(your_target Axiom::axiom)
#include <axiom/tensor.hpp>
#include <axiom/ops.hpp>
using namespace axiom;
int main() {
// Create tensors with NumPy-like syntax
auto x = tensor::randn({64, 128, 256});
auto y = tensor::ones({256, 512});
// Matrix operations
auto result = x.matmul(y);
// Einops-style rearrangement
auto reshaped = x.rearrange("b h w -> b (h w)");
// Broadcasting and element-wise ops
auto scaled = (x * 2.0f + 1.0f).relu();
return 0;
}
- Basic tensor class with shape/stride handling
- Memory management and storage abstraction
- CMake build system setup
- Basic arithmetic operations (+, -, *, /)
- Metal backend implementation
- Linear algebra operations (matmul, dot, etc.)
- Reduction operations (sum, mean, max, etc.)
- Broadcasting semantics
- Einops-style rearrangement parser
- NumPy compatibility layer
- Automatic differentiation (optional)
- ONNX integration utilities
- Memory pool allocation
- Kernel fusion optimizations
- Multi-threading for CPU operations
- Benchmarking and profiling tools
- Memory Bandwidth: >90% of theoretical peak on Apple Silicon
- Compute Utilization: >85% GPU utilization for large tensor operations
- Latency: <1ms overhead for tensor creation and basic operations
- Compatibility: 100% pass rate on NumPy compatibility test suite
MIT License - see LICENSE file for details.
@misc{axiom2025,
title={Axiom: High-Performance Tensor Library for Apple Silicon},
author={Noah Kay},
year={2025},
url={https://github.com/frikallo/axiom}
}