-
Notifications
You must be signed in to change notification settings - Fork 687
Add Metal backend core ETMetal runtime. #15020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Stack from ghstack (oldest at bottom): |
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15020
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Unrelated FailureAs of commit b782bb5 with merge base 6e0c9f6 ( NEW FAILURE - The following job has failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline
// Commit buffer and allow immediate reuse for better performance | ||
[commandBuffer_ commit]; | ||
ET_LOG(Debug, "ETMetalStream::commitAndContinue: Committed buffer %p with continue", commandBuffer_); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to release the buffer after commit?
[commandBuffer_ release];
commandBuffer_ = nil;
if (cps_) [cps_ retain]; | ||
if (func_) [func_ retain]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need these retain
lines? Aren't these already owned by the class?
// Don't release encoder_ here - the stream owns it | ||
// Only clean up our own references | ||
if (cps_) { | ||
[cps_ release]; | ||
cps_ = nil; | ||
} | ||
if (func_) { | ||
[func_ release]; | ||
func_ = nil; | ||
} | ||
|
||
encoder_ = nil; // Clear reference without releasing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just this?
cps_ = nil;
func_ = nil;
encoder_ =nil;
resultsDictionary:results | ||
executionDescriptor:nil]; | ||
|
||
//synchronize(syncType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why commented out?
}; | ||
|
||
// ======================= | ||
// ETMetalShaderLibrary - ExecuTorch Metal shader library management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a bit more comments. ETMetalShaderLibrary
- Compiles Metal Shading Language (MSL) source to MTLLibrary
- Caches compiled pipeline states for reuse
- Creates ETMetalKernelFunction lazily
- Lifetime: Persistent (application lifetime)
}; | ||
|
||
// ======================= | ||
// ETMetalKernelFunction - ExecuTorch Metal kernel function execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Encodes kernel dispatches into command buffers
- Lifetime: Per-kernel-invocation
}; | ||
|
||
// ======================= | ||
// ETMetalStream - Metal command buffer and synchronization management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ETMetalStream
- Manages Metal device, command queue, and command buffers
- Provides synchronization primitives (5 modes)
- Implements kernel coalescing for batch optimization
- Executes MPSGraph operations for AOTI-compiled models
extern "C" { | ||
#endif | ||
|
||
// Memory management functions for Metal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docblock about the memory management aspect of this design, such as buffer lifecycle, thread safety etc.
|
||
// C++ only - expose the Metal buffer mapping | ||
#ifdef __OBJC__ | ||
extern std::unordered_map<void*, MTLBuffer_t> ptr_to_mtl_buffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need a lock and thread safety to access ptr_to_mtl_buffer?
void ETMetalStream::copy(id<MTLBuffer> srcBuffer, id<MTLBuffer> dstBuffer, size_t length, | ||
size_t srcOffset, size_t dstOffset, SyncType syncType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before copying check that offsets are within the buffer bounds
This commit introduces the foundational Metal backend runtime. Key features: - ETMetalStream for managing Metal devices, command queues, buffers, and synchronization. - ETMetalShaderLibrary for compiling Metal shader source and caching pipeline states. - ETMetalKernelFunction for kernel argument binding, dispatching, and synchronization with stream-managed encoders. - Added global buffer management and pointer tracking between host and Metal buffers. - Added global stream management utilities and synchronization helpers This provides the necessary runtime primitives for executing compute shaders and MPSGraph workloads. ghstack-source-id: ea4fbb5 ghstack-comment-id: 3392300041 Pull-Request: pytorch#15020
This commit introduces the foundational Metal backend runtime.
Key features:
This provides the necessary runtime primitives for executing compute shaders and MPSGraph workloads.