You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This will read `rope_theta` from the JSON file if present and fall back to a default value of `10_000`.
114
114
115
+
### Understanding Property Wrappers
116
+
117
+
MLX Swift uses property wrappers to handle automatic registration of modules and parameters. This is different from Python MLX, which uses runtime discovery.
118
+
119
+
#### Python MLX: Automatic Discovery
120
+
121
+
In Python MLX, you simply assign modules and parameters as attributes, and MLX automatically discovers them:
122
+
123
+
```python
124
+
classMyModule(nn.Module):
125
+
def __init__(self):
126
+
super().__init__()
127
+
# Just assign - MLX auto-discovers these
128
+
self.linear = nn.Linear(256, 256) # sub-module
129
+
self.weight = mx.ones([256]) # parameter
130
+
```
131
+
132
+
#### Swift MLX: Property Wrappers
133
+
134
+
In Swift MLX, you must explicitly declare modules and parameters using property wrappers:
135
+
136
+
```swift
137
+
class MyModule: Module {
138
+
@ModuleInfovar linear: Linear // sub-module
139
+
@ParameterInfovar weight: MLXArray // parameter
140
+
141
+
init() {
142
+
self._linear.wrappedValue=Linear(256, 256)
143
+
self._weight.wrappedValue= MLXArray.ones([256])
144
+
super.init()
145
+
}
146
+
}
147
+
```
148
+
149
+
#### When to Use Each
150
+
151
+
-**`@ModuleInfo`**: For neural network layers (anything with `callAsFunction`)
152
+
- Examples: `Linear`, `Attention`, `MLP`, `RMSNorm`, arrays of layers
153
+
-**`@ParameterInfo`**: For raw `MLXArray` tensors you use manually
This explicit registration system provides type safety and ensures all modules and parameters are properly tracked for operations like weight loading, quantization, and gradient computation.
180
+
181
+
#### Advanced Property Wrapper Patterns
182
+
183
+
Beyond the basic usage, there are several advanced patterns you'll encounter:
184
+
185
+
**Optional Modules**
186
+
```swift
187
+
@ModuleInfo(key:"lm_head") var lmHead: Linear?
188
+
@ModuleInfo(key:"text_projection") var textProjection: Linear?
189
+
```
190
+
Used when modules are conditionally created based on configuration.
191
+
192
+
**Arrays of Modules**
193
+
```swift
194
+
@ModuleInfo(key:"layers") var layers: [TransformerBlock]
195
+
@ModuleInfo(key:"down_blocks") var downBlocks: [EncoderDecoderBlock2D]
196
+
```
197
+
For dynamic numbers of repeated layers.
198
+
199
+
**Complex Module Types**
200
+
```swift
201
+
@ModuleInfo(key:"mid_blocks") var midBlocks: (ResnetBlock2D, Attention, ResnetBlock2D)
202
+
```
203
+
Tuples and other composite types are supported.
204
+
205
+
**Optional Parameters**
206
+
```swift
207
+
@ParameterInfovar bias: MLXArray?
208
+
@ModuleInfo(key:"bias") var bias: MLXArray?
209
+
```
210
+
For optional parameters that may not exist in all model variants.
211
+
212
+
**Special Case:@ModuleInfo with MLXArray**
213
+
```swift
214
+
@ModuleInfo(key:"weight") var weight: MLXArray
215
+
@ModuleInfo(key:"scales") var scales: MLXArray
216
+
```
217
+
In rare cases (like quantized layers), `@ModuleInfo` is used with `MLXArray` instead of `@ParameterInfo`. This typically occurs with specialized quantization or expert layers where the arrays are treated as sub-modules for weight loading purposes.
218
+
219
+
**Computed vs Loaded Parameters**
220
+
```swift
221
+
// Parameter loaded from weights - uses @ParameterInfo
222
+
@ParameterInfo(key:"correct_output_scale") var correctOutputScale: MLXArray
Use private underscore properties when you need to compute values based on configuration or other parameters, but don't want the weight loading to fail because these "parameters" don't exist in the saved weights.
241
+
115
242
### Porting Layers without Children
116
243
117
244
Now we can begin porting the layers (Modules). Here is an example layer with no child layers (e.g. `Linear`) but with parameters (e.g. `MLXArray`):
0 commit comments