OpenAI-compatible API server powered by Apple’s on-device Foundation Models
Features • Requirements • Build & Install • Usage • License • References
A SwiftUI app that runs an OpenAI-compatible API server using Apple’s on-device Foundation Models, unifying base, deterministic, and creative variants under one endpoint for local use.
Feature | Description |
---|---|
🔁 OpenAI Compatibility | Drop-in replacement for OpenAI API with /chat/completions endpoint |
⚡ Streaming Support | Real-time responses via OpenAI streaming format |
💻 On-Device Processing | Uses Apple Foundation Models — no external servers, fully local |
✅ Availability Check | Automatic Apple Intelligence availability check on startup |
- macOS: 26 or greater
- Apple Intelligence: Must be enabled in Settings → Apple Intelligence & Siri
- Xcode: 26 or greater (must match OS version for building)
- macOS 26
- Xcode 26
- Apple Intelligence enabled
- Clone the repository
- Open
AppleIntelligenceAPI.xcodeproj
in Xcode - Select your development team in project settings
- Build and run the project (
⌘+R
) - The app will launch and start the server
Apple applies different rate limiting policies to Foundation Models:
“An app with UI in the foreground has no rate limit. A macOS CLI tool without UI is rate-limited.”
— Apple DTS Engineer (source)
- Launch the app
- Configure server settings (default:
127.0.0.1:11435
) - Click Start Server
- All three models will be served under OpenAI-compatible endpoints
GET /status
→ Model availability & statusGET /v1/models
→ List modelsPOST /v1/chat/completions
→ Chat completions (supports streaming)
# English
curl -X POST http://127.0.0.1:11435/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "apple-fm-base",
"messages": [{"role": "user", "content": "Hello, how are you?"}],
"temperature": 0.7,
"stream": false
}'
# French
curl -X POST http://127.0.0.1:11435/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "apple-fm-base",
"messages": [{"role": "user", "content": "Bonjour, comment allez-vous?"}],
"stream": false
}'
# Italian
curl -X POST http://127.0.0.1:11435/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "apple-fm-base",
"messages": [{"role": "user", "content": "Ciao, come stai?"}],
"stream": false
}'
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:11435/v1", api_key="not-needed")
# --- English (streaming example) ---
print("🔹 English:")
stream = client.chat.completions.create(
model="apple-fm-base",
messages=[{"role": "user", "content": "Hello, how are you?"}],
temperature=0.7,
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
print("\n")
# --- French (non-streaming example) ---
print("🔹 French:")
resp_fr = client.chat.completions.create(
model="apple-fm-base",
messages=[{"role": "user", "content": "Bonjour, comment allez-vous?"}],
stream=False,
)
print(resp_fr.choices[0].message.content)
print()
# --- Italian (non-streaming example) ---
print("🔹 Italian:")
resp_it = client.chat.completions.create(
model="apple-fm-base",
messages=[{"role": "user", "content": "Ciao, come stai?"}],
stream=False,
)
print(resp_it.choices[0].message.content)
This project is licensed under the MIT License — see LICENSE.
This project is a fork and modification of gety-ai/apple-on-device-openai.
Built with 🍎 + ❤️ by the open-source community