A comprehensive Windows Forms application that demonstrates how to integrate with LM Studio or other compatible AI services through a clean C# implementation. This application supports text generation, vision models, embeddings, and model management.
This application provides a feature-rich user interface to interact with a local LM Studio API or any API compatible with the OpenAI Chat Completions endpoint. It demonstrates streaming and non-streaming modes, multi-modal inputs (text + images), semantic search capabilities, and model discovery.
- Text Generation: Connect to local LM Studio API or compatible endpoints
- Streaming & Non-Streaming: Support for both real-time and complete response modes
- Vision Support: Send images along with text prompts to vision-language models (VLMs)
- Multi-Image Support: Process multiple images in a single request
- Conversation History: Maintains context across multiple exchanges
- Request Cancellation: Cancel ongoing requests at any time
- Embeddings API: Generate vector embeddings for semantic search and similarity comparison
- Model Discovery: List and query all available models (loaded and downloaded)
- Model Filtering: Filter models by type (LLM, VLM, Embeddings)
- Semantic Search: Implement vector-based search using embeddings
- Similarity Calculations: Compare text similarity using cosine similarity
- Clean, well-documented codebase with comprehensive error handling
- Event-based architecture for real-time updates
- Proper resource management with IDisposable implementation
- Polymorphic message content handling
- Thread-safe UI updates
- Configurable timeouts for different operation types
- .NET 9.0 or later
- LM Studio or compatible AI service running locally
- Basic understanding of C# and .NET Windows Forms
- Download and install LM Studio
- Load a compatible model (e.g., Llama, Qwen, Mistral)
- For vision capabilities, load a VLM model (e.g., qwen2-vl, llava)
- For embeddings, load an embedding model (e.g., nomic-embed-text)
- Start the local server (typically runs on http://localhost:1234)
The application is pre-configured to connect to:
- URL:
http://localhost:1234/v1/chat/completions - Model:
lfm2-vl-1.6b - System prompt:
you are a professional assistant
You can modify these parameters in the Form1.cs constructor:
_aiClient = new LMStudioExample(
"http://localhost:1234/v1/chat/completions", // API endpoint URL
"lfm2-vl-1.6b", // Model to use
"you are a professional assistant" // System instructions
);
_aiClient.initialize();The code is organized into three main components:
A comprehensive class that handles all communication with the AI API:
SendMessageAsync(string userMessage): For streaming text responsesSendMessageNonStreamingAsync(string userMessage): For non-streaming text responsesSendMessageWithImagesAsync(string userMessage, string[] imagePaths): For streaming with image supportSendMessageWithImagesNonStreamingAsync(string userMessage, string[] imagePaths): For non-streaming with images
GetEmbeddingAsync(string text, string embeddingModel): Get embedding vector for a single textGetEmbeddingsBatchAsync(string[] texts, string embeddingModel): Get embeddings for multiple textsCalculateCosineSimilarity(float[] embedding1, float[] embedding2): Compare two embeddings (static method)
GetAllModelsAsync(): Retrieve all available models (loaded and downloaded)GetModelInfoAsync(string modelId): Get detailed information about a specific modelGetLoadedModelsAsync(): Get only the models currently loaded in memoryGetEmbeddingModelsAsync(): Get all embedding modelsGetLanguageModelsAsync(): Get all language models (LLMs)GetVisionModelsAsync(): Get all vision-language models (VLMs)
SetTimeout(long timeSpanInSeconds): Configure request timeoutinitialize(long timeoutInSeconds): Initialize the client with conversation history
OnContentReceived: When content chunks are received (streaming mode)OnComplete: When a response is completeOnStatusUpdate: For status changesOnError: When errors occur
The Windows Forms UI that:
- Creates and configures the LMStudioExample instance
- Handles user input and button clicks
- Updates UI elements based on response events
- Manages request cancellation
- Ensures UI updates occur on the correct thread
- Maintains chat history display
IMessageContent: Interface for polymorphic message contentTextContent: Text-based message contentImageContent: Image-based message content with base64 encodingImageUrlData: Container for image URLs (data URLs with base64)
Message.CreateSystemMessage(string text): Create system instruction messagesMessage.CreateUserTextMessage(string text): Create user text messagesMessage.CreateAssistantMessage(string text): Create assistant response messages
MessageContentConverter: Handles polymorphic serialization of message contentFlexibleContentConverter: Handles both string and array content formats
The application includes comprehensive classes for deserializing various API responses:
- Streaming responses:
StreamingResponse,StreamingChoice,DeltaMessage - Non-streaming responses:
AIMessage,Choice,Message,Usage
EmbeddingResponse: Container for embedding resultsEmbeddingData: Individual embedding with vector and indexEmbeddingUsage: Token usage informationEmbeddingRequest: Request format for embeddings API
ModelsListResponse: Container for list of modelsModelInfo: Detailed information about a model including:- Model ID, type, publisher, architecture
- Quantization method, state (loaded/not-loaded)
- Max context length, compatibility type
- Helper properties:
IsLoaded,IsEmbeddingModel,IsLanguageModel,IsVisionModel
The application supports multiple image formats:
- JPEG (.jpg, .jpeg)
- PNG (.png)
- GIF (.gif)
- WebP (.webp)
- BMP (.bmp)
Images are automatically:
- Read from disk
- Converted to base64 encoding
- Embedded in data URLs with proper MIME types
- Included in the message content array
// Single image
await _aiClient.SendMessageWithImagesAsync(
"What do you see in this image?",
new[] { @"C:\path\to\image.jpg" }
);
// Multiple images
await _aiClient.SendMessageWithImagesAsync(
"Compare these images",
new[] { @"C:\path\to\image1.jpg", @"C:\path\to\image2.png" }
);// Single text embedding
var embedding = await _aiClient.GetEmbeddingAsync(
"Machine learning is fascinating",
"text-embedding-nomic-embed-text-v1.5"
);
// Batch embeddings
var embeddings = await _aiClient.GetEmbeddingsBatchAsync(
new[] { "Text 1", "Text 2", "Text 3" },
"text-embedding-nomic-embed-text-v1.5"
);float similarity = LMStudioExample.CalculateCosineSimilarity(embedding1, embedding2);
// Returns value between -1 and 1 (1 = identical, 0 = unrelated)// 1. Get query embedding
var queryEmbedding = await _aiClient.GetEmbeddingAsync(query);
// 2. Get document embeddings
var docEmbeddings = await _aiClient.GetEmbeddingsBatchAsync(documents);
// 3. Calculate similarities and rank
var results = documents
.Select((doc, i) => new {
Document = doc,
Similarity = LMStudioExample.CalculateCosineSimilarity(
queryEmbedding,
docEmbeddings[i]
)
})
.OrderByDescending(r => r.Similarity)
.ToList();var models = await _aiClient.GetAllModelsAsync();
foreach (var model in models)
{
Console.WriteLine($"{model.Id} - {model.Type} ({model.State})");
}var loadedModels = await _aiClient.GetLoadedModelsAsync();
if (loadedModels?.Length > 0)
{
// Models are ready to use
}var embeddingModels = await _aiClient.GetEmbeddingModelsAsync();
var languageModels = await _aiClient.GetLanguageModelsAsync();
var visionModels = await _aiClient.GetVisionModelsAsync();var modelInfo = await _aiClient.GetModelInfoAsync("qwen2-vl-7b-instruct");
Console.WriteLine($"Max Context: {modelInfo.MaxContextLength} tokens");
Console.WriteLine($"Loaded: {modelInfo.IsLoaded}");The application implements comprehensive error handling:
- Validation of user input (empty messages, missing files)
- File existence checks for images
- Proper exception catching and reporting
- Graceful handling of request cancellation
- HTTP status code validation
- JSON parsing error handling
- Clear error messages displayed to the user
- Debug logging for troubleshooting
The LMStudioExample class implements IDisposable to ensure proper cleanup of resources, especially the HttpClient instance. Always dispose of the client when done:
protected override void OnFormClosing(FormClosingEventArgs e)
{
_cts?.Cancel();
_aiClient.Dispose();
base.OnFormClosing(e);
}- Enter your prompt in the text box
- Click "Send (Non-Streaming)" for a complete response at once
- Click "Send (Streaming)" to see the response arrive in real-time
- Use the Cancel button to stop an ongoing request
- Modify the code to include image paths
- Call
SendMessageWithImagesAsync()with your text and image paths - Ensure you're using a VLM model that supports vision
- Load an embedding model in LM Studio
- Call
GetEmbeddingAsync()with your text - Use the resulting vectors for similarity comparisons or search
- Call
GetAllModelsAsync()to see available models - Use
GetLoadedModelsAsync()to check what's ready - Filter by type to find specific model capabilities
You can easily extend this application to:
- Add UI elements for model selection dropdowns
- Implement image upload functionality through file dialogs
- Create a vector database for storing embeddings
- Build a full semantic search interface
- Add support for authentication tokens
- Include advanced parameter controls (temperature, top_p, max_tokens)
- Implement persistent conversation history
- Add support for function calling
- Create batch processing workflows
- Build a RAG (Retrieval Augmented Generation) system
- Chat Completions:
http://localhost:1234/v1/chat/completions - Embeddings:
http://localhost:1234/api/v0/embeddings - Models List:
http://localhost:1234/api/v0/models - Specific Model:
http://localhost:1234/api/v0/models/{modelId}
Connection errors:
- Ensure LM Studio server is running and accessible
- Check firewall settings
- Verify the endpoint URL is correct
Invalid model name:
- Check that the specified model is loaded in LM Studio
- Use
GetAllModelsAsync()to see available models - Ensure model ID matches exactly (case-sensitive)
Empty responses:
- Verify the system prompt and user message are appropriate
- Check if the model is actually loaded (
GetLoadedModelsAsync()) - Review timeout settings
Image not found errors:
- Verify image file paths are correct and absolute
- Check file permissions
- Ensure image format is supported
Slow responses:
- Consider using a smaller model
- Optimize LM Studio settings (GPU layers, context size)
- Increase timeout for non-streaming requests
- Use streaming mode for better perceived performance
Embedding dimension mismatch:
- Ensure you're using the same embedding model for all texts being compared
- Different models produce different dimension vectors
Model not loaded errors:
- Load the required model type in LM Studio before use
- Check model state with
GetModelInfoAsync() - VLMs required for image processing
- Embedding models required for embeddings
- Streaming vs Non-Streaming: Use streaming for better user experience on longer responses
- Timeout Settings: Set shorter timeouts (15s) for streaming, longer (200s) for non-streaming
- Batch Embeddings: Process multiple texts together when possible
- Image Size: Larger images increase processing time and memory usage
- Context Length: Monitor token usage against model's max context length
- Model Size: Smaller quantized models (Q4, Q5) are faster but may have lower quality
- Uses standard .NET libraries for HTTP communication and JSON handling
- Compatible with LM Studio and other OpenAI-compatible API services
- Supports OpenAI Chat Completions API format
- Follows LM Studio's API conventions for embeddings and model discovery
This example application is provided for educational purposes. Use appropriate error handling and security measures for production applications. Always validate user input and sanitize file paths before processing.