A powerful web application for filtering Object-Centric Event Logs using natural language processing. Apply complex filters with simple natural language commands and visualize process mining data interactively.
- Overview
- Key Features
- Technology Stack
- Getting Started
- Usage Guide
- Filter Types
- API Documentation
- Session Management
- Resource Management
- Evaluation Feedback
- File Format Support
- Troubleshooting
- Contributing
The OCEL Log Filtering application bridges the gap between complex process mining data and natural language interaction. It leverages Large Language Models (LLMs) to interpret user queries and automatically apply appropriate filters to extract insights from Object-Centric Event Logs (OCEL).
OCEL is a standard format for storing event logs that capture process-oriented data with complex object interactions. Unlike traditional event logs that focus on a single case notion, OCEL allows for multiple object types and their relationships, enabling more comprehensive process mining analysis.
- Natural Language Filtering: Apply complex filters by describing them in plain English
- Interactive Visualization: Explore event-object relationships through dynamic, interactive graphs
- Multiple Log Formats: Support for OCEL logs in JSON (OCEL 2.0) and XML (OCEL 2.0) formats
- Session-Based Storage: Persistent user sessions with automatic resource management
- Filter Evaluation: Collection of user feedback to improve filtering effectiveness
- Hierarchical Log Organization: View relationships between original and filtered logs
- Real-time Statistics: Detailed metrics on events, objects, and relationships
- Python (^3.9): Core programming language
- FastAPI (^0.109.0): Modern, high-performance web framework
- PM4Py (^2.7.0): Process Mining library with OCEL support
- Pydantic (^2.6.0): Data validation and settings management
- OpenAI API (^1.65.5): For natural language processing of filter requests
- Uvicorn: ASGI server for serving FastAPI applications
- Asyncio: For asynchronous operations and performance
- Astro (^5.5.2): Web framework with view transitions support
- React (^19.0.0): For interactive UI components
- Tailwind CSS (^3.4.17): Utility-first CSS framework
- ReactFlow (^11.11.4): For interactive graph visualizations
- Axios (^1.8.3): Promise-based HTTP client
- SweetAlert2 (^11.6.13): For beautiful, responsive alert dialogs
- Docker and Docker Compose: For containerization and orchestration
- Nginx: As a reverse proxy for routing requests
- Environment Variables: For configuration management
- Docker and Docker Compose installed
- OpenAI API key for LLM functionality
- 250MB+ of available storage space
- Clone the repository:
git clone https://github.com/isa-group/llms-ocel.git
cd llms-ocel
- Configure the environment variables:
# Copy the example environment file and customize it
cp .env.example .env
# Edit the .env file with your settings (especially the OpenAI API key)
nano .env
- Start the application with Docker Compose:
docker-compose up -d
- Access the application:
- Frontend: http://localhost
- API: http://localhost/api
The application is highly configurable through environment variables in the .env file. Here are the key configuration options:
Variable | Description | Default |
---|---|---|
HOST | Host IP to bind the server | 0.0.0.0 |
PORT | Port for the backend server | 8000 |
DEBUG | Enable debug mode | False |
CORS_ORIGINS | Comma-separated list of allowed origins | http://localhost:80,... |
Variable | Description | Default |
---|---|---|
LLM_API_KEY | OpenAI API Key | (required) |
LLM_MODEL | LLM model to use | gpt-3.5-turbo |
LLM_TEMPERATURE | Temperature for LLM responses (0.0-1.0) | 0.1 |
LLM_MAX_TOKENS | Maximum tokens for LLM responses | 1000 |
Variable | Description | Default |
---|---|---|
CACHE_TTL | Log time-to-live in seconds for each log | 900 (15 minutes) |
CACHE_MAXSIZE | Maximum number of logs in cache | 200 |
SESSION_TTL_SECONDS | Backend session duration in seconds | 86400 (24 hours) |
SESSION_SIZE_LIMIT | Max storage per session in bytes | 262144000 (250 MB) |
SESSION_LOG_LIMIT | Maximum logs per session | 20 |
Variable | Description | Default |
---|---|---|
EXPORT_TIMEOUT | Timeout for export operations in seconds | 30 |
Variable | Description | Default |
---|---|---|
API_URL | URL for the backend API (used by frontend) | http://backend:8000 |
To apply configuration changes, update the .env file and restart the application using docker-compose down
followed by docker-compose up -d
.
- From the dashboard, click "Upload New Log" or navigate to the upload section
- Click "Upload" to process the file
- Review the log statistics displayed after successful upload
-
Select a log from the list or continue from your upload
-
Navigate to the "Filter" tab
-
Enter a natural language description of the filter you want to apply
Examples: "Show me orders created in January 2023" or "Filter events with activity equals pay order"
-
Click "Apply Filter" to process your request
-
Review the filter results and provide feedback on filter accuracy
- Select any log from your list
- Navigate to the "Visualization" tab
- Interact with the graph representation:
- Zoom and pan to explore the data
- Toggle visibility of different object types
- View relationships between events and objects
- Adjust the visualization settings for clarity
- From the log details view, click the "Download" button
- The file will be downloaded to your device
The application supports various filter strategies that can be applied through natural language:
Filter Type | Description | Example |
---|---|---|
Activity Filters | Filter events based on their activity values | "Show only events with activity 'create order'" |
Timestamp Filters | Filter events based on date ranges | "Show events between January 1st and March 31st, 2024" |
Object Type Filters | Filter based on object types | "Only include objects of type 'orders'" |
Activity-Type Matching | Filter by specific activity and object type combinations | "Show 'create package' activities only for 'packages' objects" |
Object Count Filters | Filter events based on number of related objects | "Show events related to at least 3 items" |
Lifecycle Start Events | Show only first events in object lifecycles | "Show only the first events for each order object" |
Lifecycle End Events | Show only last events in object lifecycles | "Show only the last events for each order object" |
Chain Filters | Combine multiple filters for complex queries | "Show orders from January 2024 with more than 2 items" |
The API documentation is automatically generated and available at:
-
Swagger UI: http://localhost/api/docs
- Interactive documentation to explore and test API endpoints
-
ReDoc: http://localhost/api/redoc
- More readable version of the API documentation
The application uses a multi-layer session-based approach to manage user resources:
- Sessions persist in the backend cache for 24 hours by default
- Each session has a storage limit (default: 250MB)
- Logs automatically expire after 15 minutes of inactivity
- The application provides notifications when logs are about to expire
- The browser session cookie remains valid for 2 hours
- After this period, you may need to refresh the page to restore the session
- The session ID is maintained across page refreshes within this timeframe
These time limits are designed to balance resource usage with user convenience:
- The 15-minute log expiration ensures efficient resource allocation
- The 2-hour frontend session matches typical working sessions
- The 24-hour backend persistence allows returning to your work the next day
You'll receive notifications before logs expire, giving you time to save important work by downloading results or refreshing log access.
To ensure optimal performance and fair resource allocation:
- Monitor your session storage usage in the top-right corner
- Delete unused logs to free up session storage space
- Download important results to your local system for long-term storage
- Use filter chains to create efficient workflows without intermediate logs
- Pay attention to expiration notifications for logs you're actively working with
Key resource limits (configurable in .env):
- Maximum logs per session: 20 by default
- Storage limit per session: 250MB by default
- Maximum number of logs in cache: 200 by default
Your feedback helps improve the filtering mechanism:
- After applying a filter, you'll be prompted to evaluate its effectiveness
- Rate how accurately the filter interpreted your natural language description
- Provide additional comments if needed
- Your feedback is stored and used to enhance future filtering accuracy
The application supports the following OCEL formats:
- JSON (OCEL 2.0): Default format with full support for all features
- XML (OCEL 2.0): Alternative format with equivalent functionality
- If you encounter session errors, refresh the page to generate a new session
- Clear your browser cache if persistent session problems occur
- Note that frontend sessions expire after 2 hours, while backend data persists for 24 hours
- Ensure your OCEL file follows the proper standard format
- Large files may take longer to process; please be patient
- Check that your file has a valid .json or .xml extension
- Verify that your session hasn't reached the maximum log limit (default: 20)
- Try to be more specific in your filter description
- Avoid complex or ambiguous language
- Review the available filter types and examples for guidance
- Check that your LLM_API_KEY is valid in the .env file
- Large logs will be automatically sampled for visualization
- Adjust the maximum relations setting for better performance
- Toggle visibility of specific object types to reduce complexity
- Logs automatically expire after 15 minutes of inactivity
- You'll receive notifications before logs expire
- Access a log to reset its expiration timer
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request