🔍 OCEL Log Filtering Application

A powerful web application for filtering Object-Centric Event Logs using natural language processing. Apply complex filters with simple natural language commands and visualize process mining data interactively.

📚 Table of Contents

Overview
Key Features
Technology Stack
Getting Started
Usage Guide
Filter Types
API Documentation
Session Management
Resource Management
Evaluation Feedback
File Format Support
Troubleshooting
Contributing

📑 Overview

The OCEL Log Filtering application bridges the gap between complex process mining data and natural language interaction. It leverages Large Language Models (LLMs) to interpret user queries and automatically apply appropriate filters to extract insights from Object-Centric Event Logs (OCEL).

OCEL is a standard format for storing event logs that capture process-oriented data with complex object interactions. Unlike traditional event logs that focus on a single case notion, OCEL allows for multiple object types and their relationships, enabling more comprehensive process mining analysis.

🚀 Key Features

Natural Language Filtering: Apply complex filters by describing them in plain English
Interactive Visualization: Explore event-object relationships through dynamic, interactive graphs
Multiple Log Formats: Support for OCEL logs in JSON (OCEL 2.0) and XML (OCEL 2.0) formats
Session-Based Storage: Persistent user sessions with automatic resource management
Filter Evaluation: Collection of user feedback to improve filtering effectiveness
Hierarchical Log Organization: View relationships between original and filtered logs
Real-time Statistics: Detailed metrics on events, objects, and relationships

🛠️ Technology Stack

Backend

Python (^3.9): Core programming language
FastAPI (^0.109.0): Modern, high-performance web framework
PM4Py (^2.7.0): Process Mining library with OCEL support
Pydantic (^2.6.0): Data validation and settings management
OpenAI API (^1.65.5): For natural language processing of filter requests
Uvicorn: ASGI server for serving FastAPI applications
Asyncio: For asynchronous operations and performance

Frontend

Astro (^5.5.2): Web framework with view transitions support
React (^19.0.0): For interactive UI components
Tailwind CSS (^3.4.17): Utility-first CSS framework
ReactFlow (^11.11.4): For interactive graph visualizations
Axios (^1.8.3): Promise-based HTTP client
SweetAlert2 (^11.6.13): For beautiful, responsive alert dialogs

Deployment

Docker and Docker Compose: For containerization and orchestration
Nginx: As a reverse proxy for routing requests
Environment Variables: For configuration management

🚦 Getting Started

Prerequisites

Docker and Docker Compose installed
OpenAI API key for LLM functionality
250MB+ of available storage space

Installation

Clone the repository:

git clone https://github.com/isa-group/llms-ocel.git
cd llms-ocel

Configure the environment variables:

# Copy the example environment file and customize it
cp .env.example .env

# Edit the .env file with your settings (especially the OpenAI API key)
nano .env

Start the application with Docker Compose:

docker-compose up -d

Access the application:
- Frontend: http://localhost
- API: http://localhost/api

Configuration

The application is highly configurable through environment variables in the .env file. Here are the key configuration options:

Server Settings

Variable	Description	Default
HOST	Host IP to bind the server	0.0.0.0
PORT	Port for the backend server	8000
DEBUG	Enable debug mode	False
CORS_ORIGINS	Comma-separated list of allowed origins	http://localhost:80,...

LLM Configuration

Variable	Description	Default
LLM_API_KEY	OpenAI API Key	(required)
LLM_MODEL	LLM model to use	gpt-3.5-turbo
LLM_TEMPERATURE	Temperature for LLM responses (0.0-1.0)	0.1
LLM_MAX_TOKENS	Maximum tokens for LLM responses	1000

Cache and Session Settings

Variable	Description	Default
CACHE_TTL	Log time-to-live in seconds for each log	900 (15 minutes)
CACHE_MAXSIZE	Maximum number of logs in cache	200
SESSION_TTL_SECONDS	Backend session duration in seconds	86400 (24 hours)
SESSION_SIZE_LIMIT	Max storage per session in bytes	262144000 (250 MB)
SESSION_LOG_LIMIT	Maximum logs per session	20

Export Settings

Variable	Description	Default
EXPORT_TIMEOUT	Timeout for export operations in seconds	30

Frontend Settings

Variable	Description	Default
API_URL	URL for the backend API (used by frontend)	http://backend:8000

To apply configuration changes, update the .env file and restart the application using docker-compose down followed by docker-compose up -d.

📋 Usage Guide

Uploading Logs

From the dashboard, click "Upload New Log" or navigate to the upload section
Click "Upload" to process the file
Review the log statistics displayed after successful upload

Applying Filters

Select a log from the list or continue from your upload
Navigate to the "Filter" tab
Enter a natural language description of the filter you want to apply

Examples: "Show me orders created in January 2023" or "Filter events with activity equals pay order"
Click "Apply Filter" to process your request
Review the filter results and provide feedback on filter accuracy

Visualizing Results

Select any log from your list
Navigate to the "Visualization" tab
Interact with the graph representation:
- Zoom and pan to explore the data
- Toggle visibility of different object types
- View relationships between events and objects
- Adjust the visualization settings for clarity

Exporting Results

From the log details view, click the "Download" button
The file will be downloaded to your device

🔧 Filter Types

The application supports various filter strategies that can be applied through natural language:

Filter Type	Description	Example
Activity Filters	Filter events based on their activity values	"Show only events with activity 'create order'"
Timestamp Filters	Filter events based on date ranges	"Show events between January 1st and March 31st, 2024"
Object Type Filters	Filter based on object types	"Only include objects of type 'orders'"
Activity-Type Matching	Filter by specific activity and object type combinations	"Show 'create package' activities only for 'packages' objects"
Object Count Filters	Filter events based on number of related objects	"Show events related to at least 3 items"
Lifecycle Start Events	Show only first events in object lifecycles	"Show only the first events for each order object"
Lifecycle End Events	Show only last events in object lifecycles	"Show only the last events for each order object"
Chain Filters	Combine multiple filters for complex queries	"Show orders from January 2024 with more than 2 items"

📚 API Documentation

The API documentation is automatically generated and available at:

Swagger UI: http://localhost/api/docs
- Interactive documentation to explore and test API endpoints
ReDoc: http://localhost/api/redoc
- More readable version of the API documentation

🔐 Session Management

The application uses a multi-layer session-based approach to manage user resources:

Backend Session Storage:

Sessions persist in the backend cache for 24 hours by default
Each session has a storage limit (default: 250MB)
Logs automatically expire after 15 minutes of inactivity
The application provides notifications when logs are about to expire

Frontend Session Management:

The browser session cookie remains valid for 2 hours
After this period, you may need to refresh the page to restore the session
The session ID is maintained across page refreshes within this timeframe

These time limits are designed to balance resource usage with user convenience:

The 15-minute log expiration ensures efficient resource allocation
The 2-hour frontend session matches typical working sessions
The 24-hour backend persistence allows returning to your work the next day

You'll receive notifications before logs expire, giving you time to save important work by downloading results or refreshing log access.

📊 Resource Management

To ensure optimal performance and fair resource allocation:

Monitor your session storage usage in the top-right corner
Delete unused logs to free up session storage space
Download important results to your local system for long-term storage
Use filter chains to create efficient workflows without intermediate logs
Pay attention to expiration notifications for logs you're actively working with

Key resource limits (configurable in .env):

Maximum logs per session: 20 by default
Storage limit per session: 250MB by default
Maximum number of logs in cache: 200 by default

🗣️ Evaluation Feedback

Your feedback helps improve the filtering mechanism:

After applying a filter, you'll be prompted to evaluate its effectiveness
Rate how accurately the filter interpreted your natural language description
Provide additional comments if needed
Your feedback is stored and used to enhance future filtering accuracy

📁 File Format Support

The application supports the following OCEL formats:

JSON (OCEL 2.0): Default format with full support for all features
XML (OCEL 2.0): Alternative format with equivalent functionality

❓ Troubleshooting

Session Issues

If you encounter session errors, refresh the page to generate a new session
Clear your browser cache if persistent session problems occur
Note that frontend sessions expire after 2 hours, while backend data persists for 24 hours

Upload Problems

Ensure your OCEL file follows the proper standard format
Large files may take longer to process; please be patient
Check that your file has a valid .json or .xml extension
Verify that your session hasn't reached the maximum log limit (default: 20)

Filter Application Errors

Try to be more specific in your filter description
Avoid complex or ambiguous language
Review the available filter types and examples for guidance
Check that your LLM_API_KEY is valid in the .env file

Visualization Performance

Large logs will be automatically sampled for visualization
Adjust the maximum relations setting for better performance
Toggle visibility of specific object types to reduce complexity

Log Expiration

Logs automatically expire after 15 minutes of inactivity
You'll receive notifications before logs expire
Access a log to reset its expiration timer

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
docs/evaluation-data		docs/evaluation-data
frontend		frontend
logs		logs
nginx/conf.d		nginx/conf.d
src/llms_ocel		src/llms_ocel
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

isa-group/llms-ocel

Folders and files

Latest commit

History

Repository files navigation