Skip to content

vitrivr's next-generation retrieval engine. It is capable of extracting and retrieving a wider range of multimedia objects such as audio, video, images or 3d models.

License

Notifications You must be signed in to change notification settings

vitrivr/vitrivr-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


vitrivr log


vitrivr-engine logo


GitHub Release License issues - vitrivr-engine

vitrivr's next-generation retrieval engine.

Read The Docs

vitrivr-engine

vitrivr-engine is vitrivr's next generation retrieval engine with a flexible, modular architecture. Based on the experiences with its predecessor, Cineast, vitrivr engine's data model, ingestion pipeline and retrieval logic have been reworked from the ground. Essentially, vitrivr-engine enables the analysis (i.e. ingestion) and querying (i.e. retrieval ) of multimedia data.

Built With

Getting Started and Example Usage

This guide shows how to build and run vitrivr-engine either by manually building it or using the Docker setup.

Getting Started

This guide shows how to build and run vitrivr-engine using the provided ZIP distribution and a schema configuration file.

Prerequisites


Build and Extract
  1. Clone the repository and build the distribution:

    git clone https://github.com/vitrivr/vitrivr-engine.git
    cd vitrivr-engine
    ./gradlew distZip
  2. Unzip the distribution:

    unzip -d vitrivr-engine-server/build/distributions/vitrivr-engine-server-*.zip
  3. Prepare a folder structure with your media (e.g. sandbox/media/):

    vitrivr-engine/
    └── vitrivr-engine-server-<version>/
        ├── bin/
        └── lib/
    └── sandbox/
        └── media/
            ├── image1.png
            └── video.mp4
    
  4. Clone and run the external feature extraction service if you want to use CLIP or DINO for feature extraction:

    git clone https://github.com/vitrivr/vitrivr-python-descriptor-server
    cd vitrivr-python-descriptor-server
    python3 -m venv features
    source features/bin/activate
    pip install -r requirements.txt
    python3 main.py

Docker Setup

Alternatively, you can use the Docker image to run vitrivr-engine: With the help of the docker-compose file, you can set up a PostgreSQL database with pgVector, an external instance to extract features and the vitrivr-engine server. Note: The video feature extraction using the docker image is currently not supported when running on Apple Silicon (e.g., M1, M2, M3) due to architecture compatibility issues.

Run the following command to start the Docker containers:

docker compose up --build

This command will start the PostgreSQL database, the feature extraction service, and the vitrivr-engine server. All configuration files are mounted from the vitrivr-engine-server/config directory, so you can easily modify them and restart the containers to apply changes. Also, all data is mounted to the vitrivr-engine-server/data directory, so you can easily access the data files.

Setup Configuration

Create a config-schema.json file to define your schema, fields, and backend. Make sure to adjust the database connection parameters to match your PostgreSQL or CottontailDB setup. The following example uses PostgreSQL with pgVector and defines fields for image and video ingestion, including a CLIP feature extractor and a thumbnail exporter.

If you are using Docker, please review the config files in example-configs/docker.

config-schema.json
{
  "schemas": {
    "sandbox": {
      "connection": {
        "database": "PgVectorConnectionProvider",
        "parameters": {
          "host": "127.0.0.1",
          "port": "5432",
          "database": "postgres",
          "username": "postgres",
          "password": "secret"
        }
      },
      "fields": {
        "averagecolor": {
          "factory": "AverageColor"
        },
        "file": {
          "factory": "FileSourceMetadata"
        },
        "time": {
          "factory": "TemporalMetadata"
        },
        "clip": {
          "factory": "CLIP",
          "parameters": {
            "host": "http://127.0.0.1:8888"
          }
        }
      },
      "resolvers": {
        "disk": {
          "factory": "DiskResolver",
          "parameters": {
            "location": "./sandbox/thumbnails/",
            "mimeType": "JPG"
          }
        }
      },
      "exporters": {
        "thumbnail": {
          "factory": "ThumbnailExporter",
          "parameters": {
            "maxSideResolution": "400",
            "mimeType": "JPG",
            "resolver": "disk"
          }
        }
      },
      "extractionPipelines": {
        "image": {
          "path": "/vitrivr-engine/example-configs/native/image-ingest.json"
        },
        "video": {
          "path": "/vitrivr-engine/example-configs/native/video-ingest.json"
        }
      }
    }
  }
}


Setup Pipelines

Depending on your media type, you can create individual ingestion pipelines for images and videos. The following example shows how to set up pipelines for image and video ingestion: The fields used here are defined in the config-schema.json file, and the pipelines are already defined in the extractionPipelines section of the config-schema.json.

image-ingest.json
{
  "schema": "sandbox",
  "context": {
    "contentFactory": "InMemoryContentFactory",
    "resolvers": [
      "disk"
    ],
    "local": {
      "enumerator": {
        "path": "./media",
        "depth": "1"
      },
      "thumbnail": {
        "maxSideResolution": "400",
        "mimeType": "JPG"
      }
    }
  },
  "operators": {
    "enumerator": {
      "type": "ENUMERATOR",
      "factory": "FileSystemEnumerator",
      "mediaTypes": [
        "IMAGE"
      ]
    },
    "decoder": {
      "type": "DECODER",
      "factory": "ImageDecoder"
    },
    "persister": {
      "type": "TRANSFORMER",
      "factory": "PersistRetrievableTransformer"
    },
    "thumbnail": {
      "type": "EXPORTER",
      "exporterName": "thumbnail"
    },
    "clip": {
      "type": "EXTRACTOR",
      "fieldName": "clip"
    },
    "file": {
      "type": "EXTRACTOR",
      "fieldName": "file"
    },
    "averagecolor": {
      "type": "EXTRACTOR",
      "fieldName": "averagecolor"
    }
  },
  "operations": {
    "enumerator": {
      "operator": "enumerator"
    },
    "decoder": {
      "operator": "decoder",
      "inputs": [
        "enumerator"
      ]
    },
    "persist": {
      "operator": "persister",
      "inputs": [
        "decoder"
      ]
    },
    "thumbnail": {
      "operator": "thumbnail",
      "inputs": [
        "persist"
      ]
    },
    "clip": {
      "operator": "clip",
      "inputs": [
        "thumbnail"
      ]
    },
    "averagecolor": {
      "operator": "averagecolor",
      "inputs": [
        "clip"
      ]
    },
    "file": {
      "operator": "file",
      "inputs": [
        "averagecolor"
      ]
    }
  },
  "output": [
    "file"
  ]
}

video-ingest.json
{
  "schema": "sandbox",
  "context": {
    "contentFactory": "InMemoryContentFactory",
    "resolvers": [
      "disk"
    ],
    "local": {
      "enumerator": {
        "path": "/sandbox/media",
        "depth": "1"
      },
      "thumbnail": {
        "maxSideResolution": "400",
        "mimeType": "JPG"
      },
      "filter": {
        "type": "SOURCE:VIDEO"
      },
      "decoder": {
        "timeWindowMs": "5000"
      }
    }
  },
  "operators": {
    "enumerator": {
      "type": "ENUMERATOR",
      "factory": "FileSystemEnumerator",
      "mediaTypes": [
        "VIDEO"
      ]
    },
    "decoder": {
      "type": "DECODER",
      "factory": "VideoDecoder"
    },
    "persister": {
      "type": "TRANSFORMER",
      "factory": "PersistRetrievableTransformer"
    },
    "aggregator": {
      "type": "TRANSFORMER",
      "factory": "LastContentAggregator"
    },
    "time": {
      "type": "EXTRACTOR",
      "fieldName": "time"
    },
    "thumbnail": {
      "type": "EXPORTER",
      "exporterName": "thumbnail"
    },
    "clip": {
      "type": "EXTRACTOR",
      "fieldName": "clip"
    },
    "file": {
      "type": "EXTRACTOR",
      "fieldName": "file"
    },
    "averagecolor": {
      "type": "EXTRACTOR",
      "fieldName": "averagecolor"
    },
    "filter": {
      "type": "TRANSFORMER",
      "factory": "TypeFilterTransformer"
    }
  },
  "operations": {
    "enumerator": {
      "operator": "enumerator"
    },
    "decoder": {
      "operator": "decoder",
      "inputs": [
        "enumerator"
      ]
    },
    "aggregator": {
      "operator": "aggregator",
      "inputs": [
        "decoder"
      ]
    },
    "persist": {
      "operator": "persister",
      "inputs": [
        "aggregator"
      ]
    },
    "thumbnail": {
      "operator": "thumbnail",
      "inputs": [
        "persist"
      ]
    },
    "clip": {
      "operator": "clip",
      "inputs": [
        "thumbnail"
      ]
    },
    "averagecolor": {
      "operator": "averagecolor",
      "inputs": [
        "clip"
      ]
    },
    "time": {
      "operator": "time",
      "inputs": [
        "averagecolor"
      ]
    },
    "file": {
      "operator": "file",
      "inputs": [
        "time"
      ]
    }
  },
  "output": [
    "file"
  ]
}

Start the Server

Navigate to the distribution:

cd ../vitrivr-engine-server-<version> 
./bin/vitrivr-engine-server config-schema.json

Inside the interactive CLI:

v> sandbox init         # Initializes the schema (sandbox = your schema name)
v> sandbox about         # Check the initialization state the schema (sandbox = your schema name)

If you want to ingest media files, you can use the following commands:

v> sandbox extract -n image # Ingest all images through the specified pipeline
v> sandbox extract -n video # Ingest all videos through the specified pipeline

Or if you are using Docker, you can run the following command:

docker attach vitrivr-engine-vitrivr-engine-1
sandbox init
sandbox extract -n image
sandbox extract -n video

You can now access the REST API at http://localhost:7070 and query the system.


Example Query

Use the OpenAPI Swagger UI or send a query. A simple example query to retrieve the multimedia content based on a text input could look like this. For this, the CLIP feature is used. Inside input-text, you can provide a description of the content you want to retrieve, such as "an orange starfish on the seafloor". The query will return the relevant media files based on the CLIP feature extraction.

```json
{
  "inputs": {
    "input-text": {"type": "TEXT", "data": "an orange starfish on the seafloor"}
  },
  "operations": {
    "clip" : {"type": "RETRIEVER", "field": "clip", "input": "input-text"},
    "relations" : {"type": "TRANSFORMER", "transformerName": "RelationExpander", "input": "clip"},
    "lookup" : {"type": "TRANSFORMER", "transformerName": "FieldLookup", "input": "relations"},
    "aggregator" : {"type": "TRANSFORMER", "transformerName": "ScoreAggregator",  "input": "lookup"},
    "filelookup" : {"type": "TRANSFORMER", "transformerName": "FieldLookup", "input": "aggregator"}
  },
  "context": {
    "global": {
      "limit": "1000"
    },
    "local" : {
      "lookup":{"field": "time", "keys": "start, end"},
      "relations" :{"outgoing": "partOf"},
      "filelookup": {"field": "file", "keys": "path"}
    }
  },
  "output": "filelookup"
}

Project Structure

The project is set up as a multi-module Kotlin project:

Module Description Maven Dependency
vitrivr-engine-core The core library of the project, which provides basic interfaces & classes. Yes
vtirivr-engine-query Query / retrieval related extension to the core library with various retrievers and data manipulation operators. Yes
vitrivr-engine-index Indexing / ingestion related extension to the core library with various decoders and segmenters. Yes
vitrivr-engine-module-cottontaildb The database driver for the CottontailDB database, used for NNNS and other queries. Yes
vitrivr-engine-module-features Extension that contains specific indexing and retrieval implementations such as fulltext, colour, etc. Yes
vitrivr-engine-module-m3d Extension related to 3d model indexing and retrieval. Contains various feature modules and capability to process meshes. Yes
vitrivr-engine-module-fes Extension that can be used to harnes feature extraction provided by an external ML model server. Requires local generation of bindings: ./gradlew :vitrivr-engine-module-fes:generateFESClient Yes
vitrivr-engine-server A Javalin powered server providing an OpenApi documented REST API for both, ingestion and querying and a CLI, essentially the runtime of the project No

Contributing

We welcome contributors. Please fork the repo and open a pull-request with your work. A good starting point are the 'good first issue' issues.

Contributors

  • @ppanopticon
  • @lucaro
  • @sauterl
  • @net-csscience-raphel
  • @rahelarnold98
  • @faberf

Citation

See Citation

About

vitrivr's next-generation retrieval engine. It is capable of extracting and retrieving a wider range of multimedia objects such as audio, video, images or 3d models.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 8

Languages