The tracerboy API is an experimental event tracking and reporting web service, built as part of an assignment at Narrative.
Since this application is not yet published as a Docker image or some other wrapping I'm making an assumption that eager engineers will have basic tooling installed on their machines. Tooling being - SBT and Docker. Developer experience can be further extended by usage of Nix it is however not mandatory requirement.
This project uses Postgres with TimescaleDB extension to store all analytical events and continuous aggregates with hypertables. The application is bundled with embedded Flyway that will automatically detect the state of database and execute appropriate migrations when booted or reloaded.
Make sure that application can access Postgres instance or use a Docker Compose wrapper
script ./bin/tracerboy-dev.sh to boot it up. This will create new Docker container with
Postgres and TimescaleDB extension installed and configured. It will also set username to tb, password to tb and
initialise new empty database with name tb.
./bin/tracerboy-dev.sh up pgBoot-up the application with sbt run and DATABASE_URL environment variable preconfigured.
# export DATABASE_URL="jdbc:postgresql://localhost:5432/tb?user=tb&password=tb"
sbt runOr build Docker image and boot-up the application within the help
of Docker composition:
sbt docker:publishLocal
./bin/tracerboy-dev.sh up tracerboy tracerboy-gwIf you wish to scale up or down number of replicas up use something along the following lines:
./bin/tracerboy-dev.sh up -d
./bin/tracerboy-dev.sh up --scale tracerboy=3 -d
./bin/tracerboy-dev.sh stop # To stop everythingWhen interacting with the service via Docker, please use port 4000 oppose to port 9090 when running natively on the
host operating system.
curl -D - --request POST \
127.0.0.1:4000/analytics\?timestamp=1662981405\&user=Oto+Brglez\&event=clickMain endpoint for accepting tracking information has the following query parameters:
POST /analytics?timestamp={millis_since_epoch}&user={user_id}&event={click|impression}The endpint for reporting/analytics can be accessed on the given path
GET /analyticsIf you wish to run with reloading in development mode then please consider using sbt-revolver.
sbt "~service/reStart"The unit tests suite that is bundled with the application and can be run with the help of sbt.
sbt testThe integration tests are packaged as separate module and can be invoked via usage of Gatling.
sbt integration/GatlingIt/testThe Gatling traffic simulation will run against the service running on localhost:9090. If this project is to become more serious in the future I would likely suggest usage of Testcontainers and re-usage of existing Docker Compose setup and configuration as per-the-docs.
- Although the assigment identifies "user" with "username" it would be wiser to use proper UUIDs in production setup.
- Since the application is "stateless" there is no common shared state among running instances, thus making the application easy to scale.
timestampquery parameter in thePOST /analyticsrequest should likely be hidden from the outside and set on the server when data is received and processed.- In production use-case the business logic handling could be improved with usage of ZIO Prelude / Validation or Scala Cats Validated
- The analytical aggregation is implemented with the help of TimescaleDB's hypertables and real-time continuous aggregates. In real-world scenario each of the aggregates would also need a proper retention policy.
- Why I've chosen TimescaleDB and not some other alternative is neatly captured and explained in this article . Other interesting options would also be InfluxDB, MongoDB (time-series) or other specialised time-series databases.
- Additional work should be done in this application in terms of logging and monitoring if it is to be used in the wild.