Skip to content

Rohan-flutterint/goxstream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GoXStream

Go License: MIT

GoXstream

GoXStream is an open-source, Flink-inspired stream processing engine written in Goβ€”with a beautiful React dashboard for visual pipeline design, job submission, and history. Build and run real-time data pipelines with file, database, or Kafka sources and sinks.

πŸš€ Features

  • Modular pipeline engine (in Go): Compose pipelines from map, filter, reduce, window, time-window, and more
  • Dynamic REST API: Submit pipelines and configure sources, sinks, operators via JSON
  • Pluggable sources/sinks: File, Postgres, Kafka (more coming)
  • Windowing: Tumbling, sliding, time-based, with watermark and late event support
  • Stateful operators and checkpointing (coming soon!)
  • React dashboard: Visual DAG pipeline builder (drag/drop), job submission, job history, JSON preview
  • Persistent job history (localStorage and soon, backend)

[GoXStream Dashboard]

Screenshot 2025-07-05 at 8 46 39β€―PM Screenshot 2025-07-05 at 8 46 51β€―PM Screenshot 2025-07-05 at 8 47 02β€―PM

πŸš€ Quick Start

Prerequisites

  • Go 1.19 or higher

1. Clone and Build

git clone https://github.com/YOUR_GITHUB_USERNAME/goxstream.git
cd goxstream
go mod tidy

2. Prepare Input Data

Place an example input.csv in the project root for the :

id,name,city
1,Alice,London
2,Bob,Berlin
3,Charlie,Paris
4,David,Berlin
5,Eve,Paris
6,Frank,Paris
7,Grace,Berlin

3. Run the API Server

go run ./cmd/goxstream/main.go

4. Open the React dashboard

cd goxstream-dashboard
npm install
npm start

5. 🧩 Example: Submit a Pipeline via UI or REST

Simple Map:

{
  "source": { "type": "file", "path": "input.csv" },
  "operators": [
    { "type": "map", "params": { "col": "processed", "val": "yes" } }
  ],
  "sink": { "type": "file", "path": "output.csv" }
}

Windowed Reduce:

{
  "source": { "type": "file", "path": "input.csv" },
  "operators": [
    {
      "type": "time_window",
      "params": {
        "duration": "10s",
        "inner": {
          "type": "reduce",
          "params": { "key": "city", "agg": "count" }
        }
      }
    }
  ],
  "sink": { "type": "file", "path": "output.csv" }
}

With Watermark/Late Event Support:

{
  "source": { "type": "file", "path": "input.csv" },
  "operators": [
    {
      "type": "time_window_watermark",
      "params": {
        "duration": "10s",
        "allowed_lateness": "5s",
        "inner": {
          "type": "reduce",
          "params": { "key": "city", "agg": "count" }
        }
      }
    }
  ],
  "sink": { "type": "file", "path": "output.csv" }
}

πŸ–₯️ Visual Pipeline Designer (UI)

  • GoXStream’s UI lets you visually build pipelines (drag/drop), edit operator parameters, and export as JSON to run jobs.

  • Submitted jobs and their configs are saved in a beautiful job history.


6. Submit a Pipeline Job (file β†’ file example) via CURL

curl -X POST http://localhost:8080/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "file", "path": "input.csv" },
    "operators": [{ "type": "map", "params": { "col": "processed", "val": "yes" } }],
    "sink": { "type": "file", "path": "output.csv" }
  }'

7. Submit a Pipeline Job

Use curl or Postman to submit a dynamic pipeline (example: sliding window reduce): a. Regular time window:

Input.csv file for both below time_window and time_window_watermark

id,name,city,score,timestamp
1,Alice,London,10,2024-07-04T15:00:00Z
2,Bob,Berlin,15,2024-07-04T15:00:05Z
3,Charlie,Paris,12,2024-07-04T15:00:08Z
4,David,Berlin,8,2024-07-04T15:00:12Z
5,Eve,Paris,14,2024-07-04T15:00:16Z
6,Frank,London,11,2024-07-04T15:00:22Z
7,Grace,Berlin,9,2024-07-04T15:00:29Z
8,Harry,Paris,13,2024-07-04T15:00:35Z
9,Ivy,London,17,2024-07-04T15:00:41Z
10,Jack,Berlin,16,2024-07-04T15:00:43Z

a. Time window with reduce event support:

curl -X POST http://localhost:8080/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "file", "path": "input.csv" },
    "operators": [
      {
        "type": "time_window",
        "params": {
          "duration": "10s",
          "inner": {
            "type": "reduce",
            "params": { "key": "city", "agg": "count" }
          }
        }
      }
    ],
    "sink": { "type": "file", "path": "output.csv" }
  }'

b. Time window with watermark/late event support:

curl -X POST http://localhost:8080/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "file", "path": "input.csv" },
    "operators": [
      {
        "type": "time_window_watermark",
        "params": {
          "duration": "10s",
          "allowed_lateness": "5s",
          "inner": {
            "type": "reduce",
            "params": { "key": "city", "agg": "count" }
          }
        }
      }
    ],
    "sink": { "type": "file", "path": "output.csv" }
  }'

Your results will be in output.csv with a window_id column.

city,count,window_end,window_id
London,1,2024-07-04T15:00:10Z,1
Berlin,1,2024-07-04T15:00:10Z,1
Paris,1,2024-07-04T15:00:10Z,1
...

πŸ› οΈ Architecture

[Source] --> [Map] --> [Filter] --> [Window/Reduce] --> [Sink]
   |            |         |           |                   |
 [File]   [Add/Transform] [Select] [Tumbling/Sliding]  [File]

Operators: Implemented as Go interfaces, dynamically composed at runtime.

REST API: Accepts JSON job specs, launches pipeline as background Go routines.


πŸ“ JSON Job Spec

A pipeline is defined by a simple JSON:

{
  "source": {"type": "file", "path": "input.csv"},
  "operators": [
    {"type": "map", "params": {"col": "processed", "val": "yes"}},
    {"type": "filter", "params": {"field": "city", "eq": "Berlin"}},
    {
      "type": "sliding_window",
      "params": {
        "size": 3,
        "step": 1,
        "inner": {
          "type": "reduce",
          "params": {"key": "city", "agg": "count"}
        }
      }
    }
  ],
  "sink": {"type": "file", "path": "output.csv"}
}

πŸ“š Operator Types

| Type             | Description              | Example Params                        |
| ---------------- | ------------------------ | ------------------------------------- |
| map              | Add or transform columns | `col`, `val`                          |
| filter           | Filter rows by condition | `field`, `eq`                         |
| reduce           | Aggregate/group by field | `key`, `agg` (`count`, future: `sum`) |
| tumbling\_window | Non-overlapping windows  | `size`, `inner`                       |
| sliding\_window  | Overlapping windows      | `size`, `step`, `inner`               |

πŸ§‘β€πŸ’» Extending GoXStream

Add New Operators: Implement the Operator interface and add a factory to the operator registry.

Support New Sources/Sinks: Implement a Source or Sink interface in internal/source or internal/sink.

React UI Integration: Planned for interactive pipeline creation and monitoring.


πŸ”œ Roadmap

  • Dynamic operator/source/sink registry

  • File/DB/Kafka connectors

  • REST API & React UI for pipeline design and monitoring

  • Visual DAG editor with drag/drop (reactflow)

  • Windowing and watermark support

  • Persistent job history (localStorage)

  • Checkpoint and fault-tolerance (coming next)

  • Backend job monitoring/status APIs

  • Multi-job/cluster execution

  • More analytics and ML operators


πŸ™Œ Contributing

PRs, issues, and ideas are welcome! Fork and submit improvements or new featuresβ€”let’s build a great Go stream engine together!

GoXStream β€” Streaming, the Go way! πŸš€

About

GoXStream is a modern, extensible real-time streaming engine built in Go, inspired by Apache Flink.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published