This is a repo with links to everything you'd ever want to learn about data engineering
- 
            Updated
            Oct 12, 2025 
- Jupyter Notebook
This is a repo with links to everything you'd ever want to learn about data engineering
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
A curated list of awesome big data frameworks, ressources and other awesomeness.
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
🚀 RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platforms such as MinIO and Ceph.
𝗔𝗜-𝗡𝗮𝘁𝗶𝘃𝗲 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗲. Blazing analytics, fast search, geo insights, vector AI. Built for multimodal analytics, Open-source Snowflake alternative. https://databend.com
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Upserts, Deletes And Incremental Processing on Big Data.
A Cloud Native Batch System (Project under CNCF)
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Add a description, image, and links to the bigdata topic page so that developers can more easily learn about it.
To associate your repository with the bigdata topic, visit your repo's landing page and select "manage topics."