Skip to content

Roadmap and the Future #314

@kettanaito

Description

@kettanaito

Below, I will outline the roadmap for Data, talk about some of its design flaws, and how they should be addressed. This document serves as a way for everyone to see where the project is going and align their contribution with that vision.

Design flaws

Flaw 1: Doing too much

Data is unnecessarily complex under the hood. Largely, because it tries to combine two vast areas:

  • Object design;
  • Data querying.
    At the moment of conception, there wasn’t a unified way to describe object schemas in JavaScript. Since the introduction of Standard Schemaa few months ago, that is no longer the case. Designing objects with Data was cumbersome because you are not supposed to design objects in some arbitrary testing tool.

Flaw 2: Relationships

Relationships is hands-down the most complex feature of Data. What makes it even more complex is that data relationship often extends beyond data design. If anything, it’s belongs more to the querying side of things than designing side of things.

That complexity is both internal and external. I’ve re-implemented the relationships at least four times now and I’m still not happy with the result. I suspect a big reason for that unhappiness was due to the Flaw no.1.

Future direction

  1. Data will not provide any object modeling capabilities. Instead, you should use any Standard Schema-compliant schema libraries (Zod, Valibot, ArkType, etc). This also means that Data can have better operability with the tools you’re already using.
  2. The querying API has to be rethought. Ideally, we should build a standalone ORM-inspired library that can query any data structures using Prisma/Drizzle-like methods (.findFirst(), etc). Again, the querying functionality has nothing to do with Data.
  3. Focus on random values and MSW integration. Where the biggest value of Data lies is in the ability to turn your object schema into actual data and then spin up MSW request handlers to turn that data into anything (test servers, request handlers, etc).

Good, now what?

There are two ways we can approach this change:

  1. Leave it up to me. I will get to this when I have time and mood, which is the way I approach all my open source projects. This might take some time despite me wanting to improve the library for years now.
  2. Collaborative effort. You can help in making the new Data possible. I will do my best to provide active pull request reviews and support in merging the changes related to the mentioned improvements.

Roadmap

Here’s a rough roadmap of the things to happen in order to improve Data. None of this is final but that’s a good starting point.

  • Adopt Standard Schema as the input type for your models. factory() should accept any Standard Schema and use it to generate actual data. No type-first schema design. Create a schema, infer types from the schema. This is the way.
  • Research: See if there’s Standard Schema version of the querying capabilities. I’ve had a chat with the Drizzle team about this in the past and I recall them mentioning something. This is a good time to check if they’ve built that something. I’d love for us to reuse an existing querying library.
    • If there’s no such library, we should build our own. A thing that allows you to query any data (we can limit it to root-level objects, that’s fine) using an ORM-inspired API.
  • Refactor relationships. Once the querying changes are done, we should research and refactor the relationships at their core. From the terminology and function names, to how it’s implemented (I still think Proxy-based approach is great, it just has to be done right in the context of the data input, which at this stage should be Standard Schema). See what JS ORMs are doing. See what Laravel is doing (they’ve got a really powerful ORM capabilities built in).
  • Improve MSW integration. Since the introduction of Data, MSW has shipped Source, which is a library that can create request handlers out of different sources. It sounds that Data might be one more input for Source to support. If we can find a querying library with relationships built in, Data itself would become extremely slim (just Standard Schema -> faker -> ready for Source).

Please raise a discussion before contributing anything. A day of discussion saves a week of work. I’ve already displeased a ton of people who’ve opened pull requests to data. I will likely displease them even more by closing their pull requests if they don’t contribute to the vision of the library. To prevent that from happening in the future, it’s good to discuss first and use this issue as the grounds for those discussions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions