Skip to content

Support for object, array and enum in ruleset builder #1025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 25, 2025

Conversation

sigmaith
Copy link
Member

@sigmaith sigmaith commented Jul 16, 2025

What this PR does / why we need it?

Enhances the schema ruleset builder to support below types.

  • Enum
  • Nested objects (regardless of order)
  • Array types (Array, T[])
  • Recursive, Recursive Array

Introduces a Map-based approach to handle nested object types and type references, allowing for order-independent type definition resolution.
Introduces a visited Set, currentTypeName to prevent infinite recursion.

Any background context you want to provide?

As I worked on it, I thought that the enum should be treated as a literal union.
This is the part that connects to the complex union test case later.
Please let me know if I missed anything.

What are the relevant tickets?

Related yorkie-team/yorkie#971

Checklist

  • Added relevant tests or not required
  • Addressed and resolved all CodeRabbit review comments
  • Didn't break anything

Summary by CodeRabbit

  • New Features

    • Added support for union types, literals, and enum rules in schema definitions.
  • Refactor

    • Enhanced type definition handling with hierarchical construction and improved reference resolution with cycle detection.
    • Reintroduced the 'null' primitive type in schema rules.
  • Tests

    • Activated tests for nested objects, recursive types, array types, and enum type handling.
    • Added new tests for recursive array types and complex union type validation.

Copy link

coderabbitai bot commented Jul 16, 2025

Walkthrough

The RulesetBuilder class in the schema package was refactored to build a hierarchical type definition system using stacks, supporting union types, literals, and enum rules. Rule expansion now occurs recursively from a "Document" type definition with cycle detection. Related tests were updated by activating multiple previously pending test cases, including recursive and enum types, and adding a new union types validation test.

Changes

File(s) Change Summary
packages/schema/src/rulesets.ts Refactored RulesetBuilder to use hierarchical, stack-based type definitions; added union, literal, and enum support; updated type exports and rule expansion logic with cycle detection.
packages/schema/test/ruleset.test.ts Reordered imports; activated multiple previously pending test cases for nested objects, arrays, recursive types, and enums; added new recursive array type test.
packages/schema/test/validator.test.ts Reordered imports; added a new .todo test case for validating complex union types in schemas.

Sequence Diagram(s)

sequenceDiagram
  participant Parser
  participant RulesetBuilder
  participant TypeStack
  participant Ruleset

  Parser->>RulesetBuilder: Parse schema
  loop For each type/property/union/literal
    RulesetBuilder->>TypeStack: Push context (type/property/union)
    TypeStack-->>RulesetBuilder: Current context
    RulesetBuilder->>TypeStack: Pop context on exit
  end
  RulesetBuilder->>RulesetBuilder: build()
  RulesetBuilder->>Ruleset: expandType("Document")
  Ruleset-->>RulesetBuilder: Flat array of rules
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

In the warren where schemas are spun,
Types now stack, and unions run!
Enums and objects, all in a row,
Nested and referenced, ready to go.
With rules expanded, the tests now play—
A rabbit’s delight: it’s schema ballet!
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

packages/schema/src/rulesets.ts

Oops! Something went wrong! :(

ESLint: 8.57.1

ESLint couldn't find the plugin "@typescript-eslint/eslint-plugin".

(The package "@typescript-eslint/eslint-plugin" was not found when loaded as a Node module from the directory "".)

It's likely that the plugin isn't installed correctly. Try reinstalling by running the following:

npm install @typescript-eslint/eslint-plugin@latest --save-dev

The plugin "@typescript-eslint/eslint-plugin" was referenced from the config file in ".eslintrc.js".

If you still can't figure out the problem, please stop by https://eslint.org/chat/help to chat with the team.

packages/schema/test/ruleset.test.ts

Oops! Something went wrong! :(

ESLint: 8.57.1

ESLint couldn't find the plugin "@typescript-eslint/eslint-plugin".

(The package "@typescript-eslint/eslint-plugin" was not found when loaded as a Node module from the directory "".)

It's likely that the plugin isn't installed correctly. Try reinstalling by running the following:

npm install @typescript-eslint/eslint-plugin@latest --save-dev

The plugin "@typescript-eslint/eslint-plugin" was referenced from the config file in ".eslintrc.js".

If you still can't figure out the problem, please stop by https://eslint.org/chat/help to chat with the team.

packages/schema/test/validator.test.ts

Oops! Something went wrong! :(

ESLint: 8.57.1

ESLint couldn't find the plugin "@typescript-eslint/eslint-plugin".

(The package "@typescript-eslint/eslint-plugin" was not found when loaded as a Node module from the directory "".)

It's likely that the plugin isn't installed correctly. Try reinstalling by running the following:

npm install @typescript-eslint/eslint-plugin@latest --save-dev

The plugin "@typescript-eslint/eslint-plugin" was referenced from the config file in ".eslintrc.js".

If you still can't figure out the problem, please stop by https://eslint.org/chat/help to chat with the team.

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 983ea7c and 6e5d04f.

📒 Files selected for processing (3)
  • packages/schema/src/rulesets.ts (2 hunks)
  • packages/schema/test/ruleset.test.ts (6 hunks)
  • packages/schema/test/validator.test.ts (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
packages/schema/test/ruleset.test.ts (2)
packages/schema/src/rulesets.ts (1)
  • buildRuleset (492-501)
packages/schema/src/index.ts (1)
  • buildRuleset (18-18)
🔇 Additional comments (22)
packages/schema/test/ruleset.test.ts (7)

17-17: LGTM: Import statement reordering.

The reordering of imports to place expect before it improves consistency with common import ordering conventions.


102-140: LGTM: Activated test for order-independent nested objects.

The test correctly validates that the ruleset builder can handle nested object types regardless of their declaration order, which is essential for the new Map-based approach mentioned in the PR objectives.


172-205: LGTM: Activated test for Array syntax.

The test properly validates array type handling using the generic Array<T> syntax, including the correct generation of array rules with item type information and wildcard paths for array elements.


207-240: LGTM: Activated test for T[] syntax.

The test correctly validates array type handling using the bracket T[] syntax, ensuring both array syntaxes are supported equivalently.


242-279: LGTM: Comprehensive recursive types test.

The test properly validates recursive type handling with a linked list structure, including the generation of recursive path patterns like $.linkedList.next[*].value. This addresses the cycle detection requirements mentioned in past review comments.


281-324: LGTM: Well-structured recursive array types test.

The new test for recursive array types (tree structure) correctly validates the handling of recursive references within arrays, ensuring proper rule generation for nested array structures.


402-420: LGTM: Activated enum types test.

The test correctly validates enum type handling, treating union types with literal values as enums with the expected rule structure containing type: 'enum' and values array.

packages/schema/test/validator.test.ts (2)

17-17: LGTM: Consistent import statement reordering.

The import reordering matches the pattern used in the ruleset test file, maintaining consistency across the test suite.


319-329: LGTM: Comprehensive union types validation test.

The new .todo test provides thorough coverage of complex union type scenarios, including simple unions, arrays with unions, nested unions with objects, and multiple type unions. This aligns with the enhanced union type support mentioned in the PR objectives.

packages/schema/src/rulesets.ts (13)

19-30: LGTM: Enhanced import statements for new functionality.

The additional imports support the new features including array types, literals, property signatures, and type references, which are essential for the enhanced schema capabilities.


35-95: LGTM: Well-designed type system architecture.

The new type definitions provide a solid foundation for the hierarchical approach:

  • Clear separation between Rule types and internal TypeDefinition types
  • EnumRule addition supports union/enum functionality
  • Comprehensive type coverage for primitives, objects, arrays, and unions

97-128: LGTM: Clean internal type definitions.

The TypeDefinition union type and PropertyDefinition structure provide a clear, extensible foundation for the new stack-based parsing approach.


134-142: LGTM: Well-structured class state management.

The new stack-based approach with clear separation of concerns (type definitions, current context, stacks for nested structures) is a significant improvement over the previous flat approach.


147-151: LGTM: Clean type alias declaration entry.

The method properly initializes the context for parsing a new type definition.


156-171: LGTM: Robust array type detection.

The method correctly handles both Array<T> and T[] syntax variants by checking for the appropriate tokens in the parse tree.


187-190: LGTM: Clean primitive type handling.

The method correctly creates primitive type definitions and pushes them to the type stack.


195-198: LGTM: Proper Yorkie type handling.

The method correctly handles Yorkie-specific types by creating appropriate type definitions.


203-206: LGTM: Type reference handling.

The method properly creates reference type definitions for user-defined types.


211-223: LGTM: Stack-based object type management.

The enter/exit methods for object types properly manage the property stack, enabling nested object parsing.


228-248: LGTM: Property signature handling with optional support.

The methods correctly handle property parsing including optional property detection and proper stack management.


253-265: LGTM: Union type context management.

The methods properly manage union type parsing context, enabling the collection of literal values for enum generation.


308-486: LGTM: Excellent implementation of recursive type expansion with cycle detection.

The expandType method successfully addresses the circular reference detection concern from past reviews by implementing a visited set with composite keys. The method handles all type variants comprehensively:

  • Proper cycle detection using visitKey with path and type name
  • Comprehensive handling of all type kinds (primitive, yorkie, array, union, object, reference)
  • Special logic for recursive types and array element expansion
  • Correct rule generation for each type variant

This is a significant improvement that enables support for recursive types, nested objects, and complex schema structures.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhance-schema-ruleset-builder

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@CLAassistant
Copy link

CLAassistant commented Jul 16, 2025

CLA assistant check
All committers have signed the CLA.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
packages/schema/src/rulesets.ts (1)

40-47: Missing 'null' primitive type definition

The test file uses a 'null' type at line 51, but it's not included in the PrimitiveType definition. This will cause the parser to fail when encountering null types.

Add 'null' to the PrimitiveType definition:

 export type PrimitiveType =
   | 'boolean'
   | 'integer'
   | 'double'
   | 'long'
   | 'string'
   | 'date'
-  | 'bytes';
+  | 'bytes'
+  | 'null';
🧹 Nitpick comments (2)
packages/schema/src/rulesets.ts (1)

296-302: Consider type consistency between UnionRule and expandType

The UnionRule type definition uses type: 'union', but the expandType method outputs type: 'enum'. While this aligns with treating enums as literal unions per the PR objectives, it creates a mismatch between the type definition and actual output.

Either update the type definition to match the output:

 export type UnionRule = {
-  type: 'union';
+  type: 'enum';
   values: Array<string | number | boolean>;
 } & RuleBase;

Or update the RuleType to reflect this:

 export type RuleType =
   | 'object'
   | 'array'
-  | 'union'
   | PrimitiveType
   | YorkieType
   | 'enum';

Also applies to: 85-88

packages/schema/test/ruleset.test.ts (1)

363-434: Consider adding tests for edge cases

The current tests cover the happy path well, but consider adding tests for:

  • Circular type references
  • Duplicate type definitions
  • Invalid literal values in unions
  • Empty union types

Would you like me to generate additional test cases for these edge cases?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cb65b8f and b7125c4.

📒 Files selected for processing (2)
  • packages/schema/src/rulesets.ts (3 hunks)
  • packages/schema/test/ruleset.test.ts (3 hunks)
🔇 Additional comments (1)
packages/schema/test/ruleset.test.ts (1)

102-140: Good test coverage for new features

The activation of these two test cases properly validates the enhanced functionality:

  1. Nested objects regardless of declaration order - confirms the Map-based approach works correctly
  2. Enum types with string and number literals - validates the union/enum type handling

Also applies to: 363-381

@sigmaith sigmaith marked this pull request as draft July 16, 2025 04:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
packages/schema/src/rulesets.ts (2)

145-150: Duplicate type definition validation still missing.

The previous review comment about adding validation for duplicate type definitions has not been addressed. This could lead to silent overwrites of type definitions.

Add validation to prevent duplicate type definitions:

 exitTypeAliasDeclaration() {
   if (this.currentTypeName && this.typeStack.length > 0) {
     const typeDef = this.typeStack.pop()!;
+    if (this.typeDefinitions.has(this.currentTypeName)) {
+      throw new Error(`Duplicate type definition: ${this.currentTypeName}`);
+    }
     this.typeDefinitions.set(this.currentTypeName, typeDef);
   }
   this.currentTypeName = undefined;
 }

276-338: Circular reference detection still missing.

The previous review comment about adding circular reference detection in expandType has not been addressed. This could lead to stack overflow errors with circular type dependencies.

Add circular reference detection as suggested in the previous review:

 private expandType(
   typeDef: TypeDefinition,
   path: string,
   rules: Array<Rule>,
+  visited: Set<string> = new Set(),
 ): void {
+  // For reference types, check circular dependencies
+  if (typeDef.kind === 'reference') {
+    if (visited.has(typeDef.typeName)) {
+      console.warn(`Circular reference detected: ${typeDef.typeName}`);
+      return;
+    }
+    visited.add(typeDef.typeName);
+  }
+
   switch (typeDef.kind) {
     // ... existing cases ...
     case 'reference': {
       const referencedType = this.typeDefinitions.get(typeDef.typeName);
       if (referencedType) {
-        this.expandType(referencedType, path, rules);
+        this.expandType(referencedType, path, rules, visited);
       } else {
         console.warn(`Type reference not found: ${typeDef.typeName}`);
       }
+      visited.delete(typeDef.typeName);
       break;
     }
     // Update recursive calls for properties
     case 'object': {
       // ... existing object handling ...
       for (const property of typeDef.properties) {
         const propertyPath = `${path}.${property.name}`;
-        this.expandType(property.type, propertyPath, rules);
+        this.expandType(property.type, propertyPath, rules, new Set(visited));
       }
       break;
     }
   }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b7125c4 and 983ea7c.

📒 Files selected for processing (1)
  • packages/schema/src/rulesets.ts (3 hunks)
🔇 Additional comments (2)
packages/schema/src/rulesets.ts (2)

34-60: LGTM! Clean type system enhancement.

The addition of EnumRule and the updated RuleType union properly supports the new enum functionality while maintaining type safety.


85-117: Well-designed internal type system.

The TypeDefinition and PropertyDefinition types provide a clean abstraction for building the hierarchical schema representation before expansion.

Copy link

codecov bot commented Jul 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.14%. Comparing base (cb65b8f) to head (6e5d04f).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1025      +/-   ##
==========================================
+ Coverage   78.10%   78.14%   +0.03%     
==========================================
  Files          66       66              
  Lines        5754     5764      +10     
  Branches     1041     1043       +2     
==========================================
+ Hits         4494     4504      +10     
+ Misses        951      950       -1     
- Partials      309      310       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sigmaith sigmaith self-assigned this Jul 16, 2025
@sigmaith sigmaith marked this pull request as ready for review July 25, 2025 02:44
@sigmaith sigmaith changed the title Enhance the schema ruleset builder to support enums and nested objects Enhance the schema ruleset builder Jul 25, 2025
@sigmaith sigmaith requested review from hackerwins and chacha912 July 25, 2025 04:01
Copy link
Member

@hackerwins hackerwins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.

@hackerwins hackerwins changed the title Enhance the schema ruleset builder Support for nested object and array, enum in schema ruleset builder Jul 25, 2025
@hackerwins hackerwins changed the title Support for nested object and array, enum in schema ruleset builder Support for object, array and enum in schema ruleset builder Jul 25, 2025
@hackerwins hackerwins changed the title Support for object, array and enum in schema ruleset builder Support for object, array and enum in ruleset builder Jul 25, 2025
@hackerwins hackerwins merged commit 7aa0828 into main Jul 25, 2025
2 checks passed
@hackerwins hackerwins deleted the enhance-schema-ruleset-builder branch July 25, 2025 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants