Skip to content

re2: dev runs work without worker groups, fixed some type issues #1756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 5, 2025

Conversation

ericallam
Copy link
Member

@ericallam ericallam commented Mar 5, 2025

Summary by CodeRabbit

  • Refactor

    • Streamlined event processing and task execution to ensure more consistent and reliable operations.
    • Consolidated event logging logic by unifying flag usage for clearer categorization.
  • Chores

    • Enhanced error handling in background services and connections for improved system resilience.
    • Updated internal configurations and tests to support more robust task processing.

…r queue is defined by the environment). Also deprecated the TaskEvent.isDebug column and using TaskEventKind.LOG instead for debug events
Copy link

changeset-bot bot commented Mar 5, 2025

⚠️ No Changeset found

Latest commit: bb6dbcf

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

coderabbitai bot commented Mar 5, 2025

Walkthrough

This pull request introduces several focused modifications. A TypeScript ignore directive comment is added within the checkpointer to suppress node type errors. In the webapp, event debugging now derives from an event’s kind rather than a direct flag, resulting in updated type definitions, SQL queries, and the removal of the TaskEventSummary type. The TriggerTaskServiceV2 refactors master queue retrieval into a new private method that throws an error when a worker group is missing. Additionally, run-engine trigger parameters are made optional, the Prisma schema deprecates the old debug field, and a Redis error listener is added to swallow connection errors.

Changes

File(s) Change Summary
apps/coordinator/src/checkpointer.ts Adds a // @ts-ignore comment before clearInterval to suppress node type errors.
apps/webapp/app/utils/taskEvent.ts,
apps/webapp/app/v3/eventRepository.server.ts,
apps/webapp/app/v3/taskEventStore.server.ts
Updates event debugging: imports TaskEventKind, replaces direct isDebug usage with checks on event.kind, updates type definitions and SQL queries, and removes TaskEventSummary.
apps/webapp/app/v3/services/triggerTaskV2.server.ts Refactors master queue retrieval into a new private method #getMasterQueueForEnvironment and changes error handling to throw a ServiceValidationError when a worker group is missing.
internal-packages/run-engine/src/engine/tests/batchTriggerAndWait.test.ts,
internal-packages/run-engine/src/engine/types.ts
Adds an organizationId property to the batch trigger method and updates the TriggerParams type to make masterQueue and taskEventStore optional.
internal-packages/database/prisma/schema.prisma Deprecates the isDebug field in the TaskEvent model, signaling a shift to using TaskEventKind.LOG.
internal-packages/testcontainers/src/utils.ts Introduces an error event listener in verifyRedisConnection to swallow Redis connection errors.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant TriggerTaskServiceV2
    participant WorkerGroupService

    Client->>TriggerTaskServiceV2: call()
    TriggerTaskServiceV2->>TriggerTaskServiceV2: #getMasterQueueForEnvironment(environment)
    alt Environment = DEVELOPMENT
        TriggerTaskServiceV2-->>Client: Returns undefined
    else Environment ≠ DEVELOPMENT
        TriggerTaskServiceV2->>WorkerGroupService: getDefaultWorkerGroup(project)
        WorkerGroupService-->>TriggerTaskServiceV2: workerGroup (or error)
        TriggerTaskServiceV2-->>Client: Returns masterQueue
    end
Loading

Possibly related PRs

Suggested reviewers

  • matt-aitken

Poem

I'm a rabbit on the run,
Hopping through lines of refined code,
Debug flags transformed with fun,
Leaving old pitfalls down the road.
Carrots and comments cheer each change—
A joyful hop in our code exchange!

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal-packages/testcontainers/src/utils.ts (1)

90-92: Consider documenting why errors are swallowed

While adding an error handler to swallow Redis connection errors is appropriate for test containers, it would be helpful to add a more descriptive comment explaining why these errors are intentionally ignored.

 redis.on("error", (error) => {
-  // swallow the error
+  // Intentionally swallowing Redis errors during connection verification
+  // to prevent test failures due to transient Redis connection issues
 });
apps/webapp/app/v3/eventRepository.server.ts (1)

849-850: Style icon is still based on isDebug attribute.

While the system is transitioning from isDebug to kind, the UI style icon is still determined by the boolean isDebug flag. Consider updating this to use the kind value for consistency.

-      [SemanticInternalAttributes.STYLE_ICON]: options.attributes.isDebug ? "warn" : "play",
+      [SemanticInternalAttributes.STYLE_ICON]: options.kind === TaskEventKind.LOG ? "warn" : "play",
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e97704d and bb6dbcf.

📒 Files selected for processing (9)
  • apps/coordinator/src/checkpointer.ts (1 hunks)
  • apps/webapp/app/utils/taskEvent.ts (2 hunks)
  • apps/webapp/app/v3/eventRepository.server.ts (6 hunks)
  • apps/webapp/app/v3/services/triggerTaskV2.server.ts (3 hunks)
  • apps/webapp/app/v3/taskEventStore.server.ts (3 hunks)
  • internal-packages/database/prisma/schema.prisma (1 hunks)
  • internal-packages/run-engine/src/engine/tests/batchTriggerAndWait.test.ts (1 hunks)
  • internal-packages/run-engine/src/engine/types.ts (1 hunks)
  • internal-packages/testcontainers/src/utils.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • apps/coordinator/src/checkpointer.ts
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: units / 🧪 Unit Tests
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (17)
apps/webapp/app/utils/taskEvent.ts (2)

15-15: Added import for TaskEventKind is appropriate

This import is used as part of the logic change on line 79, aligning with the new approach to determine debug status.


79-79: Logic change to derive isDebug from event kind is aligned with schema changes

Changing from isDebug to checking event.kind === TaskEventKind.LOG matches the schema changes where isDebug is being deprecated. This approach ensures consistency in how debugging events are identified across the application.

internal-packages/database/prisma/schema.prisma (1)

2384-2385: Appropriate deprecation comment for isDebug field

The deprecation notice clearly explains that the field should no longer be used and directs developers to the new approach using TaskEventKind.LOG. This is good practice for managing the transition away from deprecated fields.

internal-packages/run-engine/src/engine/types.ts (1)

68-68:

✅ Verification successful

Verify that all consumers of taskEventStore handle undefined values

Making taskEventStore optional could cause runtime errors if any code assumes this field always has a value. Ensure all code that consumes this field includes appropriate null/undefined checks.


🏁 Script executed:

#!/bin/bash
# Description: Find all uses of taskEventStore from TriggerParams to verify they handle undefined values

# Look for usages of TriggerParams.taskEventStore
rg -A 3 -B 3 "\btaskEventStore\b" --glob "*.ts" | grep -v "taskEventStore\?:"

Length of output: 26448


Verification Confirmed: All Consumers Handle Undefined Safely

After reviewing the usage across the codebase, it appears that every consumer of taskEventStore appropriately handles cases where the value may be undefined. In particular:

  • The utility function getTaskEventStoreTableForRun (in apps/webapp/app/v3/taskEventStore.server.ts) correctly defaults run.taskEventStore to "taskEvent" when it isn’t "taskEventPartitioned", thereby safely handling an undefined input.
  • Other consumers merely pass the optional taskEventStore along to helper functions (or rely on constructed instances) rather than dereferencing it directly, which avoids runtime errors.

No further changes are needed regarding null/undefined checks for this field.

apps/webapp/app/v3/taskEventStore.server.ts (3)

23-23: Type update aligns with event model refactoring.

The addition of the "kind" field to the TraceEvent type is part of a broader refactoring from using isDebug boolean flag to using the more flexible kind enumeration to categorize events.


141-141: SQL query update keeps consistency with type changes.

Adding the "kind" field to the SELECT statement in the findTraceEvents method maintains consistency with the updated TraceEvent type that now includes this field.


171-171: SQL query update maintains consistency across table types.

This parallel change to include "kind" in the SELECT clause for the TaskEvent table ensures both queries return the same fields, making the code more maintainable and consistent regardless of which table is used.

apps/webapp/app/v3/eventRepository.server.ts (5)

23-23: Updated import adds TaskEventKind for enhanced type safety.

The import now includes TaskEventKind which supports the transition from boolean isDebug flags to a more descriptive enumeration, providing better type safety and clarity in event categorization.


132-132: Type definition updated to match model changes.

Replaced isDebug: true with kind: true in the QueriedEvent type to align with the database schema changes, ensuring type consistency throughout the application.


495-495: Updated logic derives debug status from event kind.

Instead of directly using a boolean flag, the code now determines if an event is for debugging by checking if its kind is TaskEventKind.LOG. This approach provides more flexibility for future event categorization.


552-552: Updated field selection in database query.

The getRunEvents method now selects the kind field instead of isDebug, ensuring the returned data includes the information needed to determine the event's type.


864-865:

❓ Verification inconclusive

Event kind and level are derived from isDebug attribute.

The code now sets both level and kind based on the isDebug attribute, serving as a transition point that maps the old boolean flag to the new enum-based categorization.

Since this appears to be a transition strategy, let's verify if there are other places in the codebase that still rely on isDebug:


🏁 Script executed:

#!/bin/bash
# Search for isDebug usages in the codebase

echo "Finding remaining usages of isDebug..."
rg -A 2 -B 2 "isDebug" --type ts

Length of output: 3558


Action Required: Verify Consistent Migration of isDebug Usage

The changes in apps/webapp/app/v3/eventRepository.server.ts now correctly derive both level and kind from the options.attributes.isDebug flag (mapping to "WARN"/TaskEventKind.LOG versus "TRACE"/options.kind), which appears to be part of a controlled transition from a boolean flag to an enum-based categorization.

However, the search for isDebug revealed additional references in the codebase:

  • apps/webapp/app/utils/taskEvent.ts – where isDebug is computed as event.kind === TaskEventKind.LOG.
  • apps/webapp/app/routes/.../route.tsx – where isDebug is used in filtering events (e.g., !event.data.isDebug).

Please review these other usages to ensure they align with the intended migration strategy. It may be necessary to update them to maintain consistency across the application.

apps/webapp/app/v3/services/triggerTaskV2.server.ts (3)

315-315: Refactored master queue retrieval for better organization.

The inline worker group retrieval logic has been extracted into a dedicated method, improving code readability and maintainability.


340-340: Simplified master queue assignment.

The assignment of the masterQueue parameter now directly uses the result of the extracted method, making the code cleaner and easier to follow.


430-449:

❓ Verification inconclusive

Added well-structured private method for master queue retrieval.

The new private method #getMasterQueueForEnvironment effectively encapsulates the worker group retrieval logic, with proper error handling for missing worker groups. This is a good example of the Extract Method refactoring pattern.

However, there's a notable behavior change: now an error is thrown when no worker group is found, whereas previously it would just log an error and continue, which could cause issues downstream.

Let's verify if this change in error handling behavior was intentional and whether there are places that rely on the previous behavior:


🏁 Script executed:

#!/bin/bash
# Check for error handling related to missing worker groups

echo "Checking for calls to this method..."
rg -A 2 -B 2 "getMasterQueueForEnvironment" --type ts

echo "Checking for error handling of missing worker groups..."
rg -A 5 -B 5 "No worker group found" --type ts

Length of output: 2946


Action Required: Confirm Error Handling for Missing Worker Group

The new private method #getMasterQueueForEnvironment cleanly encapsulates the worker group retrieval logic and now throws a ServiceValidationError when no worker group is found. However, this introduces a behavioral change compared to the previous approach in WorkerGroupService (which logged the error and returned without throwing). Please verify the following:

  • Confirm that all consumers of #getMasterQueueForEnvironment (e.g., in triggerTaskV2.server.ts) are designed to handle an exception rather than a silent failure.
  • Review if the inconsistency between throwing an error here and the logging behavior in WorkerGroupService is intentional. If not, consider aligning the error handling across these methods.
  • If the change is deliberate, document the rationale to guide downstream error handling.
internal-packages/run-engine/src/engine/tests/batchTriggerAndWait.test.ts (2)

102-108: LGTM! Added organizationId parameter to support updated method signature.

The addition of the organizationId parameter to the blockRunWithCreatedBatch method call correctly aligns with the updated method signature mentioned in the PR.


199-204:

❓ Verification inconclusive

Consider adding organizationId parameter to unblockRunForCreatedBatch for consistency.

The blockRunWithCreatedBatch method now includes organizationId, but the unblockRunForCreatedBatch method call doesn't have this parameter. For consistency, consider adding the organizationId parameter here as well if the method signature has been updated to include it.


🏁 Script executed:

#!/bin/bash
# Check if unblockRunForCreatedBatch method also accepts organizationId parameter
grep -r "unblockRunForCreatedBatch" --include="*.ts" internal-packages/run-engine/src/

Length of output: 294


Action Required: Align Parameter Consistency in unblockRunForCreatedBatch

  • It appears that the blockRunWithCreatedBatch method now accepts an organizationId parameter. Please verify in the implementation (in internal-packages/run-engine/src/engine/index.ts) whether unblockRunForCreatedBatch has been updated to also accept this parameter.
  • If the method signature now includes organizationId, update the test call in internal-packages/run-engine/src/engine/tests/batchTriggerAndWait.test.ts (lines 199–204) to pass the appropriate organizationId (for example, via authenticatedEnvironment.organizationId).
  • Otherwise, if the parameter is not part of the updated signature, this suggestion can be disregarded.

@@ -58,14 +58,14 @@ export type TriggerParams = {
sdkVersion?: string;
cliVersion?: string;
concurrencyKey?: string;
masterQueue: string;
masterQueue?: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify that all consumers of masterQueue handle undefined values

Making masterQueue optional could cause runtime errors if any code assumes this field always has a value. Ensure all code that consumes this field includes appropriate null/undefined checks.


🏁 Script executed:

#!/bin/bash
# Description: Find all uses of masterQueue from TriggerParams to verify they handle undefined values

# Look for usages of TriggerParams.masterQueue
rg -A 3 -B 3 "\bmasterQueue\b" --glob "*.ts" | grep -v "masterQueue\?:"

Length of output: 63436


Action: Audit and add safeguards for masterQueue usage

The field masterQueue defined as optional in
internal-packages/run-engine/src/engine/types.ts
is referenced in multiple locations without explicit checks for undefined. For example:

  • In internal-packages/run-engine/src/run-queue/keyProducer.ts, a template literal (${this._prefix}*${masterQueue}) concatenates its value directly. If masterQueue is undefined, it will produce an unexpected string (e.g., "prefix*undefined").
  • In internal-packages/run-engine/src/engine/index.ts, masterQueue is used to build arrays and is passed to methods without validating its presence.
  • Similar patterns appear in other consumers (e.g., files in apps/webapp/app/v3/services/triggerTaskV2.server.ts and worker group token service), where no explicit undefined guard is evident.

Although the tests haven’t reported runtime errors so far, these unchecked usages could lead to subtle bugs when masterQueue is not supplied. Please review each consumer:

• Ensure that methods consuming masterQueue either perform an explicit undefined check or utilize a sensible default value.
• Update the affected code (e.g., in keyProducer and engine/index) to guard against undefined values.

@ericallam ericallam merged commit e89fb92 into main Mar 5, 2025
6 of 15 checks passed
@ericallam ericallam deleted the run-engine-dev-worker-group branch March 5, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants