Add test validation for schema comparison 50-row limit bug (Issue #1973) #1984

Copilot · 2025-09-03T04:34:13Z

This PR addresses Issue #1973 by adding comprehensive tests to validate the reported bug where schema comparison dashboards only display 50 rows even when tables have more than 50 columns.

Problem

Users reported that when performing schema comparison on tables with more than 50 columns, the reconciliation dashboard only shows 50 columns in the schema details, despite validation occurring for all columns. This suggests a limitation in either the data persistence layer or dashboard display logic.

Solution

Added three targeted test cases to tests/integration/reconcile/test_schema_compare.py to systematically validate this bug:

Test Cases Added

test_schema_compare_exactly_50_columns - Baseline test with exactly 50 columns to confirm normal operation
test_schema_compare_51_columns_edge_case - Edge case test with 51 columns to identify the exact threshold where the bug manifests
test_schema_compare_large_column_count_bug_validation - Clear reproduction case with 60 columns

Expected Behavior

If the bug exists:

50-column test should pass (✅)
51-column test should fail with descriptive error: "BUG DETECTED: Expected 51 rows but got 50" (❌)
60-column test should fail with error: "BUG CONFIRMED: Expected 60 rows but got 50" (❌)

When the bug is fixed:

All tests should pass, returning row counts equal to the number of columns (✅)

Root Cause Analysis

Investigation revealed the potential issue lies in the data persistence pipeline in recon_capture.py:

Schema results are stored using Spark's collect_list() to aggregate column data
Dashboard queries use explode() to convert aggregated data back to individual rows
There may be a default limit in this aggregation/explode process that caps results at 50 rows

Implementation Details

Tests follow existing patterns using schema_fixture_factory and mock_spark fixtures
Added comprehensive debug output for troubleshooting
Enhanced error messages clearly identify when the 50-row limit bug is detected
No changes to existing production code - only test additions for validation

These tests provide a foundation for confirming the bug exists and validating any future fix while preventing regression.

💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

…mparison (issue #1973) Co-authored-by: sundarshankar89 <72757199+sundarshankar89@users.noreply.github.com>

…essages for issue #1973 Co-authored-by: sundarshankar89 <72757199+sundarshankar89@users.noreply.github.com>

github-actions · 2025-09-03T04:49:09Z

✅ 21/21 passed, 1m7s total

_{Running from acceptance #2115}

…654f

Initial plan

93b339d

Copilot AI assigned Copilot and sundarshankar89 Sep 3, 2025

Copilot started work on behalf of sundarshankar89 September 3, 2025 04:34 View session

Copilot AI and others added 2 commits September 3, 2025 04:42

Add comprehensive tests to validate the 50-row limit bug in schema co…

a15b959

…mparison (issue #1973) Co-authored-by: sundarshankar89 <72757199+sundarshankar89@users.noreply.github.com>

Enhance schema comparison bug tests with better debugging and error m…

a30dba1

…essages for issue #1973 Co-authored-by: sundarshankar89 <72757199+sundarshankar89@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] write a small test to validate the bug in test_schema_compare.py @databrickslabs/lakebridge/issues/1973 Raise only PRs do not apply changes directly to main~~ Add test validation for schema comparison 50-row limit bug (Issue #1973) Sep 3, 2025

Copilot AI requested a review from sundarshankar89 September 3, 2025 04:45

Copilot finished work on behalf of sundarshankar89 September 3, 2025 04:45

Copilot AI temporarily deployed to tool September 3, 2025 04:45 Inactive

Merge branch 'main' into copilot/fix-0512a2c9-e272-4849-88a7-9f97127b…

5a74250

…654f

sundarshankar89 temporarily deployed to tool September 3, 2025 12:07 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add test validation for schema comparison 50-row limit bug (Issue #1973) #1984

Add test validation for schema comparison 50-row limit bug (Issue #1973) #1984

Uh oh!

Copilot AI commented Sep 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add test validation for schema comparison 50-row limit bug (Issue #1973) #1984

Are you sure you want to change the base?

Add test validation for schema comparison 50-row limit bug (Issue #1973) #1984

Uh oh!

Conversation

Copilot AI commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Test Cases Added

Expected Behavior

Root Cause Analysis

Implementation Details

Uh oh!

github-actions bot commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Sep 3, 2025 •

edited

Loading

github-actions bot commented Sep 3, 2025 •

edited

Loading