Skip to content

Commit f7f80e8

Browse files
authored
Release v0.54.0 (#3530)
* Implement disposition field in SQL backend ([#3477](#3477)). In this release, we've added a new `query_statement_disposition` configuration option for the SQL backend used in the `databricks labs ucx` command-line interface. This option allows users to choose the disposition method for running large SQL queries during assessment results export, preventing failures in cases of large workspaces with high volumes of findings. The new option is included in the `config.yml` file and used in the SqlBackend definition. The commit also includes updates to the `workspace_cli.py` file and addresses issue [#3447](#3447). The `disposition` parameter has been added to the `StatementExecutionBackend` method, and the `Disposition` enum from the `databricks.sdk.service.sql` module has been added to the `config.py` file. The changes have been manually tested and are included in the modified `databricks labs install ucx` and `databricks labs ucx export-assessment` commands. * AWS role issue with external locations pointing to the root of a storage account ([#3510](#3510)). This release includes a modification to enhance AWS role access for external locations pointing to the root of a storage account, addressing issue [#3510](#3510) and closing issue [#3505](#3505). The `aws.py` file in the `src/databricks/labs/ucx/assessment/` directory has been updated to improve S3 bucket ARN pattern matching, now allowing optional trailing slashes for greater flexibility. In the `access.py` file within the `aws` directory of the `databricks/labs/ucx` package, the `_identify_missing_paths` method now checks if the `role.resource_path` is a parent of the external location path or if they match exactly, allowing root-level external locations to be recognized as compatible with AWS roles. A new method, `AWSUCRoleCandidate`, has been added to the `AWSResources` class, and several test cases have been updated or added to ensure proper functionality with UC roles and AWS resources, including handling cases with multiple role creations. * Added assert to make sure installation is finished before re-installation ([#3546](#3546)). In the latest release, we've addressed an issue (commit 3546) where the reinstallation of a software component was starting before the initial installation was complete, causing a warning message to be suppressed and the test to fail. To rectify this, we have enhanced the integration tests and added an assert to ensure that the installation is finished before attempting reinstallation. A new function called `wait_for_installation_to_finish` has been introduced to manage the waiting process. Furthermore, we have updated the `test_compare_remote_local_install_versions` function to accept `installation_ctx` instead of `ws` as a parameter, ensuring proper configuration and loading of the installation before test execution. These changes guarantee that the test will pass if the installation is finished before the reinstallation is attempted. * Added dashboards to migration progress dashboard ([#3314](#3314)). The release notes have been updated to reflect the new features and changes in the migration progress dashboard. This commit includes the addition of dashboards to track the migration progress, with linting resources added to ensure code quality. The commit also modifies the existing dashboard "Migration [main]" and updates both unit and integration tests. Specific new files and methods have been added to enhance functionality, including the tracking of dashboard migration, and new fixtures have been introduced to improve testing. The changes depend on several issues and break up others to progress functionality. Overall, this commit enhances the migration progress dashboard's capabilities, making it more efficient and reliable for tracking migration progress. * Added history log encoder for dashboards ([#3424](#3424)). A history log encoder for dashboards has been added, addressing issues [#3368](#3368) and [#3369](#3369), which modifies the existing `experimental-migration-progress` workflow. This enhancement introduces a `DashboardProgressEncoder` class that encodes Dashboard objects into Historical records, appending inventory snapshots to the history table. The changes include adding new methods for handling object types such as directories, and updating the `is_delta` property of the `Table` class. The commit also includes new tests: manually tested, unit tests added, and integration tests added. Specifically, `test_table_progress_encoder_table_failures` has been updated to include a new parameter, `is_migrated_table`, which, if set to False, adds `Pending migration` to the list of failures. The `is_used_table` parameter has been removed, and its functionality is no longer part of this commit. The changes are tested through manual, unit, and integration testing, ensuring the proper encoding of migration progress and identifying relevant failures. * Create specific failure for Python syntax error while parsing with Astroid ([#3498](#3498)). In this release, the Python linting-related code has been updated to introduce a specific failure type for syntax errors that occur while parsing code using Astroid. Previously, such errors resulted in a generic `system-error` message, but with this change, a new failure type called `python-parse-error` has been introduced. This new error type includes the start and end line and column numbers of the error and is accompanied by a new issue URL for reporting the error on the UCX GitHub. The `system-error` failure type has been renamed to `python-parse-error` to maintain consistency with the `sql-parse-error` failure type. Additionally, a new method `Tree.maybe_parse()` has been introduced to improve error detection and reporting during Python linting. A unit test has been added to ensure the new failure type is working as intended, and a generic failure is kept for directing users to create GitHub issues for surfacing other issues. * DBR 16 and later support ([#3481](#3481)). This release adds support for Databricks Runtime (DBR) 16 and later, enabling the optional conversion of Hive Metastore (HMS) tables to external tables within the `migrate-tables` workflow. The change includes a new static method `_get_entity_storage_locations` to check for the presence of the `entityStorageLocations` property on table metadata. The existing `_convert_hms_table_to_external` method has been updated to use this new method and to include the `entityStorageLocations` constructor argument if present. The changes have been manually tested for DBR 16, tested with existing integration tests for DBR 15, and verified on the staging environment with DBR 16. Additionally, the `skip_job_wait=True` parameter has been added to specific test function calls to improve test execution time. This release also resolves an issue with a failed test in DBR16 due to a JDK update. * Delete stale code: `NotebookLinter._load_source_from_run_cell` ([#3529](#3529)). In this release, we have improved the code linting functionality in the NotebookLinter class of our open-source library by removing the `_load_source_from_run_cell` method in the sources.py file. This method, previously used to load source code from run cells in a notebook, has been identified as stale code and is no longer required. Consequently, this change affects the `databricks labs ucx lint-local-code` command and results in cleaner and more maintainable code. Furthermore, updated and added unit tests have been included in this commit, which have been manually tested to ensure that the changes do not adversely impact existing functionality, thus progressing issue [#3514](#3514). * Exclude ucx dashboards from Lakeview dashboard crawler ([#3450](#3450)). In this release, the functionality of the `assessment` workflow has been improved to exclude certain dashboard IDs from the Lakeview dashboard crawler. This change has been made to address the issue of false positive dashboards and affects the `_crawl` method in the `dashboards.py` file. The excluded dashboard IDs are now obtained from the `install_state.dashboards` object. Additionally, new methods have been added to the `test_dashboards.py` file in the `unit/assessment` directory to test the exclusion functionality, including a test to ensure that the exclude parameter takes priority over the include parameter. The commit also includes unit tests, manual tests, and screenshots to verify the changes on the staging environment. Overall, this modification enhances the accuracy of the dashboard crawler and simplifies the process of identifying and assessing relevant dashboards. * Fixed issue in installing UCX on UC enabled workspace ([#3501](#3501)). This pull request introduces changes to the UCX installer to address an issue ([#3420](#3420)) with installing UCX on UC-enabled workspaces. It updates the UCX policy by changing the `spark_version` parameter from `fixed` to `allowlist` with a default value, allowing the cluster definition to take `single_user` and `user_isolation` values instead of `Legacy_Single_User` and 'Legacy_Table_ACL'. Additionally, the job definition has been updated to use the default value when not explicitly provided. The changes are implemented in the `test_policy.py` file and impact the `test_job_cluster_policy` and `test_job_cluster_on_uc_enabled_workspace` methods. The pull request also includes updates to unit tests and integration tests to ensure the correct behavior of the updated UCX policy and job definition. The target audience is software engineers adopting this project, with changes involving adjusting policy definitions and testing job cluster behavior under different configurations. Issue [#3501](#3501) is also resolved with these changes. * Fixed typo in workflow name (in error message) ([#3491](#3491)). This PR includes a fix for a minor typo in the error message of the `validate_groups_permissions` method in the `workflows.py` file. The typo resulted in the incorrect spelling of `group` as `groups` in the workflow name. The fix simply changes `groups` to `group` in the error message, ensuring accurate workflow name display. The functionality of the code remains unaffected by this change, and no new methods have been added. To clarify, the `validate_groups_permissions` method verifies whether group permissions have been migrated correctly, and if not, raises a ValueError with an error message suggesting the use of the `validate-group-permissions` workflow for validation after the API has caught up. This fix resolves the typo issue and maintains the expected behavior of the code. * Make link to issue template url safe ([#3508](#3508)). In this commit, the `_definitely_failure` function in the `python_ast.py` file has been modified to make the link to the issue template URL safe using Python's `urllib`. This change ensures that any special characters in the source code passed to the function will be properly displayed in the issue template. If the source code cannot be parsed, the function creates a link to the issue template for reporting a bug in the UCX library, including the source code as part of the issue body. With this commit, the source code is now passed through the `urllib.parse.quote_plus` function before being added to the issue body, making it url-safe and improving the robustness and user-friendliness of the library. This change has been introduced in issue [#3498](#3498) and has been manually tested. * Refactor `PipelineMigrator`'s to add `include_pipeline_ids` ([#3495](#3495)). In this refactoring, the `PipelineMigrator` has been updated to introduce an `include_pipeline_ids` option, replacing the previous `skip_pipeline_ids` flag. This change allows users to specify the list of pipelines to migrate, providing better control over the migration process. The `PipelinesMigrator` constructor, `_get_pipelines_to_migrate`, and `migrate_pipelines` methods have been modified to accommodate this new flag. The `_migrate_pipeline` method now accepts the pipeline ID instead of a `PipelineInfo` object. Additionally, the unit tests have been updated to include the new `include_flag` parameter, which facilitates testing various scenarios with different pipeline lists. Although the commit does not show changes to test files, integration tests should be updated to reflect the new `include-pipeline-ids` flag functionality. This improvement resolves issue [#3492](#3492) and enhances the overall flexibility of the `PipelineMigrator`. * Rename Python AST's `Tree` methods for clarity ([#3524](#3524)). In this release, the `Tree` class in the Python AST library has been updated for improved code clarity and functionality. The `append_` methods have been renamed to `attach_` for better accuracy, and now include docstrings for increased understanding. These methods have been updated to always return `None`. A new method, `attach_child_tree`, has been added, allowing for traversal from both parent and child and propagating any module references. Several new methods and functionalities have been introduced to improve the class, while extensive unit testing has been conducted to ensure functionality. Additionally, the diff includes test cases for various functionalities, such as inferring values when attaching trees and verifying spark module propagation, as well as tests to ensure that certain operations are not supported. This change, linked to issues [#3514](#3514) and [#3520](#3520), may affect any code that calls these methods and relies on their return values. However, the added docstrings and unit tests will help ensure your code continues to function correctly. * Schedule the migration progress workflow to run daily ([#3485](#3485)). This PR introduces changes to the UCX installation process to schedule the migration progress workflow to run automatically once a day, with the default schedule set to run at 5 a.m. UTC. It includes refactoring the plumbing used for managing and installing workflows, enabling them to have a Cron-based schedule. The relevant user documentation has been updated, and the existing `migration-progress-experimental` workflow has been modified. Additionally, unit and integration tests have been added/modified to ensure the proper functioning of the updated code, and new functions have been added to verify the workflow's schedule and task detection. * Scope crawled pipelines in PipelineCrawler ([#3513](#3513)). In this release, the `PipelineCrawler` class in the `pipelines.py` file has been updated to include a new optional argument `include_pipeline_ids` in its constructor. This argument allows users to filter the pipelines that are crawled by specifying a list of pipeline IDs. The `_crawl` method has been modified to check if `include_pipeline_ids` is not `None` and to filter the list of pipelines accordingly. The class now also checks if each pipeline exists before getting its configuration, and logs a warning message if the pipeline is not found. Previously, a `NotFound` exception was raised. Additionally, the code has been updated to use `pipeline.spec.configuration` instead of `pipeline_response.spec.configuration` to get the pipeline configuration. These changes have been tested through new and updated unit tests, including a test for handling creators' user names. Overall, these updates provide improved functionality and flexibility for crawling pipelines. * Updated databricks-labs-blueprint requirement from <0.10,>=0.9.1 to >=0.9.1,<0.11 ([#3519](#3519)). In this release, we have updated the version requirement of the `databricks-labs-blueprint` package to be greater than or equal to 0.9.1 and less than 0.11. This change allows us to use the latest version of the package and includes bug fixes and dependency updates. The hosted runner has been patched in version 0.10.1 to address issues with publishing artifacts in the release workflow. Release notes for previous versions are also provided in the commit. These updates are intended to improve the overall functionality and stability of the library. * Updated databricks-sdk requirement from <0.41,>=0.40 to >=0.40,<0.42 ([#3553](#3553)). In this release, the `databricks-sdk` package requirement has been updated to version 0.41.0, which brings new features, improvements, bug fixes, and API changes. Among the new features are the addition of 'serving.http_request' for calling external functions, and recovery on download failures in the Files API client. Although the specifics of the functionality added and changed are not detailed, the focus of this release appears to be on bug fixes and internal enhancements. Additionally, the API has undergone changes, including added and altered methods and fields, however, specific information about these changes has not been provided in the release notes. * Updated sqlglot requirement from <26.1,>=25.5.0 to >=25.5.0,<26.2 ([#3500](#3500)). A critical update has been implemented in this release for the `sqlglot` package, which has been updated to version 25.5.0 or higher, but less than 26.2. This change is essential to leverage the latest version of sqlglot while avoiding any breaking changes introduced in version 26.1. The new version includes several breaking changes, new features, bug fixes, and modifications to various dialects such as hive, postgres, tsql, and sqlite. Moreover, the tokenizer has been updated to accept underscore-separated number literals. However, the specific impact of these changes on the project is not detailed in the commit message, and software engineers should thoroughly test and review the changes to ensure seamless functionality. * Updated sqlglot requirement from <26.2,>=25.5.0 to >=25.5.0,<26.3 ([#3528](#3528)). In this update, we have modified the version constraint for the `sqlglot` dependency from `>=25.5.0,<26.2` to `>=25.5.0,<26.3` in the `pyproject.toml` file. Sqlglot is a Python-based SQL parser and optimizer, and this change allows us to adopt the latest version of sqlglot within the specified version range. This update addresses potential security vulnerabilities and incorporates performance enhancements and bug fixes, ensuring that our library remains up-to-date and secure. * Updated table-migration workflows to also capture updated migration progress into the history log ([#3239](#3239)). This pull request updates the table-migration workflows to log not only the tables that still need to be migrated, but also the progress of the migration. The affected workflows include `migrate-tables`, `migrate-external-hiveserde-tables-in-place-experimental`, `migrate-external-tables-ctas`, `scan-tables-in-mounts-experimental`, and `migrate-tables-in-mounts-experimental`. The encoder for table-history has been refactored to improve control over when the `TableMigrationStatus` data is refreshed. The documentation has been updated to reflect the changes in each workflow. Additionally, both unit and integration tests have been added and updated to ensure the changes work as intended and resolve any conflicts. A new `ProgressTrackingInstallation` class has been added to support this functionality. The changes have been manually tested and include modifications to the existing workflows, new methods, and a renamed method. The `mock_workspace_client` function has been replaced, and the `external_locations.resolve_mount` method and other methods have not been called. The `TablesCrawler` object's `snapshot` method has been called once to retrieve the list of tables in the Hive metastore. The migration record workflow run is also updated to include the workflow run information in the `workflow_runs` table. These changes are expected to improve the accuracy and reliability of the table-migration workflows. Dependency updates: * Updated sqlglot requirement from <26.1,>=25.5.0 to >=25.5.0,<26.2 ([#3500](#3500)). * Updated databricks-labs-blueprint requirement from <0.10,>=0.9.1 to >=0.9.1,<0.11 ([#3519](#3519)). * Updated databricks-sdk requirement from <0.41,>=0.40 to >=0.40,<0.42 ([#3553](#3553))
1 parent 68257d4 commit f7f80e8

File tree

2 files changed

+25
-1
lines changed

2 files changed

+25
-1
lines changed

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,29 @@
11
# Version changelog
22

3+
## 0.54.0
4+
5+
* Implement disposition field in SQL backend ([#3477](https://github.com/databrickslabs/ucx/issues/3477)). This commit introduces the `query_statement_disposition` configuration value to handle large SQL queries during assessment results export for workspaces with numerous findings. A new parameter is added to the `config.yml` file, allowing users to specify the disposition method for running large SQL statements. The modification includes changes to the `databricks labs install ucx` and `databricks labs ucx export-assessment` commands and updates to the SqlBackend definition. The `Disposition` enum is utilized to specify the disposition method in tests, which have been manually verified. This feature, developed by Michele Daddetta and Guenia Izquierdo Delgado, resolves issue [#3447](https://github.com/databrickslabs/ucx/issues/3447) and is based on changes from PR [#3455](https://github.com/databrickslabs/ucx/issues/3455).
6+
* AWS role issue with external locations pointing to the root of a storage account ([#3510](https://github.com/databrickslabs/ucx/issues/3510)). In this release, the `AWSResources` class in `aws.py` has been updated to improve S3 bucket ARN pattern matching by modifying the regular expression pattern for matching. The `_identify_missing_paths` function in `access.py` has been enhanced to check for AWS role compatibility with external locations that point to the root of a storage account using `PurePath` class. Additionally, new unit tests have been added to `tests/unit/aws/test_access.py` to ensure the correct creation of all necessary UC roles, including the new external location `s3://BUCKET4` with an appropriate access level. These changes improve the accuracy of ARN pattern matching and enhance compatibility checking and testing for AWS roles and external locations. This release is part of the ongoing development of the AWS assessment tool and addresses issues [#3510](https://github.com/databrickslabs/ucx/issues/3510) and [#3505](https://github.com/databrickslabs/ucx/issues/3505).
7+
* Added dashboards to migration progress dashboard ([#3314](https://github.com/databrickslabs/ucx/issues/3314)). This commit, co-authored by Guenia Izquierdo Delgado, modifies the migration progress dashboard to include linting resources, adds new dashboards, and improves overall functionality and maintainability. The changes include modifying the existing 'Migration [main]' dashboard and updating associated unit and integration tests. New dashboards such as `Dashboards migrated` and `Dashboard pending migration` provide valuable insights into the migration progress, displaying successful migrations and pending migration status by owner. The commit also reorganizes some existing queries and adds new methods to support the new functionality, addressing dependencies from issue [#3424](https://github.com/databrickslabs/ucx/issues/3424) and progressing work on issue [#3045](https://github.com/databrickslabs/ucx/issues/3045), while breaking up issue [#3112](https://github.com/databrickslabs/ucx/issues/3112).
8+
* Added history log encoder for dashboards ([#3424](https://github.com/databrickslabs/ucx/issues/3424)). This commit introduces a history log encoder for dashboards in the context of a larger application, addressing issues [#3368](https://github.com/databrickslabs/ucx/issues/3368) and [#3369](https://github.com/databrickslabs/ucx/issues/3369). The `experimental-migration-progress` workflow has been modified, and new classes, properties, and methods have been added to handle dashboard-related progress encoding. Specifically, the `Dashboard` class, `DashboardOwnership` class, and `DashboardProgressEncoder` class have been introduced, along with several methods for assessing dashboard ownership. These changes are tested through manual testing, unit tests, and integration tests. Additionally, the existing `TableProgressEncoder` class has been updated with new tests for failure scenarios involving tables that have not been migrated. The `WorkspacePathOwnership` method has been added to determine the owner of a given workspace path, and a new unit test has been added to test table creation from historical data.
9+
* Create specific failure for Python syntax error while parsing with Astroid ([#3498](https://github.com/databrickslabs/ucx/issues/3498)). This commit enhances the Python linting functionality in our open-source library by introducing a specific failure message for syntax errors that occur during code parsing with Astroid. Previously, a generic `system-error` message was displayed, which provided limited guidance for users. Now, a new failure type called `python-parse-error` is displayed when a SyntaxError is raised during parsing, with detailed information such as the error message, line, and column numbers. This change aligns the failure type with `sql-parse-error` and adds a default GitHub issue template to report the error. Additionally, the commit renames `system-error` to `python-parse-error` to maintain consistency and updates the README to explain the new failure type. The commit also includes new unit tests to ensure that the new failure type is being handled correctly, and modifies the Python linting-related code to add a new method `Tree.maybe_parse()` to handle syntax errors.
10+
* DBR 16 and later support ([#3481](https://github.com/databrickslabs/ucx/issues/3481)). This pull request introduces support for Databricks Runtime (DBR) 16 in the optional conversion of Hive Metastore (HMS) tables to external tables within the `migrate-tables` workflow. The update includes modifications to the existing `migrate-tables` workflow, such as the addition of a `_get_entity_storage_locations` method to check for the presence of the `entityStorageLocations` property in the table metadata, which is required for the `CatalogTable` constructor in DBR 16.0. The changes have been tested manually on DBR16, passed integration tests on DBR15, and verified on a staging environment using DBR16. Additionally, the `test_running_real_assessment_job` function in `test_workflows.py` has been updated to include the `skip_job_wait=True` parameter when running the `run_workflow` method for the `assessment` workflow, improving testing efficiency. The commit also includes a deprecated test case for converting managed tables to external before migrating, with a note about its failure from DBR 16.0 onwards due to a JDK update. The test case remains unchanged, but the note serves as a reminder for further investigation. The `run_workflow` function in the test cases has been modified to include a `skip_job_wait` parameter, allowing tests to bypass waiting for job completion, reducing overall test runtime and improving the developer experience.
11+
* Exclude ucx dashboards from Lakeview dashboard crawler ([#3450](https://github.com/databrickslabs/ucx/issues/3450)). In this release, we have introduced modifications to the `assessment` workflow, specifically in the `dashboards.py` file, to exclude dashboards from the UCX package in the Lakeview dashboard crawler and prevent false positives. The `lakeview_crawler` method in the `application.py` file has been updated to include a new argument `exclude_dashboard_ids`, set to the list of dashboard IDs in the `install_state.dashboards` object. This ensures that these dashboards are excluded from the crawler. Additionally, two new unit tests have been added to ensure the exclusion functionality works correctly. The first test checks if the crawler skips the dashboard with the ID specified in the `exclude_dashboard_ids` parameter, and the second test ensures that the `exclude_dashboard_ids` parameter takes priority over the `include_dashboard_ids` parameter when both are provided. The changes have been manually tested and verified on the staging environment, and the linked issue [#3441](https://github.com/databrickslabs/ucx/issues/3441) has been resolved.
12+
* Fixed issue in installing UCX on UC enabled workspace ([#3501](https://github.com/databrickslabs/ucx/issues/3501)). In this release, we have updated the UCX policy definition for `spark_version` from a fixed value to an allowlist with a default value. This change resolves an issue where enabling UC on a workspace caused the cluster definition to take on `single_user` and `user_isolation` values instead of `Legacy_Single_User` and 'Legacy_Table_ACL'. The policy was found to be overriding these values, and changing `spark_version` from fixed to allowlist resolved the issue. Additionally, the job definition now uses the default value if no value is provided by setting `apply_policy_default_values` to true. This change resolves issue [#3420](https://github.com/databrickslabs/ucx/issues/3420). No new methods have been added, and existing functionality has not been significantly altered. To test this change, updated unit tests, integration tests, and a static installation test should be performed. The code modification includes a new method called `test_job_cluster_on_uc_enabled_workspace` which tests the behavior of installation on a UC-enabled workspace, verifying that the correct data security modes are set for different job clusters. The changes in this release are backward compatible and do not affect existing functionality. The modification to the UCX policy ensures that the correct spark version and node type are selected, while also allowing for flexibility in data security modes. The updated tests provide confidence in the correct behavior of the installation process on both standard and UC-enabled workspaces.
13+
* Fixed typo in workflow name (in error message) ([#3491](https://github.com/databrickslabs/ucx/issues/3491)). This PR fixes a minor typo in an error message that appears when group permissions fail to migrate successfully. The typo, found in the name of the workflow for validating permissions, has been corrected from `validate-group-permissions` to "validate-groups-permissions". This change enhances the user experience by providing clearer instructions for addressing issues with group permissions during migration. No new methods have been introduced, and existing functionality has been modified solely for the correction of the typo. The change does not impact any other parts of the codebase. This project is geared towards software engineers who seek to utilize its features.
14+
* Refactor `PipelineMigrator`'s to add `include_pipeline_ids` ([#3495](https://github.com/databrickslabs/ucx/issues/3495)). In this release, the `PipelineMigrator` class in the `pipelines_migrate.py` file has been refactored to enhance the pipeline migration process. The refactor introduces a new parameter `include_pipeline_ids`, which allows users to specify a list of pipelines to migrate. Previously, users could only skip pipelines that were already migrated or explicitly specified using the `skip_pipeline_ids` parameter. With this refactor, users now have more control over the migration process by being able to explicitly include and exclude pipelines using the `include_pipeline_ids` and `exclude_pipeline_ids` parameters, respectively. Additionally, the implementation of the `PipelineMigrator` class has been simplified, and unit tests and integration tests have been updated to reflect these changes. As a software engineer, it is important to thoroughly test and validate this new behavior to ensure compatibility with existing systems.
15+
* Schedule the migration progress workflow to run daily ([#3485](https://github.com/databrickslabs/ucx/issues/3485)). This PR introduces a daily schedule for the UCX installation's migration progress workflow, refactoring workflow management/installation plumbing to enable Cron-based scheduling and setting the default schedule for the migration progress workflow to run at 5 a.m. UTC. Relevant user documentation has been updated, and the existing `migration-progress-experimental` workflow has been modified. New test methods have been added to check for the presence of workflows and tasks, as well as validate the workflow's schedule and pause status. These changes improve automation and maintainability of the UCX installation process, while ensuring that existing functionalities are working correctly.
16+
* Scope crawled pipelines in PipelineCrawler ([#3513](https://github.com/databrickslabs/ucx/issues/3513)). In this release, the `PipelineCrawler` class in the `databricks/labs/ucx/assessment` directory has been updated with a new optional argument `include_pipeline_ids` in the constructor. This argument is a list of strings that represent the IDs of pipelines to be crawled. If not provided, all pipelines will be crawled. The `_crawl` method has been modified to accept a list of pipeline IDs and now obtains a list of pipeline IDs instead of pipeline objects. For each pipeline ID, the method tries to get the pipeline and extract its configuration, while also checking for any failures. Additionally, assertions have been added to ensure that the `pipeline_id` and `spec.configuration` attributes are not `None`. A new test function `test_include_pipeline_ids()` has been introduced to verify the functionality of this argument. These changes improve the functionality of the `PipelineCrawler` class by allowing users to crawl specific pipelines based on their IDs.
17+
* Updated databricks-labs-blueprint requirement from <0.10,>=0.9.1 to >=0.9.1,<0.11 ([#3519](https://github.com/databrickslabs/ucx/issues/3519)). In this update, the requirement for the `databricks-labs-blueprint` package has been updated to a version greater than or equal to 0.9.1 and strictly less than 0.11, previously it was greater than or equal to 0.9.1 and strictly less than 0.10. This change allows the latest version of the package to be used. Additionally, the commit includes release notes, a changelog, and commit information for the updated package, as well as instructions for Dependabot commands and options. The changes are limited to the `pyproject.toml` file and do not have any impact on other parts of the codebase.
18+
* Updated sqlglot requirement from <26.1,>=25.5.0 to >=25.5.0,<26.2 ([#3500](https://github.com/databrickslabs/ucx/issues/3500)). In this pull request, we have updated the version requirement for the `sqlglot` dependency in the 'pyproject.toml' file. The previous version constraint was for a version greater than or equal to 25.5.0 and less than 26.1, but it has been relaxed to permit versions greater than or equal to 25.5.0 and less than 26.2. This change was made to enable the use of the latest version of 'sqlglot', which includes several new features, bug fixes, and breaking changes as detailed in the 26.1.0 changelog. We have also included the commit history for the `sqlglot` repository to provide further context and reference. This update aims to ensure compatibility with the latest version of `sqlglot` while also providing transparency regarding the changes implemented.
19+
* Updated sqlglot requirement from <26.2,>=25.5.0 to >=25.5.0,<26.3 ([#3528](https://github.com/databrickslabs/ucx/issues/3528)). In this release, we have updated the required version constraint of the `sqlglot` library in the `pyproject.toml` file. The previous constraint `>=25.5.0,<26.2` has been updated to `>=25.5.0,<26.3`. This change allows the project to utilize the latest version of `sqlglot` within the newly specified range while maintaining compatibility with the project's existing requirements. Notably, this update does not introduce any new methods to the project; it only affects the version constraint for the `sqlglot` library. Software engineers integrating this project can now benefit from the latest `sqlglot` versions within the specified range.
20+
* Updated table-migration workflows to also capture updated migration progress into the history log ([#3239](https://github.com/databrickslabs/ucx/issues/3239)). This pull request enhances the table-migration workflows by logging updated migration progress in the history log, providing improved visibility into the migration process. The workflows, including `migrate-tables`, `migrate-external-hiveserde-tables-in-place-experimental`, `migrate-external-tables-ctas`, `scan-tables-in-mounts-experimental`, and `migrate-tables-in-mounts-experimental`, have been updated to include this new logging functionality. In addition to these changes, the documentation has been updated to reflect which workflows update which tables, and the `TableMigrationStatus` data initialization behavior has been modified. New and updated unit and integration tests have been manually tested to ensure the changes are functioning correctly. Co-authored by Serge Smertin and Cor Zuurmond.
21+
22+
Dependency updates:
23+
24+
* Updated sqlglot requirement from <26.1,>=25.5.0 to >=25.5.0,<26.2 ([#3500](https://github.com/databrickslabs/ucx/pull/3500)).
25+
* Updated databricks-labs-blueprint requirement from <0.10,>=0.9.1 to >=0.9.1,<0.11 ([#3519](https://github.com/databrickslabs/ucx/pull/3519)).
26+
327
## 0.53.1
428

529
* Removed `packaging` package dependency ([#3469](https://github.com/databrickslabs/ucx/issues/3469)). In this release, we have removed the dependency on the `packaging` package in the open-source library to address a release issue. The import statements for "packaging.version.Version" and "packaging.version.InvalidVersion" have been removed. The function _external_hms in the federation.py file has been updated to retrieve the Hive Metastore version using the "spark.sql.hive.metastore.version" configuration key and validate it using a regular expression pattern. If the version is not valid, the function logs an informational message and returns None. This change modifies the Hive Metastore version validation logic and improves the overall reliability and maintainability of the library.

src/databricks/labs/ucx/__about__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
# DO NOT MODIFY THIS FILE
2-
__version__ = "0.53.1"
2+
__version__ = "0.54.0"

0 commit comments

Comments
 (0)