Releases: databrickslabs/blueprint
Releases · databrickslabs/blueprint
Release v0.11.4
What's Changed
- Added Password Prompt to operate with echo off in terminal (#265). The command-line interface now includes a secure password prompt feature, allowing users to enter sensitive information without it being visible on the screen. This is achieved through a new method that utilizes the getpass library to hide user input when entering passwords, taking a prompt message and an optional maximum number of attempts as parameters. The method repeatedly prompts the user for a password until a valid input is provided or the maximum number of attempts is reached, at which point it raises a ValueError. This addition enhances the security and usability of the interface, and its functionality is validated through new test methods that cover both successful password entry and the scenario where the maximum number of attempts is exceeded, ensuring the feature behaves as expected in various situations.
- Sniff encoding properly in XML files with a standalone directive (#256). The XML file encoding detection has been improved to support a wider range of valid XML declarations. The regular expression used to match XML declarations has been updated to correctly handle cases where both encoding and standalone attributes are present, such as . This change enables more accurate detection of the encoding attribute in XML files, even when a standalone directive is present. Additionally, test functions have been added and modified to verify this functionality, including tests for XML files with a BOM prefix and those with an XML standalone declaration, ensuring that the code can correctly read these files and detect the encoding.
Full Changelog: v0.11.3...v0.11.4
v0.11.3
- Fixed configuration file unmarshalling of JSON floating-point values (#253). The unmarshalling of primitive types has been improved to ensure accurate conversion and prevent potential errors or data corruption. The updated functionality now correctly handles the conversion of JSON floating-point values to integers by refusing to truncate precision and instead raising a
SerdeError
when necessary. Additionally, string-to-boolean conversions are now strictly validated to only accepttrue
orfalse
(case-insensitive). Furthermore, configuration file unmarshalling has been enhanced with additional type checks to verify that loaded values match the expected types, such as strings, integers, and floats, thereby preventing incorrect type conversions and ensuring that the loaded data retains its original precision and type. - Fixed unmarshalling of forward references on Python ≥ 3.12.4 (#252). The library's unmarshalling functionality has been updated to support forward references on Python versions 3.12.4 and later, which introduced changes to the
_evaluate
method ofForwardRef
. A new internal utility method has been added to handle these changes, ensuring compatibility with different Python versions by conditionally passing therecursive_guard
parameter as a keyword argument and including additional type information. Additionally, a new test function has been introduced to verify the correct handling of forward references in class fields, simulating future annotations and testing the save and load process of an instance with various field types, including strings, integers, and JSON values, to ensure that forward references are correctly resolved. - Support detecting file encoding from XML declaration (#254). The library's file encoding detection capabilities have been enhanced to support XML files, allowing for the extraction of encoding information from the XML declaration at the start of the file. A new detection method has been introduced, which reads the initial bytes of the file to determine the potential encoding, attempts to decode the XML declaration, and returns the specified encoding if successful. This functionality is accessible through the updated
decode_with_bom
andread_text
functions, which now accept an optionaldetect_xml
parameter to enable XML declaration-based encoding detection. If no encoding is detected via the byte order mark (BOM) or XML declaration, the library defaults to the locale's preferred encoding. Additionally, new test cases have been added to verify the correct detection of XML file encodings, including scenarios with encoding declarations, byte order marks, and default UTF-8 encoding.
Contributors: @asnare, @sundarshankar89
v0.11.2
- Allow login URLs as profile
host
when configuring the workspace/admin client for CLI commands (#250). The handling of theDATABRICKS_HOST
environment variable has been modified to ensure consistent normalization of the host URL with the Databricks Go SDK, resolving a host normalization issue that previously arose from differences in SDK implementations. Two new methods,fix_databricks_host
and_patch_databricks_host
, have been introduced to emulate the Go SDK's host normalization and update the environment variable if necessary. Thefix_databricks_host
method normalizes the host URL by parsing it and creating a new URL instance with empty path, parameters, query, and fragment if the netloc is empty, while the_patch_databricks_host
method checks and updates theDATABRICKS_HOST
environment variable accordingly. This change enables the Python SDK to receive a normalized host URL, allowing the labs CLI integration to work correctly, and updates theneeds_workspace_client
andis_account
checks to use the normalized host URL when creating workspace or account clients. Additionally, several unit tests have been added to verify the correctness of the normalization and patching functionality for different host value types and client scenarios.
Contributors: @asnare
v0.11.1
- Expose the number of available CPUs for concurrent processing (#244). The library now provides a method to determine the number of logical CPUs available for the current process, considering factors such as containerized environments where the available CPU quota may differ from the total number of CPUs present. This method checks for the availability of the
process_cpu_count
attribute, and if not available, attempts to use thesched_getaffinity
function on Linux or falls back to the total number of CPUs in the system, defaulting to 1 if unknown. Thegather
method has been updated to utilize this new method, allowing for more accurate determination of the available CPU count and improved concurrency. Additionally, several test cases have been added to verify the correct behavior of the method, including scenarios where the count is retrieved from different sources, ensuring a reliable way to determine the available CPU count for configuring concurrent processing in downstream applications. - Improve support for reading text files that contain a Unicode BOM at the start (#243). The library now provides enhanced support for reading text files that contain a Unicode Byte Order Mark (BOM) at the start, allowing for accurate detection and handling of the file's encoding. New methods have been introduced to detect the BOM and decode the file accordingly, including handling of decoding errors and newline characters. The
read_text
function has been added, enabling the reading of text files with a BOM prefix, and is designed to work with both seekable and non-seekable files, although specifying a read size for non-seekable files will raise an error. Additionally, the existing code for handling Workspace files has been refactored to utilize the same implementation, and improvements have been made to support non-seekable files, ensuring a more robust and reliable reading experience for text files with Unicode BOM markers.
Contributors: @asnare
v0.11.0
- Marshalling: allow JSON-like fields (#241). The library has undergone significant changes to improve its marshalling functionality, code readability, and maintainability. A new
JsonValue
type alias has been introduced to represent the maximum bounds of values that can be saved for an installation, and support forAny
andobject
as type annotations on data classes has been removed. The library now issues aDeprecationWarning
when saving rawlist
anddict
fields, and raises a specific error during loading, instructing users to uselist[T]
ordict[T]
instead. Various methods, including_marshal_generic_list
,_marshal_raw_list
,_marshal_generic_dict
, and_marshal_raw_dict
, have been updated to handle the serialization of lists and dictionaries, while the_unmarshal
method now handles the deserialization of unions, lists, and dictionaries. Additionally, the library has been updated to provide more informative error messages, and several tests have been added to cover various scenarios, including generic dict and list JSON values, bool in union, and raw list and dict deprecation. TheInstallation
class,MockInstallation
class, andPaths
class have also been updated with new methods, type hints, and custom initialization to improve code flexibility and maintainability.
Contributors: @asnare
v0.10.2
- Consistent exception formatting in logs (#237). The logger's exception formatting has been enhanced to provide a consistent and readable log format, adhering to standard Python norms. When an exception occurs, the log message now ensures a newline character separates the error message from the exception details, regardless of whether logs are colorized or not. This update applies to both exception text and stack information, which are now prepended with a newline character if necessary, resulting in a uniform format for all log types. This change resolves previous inconsistencies between colorized and non-colorized logs, aligning the logging functionality with standard Python practices for exception logging, and improving overall log readability.
- Ensure that App logger emits
DEBUG
events if the CLI is invoked with--debug
(#238). Theget_logger
function has been enhanced to provide more flexibility and consistency with standard logging practices. It now accepts an optionalmanager
parameter, allowing for customization of the logging manager, and returns alogging.Logger
object. The logger level is automatically set toDEBUG
when the application is running in debug mode, as detected by theis_in_debug
function, and the level is set using thelogging.DEBUG
constant for consistency. This change simplifies the code and ensures that the logger emitsDEBUG
events when the application is run with the debug flag, which is verified through an updated test suite that covers various scenarios, including logger name setting, debug mode behavior, and logger propagation. - Ensure the names of logger levels are consistent (#234). The logger has been updated to use consistent naming conventions for logging levels, aligning with the Python ecosystem's norms. Previously, colorized logs used compact names
WARN
andFATAL
for warning and critical levels, while non-colorized logs used the conventionalWARNING
andCRITICAL
names. To address this inconsistency, two new dictionaries have been introduced to store colorized level names and color codes, and theformat
method has been modified to utilize these dictionaries, ensuring consistent logging level names and colorized message text. As a result, logging level names have been updated to use the conventionalWARNING
andCRITICAL
instead ofWARN
and "FATAL", and color codes for message text have been added for each logging level, promoting consistency and adherence to Python logging conventions. - Ensure the non-colorized logs also include timestamps with second-lev… (#235). The log formatter has been updated to include second-granular timestamps in non-colorized logs, providing more precise logging information and ensuring consistency with colorized output. Previously, only minute-granular timestamps were logged, which was insufficient for logging purposes. The update changes the timestamp format from
%H:%M
to%H:%M:%S
to include seconds, resulting in more detailed timestamp information. This change resolves the inconsistency between colorized and non-colorized logs, and is verified by updated tests that validate the formatter's behavior with and without colors, confirming that the formatter now correctly starts with a timestamp in both cases. - Fixed Blueprint Install (#225). The
__version__
variable import statement has been updated to utilize a fully qualified module name, providing a more explicit and absolute reference to the module containing version information. This change ensures that the correct version is imported and used to set the user agent extra in relevant function calls, enhancing the reliability and accuracy of version tracking within the library. - Fixed argument interpolation in colorised logs (#233). The colorised log formatter has been enhanced to correctly handle log entries containing
%
-style placeholders with arguments, a common pattern in third-party code, by retrieving the log message usingrecord.getMessage()
instead of directly accessingrecord.msg
. This update resolves an issue with improper formatting of logging from third-party components and builds upon previous changes to address the underlying logging problem. Additionally, the corresponding test case has been updated to verify that the formatter correctly handles messages with arguments that require interpolation, both with and without colors enabled, and is no longer expected to fail, indicating that the issue with argument interpolation in the colorized log formatter has been resolved. - Fixed logger name abbreviation fails if the logger name contains
..
(#236). The logger's format method has been enhanced to correctly abbreviate logger names containing multiple consecutive dots, which previously led to exceptions. The new logic splits the logger name into components, abbreviating all but the last two, and then reassembles them, ensuring correct abbreviation and formatting even when consecutive dots are present. This improvement also fixes the colorized logger output to handle logger names with consecutive dots without throwing an exception, and the corresponding test case has been updated to reflect this change, now directly testing the logging functionality by formatting the log record and stripping ANSI escape sequences, providing a more straightforward verification of the logging functionality.
Contributors: @asnare, @sundarshankar89
v0.10.1
- patch hosted runner (#185). In this release, we have implemented a temporary fix to address issues with publishing artifacts in the release workflow. This fix involves changing the runner used for the job from
ubuntu-latest
to a protected runner group labeled "linux-ubuntu-latest". This ensures that the job runs on a designated hosted runner with the specified configuration, enhancing the reliability and security of the release process. Thepermissions
section of the job remains unchanged, allowing authentication to PyPI and signing of release artifacts with sigstore-python. It is worth noting that this is a stopgap measure, and further changes to the release workflow may be made in the future.
Contributors: @sundarshankar89
v0.10.0
- Fixed incorrect script for no-pylint-disable (#178). In this release, we have updated the script used in the
no-cheat
GitHub workflow to address false positives in stacked pull requests. The updated script fetches the base reference from the remote repository and generates a diff between the base reference and the current branch, saving it to a file. It then runs the "no_cheat.py" script against this diff file and saves the results to a separate file. If the count of cheats (instances where linting has been intentionally disabled) is greater than one, the script outputs the contents of the results file and exits with a non-zero status, indicating an error. This change enhances the accuracy of the script and ensures it functions correctly in a stacked pull request scenario. Theno_cheat
function, which checks for the presence of certain pylint disable tags in a given diff text, has been updated to the latest version from the ucx project to improve accuracy. The function identifies tags by looking for lines starting with-
or "+" followed by the disable tag and a list of codes, and counts the number of times each code is added and removed, reporting any net additions. - Skip dataclassess fields only when
None
(#180). In this release, we have implemented a change that allows for the skipping of dataclass fields only when the value isNone
, enabling the inclusion of empty lists, strings, or zeros during marshalling. This modification is in response to issue #179 and involves adding a check forNone
before marshalling a dataclass field. Specifically, the previous conditionif not raw:
has been replaced withif raw is None:
. This change ensures that empty values such as[]
,''
, or0
are not skipped during the serialization process, unless they are explicitly set toNone
. This enhancement provides improved compatibility and flexibility for users working with dataclasses containing empty values, allowing for more fine-grained control during the serialization process.
Dependency updates:
- Bump codecov/codecov-action from 4 to 5 (#174).
Contributors: @nfx, @dependabot[bot], @JCZuurmond, @ericvergnaud, @sundarshankar89
v0.9.3
- Fixed issue when Databricks SDK config objects were overridden for installation config files (#170). This commit addresses an issue where Databricks SDK config objects were being overridden during installation config files creation, which has been resolved by modifying the
_marshal
method in theinstallation
class to handledatabricks.sdk.core.Config
instances more carefully, and by introducing a new helper functionget_databricks_sdk_config
in thepaths.py
file, which retrieves the Databricks SDK configuration and improves the reliability and robustness of the SDK configuration. This fixes bug #169 and ensures that the SDK configuration is not accidentally modified during the installation process, preventing unexpected behavior and errors. The changes are isolated to thepaths.py
file and do not affect other parts of the codebase.
Contributors: @FastLee
v0.9.2
- Bump actions/checkout from 4.2.1 to 4.2.2 (#160). In this release, the 'actions/checkout' dependency has been updated from version 4.2.1 to 4.2.2. This update includes changes to the 'url-helper.ts' file, which now utilizes well-known environment variables for improved reliability and maintainability. Additionally, unit test coverage for the
isGhes
function has been expanded. These changes are recommended for adoption to take advantage of the enhancements. The pull request includes a detailed changelog, commit history, and instructions for managing the update using Dependabot commands and options. - Bump databrickslabs/sandbox from acceptance/v0.3.1 to 0.4.2 (#166). In the latest release, the
databrickslabs/sandbox
Python package has been updated from version acceptance/v0.3.1 to 0.4.2. This update includes new features such as installation instructions, additional go-git libraries, and modifications to the README file. Dependency updates include a bump in the version ofgolang.org/x/crypto
used. The pull request for this update was created by a GitHub bot, Dependabot, which will manage any conflicts and respond to comments containing specific commands. It is essential to thoroughly review and test this updated version to ensure that the new methods and modifications to existing functionality do not introduce any issues or regressions, and that the changes are well-documented and justified. - Don't draft automated releases (#159). In this release, the draft release feature in the GitHub Actions workflow has been disabled, enhancing the release process for software engineers. The 'draft: true' parameter has been removed from the
Draft release
job, which means that automated releases will now be published immediately upon creation instead of being created as drafts. This modification simplifies and streamlines the release process, making it more efficient for engineers who adopt the project. The change is aimed at reducing the time and effort required in manually publishing draft releases, thereby improving the overall experience for project contributors and users. - Updated custom
Path
support for python 3.13 (#161). In this revision, the project's continuous integration (CI) workflow has been updated to include Python 3.13, enhancing compatibility and enabling early identification of platform-specific issues. Thepaths
module has been refactored into several submodules for better organization, and a new submodule,databrickspath_posixpath
, has been added to distinguishPosixPath
fromDBFSPath
andWorkspacePath
. The comparison and equality behavior of_DatabricksPath
objects has been modified to includeparser
property identity checks in Python 3.13, ensuring consistent behavior and eliminating exceptions when built-in paths are compared with custom paths. These updates promote confidence in the project's long-term viability and adaptability in response to evolving language standards.
Dependency updates:
- Bump actions/checkout from 4.2.1 to 4.2.2 (#160).
- Bump databrickslabs/sandbox from acceptance/v0.3.1 to 0.4.2 (#166).
Contributors: @asnare, @dependabot[bot], @nfx