Skip to content

Build data_context object in __init__() and not in execute method  #142

@Salias

Description

@Salias

Right now, the self.data_context object is initialized within the execute method of the airflow BaseOperator.

This is done in:

However, this makes impossible to interact with the data context before or after the execution.

If this self.data_context is initiated in the __init__() method, the user could interact with this object in the pre_execute() or post_execute() methods of airflow BaseOperator.

A possible use case, for example, is to add ExpectationsSuites on runtime using an InMemoryStoreBackend Expectation store?

    def pre_execute(self, context: Any):
    """
    Create and add an expectation suite to the in-memory DataContext.
    """
        suite = self.data_context.create_expectation_suite(suite_name=suite_name, overwrite_existing=True)
        
        # Add expectations
        # Here we'll add a simple expectation as an example
        suite.add_expectation(
            expectation_type="expect_table_row_count_to_be_between",
            kwargs={
                "min_value": 1,
                "max_value": 1000000
            }
        )

        # Save the suite to the DataContext's in-memory expectations store
       self.data_context.save_expectation_suite(suite)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions