Skip to content

Parsers (pre-processing) do not run if validation is disabled #2161

@libklein

Description

@libklein

I've noticed that parsers are not run if validation is disabled. (See code snippet)

I wonder if this is intended, or an oversight? The primary reason for disabling validation is, as outlined in the docs, avoiding expensive checks in e.g. production. Wouldn't we still want to run parsers if validation is disabled?

Happy to hear your opinions.

def test_parser_runs() -> None:
    import pandera.pandas as pa
    import pandas as pd
    from pandas.testing import assert_frame_equal
    from pandera.config import config_context

    with config_context(validation_enabled=False):
        schema = pa.DataFrameSchema(
            parsers=pa.Parser(lambda df: df.transform(lambda s: s*2)),
            columns={
                "a": pa.Column(float, parsers=pa.Parser(lambda s: s + 1)),
            }
        )

        data = pd.DataFrame({
            "a": [1.0]
        })
        expected_data = pd.DataFrame({
            "a": [5.0]
        })

        assert_frame_equal(expected_data, schema.validate(data)) # Fails - dataframe is not transformed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions