Skip to content

Commit c7b6bc8

Browse files
committed
doc update
1 parent a424945 commit c7b6bc8

File tree

2 files changed

+80
-19
lines changed

2 files changed

+80
-19
lines changed

ads/feature_store/docs/source/dataset.rst

Lines changed: 39 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -119,28 +119,56 @@ With a Dataset instance, we can get the last dataset job details using ``get_las
119119

120120
.. code-block:: python3
121121
122-
# Fetch validation results for a dataset
123122
dataset_job = dataset.get_last_job()
124-
df = dataset_job.get_validation_output().to_dataframe()
125-
df.show()
126123
127124
Save expectation entity
128125
=======================
129-
Feature store allows you to define expectations on data being materialized into feature group instance. With a ``FeatureGroup`` instance, we can save the expectation entity using ``save_expectation()``
130-
131-
132-
.. image:: figures/validation.png
126+
Feature store allows you to define expectations on data being materialized into dataset instance.With a ``Dataset`` instance, You can save the expectation details using ``with_expectation_suite()`` with parameters
133127

134-
The ``.save_expectation()`` method takes the following optional parameter:
135-
136-
- ``expectation: Expectation``. Expectation of great expectation
128+
- ``expectation_suite: ExpectationSuite``. ExpectationSuit of great expectation
137129
- ``expectation_type: ExpectationType``. Type of expectation
138130
- ``ExpectationType.STRICT``: Fail the job if expectation not met
139131
- ``ExpectationType.LENIENT``: Pass the job even if expectation not met
140132

133+
.. note::
134+
135+
Great Expectations is a Python-based open-source library for validating, documenting, and profiling your data. It helps you to maintain data quality and improve communication about data between teams. Software developers have long known that automated testing is essential for managing complex codebases.
136+
137+
.. image:: figures/validation.png
138+
141139
.. code-block:: python3
142140
143-
feature_group.save_expectation(expectation_suite, expectation_type="STRICT")
141+
expectation_suite = ExpectationSuite(
142+
expectation_suite_name="expectation_suite_name"
143+
)
144+
expectation_suite.add_expectation(
145+
ExpectationConfiguration(
146+
expectation_type="expect_column_values_to_not_be_null",
147+
kwargs={"column": "<column>"},
148+
)
149+
150+
dataset_resource = (
151+
Dataset()
152+
.with_description("dataset description")
153+
.with_compartment_id(<compartment_id>)
154+
.with_name(<name>)
155+
.with_entity_id(entity_id)
156+
.with_feature_store_id(feature_store_id)
157+
.with_query(f"SELECT * FROM `{entity_id}`.{feature_group_name}")
158+
.with_expectation_suite(
159+
expectation_suite=expectation_suite,
160+
expectation_type=ExpectationType.STRICT,
161+
)
162+
)
163+
164+
You can call the ``get_validation_output()`` method of the Dataset instance to fetch validation results for a specific ingestion job.
165+
The ``get_validation_output()`` method takes the following optional parameter:
166+
167+
- ``job_id: string``. Id of dataset job
168+
169+
``get_validation_output().to_pandas()`` will output the validation results for each expectation as pandas dataframe
170+
171+
``get_validation_output().to_summary()`` will output the overall summary of validation as pandas dataframe.
144172

145173
.. seealso::
146174

ads/feature_store/docs/source/feature_group.rst

Lines changed: 41 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -152,22 +152,55 @@ Feature store provides an API similar to Pandas to join feature groups together
152152
153153
Save expectation entity
154154
=======================
155-
Feature store allows you to define expectations on data being materialized into feature group instance. With a ``FeatureGroup`` instance, we can save the expectation entity using ``save_expectation()``
155+
With a ``FeatureGroup`` instance, You can save the expectation details using ``with_expectation_suite()`` with parameters
156156

157-
158-
.. image:: figures/validation.png
159-
160-
The ``.save_expectation()`` method takes the following optional parameter:
161-
162-
- ``expectation: Expectation``. Expectation of great expectation
157+
- ``expectation_suite: ExpectationSuite``. ExpectationSuit of great expectation
163158
- ``expectation_type: ExpectationType``. Type of expectation
164159
- ``ExpectationType.STRICT``: Fail the job if expectation not met
165160
- ``ExpectationType.LENIENT``: Pass the job even if expectation not met
166161

162+
.. note::
163+
164+
Great Expectations is a Python-based open-source library for validating, documenting, and profiling your data. It helps you to maintain data quality and improve communication about data between teams. Software developers have long known that automated testing is essential for managing complex codebases.
165+
166+
.. image:: figures/validation.png
167+
167168
.. code-block:: python3
168169
169-
feature_group.save_expectation(expectation_suite, expectation_type="STRICT")
170+
expectation_suite = ExpectationSuite(
171+
expectation_suite_name="expectation_suite_name"
172+
)
173+
expectation_suite.add_expectation(
174+
ExpectationConfiguration(
175+
expectation_type="expect_column_values_to_not_be_null",
176+
kwargs={"column": "<column>"},
177+
)
178+
179+
feature_group_resource = (
180+
FeatureGroup()
181+
.with_feature_store_id(feature_store.id)
182+
.with_primary_keys(["<key>"])
183+
.with_name("<name>")
184+
.with_entity_id(entity.id)
185+
.with_compartment_id(<compartment_id>)
186+
.with_schema_details_from_dataframe(<datframe>)
187+
.with_expectation_suite(
188+
expectation_suite=expectation_suite,
189+
expectation_type=ExpectationType.STRICT,
190+
)
191+
)
192+
193+
You can call the ``get_validation_output()`` method of the FeatureGroup instance to fetch validation results for a specific ingestion job.
194+
The ``get_validation_output()`` method takes the following optional parameter:
195+
196+
- ``job_id: string``. Id of feature group job
197+
``get_validation_output().to_pandas()`` will output the validation results for each expectation as pandas dataframe
198+
199+
.. image:: figures/validation_results.png
200+
201+
``get_validation_output().to_summary()`` will output the overall summary of validation as pandas dataframe.
170202

203+
.. image:: figures/validation_summary.png
171204
.. seealso::
172205

173206
:ref:`Feature Validation`

0 commit comments

Comments
 (0)