Skip to content

Commit 8fa1606

Browse files
committed
added docs
1 parent fed6358 commit 8fa1606

File tree

2 files changed

+22
-13
lines changed

2 files changed

+22
-13
lines changed

ads/feature_store/docs/source/dataset.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,10 @@ Feature store allows you to define expectations on data being materialized into
138138

139139
.. code-block:: python3
140140
141+
from great_expectations.core import ExpectationSuite, ExpectationConfiguration
142+
from ads.feature_store.common.enums import TransformationMode, ExpectationType
143+
from ads.feature_store.feature_group import FeatureGroup
144+
141145
expectation_suite = ExpectationSuite(
142146
expectation_suite_name="expectation_suite_name"
143147
)
@@ -186,6 +190,7 @@ dataset or it can be updated later as well.
186190
.. code-block:: python3
187191
188192
# Define statistics configuration for selected features
193+
from ads.feature_store.statistics_config import StatisticsConfig
189194
stats_config = StatisticsConfig().with_is_enabled(True).with_columns(["column1", "column2"])
190195
191196
@@ -194,6 +199,7 @@ This can be used with dataset instance.
194199
.. code-block:: python3
195200
196201
from ads.feature_store.dataset import Dataset
202+
from ads.feature_store.statistics_config import StatisticsConfig
197203
198204
dataset = (
199205
Dataset

ads/feature_store/docs/source/feature_group.rst

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -128,19 +128,18 @@ Materialise Stream
128128
You can call the ``materialise_stream() -> FeatureGroupJob`` method of the ``FeatureGroup`` instance to load the streaming data to feature group. To persist the feature_group and save feature_group data along the metadata in the feature store, call the ``materialise_stream()``
129129

130130
The ``.materialise_stream()`` method takes the following parameter:
131-
- ``input_dataframe``: Features in Streaming Dataframe to be saved.
132-
- ``query_name``: It is possible to optionally specify a name for the query to make it easier to recognise in the Spark UI. Defaults to ``None``.
133-
- ``ingestion_mode``: Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
134-
- ``"append"``: Only the new rows in the streaming DataFrame/Dataset will be written to the sink. If the query doesn’t contain aggregations, it will be equivalent to
135-
- append mode. Defaults to ``"append"``.
136-
- ``"complete"``: All the rows in the streaming DataFrame/Dataset will be written to the sink every time there is some update.
137-
- ``"update"``: only the rows that were updated in the streaming DataFrame/Dataset will be written to the sink every time there are some updates.
138-
- ``await_termination``: Waits for the termination of this query, either by query.stop() or by an exception. If the query has terminated with an exception, then the exception will be thrown. If timeout is set, it returns whether the query has terminated or not within the timeout seconds. Defaults to ``False``.
139-
- ``timeout``: Only relevant in combination with ``await_termination=True``.
140-
- Defaults to ``None``.
141-
- ``checkpoint_dir``: Checkpoint directory location. This will be used to as a reference to from where to resume the streaming job. If ``None`` then hsfs will construct as "insert_stream_" + online_topic_name. Defaults to ``None``.
142-
- ``write_options``: Additional write options for Spark as key-value pairs.
143-
- Defaults to ``{}``.
131+
- ``input_dataframe``: Features in Streaming Dataframe to be saved.
132+
- ``query_name``: It is possible to optionally specify a name for the query to make it easier to recognise in the Spark UI. Defaults to ``None``.
133+
- ``ingestion_mode``: Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
134+
- ``append``: Only the new rows in the streaming DataFrame/Dataset will be written to the sink. If the query doesn’t contain aggregations, it will be equivalent to append mode. Defaults to ``"append"``.
135+
- ``complete``: All the rows in the streaming DataFrame/Dataset will be written to the sink every time there is some update.
136+
- ``update``: only the rows that were updated in the streaming DataFrame/Dataset will be written to the sink every time there are some updates.
137+
- ``await_termination``: Waits for the termination of this query, either by ``query.stop()`` or by an exception. If the query has terminated with an exception, then the exception will be thrown. If timeout is set, it returns whether the query has terminated or not within the timeout seconds. Defaults to ``False``.
138+
- ``timeout``: Only relevant in combination with ``await_termination=True``.
139+
- Defaults to ``None``.
140+
- ``checkpoint_dir``: Checkpoint directory location. This will be used to as a reference to from where to resume the streaming job. Defaults to ``None``.
141+
- ``write_options``: Additional write options for Spark as key-value pairs.
142+
- Defaults to ``{}``.
144143

145144
.. seealso::
146145
:ref:`Feature Group Job`
@@ -200,6 +199,9 @@ With a ``FeatureGroup`` instance, You can save the expectation details using ``w
200199
.. image:: figures/validation.png
201200

202201
.. code-block:: python3
202+
from great_expectations.core import ExpectationSuite, ExpectationConfiguration
203+
from ads.feature_store.common.enums import TransformationMode, ExpectationType
204+
from ads.feature_store.feature_group import FeatureGroup
203205
204206
expectation_suite = ExpectationSuite(
205207
expectation_suite_name="expectation_suite_name"
@@ -248,6 +250,7 @@ feature group or it can be updated later as well.
248250
.. code-block:: python3
249251
250252
# Define statistics configuration for selected features
253+
from ads.feature_store.statistics_config import StatisticsConfig
251254
stats_config = StatisticsConfig().with_is_enabled(True).with_columns(["column1", "column2"])
252255
253256

0 commit comments

Comments
 (0)