Skip to content

Commit 6df592a

Browse files
committed
updating docs
1 parent 1cec8e8 commit 6df592a

File tree

14 files changed

+200
-333
lines changed

14 files changed

+200
-333
lines changed

ads/opctl/operator/lowcode/anomaly/const.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ class SupportedModels(str, metaclass=ExtendedEnumMeta):
1313

1414
AutoMLX = "automlx"
1515
AutoTS = "autots"
16-
TODS = "tods"
16+
Auto = "auto"
17+
# TODS = "tods"
1718

1819

1920
class TODSSubModels(str, metaclass=ExtendedEnumMeta):

ads/opctl/operator/lowcode/anomaly/model/automlx.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ def _build_model(self) -> pd.DataFrame:
2727
date_column = self.spec.datetime_column.name
2828
anomaly_output = AnomalyOutput(date_column=date_column)
2929

30-
time_budget = self.spec.model_kwargs.pop("time_budget", -1)
30+
time_budget = self.spec.model_kwargs.pop("time_budget", None)
3131
# Iterate over the full_data_dict items
3232
for target, df in self.datasets.full_data_dict.items():
3333
est = automl.Pipeline(task="anomaly_detection", **self.spec.model_kwargs)

ads/opctl/operator/lowcode/anomaly/operator_config.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,12 @@
1010

1111
from ads.common.serializer import DataClassSerializable
1212
from ads.opctl.operator.common.utils import _load_yaml_from_uri
13-
from ads.opctl.operator.common.operator_config import OperatorConfig, OutputDirectory, InputData
13+
from ads.opctl.operator.common.operator_config import (
14+
OperatorConfig,
15+
OutputDirectory,
16+
InputData,
17+
)
18+
from .const import SupportedModels
1419

1520

1621
@dataclass(repr=True)
@@ -64,7 +69,7 @@ def __post_init__(self):
6469
self.inliers_filename = self.inliers_filename or "inliers.csv"
6570
self.outliers_filename = self.outliers_filename or "outliers.csv"
6671
self.test_metrics_filename = self.test_metrics_filename or "metrics.csv"
67-
72+
self.model = self.model or SupportedModels.Auto
6873
self.generate_inliers = (
6974
self.generate_inliers if self.generate_inliers is not None else False
7075
)

ads/opctl/operator/lowcode/anomaly/schema.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,9 @@ spec:
2929
input_data:
3030
required: true
3131
type: dict
32+
default: {"url": "data.csv"}
3233
meta:
33-
description: "The input data."
34+
description: "The payload that the detector should evaluate."
3435
schema:
3536
connect_args:
3637
nullable: true
@@ -82,7 +83,7 @@ spec:
8283
required: false
8384
type: dict
8485
meta:
85-
description: "The input data."
86+
description: "Data that has already been labeled as anomalous or not."
8687
schema:
8788
connect_args:
8889
nullable: true

ads/opctl/operator/lowcode/forecast/schema.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ spec:
2929
historical_data:
3030
required: true
3131
type: dict
32+
default: {"url": "data.csv"}
3233
meta:
3334
description: "This should be indexed by date and target category (optionally). It should include all targets and endogeneous data."
3435
schema:
@@ -335,10 +336,10 @@ spec:
335336

336337
target_category_columns:
337338
type: list
338-
required: true
339+
required: false
339340
schema:
340341
type: string
341-
default: ["Column1"]
342+
default: ["Series ID"]
342343

343344
horizon:
344345
required: true

docs/source/user_guide/operators/anomaly_detection_operator/advanced_use_cases.rst

Lines changed: 50 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -4,99 +4,86 @@ Advanced Use Cases
44

55
**Documentation: Anomaly Detection Science and Model Parameterization**
66

7-
**The Science of Anomaly Detection**
7+
The Science of Anomaly Detection
8+
--------------------------------
89

9-
Forecasting is a complex yet essential discipline that involves predicting future values or events based on historical data and various mathematical and statistical techniques. To achieve accurate forecasts, it is crucial to understand some fundamental concepts:
10+
Anomaly Detection comes in many forms. We will go through some of these and give guidance as to whether this Operator is going to be helpful for each use case.
1011

11-
**Seasonality**
12+
* Constructive v Destructive v Pre-Processing: This Operator focuses on the Constructive and Pre-Processing use cases. Destructive can work, but more specific parameters may be required.
13+
* Supervised v Semi-Supervised v Unsupervised: All 3 of these approaches are supported by AutoMLX. AutoTS supports only Unsupervised at this time.
14+
* Time Series. This Operator is focused on just time-series data.
1215

13-
Seasonality refers to patterns in data that repeat at regular intervals, typically within a year. For example, retail sales often exhibit seasonality with spikes during holidays or specific seasons. Seasonal components can be daily, weekly, monthly, or yearly, and understanding them is vital for capturing and predicting such patterns accurately.
1416

15-
**Stationarity**
17+
Data Parameterization
18+
---------------------
1619

17-
Stationarity is a critical property of time series data. A time series is considered stationary when its statistical properties, such as mean, variance, and autocorrelation, remain constant over time. Stationary data simplifies forecasting since it allows models to assume that future patterns will resemble past patterns.
20+
**Read Data from the Database**
1821

19-
**Cold Start**
22+
.. code-block:: yaml
23+
24+
kind: operator
25+
type: anomaly
26+
version: v1
27+
spec:
28+
input_data:
29+
connect_args:
30+
user: XXX
31+
password: YYY
32+
dsn: "localhost/orclpdb"
33+
sql: 'SELECT Store_ID, Sales, Date FROM live_data'
34+
datetime_column:
35+
name: ds
36+
target_column: y
2037
21-
The "cold start" problem arises when you have limited historical data for a new product, service, or entity. Traditional forecasting models may struggle to make accurate predictions in these cases due to insufficient historical context.
2238
23-
**Passing Parameters to Models**
39+
**Read Part of a Dataset**
2440

25-
To enhance the accuracy and adaptability of forecasting models, our system allows you to pass parameters directly. Here's how to do it:
41+
42+
.. code-block:: yaml
43+
44+
kind: operator
45+
type: anomaly
46+
version: v1
47+
spec:
48+
input_data:
49+
url: oci://bucket@namespace/data
50+
format: hdf
51+
limit: 1000 # Only the first 1000 rows
52+
columns: ["y", "ds"] # Ignore other columns
53+
datetime_column:
54+
name: ds
55+
target_column: y
2656
2757
2858
**Specify Model Type**
2959

30-
Sometimes users will know which models they want to use. When users know this in advance, they can specify using the ``model_kwargs`` dictionary. In the following example, we will instruct the model to *only* use the ``DecisionTreeRegressor`` model.
60+
Sometimes users will know which models they want to use. When users know this in advance, they can specify using the ``model_kwargs`` dictionary. In the following example, we will instruct the model to *only* use the ``IsolationForestOD`` model.
3161

3262
.. code-block:: yaml
3363
3464
kind: operator
35-
type: forecast
65+
type: anomaly
3666
version: v1
3767
spec:
3868
model: automlx
3969
model_kwargs:
4070
model_list:
41-
- NaiveForecaster
71+
- IsolationForestOD
4272
search_space:
43-
NaiveForecaster:
44-
sp: [1,100]
73+
IsolationForestOD:
74+
n_estimators:
75+
range': [10, 50]
76+
type': 'discrete'
4577
4678
47-
When using autots, there are model_list *families*. These families are named after the shared characteristics of the models included. For example, we can use the autots "superfast" model_list and set it in the following way:
79+
AutoTS offers the same extensibility:
4880

4981
.. code-block:: yaml
5082
5183
kind: operator
52-
type: forecast
84+
type: anomaly
5385
version: v1
5486
spec:
5587
model: autots
5688
model_kwargs:
57-
model_list: superfast
58-
59-
60-
Note: this is only supported for the ``autots`` model.
61-
62-
63-
**Specify Other Model Details**
64-
65-
In addition to ``model_list``, there are many other parameters that can be specified. Users may specify, for example, the search space they want to search for their given model type. In automlx, specifying a hyperparameter range is as simple as:
66-
67-
.. code-block:: yaml
68-
69-
kind: operator
70-
type: forecast
71-
version: v1
72-
spec:
73-
model: automlx
74-
model_kwargs:
75-
search_space:
76-
LogisticRegression:
77-
C:
78-
range: [0.03125, 512]
79-
type': continuous
80-
solver:
81-
range: ['newton-cg', 'lbfgs', 'liblinear', 'sag']
82-
type': categorical
83-
class_weight:
84-
range: [None, 'balanced']
85-
type: categorical
86-
87-
88-
**When Models Perform Poorly and the "Auto" Method**
89-
90-
Forecasting models are not one-size-fits-all, and some models may perform poorly under certain conditions. Common scenarios where models might struggle include:
91-
92-
- **Sparse Data:** When there's limited historical data available, traditional models may have difficulty making accurate predictions, especially for cold start problems.
93-
94-
- **High Seasonality:** Extremely seasonal data with complex patterns can challenge traditional models, as they might not capture all nuances.
95-
96-
- **Non-Linear Relationships:** In cases where the relationships between input variables and forecasts are nonlinear, linear models may underperform.
97-
98-
- **Changing Dynamics:** If the underlying data-generating process changes over time, static models may fail to adapt.
99-
100-
Our system offers an "auto" method that strives to anticipate and address these challenges. It dynamically selects the most suitable forecasting model and parameterizes it based on the characteristics of your data. It can automatically detect seasonality, stationarity, and cold start issues, then choose the best-fitting model and adjust its parameters accordingly.
101-
102-
By using the "auto" method, you can rely on the system's intelligence to adapt to your data's unique characteristics and make more accurate forecasts, even in challenging scenarios. This approach simplifies the forecasting process and often leads to better results than manual model selection and parameter tuning.
89+
method: IQR

0 commit comments

Comments
 (0)