Skip to content

Commit 2439f18

Browse files
committed
add uspport for local backend
1 parent c34981b commit 2439f18

File tree

4 files changed

+60
-46
lines changed

4 files changed

+60
-46
lines changed

ads/opctl/operator/common/backend_factory.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,16 @@ class BackendFactory:
7676
RUNTIME_TYPE.CONTAINER.value.lower(),
7777
),
7878
},
79+
BACKEND_NAME.LOCAL.value.lower(): {
80+
RUNTIME_TYPE.PYTHON.value.lower(): (
81+
BACKEND_NAME.LOCAL.value.lower(),
82+
RUNTIME_TYPE.PYTHON.value.lower(),
83+
),
84+
RUNTIME_TYPE.CONTAINER.value.lower(): (
85+
BACKEND_NAME.LOCAL.value.lower(),
86+
RUNTIME_TYPE.CONTAINER.value.lower(),
87+
),
88+
},
7989
}
8090

8191
BACKEND_MAP = {

ads/opctl/operator/lowcode/anomaly/MLoperator

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ gpu: no
66
keywords:
77
- Anomaly Detection
88
backends:
9-
- job
9+
- job, local
1010
description: |
1111
Anomaly Detection is the identification of rare items, events, or observations in data that
1212
differ significantly from the expectation. This can be used for several scenarios like asset

docs/source/user_guide/operators/anomaly_detection_operator/index.rst

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,41 +2,37 @@
22
Anomaly Detection Operator
33
==========================
44

5-
The Anomaly Detection Operator leverages historical time series data to generate accurate forecasts for future trends. This operator aims to simplify and expedite the data science process by automating the selection of appropriate models and hyperparameters, as well as identifying relevant features for a given prediction task.
5+
The Anomaly Detection Operator is a low code tool for integrating Anomaly Detection into any enterprise applicaiton. Specifically, it leverages timeseries constructive anomaly detection in order to flag anomolous moments in your data, by time and by ID.
66

77

88
Overview
99
--------
1010

11-
**Introduction to Anomaly Detection with the Python Library Module**
11+
**Input Data**
1212

13-
Anomaly Detection is a crucial component of decision-making in various fields. The Operators framework is OCI's most extensible, low-code, managed ecosystem for building and deploying anomaly detection models.
13+
The Anomaly Detection Operator accepts a dataset with:
14+
# A datetime column
15+
# A target column
16+
# (Optionally) 1 or more seires columns (such that the target is indexed by datetime and series)
17+
# (Optionall) An arbitrary number of additional variables
1418

15-
This technical documentation introduces ``ads opctl`` for anomaly detection tasks. This module is engineered with the principles of low-code development in mind, making it accessible to users with varying degrees of technical expertise. It operates on managed infrastructure, ensuring reliability and scalability, while its configurability through YAML allows users to tailor forecasts to their specific needs.
19+
Besides this input data, the user can also specify validation data, if available. Validation data should have all the columns of the input data plus a binary column titled "anomaly". The "anomaly" column should be -1 for anomalies and 1 for normal rows.
1620

17-
**Multivariate vs. Univariate Anomaly Detection**
18-
19-
One of the fundamental decisions in anomaly detection is whether to employ multivariate or univariate models. Univariate anomaly detection involves predicting a single variable, typically based on its historical values, making it suitable for straightforward time series analysis. In contrast, multivariate anomaly detection takes into account multiple interrelated variables, allowing for a more comprehensive understanding of complex systems.
20-
21-
**Global vs. Local Models for Multivariate Forecasts**
21+
Finally the user can provide "test_data" in order to recieve test metrics and evaluate the Operator's performance more easily. Test data should indexed by date and (optionally) series. Test data should have a -1 for anomalous rows and 1 for normal rows.
2222

23-
When dealing with multivariate forecasts, the choice between global and local models is pivotal. Global models assume that the relationships between variables are uniform across all data points, providing a consolidated forecast for the entire dataset. In contrast, local models consider localized relationships, allowing forecasts to adapt to variations within the dataset.
24-
25-
**Strengths and Weaknesses of Global and Local Models**
26-
27-
Global models are advantageous when relationships between variables remain relatively stable over time. They offer simplicity and ease of interpretation, making them suitable for a wide range of applications. However, they may struggle to capture nuances in the data when relationships are not consistent throughout the dataset.
23+
**Multivariate vs. Univariate Anomaly Detection**
2824

29-
Local models, on the other hand, excel in capturing localized patterns and relationships, making them well-suited for datasets with varying dynamics. They can provide more accurate forecasts in cases where global models fall short.
25+
If you have additional variables that you think might be related, then you should use "multivariate" AD. All additional columns given in the input data will be used in determining if the target column is anomalous.
3026

3127
**Auto Model Selection**
3228

33-
Some users know which modeling frameworks (this can be a specific model, such as ARIMA and Prophet or it can be an automl library like Oracle's AutoMLX) they want to use right already, the anomaly detection operator allows these more advanced users to configure this through the ``model`` parameter. For those newer users who don't know, or want to explore multiple, the anomaly detection operator sets the ``model`` parameter to "auto" by default. "auto" will select the framework that looks most appropriate given the dataset.
29+
Operators users don't need to know anything about the underlying models in order to use them. By default we set ``model: auto``. However, some users want more control over the modeling parameters. These users can set the ``model`` parameter to either ``autots`` or ``automlx`` and then pass parameters directly into ``model_kwargs``. See :doc:`Advanced Examples <./advanced_use_cases>`
3430

3531
**Anomaly Detection Documentation**
3632

37-
This documentation will explore these concepts in greater depth, demonstrating how to leverage the flexibility and configurability of the Python library module to implement both multivariate and univariate anomaly detection models, as well as global and local approaches. By the end of this guide, users will have the knowledge and tools needed to make informed decisions when designing anomaly detection solutions tailored to their specific requirements.
33+
This documentation will explore these concepts in greater depth, demonstrating how to leverage the flexibility and configurability of the Python library module to implement both multivariate and univariate anomaly detection models. By the end of this guide, users will have the knowledge and tools needed to make informed decisions when designing anomaly detection solutions tailored to their specific requirements.
3834

39-
.. versionadded:: 2.9.0
35+
.. versionadded:: 2.10.1
4036

4137
.. toctree::
4238
:maxdepth: 1

tests/operators/anomaly/test_anomaly_simple.py

Lines changed: 35 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
import tempfile
1313
import os
1414
import numpy as np
15+
from ads.opctl.operator.cmd import run
1516

1617

1718
MODELS = ["automlx", "autots"]
@@ -90,12 +91,14 @@ def test_artificial_big(model):
9091
yaml_i["spec"]["target_category_columns"] = [TARGET_CATEGORY_COLUMN]
9192
yaml_i["spec"]["datetime_column"]["name"] = DATETIME_COLUMN
9293

93-
with open(anomaly_yaml_filename, "w") as f:
94-
f.write(yaml.dump(yaml_i))
95-
sleep(0.1)
96-
subprocess.run(
97-
f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
98-
)
94+
run(yaml_i, backend="local", debug=False)
95+
96+
# with open(anomaly_yaml_filename, "w") as f:
97+
# f.write(yaml.dump(yaml_i))
98+
# sleep(0.1)
99+
# subprocess.run(
100+
# f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
101+
# )
99102
sleep(0.1)
100103
subprocess.run(f"ls -a {output_dirname}/", shell=True)
101104
assert os.path.exists(f"{output_dirname}/report.html"), "Report not generated."
@@ -128,13 +131,15 @@ def test_artificial_small(model):
128131
yaml_i["spec"]["output_directory"]["url"] = output_dirname
129132
yaml_i["spec"]["contamination"] = 0.3
130133

131-
with open(anomaly_yaml_filename, "w") as f:
132-
f.write(yaml.dump(yaml_i))
133-
sleep(0.1)
134-
subprocess.run(
135-
f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
136-
)
137-
sleep(0.1)
134+
run(yaml_i, backend="local", debug=False)
135+
136+
# with open(anomaly_yaml_filename, "w") as f:
137+
# f.write(yaml.dump(yaml_i))
138+
# sleep(0.1)
139+
# subprocess.run(
140+
# f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
141+
# )
142+
# sleep(0.1)
138143
subprocess.run(f"ls -a {output_dirname}/", shell=True)
139144
assert os.path.exists(f"{output_dirname}/report.html"), "Report not generated."
140145

@@ -180,13 +185,14 @@ def test_validation(model):
180185
yaml_i["spec"]["output_directory"]["url"] = output_dirname
181186
yaml_i["spec"]["contamination"] = 0.05
182187

183-
with open(anomaly_yaml_filename, "w") as f:
184-
f.write(yaml.dump(yaml_i))
185-
sleep(0.1)
186-
subprocess.run(
187-
f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
188-
)
189-
sleep(0.1)
188+
run(yaml_i, backend="local", debug=False)
189+
# with open(anomaly_yaml_filename, "w") as f:
190+
# f.write(yaml.dump(yaml_i))
191+
# sleep(0.1)
192+
# subprocess.run(
193+
# f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
194+
# )
195+
# sleep(0.1)
190196
subprocess.run(f"ls -a {output_dirname}/", shell=True)
191197
assert os.path.exists(f"{output_dirname}/report.html"), "Report not generated."
192198

@@ -203,13 +209,15 @@ def test_load_datasets(model, data_dict):
203209
yaml_i["spec"]["datetime_column"]["name"] = data_dict["dt_col"]
204210
yaml_i["spec"]["output_directory"]["url"] = output_dirname
205211

206-
with open(f"{tmpdirname}/anomaly.yaml", "w") as f:
207-
f.write(yaml.dump(yaml_i))
208-
sleep(0.5)
209-
subprocess.run(
210-
f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
211-
)
212-
sleep(0.1)
212+
run(yaml_i, backend="local", debug=False)
213+
214+
# with open(f"{tmpdirname}/anomaly.yaml", "w") as f:
215+
# f.write(yaml.dump(yaml_i))
216+
# sleep(0.5)
217+
# subprocess.run(
218+
# f"ads operator run -f {anomaly_yaml_filename} --debug", shell=True
219+
# )
220+
# sleep(0.1)
213221
subprocess.run(f"ls -a {output_dirname}/", shell=True)
214222

215223
# train_metrics = pd.read_csv(f"{output_dirname}/metrics.csv")

0 commit comments

Comments
 (0)