You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The [Oracle Accelerated Data Science (ADS) SDK](https://accelerated-data-science.readthedocs.io/en/latest/index.html) is maintained by the Oracle Cloud Infrastructure (OCI) [Data Science service](https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm) team. It speeds up common data science activities by providing tools that automate and simplify common data science tasks. Additionally, provides data scientists a friendly pythonic interface to OCI services. Some of the more notable services are OCI Data Science, Model Catalog, Model Deployment, Jobs, ML Pipelines, Data Flow, Object Storage, Vault, Big Data Service, Data Catalog, and the Autonomous Database. ADS gives you an interface to manage the life cycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
6
7
7
8
With ADS you can:
8
9
9
-
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`.
10
-
- Use feature types to characterize your data, create meaning summary statistics and plot. Use the warning and validation system to test the quality of your data.
11
-
- Tune models using hyperparameter optimization with the `ADSTuner` tool.
12
-
- Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
13
-
- Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm).
14
-
- Deploy models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
15
-
- Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm).
16
-
- Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
17
-
- Define and run an end-to-end machine learning orchestration covering all the steps of machine learning lifecycle in a repeatable, continuous [ML Pipelines](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/pipeline/overview.html#).
18
-
- Manage the life cycle of conda environments through the `ads conda` command line interface (CLI).
10
+
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`.
11
+
- Tune models using hyperparameter optimization with the `ADSTuner` tool.
12
+
- Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
13
+
- Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm).
14
+
- Deploy models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
15
+
- Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm).
16
+
- Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
17
+
- Define and run an end-to-end machine learning orchestration covering all the steps of machine learning lifecycle in a repeatable, continuous [ML Pipelines](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/pipeline/overview.html#).
18
+
- Manage the life cycle of conda environments through the `ads conda` command line interface (CLI).
@@ -66,47 +66,44 @@ Oracle Accelerated Data Science SDK (ADS)
66
66
67
67
modules
68
68
69
-
.. admonition:: Oracle Accelerated Data Science (ADS) SDK
69
+
.. admonition:: Oracle Accelerated Data Science (ADS)
70
+
:class: note
70
71
71
-
The Oracle Accelerated Data Science (ADS) SDK is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
72
+
Oracle Accelerated Data Science (ADS) is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
72
73
73
-
With ADS you can:
74
+
With ADS you can:
74
75
75
-
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3, and other sources into Pandas dataframes.
76
-
- Easily compute summary statistics on your dataframes and perform data profiling.
77
-
- Tune models using hyperparameter optimization with the ADSTuner tool.
78
-
- Generate detailed evaluation reports of your model candidates with the ADSEvaluator module.
79
-
- Save machine learning models to the OCI Data Science Models.
80
-
- Deploy those models as HTTPS endpoints with Model Deployment.
81
-
- Launch distributed ETL, data processing, and model training jobs in Spark with OCI Data Flow.
82
-
- Train machine learning models in OCI Data Science Jobs.
83
-
- Manage the lifecycle of conda environments through the ads conda command line interface (CLI).
84
-
- Distributed Training with PyTorch, Horovod and Dask
76
+
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3, and other sources into Pandas dataframes.
77
+
- Easily compute summary statistics on your dataframes and perform data profiling.
78
+
- Tune models using hyperparameter optimization with the ADSTuner tool.
79
+
- Generate detailed evaluation reports of your model candidates with the ADSEvaluator module.
80
+
- Save machine learning models to the OCI Data Science Models.
81
+
- Deploy those models as HTTPS endpoints with Model Deployment.
82
+
- Launch distributed ETL, data processing, and model training jobs in Spark with OCI Data Flow.
83
+
- Train machine learning models in OCI Data Science Jobs.
84
+
- Manage the lifecycle of conda environments through the ads conda command line interface (CLI).
85
+
- Distributed Training with PyTorch, Horovod and Dask
Copy file name to clipboardExpand all lines: docs/source/user_guide/apachespark/dataflow-spark-magic.rst
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -79,6 +79,7 @@ Use the `%help` method to get a list of all the available commands, along with a
79
79
%help
80
80
81
81
.. admonition:: Tip
82
+
:class: note
82
83
83
84
To access the docstrings of any magic command and figure out what arguments to provide, simply add ``?`` at the end of the command. For instance: ``%create_session?``
Copy file name to clipboardExpand all lines: docs/source/user_guide/apachespark/dataflow.rst
+3Lines changed: 3 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -41,6 +41,7 @@ Define config. If you have not yet configured your dataflow setting, or would li
41
41
Use the config defined above to submit the cell.
42
42
43
43
.. admonition:: Tip
44
+
:class: note
44
45
45
46
Get more information about the dataflow extension by running ``%dataflow -h``
46
47
@@ -131,11 +132,13 @@ To submit your notebook to DataFlow using the ``ads`` CLI, run:
131
132
ads opctl run -s <folder where notebook is located> -e <notebook name> -b dataflow
132
133
133
134
.. admonition:: Tip
135
+
:class: note
134
136
135
137
You can avoid running cells that are not DataFlow environment compatible by tagging the cells and then providing the tag names to ignore. In the following example cells that are tagged ``ignore`` and ``remove`` will be ignored -
136
138
``--exclude-tag ignore --exclude-tag remove``
137
139
138
140
.. admonition:: Tip
141
+
:class: note
139
142
140
143
You can run the notebook in your local pyspark environment before submitting to ``DataFlow`` using the same CLI with ``-b local``
Copy file name to clipboardExpand all lines: docs/source/user_guide/apachespark/spark.rst
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,7 @@ Apache Spark
4
4
5
5
6
6
.. admonition:: DataFlow
7
+
:class: note
7
8
8
9
Oracle Cloud Infrastructure (OCI) Data Flow is a fully managed, serverless, and on-demand Apache Spark Service that performs data processing or model training tasks on extremely large datasets without infrastructure to deploy or manage.
Copy file name to clipboardExpand all lines: docs/source/user_guide/cli/opctl/localdev/condapack.rst
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,7 @@ create
25
25
Build conda packs from your workstation using ``ads opctl conda create`` subcommand.
26
26
27
27
.. admonition:: Tip
28
+
:class: note
28
29
29
30
To publish a conda pack that is natively installed on a oracle linux host (compute or laptop), use ``NO_CONTAINER`` environment variable to remove dependency on the ml-job container image:
0 commit comments