Skip to content

Commit e3f1d60

Browse files
committed
Merge remote-tracking branch 'refs/remotes/origin/main'
Conflicts: .github/workflows/publish-to-pypi.yml .github/workflows/publish-to-readthedocs.yml .github/workflows/run-unittests-default_setup.yml .github/workflows/run-unittests.yml README.md ads/common/decorator/deprecate.py ads/common/serializer.py ads/common/utils.py ads/feature_engineering/adsstring/parsers/__init__.py ads/jobs/ads_job.py ads/jobs/builders/base.py ads/jobs/builders/infrastructure/dataflow.py ads/jobs/builders/infrastructure/dsc_job.py ads/jobs/builders/infrastructure/dsc_job_runtime.py ads/jobs/builders/runtimes/base.py ads/jobs/builders/runtimes/python_runtime.py ads/jobs/serializer.py ads/model/artifact.py ads/model/common/utils.py ads/model/deployment/model_deployment.py ads/model/deployment/model_deployment_infrastructure.py ads/model/deployment/model_deployment_runtime.py ads/model/framework/spark_model.py ads/model/generic_model.py ads/model/model_metadata_mixin.py ads/model/serde/__init__.py ads/model/service/oci_datascience_model_deployment.py ads/opctl/backend/ads_model_deployment.py ads/opctl/cli.py ads/opctl/cmds.py ads/templates/score_pytorch.jinja2 docs/source/conf.py docs/source/user_guide/configuration/authentication.rst docs/source/user_guide/jobs/data_science_job.rst docs/source/user_guide/jobs/index.rst docs/source/user_guide/jobs/policies.rst docs/source/user_guide/jobs/run_container.rst docs/source/user_guide/jobs/run_git.rst docs/source/user_guide/model_training/automl/quick_start.rst
2 parents 4d06a9c + 129ede9 commit e3f1d60

File tree

26 files changed

+107
-103
lines changed

26 files changed

+107
-103
lines changed

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
1-
# Oracle Accelerated Data Science SDK (ADS)
1+
# Oracle Accelerated Data Science (ADS)
2+
3+
[![PyPI](https://img.shields.io/pypi/v/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://pypi.org/project/oracle-ads/) [![Python](https://img.shields.io/pypi/pyversions/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://pypi.org/project/oracle-ads/) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://github.com/ambv/black)
24

3-
[![PyPI](https://img.shields.io/pypi/v/oracle-ads.svg)](https://pypi.org/project/oracle-ads/) [![Python](https://img.shields.io/pypi/pyversions/oracle-ads.svg?style=plastic)](https://pypi.org/project/oracle-ads/)
45

56
The [Oracle Accelerated Data Science (ADS) SDK](https://accelerated-data-science.readthedocs.io/en/latest/index.html) is maintained by the Oracle Cloud Infrastructure (OCI) [Data Science service](https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm) team. It speeds up common data science activities by providing tools that automate and simplify common data science tasks. Additionally, provides data scientists a friendly pythonic interface to OCI services. Some of the more notable services are OCI Data Science, Model Catalog, Model Deployment, Jobs, ML Pipelines, Data Flow, Object Storage, Vault, Big Data Service, Data Catalog, and the Autonomous Database. ADS gives you an interface to manage the life cycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
67

78
With ADS you can:
89

9-
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`.
10-
- Use feature types to characterize your data, create meaning summary statistics and plot. Use the warning and validation system to test the quality of your data.
11-
- Tune models using hyperparameter optimization with the `ADSTuner` tool.
12-
- Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
13-
- Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm).
14-
- Deploy models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
15-
- Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm).
16-
- Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
17-
- Define and run an end-to-end machine learning orchestration covering all the steps of machine learning lifecycle in a repeatable, continuous [ML Pipelines](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/pipeline/overview.html#).
18-
- Manage the life cycle of conda environments through the `ads conda` command line interface (CLI).
10+
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`.
11+
- Tune models using hyperparameter optimization with the `ADSTuner` tool.
12+
- Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
13+
- Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm).
14+
- Deploy models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
15+
- Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm).
16+
- Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
17+
- Define and run an end-to-end machine learning orchestration covering all the steps of machine learning lifecycle in a repeatable, continuous [ML Pipelines](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/pipeline/overview.html#).
18+
- Manage the life cycle of conda environments through the `ads conda` command line interface (CLI).
1919

2020
## Installation
2121

docs/requirements.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,12 @@ autodoc
22
nbsphinx
33
sphinx
44
sphinxcontrib-napoleon
5-
sphinx-rtd-theme
65
sphinx_copybutton
76
sphinx_code_tabs
8-
oracle_ads
97
sphinx-autobuild
8+
sphinx-autorun
9+
oracle_ads
10+
furo
1011
IPython
1112
pandoc
1213
rstcheck
8 KB
Loading

docs/source/conf.py

Lines changed: 21 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
import datetime
66
import os
77
import sys
8+
from typing import Any, Dict
9+
810

911
# This causes documentation within the __init__ method to be pulled into the documentation properly
1012
autoclass_content = "both"
@@ -21,22 +23,24 @@
2123
version = release = __import__("ads").__version__
2224

2325
extensions = [
24-
"sphinx_rtd_theme",
2526
"sphinx.ext.napoleon",
2627
"sphinx.ext.autodoc",
2728
"sphinx.ext.doctest",
28-
"sphinx.ext.todo",
2929
"sphinx.ext.mathjax",
3030
"sphinx.ext.ifconfig",
31+
"sphinx.ext.autodoc",
32+
"sphinx.ext.todo",
33+
"sphinx.ext.extlinks",
34+
"sphinx.ext.intersphinx",
3135
"sphinx.ext.graphviz",
32-
"sphinx.ext.inheritance_diagram",
3336
"nbsphinx",
3437
"sphinx_code_tabs",
3538
"sphinx_copybutton",
3639
"sphinx.ext.duration",
3740
"sphinx.ext.autosummary",
3841
"sphinx.ext.intersphinx",
3942
"sphinx.ext.viewcode",
43+
"sphinx_autorun",
4044
]
4145

4246
# Add any paths that contain templates here, relative to this directory.
@@ -62,53 +66,29 @@
6266
# directories to ignore when looking for source files.
6367
# This pattern also affects html_static_path and html_extra_path.
6468
# exclude_patterns = []
65-
exclude_patterns = ['build', '**.ipynb_checkpoints']
69+
exclude_patterns = ['build', '**.ipynb_checkpoints', 'Thumbs.db', '.DS_Store']
6670

67-
# The name of the Pygments (syntax highlighting) style to use.
68-
pygments_style = None
71+
language = "en"
6972

70-
html_logo = "_static/oracle_logo.png"
73+
html_theme = "furo"
74+
html_static_path = ["_static"]
7175

72-
# -- Options for HTML output -------------------------------------------------
76+
html_title = f"{project} v{release}"
7377

74-
# The theme to use for HTML and HTML Help pages. See the documentation for
75-
# a list of builtin themes.
76-
#
77-
html_theme = "sphinx_rtd_theme"
78+
# Disable the generation of the various indexes
79+
html_use_modindex = False
80+
html_use_index = False
81+
82+
# html_css_files = [
83+
# 'pied-piper-admonition.css',
84+
# ]
7885

79-
# Theme options are theme-specific and customize the look and feel of a theme
80-
# further. For a list of options available for each theme, see the
81-
# documentation.
82-
#
8386
html_theme_options = {
84-
"logo_only": False,
85-
# Toc options
86-
"sticky_navigation": True,
87-
"navigation_depth": 4,
88-
"includehidden": True,
89-
"titles_only": False,
90-
"display_version": True,
87+
"light_logo": "logo-light-mode.png",
88+
"dark_logo": "logo-dark-mode.png",
9189
}
9290

93-
# Add any paths that contain custom static files (such as style sheets) here,
94-
# relative to this directory. They are copied after the builtin static files,
95-
# so a file named "default.css" will overwrite the builtin "default.css".
96-
html_static_path = ["_static"]
97-
98-
# Custom sidebar templates, must be a dictionary that maps document names
99-
# to template names.
100-
#
101-
# The default sidebars (for documents that don't match any pattern) are
102-
# defined by theme itself. Builtin themes are using these templates by
103-
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
104-
# 'searchbox.html']``.
105-
#
106-
# html_sidebars = {}
107-
108-
109-
# -- Options for HTMLHelp output ---------------------------------------------
11091

111-
# Output file base name for HTML help builder.
11292
htmlhelp_basename = "pydoc"
11393

11494

docs/source/index.rst

Lines changed: 33 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,15 @@
66
library and CLI for Machine learning engineers to work with Cloud Infrastructure (CPU and GPU VMs, Storage etc, Spark) for Data, Models,
77
Notebooks, Pipelines and Jobs.
88

9-
Oracle Accelerated Data Science SDK (ADS)
10-
=========================================
9+
Oracle Accelerated Data Science (ADS)
10+
=====================================
1111
|PyPI|_ |Python|_ |Notebook Examples|_
1212

13-
.. |PyPI| image:: https://img.shields.io/pypi/v/oracle-ads.svg
13+
.. |PyPI| image:: https://img.shields.io/pypi/v/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white
1414
.. _PyPI: https://pypi.org/project/oracle-ads/
15-
.. |Python| image:: https://img.shields.io/pypi/pyversions/oracle-ads.svg?style=plastic
15+
.. |Python| image:: https://img.shields.io/pypi/pyversions/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white
1616
.. _Python: https://pypi.org/project/oracle-ads/
17-
.. |Notebook Examples| image:: https://img.shields.io/badge/docs-notebook--examples-blue
17+
.. |Notebook Examples| image:: https://img.shields.io/badge/docs-notebook--examples-blue?style=for-the-badge&logo=pypi&logoColor=white
1818
.. _Notebook Examples: https://github.com/oracle-samples/oci-data-science-ai-samples/tree/master/notebook_examples
1919

2020
.. toctree::
@@ -66,47 +66,44 @@ Oracle Accelerated Data Science SDK (ADS)
6666

6767
modules
6868

69-
.. admonition:: Oracle Accelerated Data Science (ADS) SDK
69+
.. admonition:: Oracle Accelerated Data Science (ADS)
70+
:class: note
7071

71-
The Oracle Accelerated Data Science (ADS) SDK is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
72+
Oracle Accelerated Data Science (ADS) is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
7273

73-
With ADS you can:
74+
With ADS you can:
7475

75-
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3, and other sources into Pandas dataframes.
76-
- Easily compute summary statistics on your dataframes and perform data profiling.
77-
- Tune models using hyperparameter optimization with the ADSTuner tool.
78-
- Generate detailed evaluation reports of your model candidates with the ADSEvaluator module.
79-
- Save machine learning models to the OCI Data Science Models.
80-
- Deploy those models as HTTPS endpoints with Model Deployment.
81-
- Launch distributed ETL, data processing, and model training jobs in Spark with OCI Data Flow.
82-
- Train machine learning models in OCI Data Science Jobs.
83-
- Manage the lifecycle of conda environments through the ads conda command line interface (CLI).
84-
- Distributed Training with PyTorch, Horovod and Dask
76+
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3, and other sources into Pandas dataframes.
77+
- Easily compute summary statistics on your dataframes and perform data profiling.
78+
- Tune models using hyperparameter optimization with the ADSTuner tool.
79+
- Generate detailed evaluation reports of your model candidates with the ADSEvaluator module.
80+
- Save machine learning models to the OCI Data Science Models.
81+
- Deploy those models as HTTPS endpoints with Model Deployment.
82+
- Launch distributed ETL, data processing, and model training jobs in Spark with OCI Data Flow.
83+
- Train machine learning models in OCI Data Science Jobs.
84+
- Manage the lifecycle of conda environments through the ads conda command line interface (CLI).
85+
- Distributed Training with PyTorch, Horovod and Dask
8586

8687

8788
.. admonition:: Installation
89+
:class: note
8890

8991
python3 -m pip install oracle-ads
9092

9193

9294
.. admonition:: Source Code
95+
:class: note
9396

9497
`https://github.com/oracle/accelerated-data-science <https://github.com/oracle/accelerated-data-science>`_
9598

96-
.. code:: ipython3
97-
99+
.. code-block:: python3
98100
>>> import ads
99101
>>> ads.hello()
100102
101-
O o-o o-o
102-
/ \ | \ |
103-
o---o| O o-o
104-
| || / |
105-
o oo-o o--o
103+
.. runblock:: pycon
106104

107-
ADS SDK version: X.Y.Z
108-
Pandas version: x.y.z
109-
Debug mode: False
105+
>>> import ads
106+
>>> ads.hello()
110107

111108

112109
Additional Documentation
@@ -115,6 +112,8 @@ Additional Documentation
115112
- `OCI Data Science and AI services Examples <https://github.com/oracle/oci-data-science-ai-samples>`_
116113
- `Oracle AI & Data Science Blog <https://blogs.oracle.com/ai-and-datascience/>`_
117114
- `OCI Documentation <https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm>`_
115+
- `OCIFS Documentation <https://ocifs.readthedocs.io/en/latest/>`_
116+
- `Example Notebooks <https://github.com/oracle-samples/oci-data-science-ai-samples/tree/master/notebook_examples>`_
118117

119118
Examples
120119
++++++++
@@ -147,25 +146,25 @@ This example uses SQL injection safe binding variables.
147146

148147
.. code-block:: python3
149148
150-
import ads
151-
import pandas as pd
149+
import ads
150+
import pandas as pd
152151
153-
connection_parameters = {
152+
connection_parameters = {
154153
"user_name": "<user_name>",
155154
"password": "<password>",
156155
"service_name": "<tns_name>",
157156
"wallet_location": "<file_path>",
158-
}
157+
}
159158
160-
df = pd.DataFrame.ads.read_sql(
159+
df = pd.DataFrame.ads.read_sql(
161160
"""
162161
SELECT *
163162
FROM SH.SALES
164163
WHERE ROWNUM <= :max_rows
165164
""",
166165
bind_variables={ max_rows : 100 },
167166
connection_parameters=connection_parameters,
168-
)
167+
)
169168
170169
More Examples
171170
~~~~~~~~~~~~~

docs/source/user_guide/apachespark/dataflow-spark-magic.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ Use the `%help` method to get a list of all the available commands, along with a
7979
%help
8080
8181
.. admonition:: Tip
82+
:class: note
8283

8384
To access the docstrings of any magic command and figure out what arguments to provide, simply add ``?`` at the end of the command. For instance: ``%create_session?``
8485

docs/source/user_guide/apachespark/dataflow.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ Define config. If you have not yet configured your dataflow setting, or would li
4141
Use the config defined above to submit the cell.
4242

4343
.. admonition:: Tip
44+
:class: note
4445

4546
Get more information about the dataflow extension by running ``%dataflow -h``
4647

@@ -131,11 +132,13 @@ To submit your notebook to DataFlow using the ``ads`` CLI, run:
131132
ads opctl run -s <folder where notebook is located> -e <notebook name> -b dataflow
132133
133134
.. admonition:: Tip
135+
:class: note
134136

135137
You can avoid running cells that are not DataFlow environment compatible by tagging the cells and then providing the tag names to ignore. In the following example cells that are tagged ``ignore`` and ``remove`` will be ignored -
136138
``--exclude-tag ignore --exclude-tag remove``
137139

138140
.. admonition:: Tip
141+
:class: note
139142

140143
You can run the notebook in your local pyspark environment before submitting to ``DataFlow`` using the same CLI with ``-b local``
141144

docs/source/user_guide/apachespark/spark.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Apache Spark
44

55

66
.. admonition:: DataFlow
7+
:class: note
78

89
Oracle Cloud Infrastructure (OCI) Data Flow is a fully managed, serverless, and on-demand Apache Spark Service that performs data processing or model training tasks on extremely large datasets without infrastructure to deploy or manage.
910

docs/source/user_guide/cli/opctl/localdev/condapack.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ create
2525
Build conda packs from your workstation using ``ads opctl conda create`` subcommand.
2626

2727
.. admonition:: Tip
28+
:class: note
2829

2930
To publish a conda pack that is natively installed on a oracle linux host (compute or laptop), use ``NO_CONTAINER`` environment variable to remove dependency on the ml-job container image:
3031

0 commit comments

Comments
 (0)