Merge branch 'main' into ODSC-37153/action_to_publish_docs

liudmylaru · web-flow · commit 37250d219078 · 2023-04-26T20:07:40.000-05:00
diff --git a/.github/workflows/publish-to-readthedocs.yml b/.github/workflows/publish-to-readthedocs.yml
@@ -13,6 +13,7 @@ on:
       - main
     paths:
       - 'docs/**'
+
 env:
   RTDS_ADS_PROJECT: https://readthedocs.org/api/v3/projects/accelerated-data-science
   RTDS_ADS_TOKEN: ${{ secrets.RTDS_ADS_TOKEN }}
diff --git a/README.md b/README.md
@@ -1,21 +1,21 @@
-# Oracle Accelerated Data Science SDK (ADS)
+# Oracle Accelerated Data Science (ADS)
+
+[![PyPI](https://img.shields.io/pypi/v/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://pypi.org/project/oracle-ads/) [![Python](https://img.shields.io/pypi/pyversions/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://pypi.org/project/oracle-ads/) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://github.com/ambv/black)
 
-[![PyPI](https://img.shields.io/pypi/v/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://pypi.org/project/oracle-ads/) [![Python](https://img.shields.io/pypi/pyversions/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white)](https://pypi.org/project/oracle-ads/)
 
 The [Oracle Accelerated Data Science (ADS) SDK](https://accelerated-data-science.readthedocs.io/en/latest/index.html) is maintained by the Oracle Cloud Infrastructure (OCI) [Data Science service](https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm) team. It speeds up common data science activities by providing tools that automate and simplify common data science tasks. Additionally, provides data scientists a friendly pythonic interface to OCI services. Some of the more notable services are OCI Data Science, Model Catalog, Model Deployment, Jobs, ML Pipelines, Data Flow, Object Storage, Vault, Big Data Service, Data Catalog, and the Autonomous Database. ADS gives you an interface to manage the life cycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
 
 With ADS you can:
 
- - Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`.
- - Use feature types to characterize your data, create meaning summary statistics and plot. Use the warning and validation system to test the quality of your data.
- - Tune models using hyperparameter optimization with the `ADSTuner` tool.
- - Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
- - Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm).
- - Deploy models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
- - Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm).
- - Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
- - Define and run an end-to-end machine learning orchestration covering all the steps of machine learning lifecycle in a repeatable, continuous [ML Pipelines](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/pipeline/overview.html#).
- - Manage the life cycle of conda environments through the `ads conda` command line interface (CLI).
+- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`.
+- Tune models using hyperparameter optimization with the `ADSTuner` tool.
+- Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
+- Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm).
+- Deploy models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
+- Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm).
+- Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
+- Define and run an end-to-end machine learning orchestration covering all the steps of machine learning lifecycle in a repeatable, continuous [ML Pipelines](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/pipeline/overview.html#).
+- Manage the life cycle of conda environments through the `ads conda` command line interface (CLI).
 
 ## Installation
 
diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -2,11 +2,10 @@ autodoc
 nbsphinx
 sphinx
 sphinxcontrib-napoleon
-sphinx-rtd-theme
 sphinx_copybutton
 sphinx_code_tabs
 sphinx-autobuild
-sphinx-design
+sphinx-autorun
 oracle_ads
 furo
 IPython
diff --git a/docs/source/_static/logo-dark-mode.png b/docs/source/_static/logo-dark-mode.png
diff --git a/docs/source/_static/logo-light-mode.png b/docs/source/_static/logo-light-mode.png
diff --git a/docs/source/_static/oracle_logo.png b/docs/source/_static/oracle_logo.png
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -23,21 +23,20 @@
 version = release = __import__("ads").__version__
 
 extensions = [
-    "sphinx_rtd_theme",
     "sphinx.ext.napoleon",
     "sphinx.ext.autodoc",
     "sphinx.ext.doctest",
-    "sphinx.ext.todo",
     "sphinx.ext.mathjax",
     "sphinx.ext.ifconfig",
-    "sphinx.ext.graphviz",
-    "sphinx.ext.inheritance_diagram",
+    "sphinx.ext.autodoc",
     "sphinx.ext.todo",
-    "sphinx.ext.viewcode",
+    "sphinx.ext.extlinks",
+    "sphinx.ext.intersphinx",    
+    "sphinx.ext.graphviz",
     "nbsphinx",
     "sphinx_code_tabs",
-    "sphinx_design",
-    "sphinx_copybutton"
+    "sphinx_copybutton",
+    "sphinx_autorun",
 ]
 
 # Add any paths that contain templates here, relative to this directory.
@@ -63,16 +62,29 @@
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path.
 # exclude_patterns = []
-exclude_patterns = ['build', '**.ipynb_checkpoints']
+exclude_patterns = ['build', '**.ipynb_checkpoints', 'Thumbs.db', '.DS_Store']
 
-# The name of the Pygments (syntax highlighting) style to use.
-pygments_style = None
 language = "en"
 
 html_theme = "furo"
-html_logo = "_static/oracle_logo.png"
 html_static_path = ["_static"]
-html_css_files = ["pied-piper-admonition.css"]
+
+html_title = f"{project} v{release}"
+
+# Disable the generation of the various indexes
+html_use_modindex = False
+html_use_index = False
+
+# html_css_files = [
+#     'pied-piper-admonition.css',     
+# ]
+
+html_theme_options = {
+    "light_logo": "logo-light-mode.png",
+    "dark_logo": "logo-dark-mode.png",     
+}
+
+
 htmlhelp_basename = "pydoc"
 
 
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -6,8 +6,8 @@
         library and CLI for Machine learning engineers to work with Cloud Infrastructure (CPU and GPU VMs, Storage etc, Spark) for Data, Models,
         Notebooks, Pipelines and Jobs.
 
-Oracle Accelerated Data Science SDK (ADS)
-=========================================
+Oracle Accelerated Data Science (ADS)
+=====================================
 |PyPI|_ |Python|_ |Notebook Examples|_
 
 .. |PyPI| image:: https://img.shields.io/pypi/v/oracle-ads.svg?style=for-the-badge&logo=pypi&logoColor=white
@@ -66,47 +66,44 @@ Oracle Accelerated Data Science SDK (ADS)
 
    modules
 
-.. admonition:: Oracle Accelerated Data Science (ADS) SDK
+.. admonition:: Oracle Accelerated Data Science (ADS)
+   :class: note
 
-  The Oracle Accelerated Data Science (ADS) SDK is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
+   Oracle Accelerated Data Science (ADS) is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
 
-  With ADS you can:
+   With ADS you can:
 
-    - Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3, and other sources into Pandas dataframes.
-    - Easily compute summary statistics on your dataframes and perform data profiling.
-    - Tune models using hyperparameter optimization with the ADSTuner tool.
-    - Generate detailed evaluation reports of your model candidates with the ADSEvaluator module.
-    - Save machine learning models to the OCI Data Science Models.
-    - Deploy those models as HTTPS endpoints with Model Deployment.
-    - Launch distributed ETL, data processing, and model training jobs in Spark with OCI Data Flow.
-    - Train machine learning models in OCI Data Science Jobs.
-    - Manage the lifecycle of conda environments through the ads conda command line interface (CLI).
-    - Distributed Training with PyTorch, Horovod and Dask
+   - Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3, and other sources into Pandas dataframes.
+   - Easily compute summary statistics on your dataframes and perform data profiling.
+   - Tune models using hyperparameter optimization with the ADSTuner tool.
+   - Generate detailed evaluation reports of your model candidates with the ADSEvaluator module.
+   - Save machine learning models to the OCI Data Science Models.
+   - Deploy those models as HTTPS endpoints with Model Deployment.
+   - Launch distributed ETL, data processing, and model training jobs in Spark with OCI Data Flow.
+   - Train machine learning models in OCI Data Science Jobs.
+   - Manage the lifecycle of conda environments through the ads conda command line interface (CLI).
+   - Distributed Training with PyTorch, Horovod and Dask
 
 
 .. admonition:: Installation
+   :class: note
 
    python3 -m pip install oracle-ads
 
 
 .. admonition:: Source Code
+   :class: note
 
    `https://github.com/oracle/accelerated-data-science <https://github.com/oracle/accelerated-data-science>`_
 
-.. code:: ipython3
-
+.. code-block:: python3
    >>> import ads
    >>> ads.hello()
 
-     O  o-o   o-o
-    / \ |  \ |
-   o---o|   O o-o
-   |   ||  /     |
-   o   oo-o  o--o
+.. runblock:: pycon
 
-   ADS SDK version: X.Y.Z
-   Pandas version: x.y.z
-   Debug mode: False
+   >>> import ads
+   >>> ads.hello()
 
 
 Additional Documentation
@@ -115,6 +112,8 @@ Additional Documentation
   - `OCI Data Science and AI services Examples <https://github.com/oracle/oci-data-science-ai-samples>`_
   - `Oracle AI & Data Science Blog <https://blogs.oracle.com/ai-and-datascience/>`_
   - `OCI Documentation <https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm>`_
+  - `OCIFS Documentation <https://ocifs.readthedocs.io/en/latest/>`_
+  - `Example Notebooks <https://github.com/oracle-samples/oci-data-science-ai-samples/tree/master/notebook_examples>`_
 
 Examples
 ++++++++
@@ -147,25 +146,25 @@ This example uses SQL injection safe binding variables.
 
 .. code-block:: python3
 
-  import ads
-  import pandas as pd
+   import ads
+   import pandas as pd
 
-  connection_parameters = {
+   connection_parameters = {
       "user_name": "<user_name>",
       "password": "<password>",
       "service_name": "<tns_name>",
       "wallet_location": "<file_path>",
-  }
+   }
 
-  df = pd.DataFrame.ads.read_sql(
+   df = pd.DataFrame.ads.read_sql(
       """
       SELECT *
       FROM SH.SALES
       WHERE ROWNUM <= :max_rows
       """,
       bind_variables={ max_rows : 100 },
       connection_parameters=connection_parameters,
-  )
+   )
 
 More Examples
 ~~~~~~~~~~~~~
diff --git a/docs/source/user_guide/apachespark/dataflow-spark-magic.rst b/docs/source/user_guide/apachespark/dataflow-spark-magic.rst
@@ -79,6 +79,7 @@ Use the `%help` method to get a list of all the available commands, along with a
   %help
 
 .. admonition:: Tip
+   :class: note
 
   To access the docstrings of any magic command and figure out what arguments to provide, simply add ``?`` at the end of the command. For instance: ``%create_session?``
 
diff --git a/docs/source/user_guide/apachespark/dataflow.rst b/docs/source/user_guide/apachespark/dataflow.rst
@@ -41,6 +41,7 @@ Define config. If you have not yet configured your dataflow setting, or would li
 Use the config defined above to submit the cell.
 
 .. admonition:: Tip
+   :class: note
 
   Get more information about the dataflow extension by running ``%dataflow -h``
 
@@ -131,11 +132,13 @@ To submit your notebook to DataFlow using the ``ads`` CLI, run:
   ads opctl run -s <folder where notebook is located> -e <notebook name> -b dataflow
 
 .. admonition:: Tip
+   :class: note
 
   You can avoid running cells that are not DataFlow environment compatible by tagging the cells and then providing the tag names to ignore. In the following example cells that are tagged ``ignore`` and ``remove`` will be ignored -
   ``--exclude-tag ignore --exclude-tag remove``
 
 .. admonition:: Tip
+   :class: note
 
   You can run the notebook in your local pyspark environment before submitting to ``DataFlow`` using the same CLI with ``-b local``
 
diff --git a/docs/source/user_guide/apachespark/spark.rst b/docs/source/user_guide/apachespark/spark.rst
@@ -4,6 +4,7 @@ Apache Spark
 
 
 .. admonition:: DataFlow
+   :class: note
 
   Oracle Cloud Infrastructure (OCI) Data Flow is a fully managed, serverless, and on-demand Apache Spark Service that performs data processing or model training tasks on extremely large datasets without infrastructure to deploy or manage.
 
diff --git a/docs/source/user_guide/cli/opctl/localdev/condapack.rst b/docs/source/user_guide/cli/opctl/localdev/condapack.rst
@@ -25,6 +25,7 @@ create
 Build conda packs from your workstation using ``ads opctl conda create`` subcommand.
 
 .. admonition:: Tip
+   :class: note
 
     To publish a conda pack that is natively installed on a oracle linux host (compute or laptop), use ``NO_CONTAINER`` environment variable to remove dependency on the ml-job container image:
 
diff --git a/docs/source/user_guide/cli/quickstart.rst b/docs/source/user_guide/cli/quickstart.rst
@@ -25,7 +25,7 @@ Install ADS CLI
 
 .. admonition:: Tip
 
-  ``ads opctl`` subcommand lets us setup your local development envrionment for Data Science Jobs. More information can be found by running ``ads opctl -h``
+  ``ads opctl`` subcommand lets you setup your local development envrionment for Data Science Jobs. More information can be found by running ``ads opctl -h``
 
 
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/docs/source/user_guide/jobs/data_science_job.rst b/docs/source/user_guide/jobs/data_science_job.rst
@@ -2,6 +2,7 @@ Quick Start
 ***********
 
 .. admonition:: Prerequisite
+  :class: note
 
   Before creating a job, ensure that you have policies configured for Data Science resources.
 
diff --git a/docs/source/user_guide/jobs/policies.rst b/docs/source/user_guide/jobs/policies.rst
@@ -11,6 +11,7 @@ This section describe the policies you might need for running Data Science Jobs.
     You should further restrict the access to the resources base on your needs.
 
 .. admonition:: Policy subject
+    :class: note
 
     In the following example, ``group <your_data_science_users>`` is the subject of the policy
     when using OCI API keys for authentication. For resource principal authentication,
diff --git a/docs/source/user_guide/jobs/run_container.rst b/docs/source/user_guide/jobs/run_container.rst
@@ -4,6 +4,7 @@ Run a Container
 The :py:class:`~ads.jobs.ContainerRuntime` class allows you to run a container image using OCI data science jobs.
 
 .. admonition:: OCI Container Registry
+  :class: note
 
   To use the :py:class:`~ads.jobs.ContainerRuntime`, you need to first push the image to
   `OCI container registry <https://docs.oracle.com/en-us/iaas/Content/Registry/Concepts/registryoverview.htm>`_.
diff --git a/docs/source/user_guide/jobs/run_git.rst b/docs/source/user_guide/jobs/run_git.rst
@@ -34,6 +34,7 @@ For repository on GitHub, you could setup the
 `GitHub Deploy Key <https://docs.github.com/en/developers/overview/managing-deploy-keys#deploy-keys>`_ as secret.
 
 .. admonition:: Git Version for Private Repository
+  :class: note
 
   Git version of 2.3+ is required to use a private repository.
 
diff --git a/docs/source/user_guide/loading_data/connect.rst b/docs/source/user_guide/loading_data/connect.rst
@@ -6,7 +6,7 @@ Load Data
 Connecting to Data Sources
 **************************
 
-You can load data into ADS in several different ways from Oracle Cloud Infrastructure Object Storage, cx_Oracle, or S3.  Following are some examples.
+You can load data into ADS in several different ways from Oracle Cloud Infrastructure Object Storage, Oracle RDBMS, or S3.  Following are some examples.
 
 Begin by loading the required libraries and modules:
 
diff --git a/docs/source/user_guide/loading_data/connect_legacy.rst b/docs/source/user_guide/loading_data/connect_legacy.rst
@@ -2,14 +2,15 @@ Connect with ``DatasetFactory``
 *******************************
 
 
-.. admonition:: Deprecation Note |deprecated|
+.. admonition:: |deprecated|
+  :class: note
 
-    * ``DataSetFactory.open`` is deprecated in favor of Pandas to read from file systems. 
-    * Pandas(>1.2.1) can connect to object storage using uri format - ``oci://bucket@namepace/path/to/data``.
-    * To read from Oracle database or MySQL, see DataBase sections under :doc:`Connecting to Datasources<connect>`
-    * ``DataSetFactory.from_dataframe`` is supported to create ``ADSDataset`` class from ``pandas`` dataframe
+  * ``DataSetFactory.open`` is deprecated in favor of Pandas to read from file systems. 
+  * Pandas(>1.2.1) can connect to object storage using uri format - ``oci://bucket@namepace/path/to/data``.
+  * To read from Oracle database or MySQL, see DataBase sections under :doc:`Connecting to Datasources<connect>`
+  * ``DataSetFactory.from_dataframe`` is supported to create ``ADSDataset`` class from ``pandas`` dataframe
 
-    See :doc:`Connecting to Datasources<connect>` for examples.
+  See :doc:`Connecting to Datasources<connect>` for examples.
 
 .. |deprecated| image:: /_static/badge_deprecated.svg
 
diff --git a/docs/source/user_guide/model_registration/_template/deploy.rst b/docs/source/user_guide/model_registration/_template/deploy.rst
@@ -6,7 +6,8 @@ See `API documentation <../../ads.model.html#id1>`__ for more details about the
 
 
 .. admonition:: Tips
+   :class: note
 
-    * Providing ``deployment_access_log_id`` and ``deployment_predict_log_id`` helps in debugging your model inference setup.
-    * Default Load Balancer configuration has bandwidth of 10 Mbps. `Refer service document to help you choose the right setup. <https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_create.htm>`_ 
-    * Check for supported instance shapes `here <https://docs.oracle.com/en-us/iaas/data-science/using/overview.htm#supported-shapes>`_ .
+   * Providing ``deployment_access_log_id`` and ``deployment_predict_log_id`` helps in debugging your model inference setup.
+   * Default Load Balancer configuration has bandwidth of 10 Mbps. `Refer service document to help you choose the right setup. <https://docs.oracle.com/en-us/iaas/data-science/using/model_dep_create.htm>`_ 
+   * Check for supported instance shapes `here <https://docs.oracle.com/en-us/iaas/data-science/using/overview.htm#supported-shapes>`_ .
diff --git a/docs/source/user_guide/model_training/automl/quick_start.rst b/docs/source/user_guide/model_training/automl/quick_start.rst
@@ -2,6 +2,7 @@ Quick Start
 ===========
 
 .. admonition:: Prerequisite
+    :class: note
 
     You need to check if AutoMLx library is installed in your environment. For more information on the conda environments that contain AutoMLx, check `this page <https://docs.oracle.com/en-us/iaas/data-science/using/conda-automlx-fam.htm>`__.
 
diff --git a/docs/source/user_guide/model_training/distributed_training/configuration/configuration.rst b/docs/source/user_guide/model_training/distributed_training/configuration/configuration.rst
@@ -29,6 +29,7 @@ OCI Policies
 Several OCI policies are needed for distributed training.
 
 .. admonition:: Policy subject
+  :class: note
 
   In the following example, ``group <your_data_science_users>`` is the subject of the policy. When starting the job from an OCI notebook session using resource principal, the subject should be ``dynamic-group``, for example, ``dynamic-group <your_notebook_sessions>``
 
diff --git a/docs/source/user_guide/model_training/distributed_training/dask/dask.rst b/docs/source/user_guide/model_training/distributed_training/dask/dask.rst
@@ -9,6 +9,7 @@ creating both the container and ``yaml`` spec to run the distributed workload.
 
 
 .. admonition:: Dask
+  :class: note
 
   This is a good choice when you want to use ``Scikit-Learn``, ``XGBoost``, ``LightGBM`` or have
   data parallel tasks for very large datasets where the data can be partitioned. 
diff --git a/docs/source/user_guide/model_training/distributed_training/developer/developer.rst b/docs/source/user_guide/model_training/distributed_training/developer/developer.rst
diff --git a/docs/source/user_guide/model_training/distributed_training/overview.rst b/docs/source/user_guide/model_training/distributed_training/overview.rst
diff --git a/docs/source/user_guide/model_training/distributed_training/remote_source_code.rst b/docs/source/user_guide/model_training/distributed_training/remote_source_code.rst
diff --git a/docs/source/user_guide/pipeline/quick_start.rst b/docs/source/user_guide/pipeline/quick_start.rst
diff --git a/docs/source/user_guide/quickstart/quickstart.rst b/docs/source/user_guide/quickstart/quickstart.rst