Skip to content

Commit b4e971e

Browse files
authored
ODSC-44446. Multiple fixes to documentation (#264)
1 parent e260b32 commit b4e971e

File tree

14 files changed

+141
-484
lines changed

14 files changed

+141
-484
lines changed

docs/source/ads.explanations.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
ads.explanations package
2+
========================

docs/source/ads.jobs.rst

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,6 @@
11
ads.jobs package
22
================
33

4-
.. toctree::
5-
:maxdepth: 3
6-
7-
ads.jobs
8-
9-
104
Subpackages
115
-----------
126

docs/source/conf.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,3 +173,11 @@
173173
todo_include_todos = True
174174

175175
mathjax_path = "math_jax_3_2_0.js"
176+
177+
# This css will be included in htlm pages to process where we add .. raw:: html for nb cell nice outputs with
178+
# <div class="nboutput nblast docutils container"> and
179+
# <div class="output_area rendered_html docutils container">
180+
# See ads_tuner.rst dataframe output in .. raw:: html sections
181+
html_css_files = [
182+
"nbsphinx-code-cells.css"
183+
]

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ Oracle Accelerated Data Science (ADS)
9797
`https://github.com/oracle/accelerated-data-science <https://github.com/oracle/accelerated-data-science>`_
9898

9999
.. code-block:: python3
100+
100101
>>> import ads
101102
>>> ads.hello()
102103

docs/source/user_guide/big_data_service/file_management.rst

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ Upload
7171
------
7272

7373
The `.put() <https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem.put>`_ method is used to upload files from local storage to HDFS. The first parameter is the local path of the files to upload. The second parameter is the HDFS path where the files are to be stored.
74-
`.upload() <https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem.upload>`_ is an alias of `.put()`.
74+
`.upload() <https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem.upload>`_ is an alias of ``.put()``.
7575
.. code-block:: python3
7676
7777
fs.put(
@@ -82,7 +82,7 @@ The `.put() <https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.sp
8282
Ibis
8383
====
8484

85-
`Ibis <https://github.com/ibis-project/ibis>`_ is an open-source library by `Cloudera <https://www.cloudera.com/>`_ that provides a Python framework to access data and perform analytical computations from different sources. Ibis allows access to the data ising HDFS. You use the ``ibis.impala.hdfs_connect()`` method to make a connection to HDFS, and it returns a handler. This handler has methods such as ``.ls()`` to list, ``.get()`` to download, ``.put()`` to upload, and ``.rm()`` to delete files. These operations support globbing. Ibis' HDFS connector supports a variety of `additional operations <https://ibis-project.org/docs/dev/backends/Impala/#hdfs-interaction>`_.
85+
`Ibis <https://github.com/ibis-project/ibis>`_ is an open-source library by `Cloudera <https://www.cloudera.com/>`_ that provides a Python framework to access data and perform analytical computations from different sources. Ibis allows access to the data ising HDFS. You use the ``ibis.impala.hdfs_connect()`` method to make a connection to HDFS, and it returns a handler. This handler has methods such as ``.ls()`` to list, ``.get()`` to download, ``.put()`` to upload, and ``.rm()`` to delete files. These operations support globbing. Ibis' HDFS connector supports a variety of `additional operations <https://ibis-project.org/backends/impala/#hdfs-interaction>`_.
8686

8787
Connect
8888
-------
@@ -159,7 +159,7 @@ Use the `.put() <https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspe
159159
Pandas
160160
======
161161

162-
Pandas allows access to BDS' HDFS system through :ref: `FSSpec`. This section demonstrates some common operations.
162+
Pandas allows access to BDS' HDFS system through :ref:`FSSpec`. This section demonstrates some common operations.
163163

164164
Connect
165165
-------
@@ -259,4 +259,3 @@ The following sample code shows several different PyArrow methods for working wi
259259
table = pa.Table.from_pandas(df)
260260
pq.write_to_dataset(table, root_path="/path/on/BDS/HDFS", partition_cols=["dt"],
261261
flavor="spark", filesystem=fs)
262-

docs/source/user_guide/big_data_service/sql_data_management.rst

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Ibis
1313
Connect
1414
-------
1515

16-
Obtaining a Kerberos ticket, depending on your system configuration, you may need to define the ``ibis.options.impala.temp_db`` and ``ibis.options.impala.temp_hdfs_path`` options. The ``ibis.impala.connect()`` method makes a connection to the `Impala execution backend <https://ibis-project.org/docs/dev/backends/Impala/>`_. The ``.sql()`` allows you to run SQL commands on the data.
16+
Obtaining a Kerberos ticket, depending on your system configuration, you may need to define the ``ibis.options.impala.temp_db`` and ``ibis.options.impala.temp_hdfs_path`` options. The ``ibis.impala.connect()`` method makes a connection to the `Impala execution backend <https://ibis-project.org/backends/impala/>`_. The ``.sql()`` allows you to run SQL commands on the data.
1717

1818
.. code-block:: python3
1919
@@ -167,5 +167,3 @@ It is important to close sessions when you don't need them anymore. This frees u
167167
.. code-block:: python3
168168
169169
cursor.close()
170-
171-

docs/source/user_guide/data_transformation/data_transformation.rst

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,27 @@ You can load a ``pandas`` dataframe into an ``ADSDataset`` by calling.
2020
Automated Transformations
2121
*************************
2222

23-
ADS has built in automatic transform tools for datasets. When the ``get_recommendations()`` tool is applied to an ``ADSDataset`` object, it shows the user detected issues with the data and recommends changes to apply to the dataset. You can accept the changes is as easy as clicking a button in the drop down menu. After all the changes are applied, the transformed dataset can be retrieved by calling ``get_transformed_dataset()``.
23+
ADS provides built-in automatic transformation tools for datasets. These tools help detect issues with the data and recommend changes to improve the dataset. The recommended changes can be accepted by clicking a button in the drop-down menu. Once the changes are applied, the transformed dataset can be retrieved using the ``get_transformed_dataset()`` method.
24+
25+
To access the recommendations, you can use the ``get_recommendations()`` method on the ``ADSDataset`` object:
2426

2527
.. code-block:: python3
2628
29+
wine_ds = DatasetFactory.from_dataframe(data, target='Price') # Specify the target variable
2730
wine_ds.get_recommendations()
2831
32+
However, please note that ``get_recommendations()`` is not a direct method of the ``ADSDataset`` class. If you created the dataset using ``ADSDataset.from_dataframe(data)``, calling ``get_recommendations()`` directly on the ``ADSDataset`` object will result in an error. Instead, you can retrieve the recommendations by following these steps:
33+
34+
.. code-block:: python3
35+
36+
from ads.dataset.factory import DatasetFactory
37+
38+
wine_ds = DatasetFactory.from_dataframe(data, target='Price')
39+
# Get the recommendations
40+
recommendations = wine_ds.get_recommendations()
41+
42+
The ``recommendations`` variable will contain the detected issues with the dataset and the recommended changes. You can then review and accept the recommended changes as needed.
43+
2944
Alternatively, you can use ``auto_transform()`` to apply all the recommended transformations at once. ``auto_transform()`` returns a transformed dataset with several optimizations applied automatically. The optimizations include:
3045

3146
* Dropping constant and primary key columns, which has no predictive quality.
@@ -242,7 +257,7 @@ You can apply functions to update column values in existing column. This example
242257
Change Data Type
243258
================
244259

245-
You can change the data type columns with the ``astype()`` method. ADS uses the Pandas method, ``astype()``, on dataframe objects. For specifics, see `astype for a Pandas Dataframe <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.astype.html>`_, `using numpy.dtype <https://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html#numpy.dtype>`_, or `Pandas dtypes <https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html#dtypes>`_.
260+
You can change the data type columns with the ``astype()`` method. ADS uses the Pandas method, ``astype()``, on dataframe objects. For specifics, see `astype for a Pandas Dataframe <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.astype.html>`_, `using numpy.dtype <https://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html#numpy.dtype>`_, or `Pandas dtypes <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dtypes.html>`_.
246261

247262
When you change the type of a column, ADS updates its semantic type to categorical, continuous, datetime, or ordinal. For example, if you update a column type to integer, its semantic type updates to ordinal. For data type details, see ref:`loading-data-specify-dtype`.
248263

0 commit comments

Comments
 (0)