You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/user_guide/data_transformation/data_transformation.rst
+8-5Lines changed: 8 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
Transform Data
4
4
##############
5
5
6
-
When datasets are loaded with DatasetFactory, they can be transformed and manipulated easily with the built-in functions. Underlying, an ``ADSDataset`` object is a Pandas dataframe. Any operation that can be performed to a `Pandas dataframe <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html>`_ can also be applied to an ADS Dataset.
6
+
When datasets are loaded, they can be transformed and manipulated easily with the built-in functions. Underlying, an ``ADSDataset`` object is a Pandas dataframe. Any operation that can be performed to a `Pandas dataframe <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html>`_ can also be applied to an ADS Dataset.
7
7
8
8
Loading the Dataset
9
9
********************
@@ -12,9 +12,9 @@ You can load a ``pandas`` dataframe into an ``ADSDataset`` by calling.
12
12
13
13
.. code-block:: python3
14
14
15
-
from ads.dataset.factory import DatasetFactory
15
+
from ads.dataset.dataset import ADSDataset
16
16
17
-
ds = DatasetFactory.from_dataframe(df)
17
+
ds = ADSDataset.from_dataframe(df)
18
18
19
19
20
20
Automated Transformations
@@ -513,11 +513,14 @@ The resulting three data subsets each have separate data (X) and labels (y).
513
513
print(train.X) # print out all features in train dataset
514
514
print(train.y) # print out labels in train dataset
515
515
516
-
You can split the dataset right after the ``DatasetFactory.open()`` statement:
516
+
You can split the dataset right after the ``ADSDatasetWithTarget.from_dataframe()`` statement:
Copy file name to clipboardExpand all lines: docs/source/user_guide/loading_data/connect.rst
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -526,34 +526,34 @@ To load a dataframe from a remote web server source, use ``pandas`` directly and
526
526
Convert Pandas DataFrame to ``ADSDataset``
527
527
==========================================
528
528
529
-
To convert a Pandas dataframe to ``ADSDataset``, pass the ``pandas.DataFrame`` object directly into the ADS ``DatasetFactory.open`` method:
529
+
To convert a Pandas dataframe to ``ADSDataset``, pass the ``pandas.DataFrame`` object directly into the ADS ``ADSDataset`` constructor or ``ADSDataset.from_dataframe()`` method:
530
530
531
531
.. code-block:: python3
532
532
533
533
import pandas as pd
534
-
from ads.dataset.factory import DatasetFactory
534
+
from ads.dataset.dataset import ADSDataset
535
535
536
536
df = pd.read_csv('/path/some_data.csv) # load data with Pandas
537
537
538
538
# use open...
539
539
540
-
ds = DatasetFactory.open(df) # construct **ADS** Dataset from DataFrame
540
+
ds = ADSDataset(df) # construct **ADS** Dataset from DataFrame
541
541
542
542
# alternative form...
543
543
544
-
ds = DatasetFactory.from_dataframe(df)
544
+
ds = ADSDataset.from_dataframe(df)
545
545
546
546
# an example using Pandas to parse data on the clipboard as a CSV and construct an ADS Dataset object
547
547
# this allows easily transfering data from an application like Microsoft Excel, Apple Numbers, etc.
To open a dataset from Object Storage using the Oracle Cloud Infrastructure configuration file method, include the location of the file using this format ``oci://<bucket_name>@<namespace>/<file_name>`` and modify the optional parameter ``storage_options``. Insert:
You can use ADS to query a table from your database, and then load that table as an ``ADSDataset`` object through ``DatasetFactory``.
152
-
When you open ``DatasetFactory``, specify the name of the table you want to pull using the ``table`` variable for a given table. For SQL expressions, use the table parameter also. For example, *(`table="SELECT * FROM sh.times WHERE rownum <= 30"`)*.
158
+
You can use ADS to query a table from your database, and then load that table as an ``ADSDatasetWithTarget`` object.
159
+
When you open ``ADSDatasetWithTarget``, specify the name of the table you want to pull using the ``table`` variable for a given table. For SQL expressions, use the table parameter also. For example, *(`table="SELECT * FROM sh.times WHERE rownum <= 30"`)*.
df = pd.read_sql('SELECT * from <TABLENAME>', con=engine)
174
181
175
-
You can convert the ``pd.DataFrame`` into ``ADSDataset`` using the ``DatasetFactory.from_dataframe()`` function.
182
+
You can convert the ``pd.DataFrame`` into ``ADSDataset`` using the ``ADSDataset.from_dataframe()`` function.
176
183
177
184
.. code-block:: python3
178
185
179
-
ds = DatasetFactory.from_dataframe(df)
186
+
ds = ADSDataset.from_dataframe(df)
180
187
181
188
These two examples run a simple query on ADW data. With ``read_sql_query`` you can use SQL expressions not just for tables, but also to limit the number of rows and to apply conditions with filters, such as (``where``).
182
189
@@ -207,7 +214,7 @@ You can also query data from ADW using cx_Oracle. Use the cx_Oracle 7.0.0 versio
207
214
data = results.fetchall()
208
215
df = pd.DataFrame(np.array(data))
209
216
210
-
ds = DatasetFactory.from_dataframe(df)
217
+
ds = ADSDataset.from_dataframe(df)
211
218
212
219
.. code-block:: python3
213
220
@@ -230,7 +237,7 @@ This example adds predictions programmatically using cx_Oracle. It uses ``execut
230
237
231
238
.. code-block:: python3
232
239
233
-
ds = DatasetFactory.open("iris.csv")
240
+
ds = ADSDataset(pd.read_csv("iris.csv"))
234
241
235
242
create_table = '''CREATE TABLE IRIS_PREDICTED (,
236
243
sepal_length number,
@@ -269,24 +276,29 @@ You can open Amazon S3 public or private files in ADS. For private files, you mu
Copy file name to clipboardExpand all lines: docs/source/user_guide/loading_data/supported_format.rst
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -3,11 +3,11 @@ Supported Formats
3
3
4
4
You can load datasets into ADS, either locally or from network file systems.
5
5
6
-
You can open datasets with ``DatasetFactory``, ``DatasetBrowser`` or ``pandas``. ``DatasetFactory`` allows datasets to be loaded into ADS.
6
+
You can open datasets with ``DatasetBrowser`` or ``pandas``.
7
7
8
8
``DatasetBrowser`` supports opening the datasets from web sites and libraries, such as scikit-learn directly into ADS.
9
9
10
-
When you open a dataset in ``DatasetFactory``, you can get the summary statistics, correlations, and visualizations of the dataset.
10
+
When you load a dataset in ``ADSDataset`` from ``pandas.DataFrame``, you can get the summary statistics, correlations, and visualizations of the dataset.
0 commit comments