You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The [Oracle Accelerated Data Science (ADS) SDK](https://docs.oracle.com/en-us/iaas/tools/ads-sdk/latest/index.html) is maintained by the [Oracle Cloud Infrastructure Data Science service](https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm) team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
5
+
The [Oracle Accelerated Data Science (ADS) SDK](https://docs.oracle.com/en-us/iaas/tools/ads-sdk/latest/index.html) is maintained by the Oracle Cloud Infrastructure (OCI) [Data Science service](https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm) team. It speeds up common data science activities by providing tools that automate and simplify common data science tasks. Additionally, provides data scientists a friendly pythonic interface to OCI services. Some of the more notable services are OCI Data Science, Model Catalog, Model Deployment, Jobs, Data Flow, Object Storage, Vault, Big Data Service, Data Catalog, and the Autonomous Database. ADS gives you an interface to manage the life cycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
6
6
7
7
With ADS you can:
8
8
9
9
- Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`.
10
-
-Easily compute summary statistics on your dataframes and perform data profiling.
11
-
- Tune models using hyperparameter optimization with the `ADSTuner` tool.
12
-
- Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
10
+
-Use feature types to characterize your data, create meaning summary statistics and plot. Use the warning and validation system to test the quality of your data.
11
+
- Tune models using hyperparameter optimization with the `ADSTuner` tool.
12
+
- Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module.
13
13
- Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm).
14
-
- Deploy those models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
14
+
- Deploy models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
15
15
- Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm).
16
-
- Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
17
-
- Manage the lifecycle of conda environments through the `ads conda` command line interface (CLI).
16
+
- Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm).
17
+
- Manage the life cycle of conda environments through the `ads conda` command line interface (CLI).
18
18
19
19
## Installation
20
20
@@ -28,44 +28,98 @@ You have various options when installing ADS.
28
28
29
29
### Installing extras libraries
30
30
31
-
To use ADS within a [Notebook Session](https://docs.oracle.com/en-us/iaas/data-science/using/manage-notebook-sessions.htm) of the OCI Data Science service:
31
+
The `all-optional` module will install all optional dependencies.
32
32
33
33
```bash
34
-
$ python3 -m pip install oracle-ads[notebook]
34
+
$ python3 -m pip install oracle-ads[all-optional]
35
35
```
36
36
37
-
For machine learning tasks install
37
+
To work with gradient boosting models, install the `boosted` module. This module includes XGBoost and LightGBM model classes.
38
38
39
39
```bash
40
40
$ python3 -m pip install oracle-ads[boosted]
41
41
```
42
42
43
-
To work on text related tasks run
43
+
For big data use cases using Oracle Big Data Service (BDS), install the `bds` module. It includes the following libraries, `ibis-framework[impala]`, `hdfs[kerberos]` and `sqlalchemy`.
44
44
45
45
```bash
46
-
$ python3 -m pip install oracle-ads[text]
46
+
$ python3 -m pip install oracle-ads[bds]
47
47
```
48
48
49
-
For access to a broad set of data formats (for example, Excel, Avro, etc.) run
49
+
To work with a broad set of data formats (for example, Excel, Avro, etc.) install the `data` module. It includes the `fastavro`, `openpyxl`, `pandavro`, `asteval`, `datefinder`, `htmllistparse`, and `sqlalchemy` libraries.
50
50
51
51
```bash
52
52
$ python3 -m pip install oracle-ads[data]
53
53
```
54
54
55
+
To work with geospatial data install the `geo` module. It includes the `geopandas` and libraries from the `viz` module.
56
+
57
+
```bash
58
+
$ python3 -m pip install oracle-ads[geo]
59
+
```
60
+
61
+
Install the `notebook` module to use ADS within a OCI Data Science service [notebook session](https://docs.oracle.com/en-us/iaas/data-science/using/manage-notebook-sessions.htm). This module installs `ipywidgets` and `ipython` libraries.
62
+
63
+
```bash
64
+
$ python3 -m pip install oracle-ads[notebook]
65
+
```
66
+
67
+
To work with ONNX-compatible run times and libraries designed to maximize performance and model portability, install the `onnx` module. It includes the following libraries, `onnx`, `onnxruntime`, `onnxmltools`, `skl2onnx`, `xgboost`, `lightgbm` and libraries from the `viz` module.
68
+
69
+
```bash
70
+
$ python3 -m pip install oracle-ads[onnx]
71
+
```
72
+
73
+
For infrastructure tasks, install the `opctl` module. It includes the following libraries, `oci-cli`, `docker`, `conda-pack`, `nbconvert`, `nbformat`, and `inflection`.
74
+
75
+
```bash
76
+
$ python3 -m pip install oracle-ads[opctl]
77
+
```
78
+
79
+
For hyperparameter optimization tasks install the `optuna` module. It includes the `optuna` and libraries from the `viz` module.
80
+
81
+
```bash
82
+
$ python3 -m pip install oracle-ads[optuna]
83
+
```
84
+
85
+
Install the `tensorflow` module to include `tensorflow` and libraries from the `viz` module.
86
+
87
+
```bash
88
+
$ python3 -m pip install oracle-ads[tensorflow]
89
+
```
90
+
91
+
For text related tasks, install the `text` module. This will include the `wordcloud`, `spacy` libraries.
92
+
93
+
```bash
94
+
$ python3 -m pip install oracle-ads[text]
95
+
```
96
+
97
+
Install the `torch` module to include `pytorch` and libraries from the `viz` module.
98
+
99
+
```bash
100
+
$ python3 -m pip install oracle-ads[torch]
101
+
```
102
+
103
+
Install the `viz` module to include libraries for visualization tasks. Some of the key packages are `bokeh`, `folium`, `seaborn` and related packages.
104
+
105
+
```bash
106
+
$ python3 -m pip install oracle-ads[viz]
107
+
```
108
+
55
109
**Note**
56
110
57
111
Multiple extra dependencies can be installed together. For example:
This example uses SQL injection safe binding variables.
87
144
88
145
```python
146
+
import ads
147
+
import pandas as pd
148
+
89
149
connection_parameters = {
90
-
"user_name": "<username>",
150
+
"user_name": "<user_name>",
91
151
"password": "<password>",
92
-
"service_name": "<service_name_{high|med|low}>",
93
-
"wallet_location": "/full/path/to/my_wallet.zip",
152
+
"service_name": "<tns_name>",
153
+
"wallet_location": "<file_path>",
94
154
}
95
-
import pandas as pd
96
-
import ads
97
155
98
-
# simple read of a SQL query into a dataframe with no bind variables
99
-
df = pd.DataFrame.ads.read_sql(
100
-
"SELECT * FROM SH.SALES",
101
-
connection_parameters=connection_parameters,
102
-
)
103
-
```
104
-
105
-
### Load data from ADB (using sql-injection-safe bind variables)
106
-
107
-
```python
108
156
df = pd.DataFrame.ads.read_sql(
109
157
"""
110
-
SELECT
111
-
*
112
-
FROM
113
-
SH.SALES
114
-
WHERE
115
-
ROWNUM <= :max_rows
158
+
SELECT *
159
+
FROM SH.SALES
160
+
WHERE ROWNUM <= :max_rows
116
161
""",
117
-
bind_variables={
118
-
max_rows : 100
119
-
},
162
+
bind_variables={ max_rows : 100 },
120
163
connection_parameters=connection_parameters,
121
164
)
122
165
```
@@ -129,8 +172,8 @@ Find Getting Started instructions for developers in [README-development.md](http
129
172
130
173
## Security
131
174
132
-
Please consult the security guide [SECURITY.md](https://github.com/oracle/accelerated-data-science/blob/main/SECURITY.md) for our responsible security vulnerability disclosure process.
175
+
Consult the security guide [SECURITY.md](https://github.com/oracle/accelerated-data-science/blob/main/SECURITY.md) for our responsible security vulnerability disclosure process.
133
176
134
177
## License
135
178
136
-
Copyright (c) 2020, 2022 Oracle and/or its affiliates. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/
179
+
Copyright (c) 2020, 2022 Oracle and/or its affiliates. Licensed under the [Universal Permissive License v1.0](https://oss.oracle.com/licenses/upl/)
0 commit comments