Skip to content

Commit b0597ed

Browse files
committed
Merge commit '888ad02cb1cf5021813d68ce52590359b224b1e3'
Conflicts: docs/source/user_guide/jobs/run_script.rst
2 parents b6fee53 + 888ad02 commit b0597ed

File tree

6 files changed

+127
-17
lines changed

6 files changed

+127
-17
lines changed

.github/workflows/run-unittests.yml

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
name: Unit Tests
2+
3+
on:
4+
workflow_dispatch:
5+
push:
6+
branches:
7+
- main
8+
- 'release/**'
9+
- develop
10+
paths:
11+
- '!docs/**'
12+
13+
pull_request:
14+
15+
# Cancel in progress workflows on pull_requests.
16+
# https://docs.github.com/en/actions/using-jobs/using-concurrency#example-using-a-fallback-value
17+
concurrency:
18+
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
19+
cancel-in-progress: true
20+
21+
permissions:
22+
contents: read
23+
24+
jobs:
25+
test:
26+
name: ${{ matrix.tests-type }}, python ${{ matrix.python-version }}
27+
runs-on: ubuntu-latest
28+
timeout-minutes: 45
29+
30+
strategy:
31+
fail-fast: false
32+
matrix:
33+
python-version: ["3.7","3.8","3.9","3.10"]
34+
tests-type: ["DefaultSetup"]
35+
36+
steps:
37+
- uses: actions/checkout@v3
38+
# - uses: actions/cache@v3
39+
# with:
40+
# path: ~/.cache/pip
41+
# key: ${{ runner.os }}-pip-${{ hashFiles('**/test-requirements.txt') }}
42+
# restore-keys: |
43+
# ${{ runner.os }}-pip-
44+
- uses: actions/setup-python@v4
45+
with:
46+
python-version: ${{ matrix.python-version }}
47+
48+
- name: "Setup test env"
49+
run: |
50+
pip install coverage pytest-codecov tox==4.2.8
51+
52+
- name: "Run unit tests"
53+
timeout-minutes: 45
54+
shell: bash
55+
run: |
56+
set -x # print commands that are executed
57+
# coverage erase
58+
# ./scripts/runtox.sh "${{ matrix.python-version }}-${{ matrix.tests-type }}" --cov --cov-report=
59+
# coverage combine .coverage-*
60+
# coverage html -i
61+
62+
# Uploading test artifacts
63+
# https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts#uploading-build-and-test-artifacts
64+
# - name: "Upload artifact"
65+
# uses: actions/upload-artifact@v3
66+
# with:
67+
# name: code-coverage-report
68+
# path: htmlcov/
69+
# retention-days: 10

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@ This example uses SQL injection safe binding variables.
166166

167167
## Contributing
168168

169-
This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide [CONTRIBUTING.md](https://github.com/oracle/accelerated-data-science/blob/main/CONTRIBUTING.md).
169+
This project welcomes contributions from the community. Before submitting a pull request, please [review our contribution guide](./CONTRIBUTING.md)
170170

171171
Find Getting Started instructions for developers in [README-development.md](https://github.com/oracle/accelerated-data-science/blob/main/README-development.md)
172172

SECURITY.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,10 @@ and privacy of all our users.
66

77
Please do NOT raise a GitHub Issue to report a security vulnerability. If you
88
believe you have found a security vulnerability, please submit a report to
9-
[secalert_us@oracle.com](mailto:secalert_us@oracle.com) preferably with a proof of concept.
10-
Please review some additional information on
11-
[how to report security vulnerabilities to Oracle](https://www.oracle.com/corporate/security-practices/assurance/vulnerability/reporting.html).
9+
[secalert_us@oracle.com][1] preferably with a proof of concept. Please review
10+
some additional information on [how to report security vulnerabilities to Oracle][2].
1211
We encourage people who contact Oracle Security to use email encryption using
13-
[our encryption key](https://www.oracle.com/security-alerts/encryptionkey.html).
12+
[our encryption key][3].
1413

1514
We ask that you do not use other channels or contact the project maintainers
1615
directly.
@@ -22,15 +21,18 @@ security features are welcome on GitHub Issues.
2221

2322
Security updates will be released on a regular cadence. Many of our projects
2423
will typically release security fixes in conjunction with the
25-
[Oracle Critical Patch Update](https://www.oracle.com/security-alerts/encryptionkey.html) program.
26-
Security updates are released on the Tuesday closest to the 17th day of January, April, July and October.
27-
A pre-release announcement will be published on the Thursday preceding each release. Additional
28-
information, including past advisories, is available on our
29-
[security alerts](https://www.oracle.com/security-alerts/) page.
24+
[Oracle Critical Patch Update][3] program. Additional
25+
information, including past advisories, is available on our [security alerts][4]
26+
page.
3027

3128
## Security-related information
3229

3330
We will provide security related information such as a threat model, considerations
3431
for secure use, or any known security issues in our documentation. Please note
3532
that labs and sample code are intended to demonstrate a concept and may not be
3633
sufficiently hardened for production use.
34+
35+
[1]: mailto:secalert_us@oracle.com
36+
[2]: https://www.oracle.com/corporate/security-practices/assurance/vulnerability/reporting.html
37+
[3]: https://www.oracle.com/security-alerts/encryptionkey.html
38+
[4]: https://www.oracle.com/security-alerts/

build_spec.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Copyright (c) 2023, 2022, Oracle and/or its affiliates.
2+
3+
version: 0.1
4+
component: build
5+
timeoutInSeconds: 1000
6+
shell: bash
7+
8+
steps:
9+
- type: Command
10+
name: "compress the repo"
11+
command: |
12+
tar -cvzf ${OCI_WORKSPACE_DIR}/repo.tgz ./
13+
outputArtifacts:
14+
- name: artifact
15+
type: BINARY
16+
location: ${OCI_WORKSPACE_DIR}/repo.tgz

docs/requirements.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
-e .
21
autodoc
32
nbsphinx
43
sphinx

docs/source/user_guide/apachespark/dataflow.rst

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ Define config. If you have not yet configured your dataflow setting, or would li
3636
dataflow_config.logs_bucket_uri = "oci://<my-bucket>@<my-tenancy>/"
3737
dataflow_config.spark_version = "3.2.1"
3838
dataflow_config.configuration = {"spark.driver.memory": "512m"}
39+
dataflow_config.private_endpoint_id = "ocid1.dataflowprivateendpoint.oc1.iad.<your private endpoint ocid>"
3940
4041
Use the config defined above to submit the cell.
4142

@@ -159,6 +160,11 @@ You could submit a notebook using ADS SDK APIs. Here is an example to submit a n
159160
.with_executor_shape("VM.Standard.E4.Flex")
160161
.with_executor_shape_config(ocpus=4, memory_in_gbs=64)
161162
.with_logs_bucket_uri("oci://mybucket@mytenancy/")
163+
.with_private_endpoint_id("ocid1.dataflowprivateendpoint.oc1.iad.<your private endpoint ocid>")
164+
.with_configuration({
165+
"spark.driverEnv.myEnvVariable": "value1",
166+
"spark.executorEnv.myEnvVariable": "value2",
167+
})
162168
)
163169
rt = (
164170
DataFlowNotebookRuntime()
@@ -197,6 +203,7 @@ You can set them using the ``with_{property}`` functions:
197203
- ``with_num_executors``
198204
- ``with_spark_version``
199205
- ``with_warehouse_bucket_uri``
206+
- ``with_private_endpoint_id`` (`doc <https://docs.oracle.com/en-us/iaas/data-flow/using/pe-allowing.htm#pe-allowing>`__)
200207

201208
For more details, see `DataFlow class documentation <https://docs.oracle.com/en-us/iaas/tools/ads-sdk/latest/ads.jobs.html#module-ads.jobs.builders.infrastructure.dataflow>`__.
202209

@@ -209,6 +216,7 @@ The ``DataFlowRuntime`` properties are:
209216
- ``with_archive_uri`` (`doc <https://docs.oracle.com/en-us/iaas/data-flow/using/dfs_data_flow_library.htm#third-party-libraries>`__)
210217
- ``with_archive_bucket``
211218
- ``with_custom_conda``
219+
- ``with_configuration``
212220

213221
For more details, see the `runtime class documentation <../../ads.jobs.html#module-ads.jobs.builders.runtimes.python_runtime>`__.
214222

@@ -217,7 +225,7 @@ object can be reused and combined with various ``DataFlowRuntime`` parameters to
217225
create applications.
218226

219227
In the following "hello-world" example, ``DataFlow`` is populated with ``compartment_id``,
220-
``driver_shape``, ``driver_shape_config``, ``executor_shape``, ``executor_shape_config``
228+
``driver_shape``, ``driver_shape_config``, ``executor_shape``, ``executor_shape_config``
221229
and ``spark_version``. ``DataFlowRuntime`` is populated with ``script_uri`` and
222230
``script_bucket``. The ``script_uri`` specifies the path to the script. It can be
223231
local or remote (an Object Storage path). If the path is local, then
@@ -267,6 +275,10 @@ accepted. In the next example, the prefix is given for ``script_bucket``.
267275
.with_script_uri(os.path.join(td, "script.py"))
268276
.with_script_bucket("oci://mybucket@namespace/prefix")
269277
.with_custom_conda("oci://<mybucket>@<mynamespace>/<path/to/conda_pack>")
278+
.with_configuration({
279+
"spark.driverEnv.myEnvVariable": "value1",
280+
"spark.executorEnv.myEnvVariable": "value2",
281+
})
270282
)
271283
df = Job(name=name, infrastructure=dataflow_configs, runtime=runtime_config)
272284
df.create()
@@ -374,6 +386,10 @@ In the next example, ``archive_uri`` is given as an Object Storage location.
374386
.with_executor_shape("VM.Standard.E4.Flex")
375387
.with_executor_shape_config(ocpus=4, memory_in_gbs=64)
376388
.with_spark_version("3.0.2")
389+
.with_configuration({
390+
"spark.driverEnv.myEnvVariable": "value1",
391+
"spark.executorEnv.myEnvVariable": "value2",
392+
})
377393
)
378394
runtime_config = (
379395
DataFlowRuntime()
@@ -545,12 +561,16 @@ into the ``Job.from_yaml()`` function to build a Data Flow job:
545561
language: PYTHON
546562
logsBucketUri: <logs_bucket_uri>
547563
numExecutors: 1
548-
sparkVersion: 2.4.4
564+
sparkVersion: 3.2.1
565+
privateEndpointId: <private_endpoint_ocid>
549566
type: dataFlow
550567
name: dataflow_app_name
551568
runtime:
552569
kind: runtime
553570
spec:
571+
configuration:
572+
spark.driverEnv.myEnvVariable: value1
573+
spark.executorEnv.myEnvVariable: value2
554574
scriptBucket: bucket_name
555575
scriptPathURI: oci://<bucket_name>@<namespace>/<prefix>
556576
type: dataFlow
@@ -618,6 +638,12 @@ into the ``Job.from_yaml()`` function to build a Data Flow job:
618638
sparkVersion:
619639
required: false
620640
type: string
641+
privateEndpointId:
642+
required: false
643+
type: string
644+
configuration:
645+
required: false
646+
type: dict
621647
type:
622648
allowed:
623649
- dataFlow
@@ -662,11 +688,9 @@ into the ``Job.from_yaml()`` function to build a Data Flow job:
662688
- service
663689
required: true
664690
type: string
665-
env:
666-
type: list
691+
configuration:
667692
required: false
668-
schema:
669-
type: dict
693+
type: dict
670694
freeform_tag:
671695
required: false
672696
type: dict

0 commit comments

Comments
 (0)