Merge commit '888ad02cb1cf5021813d68ce52590359b224b1e3'

qiuosier · qiuosier · commit b0597ed98880 · 2023-03-01T20:16:49.000-05:00
Conflicts:
	docs/source/user_guide/jobs/run_script.rst
diff --git a/.github/workflows/run-unittests.yml b/.github/workflows/run-unittests.yml
@@ -0,0 +1,69 @@
+name: Unit Tests
+
+on:
+  workflow_dispatch:
+  push:
+    branches:
+      - main
+      - 'release/**'
+      - develop
+    paths:
+      - '!docs/**'
+
+  pull_request:
+
+# Cancel in progress workflows on pull_requests.
+# https://docs.github.com/en/actions/using-jobs/using-concurrency#example-using-a-fallback-value
+concurrency:
+  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
+  cancel-in-progress: true
+
+permissions:
+  contents: read
+
+jobs:
+  test:
+    name: ${{ matrix.tests-type }}, python ${{ matrix.python-version }}
+    runs-on: ubuntu-latest
+    timeout-minutes: 45
+
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.7","3.8","3.9","3.10"]
+        tests-type: ["DefaultSetup"]
+
+    steps:
+      - uses: actions/checkout@v3
+#      - uses: actions/cache@v3
+#        with:
+#          path: ~/.cache/pip
+#          key: ${{ runner.os }}-pip-${{ hashFiles('**/test-requirements.txt') }}
+#          restore-keys: |
+#            ${{ runner.os }}-pip-
+      - uses: actions/setup-python@v4
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: "Setup test env"
+        run: |
+          pip install coverage pytest-codecov tox==4.2.8
+
+      - name: "Run unit tests"
+        timeout-minutes: 45
+        shell: bash
+        run: |
+          set -x # print commands that are executed
+#          coverage erase
+#          ./scripts/runtox.sh "${{ matrix.python-version }}-${{ matrix.tests-type }}" --cov --cov-report=
+#          coverage combine .coverage-*
+#          coverage html -i
+
+      # Uploading test artifacts
+      # https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts#uploading-build-and-test-artifacts
+#      - name: "Upload artifact"
+#        uses: actions/upload-artifact@v3
+#        with:
+#          name: code-coverage-report
+#          path: htmlcov/
+#          retention-days: 10
diff --git a/README.md b/README.md
@@ -166,7 +166,7 @@ This example uses SQL injection safe binding variables.
 
 ## Contributing
 
-This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide [CONTRIBUTING.md](https://github.com/oracle/accelerated-data-science/blob/main/CONTRIBUTING.md).
+This project welcomes contributions from the community. Before submitting a pull request, please [review our contribution guide](./CONTRIBUTING.md)
 
 Find Getting Started instructions for developers in [README-development.md](https://github.com/oracle/accelerated-data-science/blob/main/README-development.md)
 
diff --git a/SECURITY.md b/SECURITY.md
@@ -6,11 +6,10 @@ and privacy of all our users.
 
 Please do NOT raise a GitHub Issue to report a security vulnerability. If you
 believe you have found a security vulnerability, please submit a report to
-[secalert_us@oracle.com](mailto:secalert_us@oracle.com) preferably with a proof of concept. 
-Please review some additional information on 
-[how to report security vulnerabilities to Oracle](https://www.oracle.com/corporate/security-practices/assurance/vulnerability/reporting.html).
+[secalert_us@oracle.com][1] preferably with a proof of concept. Please review
+some additional information on [how to report security vulnerabilities to Oracle][2].
 We encourage people who contact Oracle Security to use email encryption using
-[our encryption key](https://www.oracle.com/security-alerts/encryptionkey.html).
+[our encryption key][3].
 
 We ask that you do not use other channels or contact the project maintainers
 directly.
@@ -22,15 +21,18 @@ security features are welcome on GitHub Issues.
 
 Security updates will be released on a regular cadence. Many of our projects
 will typically release security fixes in conjunction with the
-[Oracle Critical Patch Update](https://www.oracle.com/security-alerts/encryptionkey.html) program. 
-Security updates are released on the Tuesday closest to the 17th day of January, April, July and October. 
-A pre-release announcement will be published on the Thursday preceding each release. Additional
-information, including past advisories, is available on our 
-[security alerts](https://www.oracle.com/security-alerts/) page.
+[Oracle Critical Patch Update][3] program. Additional
+information, including past advisories, is available on our [security alerts][4]
+page.
 
 ## Security-related information
 
 We will provide security related information such as a threat model, considerations
 for secure use, or any known security issues in our documentation. Please note
 that labs and sample code are intended to demonstrate a concept and may not be
 sufficiently hardened for production use.
+
+[1]: mailto:secalert_us@oracle.com
+[2]: https://www.oracle.com/corporate/security-practices/assurance/vulnerability/reporting.html
+[3]: https://www.oracle.com/security-alerts/encryptionkey.html
+[4]: https://www.oracle.com/security-alerts/
diff --git a/build_spec.yaml b/build_spec.yaml
@@ -0,0 +1,16 @@
+# Copyright (c) 2023, 2022, Oracle and/or its affiliates.
+
+version: 0.1
+component: build
+timeoutInSeconds: 1000
+shell: bash
+
+steps:
+  - type: Command
+    name: "compress the repo"
+    command: |
+      tar -cvzf ${OCI_WORKSPACE_DIR}/repo.tgz ./
+outputArtifacts:
+  - name: artifact
+    type: BINARY
+    location: ${OCI_WORKSPACE_DIR}/repo.tgz
diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -1,4 +1,3 @@
--e .
 autodoc
 nbsphinx
 sphinx
diff --git a/docs/source/user_guide/apachespark/dataflow.rst b/docs/source/user_guide/apachespark/dataflow.rst
@@ -36,6 +36,7 @@ Define config. If you have not yet configured your dataflow setting, or would li
   dataflow_config.logs_bucket_uri = "oci://<my-bucket>@<my-tenancy>/"
   dataflow_config.spark_version = "3.2.1"
   dataflow_config.configuration = {"spark.driver.memory": "512m"}
+  dataflow_config.private_endpoint_id = "ocid1.dataflowprivateendpoint.oc1.iad.<your private endpoint ocid>"
 
 Use the config defined above to submit the cell.
 
@@ -159,6 +160,11 @@ You could submit a notebook using ADS SDK APIs. Here is an example to submit a n
 		.with_executor_shape("VM.Standard.E4.Flex")
 		.with_executor_shape_config(ocpus=4, memory_in_gbs=64)
         .with_logs_bucket_uri("oci://mybucket@mytenancy/")
+        .with_private_endpoint_id("ocid1.dataflowprivateendpoint.oc1.iad.<your private endpoint ocid>")
+        .with_configuration({
+            "spark.driverEnv.myEnvVariable": "value1",
+            "spark.executorEnv.myEnvVariable": "value2",
+        })
     )
     rt = (
         DataFlowNotebookRuntime()
@@ -197,6 +203,7 @@ You can set them using the ``with_{property}`` functions:
 - ``with_num_executors``
 - ``with_spark_version``
 - ``with_warehouse_bucket_uri``
+- ``with_private_endpoint_id`` (`doc <https://docs.oracle.com/en-us/iaas/data-flow/using/pe-allowing.htm#pe-allowing>`__)
 
 For more details, see `DataFlow class documentation <https://docs.oracle.com/en-us/iaas/tools/ads-sdk/latest/ads.jobs.html#module-ads.jobs.builders.infrastructure.dataflow>`__.
 
@@ -209,6 +216,7 @@ The ``DataFlowRuntime`` properties are:
 - ``with_archive_uri`` (`doc <https://docs.oracle.com/en-us/iaas/data-flow/using/dfs_data_flow_library.htm#third-party-libraries>`__)
 - ``with_archive_bucket``
 - ``with_custom_conda``
+- ``with_configuration``
 
 For more details, see the `runtime class documentation <../../ads.jobs.html#module-ads.jobs.builders.runtimes.python_runtime>`__.
 
@@ -217,7 +225,7 @@ object can be reused and combined with various ``DataFlowRuntime`` parameters to
 create applications.
 
 In the following "hello-world" example, ``DataFlow`` is populated with ``compartment_id``,
-``driver_shape``, ``driver_shape_config``, ``executor_shape``, ``executor_shape_config`` 
+``driver_shape``, ``driver_shape_config``, ``executor_shape``, ``executor_shape_config``
 and ``spark_version``. ``DataFlowRuntime`` is populated with ``script_uri`` and
 ``script_bucket``. The ``script_uri`` specifies the path to the script. It can be
 local or remote (an Object Storage path). If the path is local, then
@@ -267,6 +275,10 @@ accepted. In the next example, the prefix is given for ``script_bucket``.
             .with_script_uri(os.path.join(td, "script.py"))
             .with_script_bucket("oci://mybucket@namespace/prefix")
             .with_custom_conda("oci://<mybucket>@<mynamespace>/<path/to/conda_pack>")
+            .with_configuration({
+                "spark.driverEnv.myEnvVariable": "value1",
+                "spark.executorEnv.myEnvVariable": "value2",
+            })
         )
         df = Job(name=name, infrastructure=dataflow_configs, runtime=runtime_config)
         df.create()
@@ -374,6 +386,10 @@ In the next example, ``archive_uri`` is given as an Object Storage location.
 		    .with_executor_shape("VM.Standard.E4.Flex")
 		    .with_executor_shape_config(ocpus=4, memory_in_gbs=64)
             .with_spark_version("3.0.2")
+            .with_configuration({
+                "spark.driverEnv.myEnvVariable": "value1",
+                "spark.executorEnv.myEnvVariable": "value2",
+            })
         )
         runtime_config = (
             DataFlowRuntime()
@@ -545,12 +561,16 @@ into the ``Job.from_yaml()`` function to build a Data Flow job:
         language: PYTHON
         logsBucketUri: <logs_bucket_uri>
         numExecutors: 1
-        sparkVersion: 2.4.4
+        sparkVersion: 3.2.1
+        privateEndpointId: <private_endpoint_ocid>
       type: dataFlow
     name: dataflow_app_name
     runtime:
       kind: runtime
       spec:
+        configuration:
+            spark.driverEnv.myEnvVariable: value1
+            spark.executorEnv.myEnvVariable: value2
         scriptBucket: bucket_name
         scriptPathURI: oci://<bucket_name>@<namespace>/<prefix>
       type: dataFlow
@@ -618,6 +638,12 @@ into the ``Job.from_yaml()`` function to build a Data Flow job:
             sparkVersion:
                 required: false
                 type: string
+            privateEndpointId:
+                required: false
+                type: string
+            configuration:
+                required: false
+                type: dict
     type:
         allowed:
             - dataFlow
@@ -662,11 +688,9 @@ into the ``Job.from_yaml()`` function to build a Data Flow job:
                             - service
                         required: true
                         type: string
-            env:
-                type: list
+            configuration:
                 required: false
-                schema:
-                    type: dict
+                type: dict
             freeform_tag:
                 required: false
                 type: dict

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,3 @@`
`1`		`--e .`
`2`	`1`	`autodoc`
`3`	`2`	`nbsphinx`
`4`	`3`	`sphinx`