|
| 1 | +--- |
| 2 | +layout: default |
| 3 | +title: Getting Started with Jupyter |
| 4 | +nav_order: 4 |
| 5 | +--- |
| 6 | + |
| 7 | +[Project Jupyter](https://jupyter.org/) provides a set of tools for working with notebooks, code, and data. The |
| 8 | +MarkLogic Spark connector can be easily integrated into these tools to allow users to access and analyze data in |
| 9 | +MarkLogic. |
| 10 | + |
| 11 | +To get started, install either [JupyterLab or Jupyter Notebook](https://jupyter.org/install). Both of these tools |
| 12 | +allow you to work with the connector in the same fashion. The rest of this guide will assume the use of Jupyter |
| 13 | +Notebook, though the instructions will work for JupyterLab as well. |
| 14 | + |
| 15 | +Once you have installed, started, and accessed Jupyter Notebook in your web browser - in a default Notebook |
| 16 | +installation, you should be able to access it at http://localhost:8889/tree - click on "New" in the upper right hand |
| 17 | +corner of the Notebook interface and select "Python 3 (ipykernel)" to create a new notebook. |
| 18 | + |
| 19 | +In the first cell in the notebook, enter the following to allow Jupyter Notebook to access the MarkLogic Spark connector |
| 20 | +and also to initialize Spark: |
| 21 | + |
| 22 | +``` |
| 23 | +import os |
| 24 | +os.environ['PYSPARK_SUBMIT_ARGS'] = '--jars "/path/to/marklogic-spark-connector-2.0.0.jar" pyspark-shell' |
| 25 | +
|
| 26 | +from pyspark.sql import SparkSession |
| 27 | +spark = SparkSession.builder.master("local[*]").appName('My Notebook').getOrCreate() |
| 28 | +spark.sparkContext.setLogLevel("WARN") |
| 29 | +spark |
| 30 | +``` |
| 31 | + |
| 32 | +The path of `/path/to/marklogic-spark-connector-2.0.0.jar` should be changed to match the location of the connector |
| 33 | +jar on your filesystem. You are free to customize the `spark` variable in any manner you see fit as well. |
| 34 | + |
| 35 | +Now that you have an initialized Spark session, you can run any of the examples found in the |
| 36 | +[Getting Started with PySpark](getting-started-pyspark.md) guide. |
| 37 | + |
| 38 | + |
| 39 | + |
| 40 | + |
0 commit comments