Skip to content

Commit d91a92f

Browse files
authored
Merge pull request #46 from marklogic/feature/466-jupyter
DEVEXP-466 Added guide for using Jupyter
2 parents 8fff4e8 + c43dc89 commit d91a92f

File tree

4 files changed

+43
-3
lines changed

4 files changed

+43
-3
lines changed

docs/configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: Configuration Reference
4-
nav_order: 6
4+
nav_order: 7
55
---
66

77
The MarkLogic Spark connector has 3 sets of configuration options - connection options, reading options, and writing

docs/getting-started-jupyter.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
layout: default
3+
title: Getting Started with Jupyter
4+
nav_order: 4
5+
---
6+
7+
[Project Jupyter](https://jupyter.org/) provides a set of tools for working with notebooks, code, and data. The
8+
MarkLogic Spark connector can be easily integrated into these tools to allow users to access and analyze data in
9+
MarkLogic.
10+
11+
To get started, install either [JupyterLab or Jupyter Notebook](https://jupyter.org/install). Both of these tools
12+
allow you to work with the connector in the same fashion. The rest of this guide will assume the use of Jupyter
13+
Notebook, though the instructions will work for JupyterLab as well.
14+
15+
Once you have installed, started, and accessed Jupyter Notebook in your web browser - in a default Notebook
16+
installation, you should be able to access it at http://localhost:8889/tree - click on "New" in the upper right hand
17+
corner of the Notebook interface and select "Python 3 (ipykernel)" to create a new notebook.
18+
19+
In the first cell in the notebook, enter the following to allow Jupyter Notebook to access the MarkLogic Spark connector
20+
and also to initialize Spark:
21+
22+
```
23+
import os
24+
os.environ['PYSPARK_SUBMIT_ARGS'] = '--jars "/path/to/marklogic-spark-connector-2.0.0.jar" pyspark-shell'
25+
26+
from pyspark.sql import SparkSession
27+
spark = SparkSession.builder.master("local[*]").appName('My Notebook').getOrCreate()
28+
spark.sparkContext.setLogLevel("WARN")
29+
spark
30+
```
31+
32+
The path of `/path/to/marklogic-spark-connector-2.0.0.jar` should be changed to match the location of the connector
33+
jar on your filesystem. You are free to customize the `spark` variable in any manner you see fit as well.
34+
35+
Now that you have an initialized Spark session, you can run any of the examples found in the
36+
[Getting Started with PySpark](getting-started-pyspark.md) guide.
37+
38+
39+
40+

docs/reading.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: Reading Data
4-
nav_order: 4
4+
nav_order: 5
55
---
66

77
The MarkLogic Spark connector allows for data to be retrieved from MarkLogic as rows via an

docs/writing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: Writing Data
4-
nav_order: 5
4+
nav_order: 6
55
---
66

77
The MarkLogic Spark connector allows for writing rows in a Spark DataFrame to MarkLogic as documents. The sections below

0 commit comments

Comments
 (0)