Skip to content

charmed-spark-jupyter starts with an existing spark session #157

@gustavosr98

Description

@gustavosr98

As soon as I start the notebook I can see exec pods running

pysparkshell-a5c6ae98aac7dabe-exec-1                          2/2     Running   0          118s
pysparkshell-a5c6ae98aac7dabe-exec-2                          2/2     Running   0          117s

I want to be able to create a new session, specifiying config parameters from the Notebook
It is confusing to have an auto created session, then believe it is using the new session with getOrCreate()

Having to experiment Spark paremeters over PodDefaults, destroy the notebook, update PodDefaults, create a new notebook takes too much time

We should by default not trigger the creation of any session


Kind of "Workaround"

# Run this at the start of your notebook

from pyspark.sql import SparkSession
# Delete any old session
SparkSession.builder.getOrCreate().stop()

It will allow to create a new session with specified configs
However, it will continue in the background try to create a new auto session with default configs

pysparkshell-0a9af798aad8f8c7-exec-14                         1/2     Terminating   0          12s
pysparkshell-0a9af798aad8f8c7-exec-15                         1/2     Terminating   0          8s
pysparkshell-0a9af798aad8f8c7-exec-16                         2/2     Running       0          7s
pysparkshell-0a9af798aad8f8c7-exec-17                         1/2     Running       0          3s
s3example-007e6298aad4f17c-exec-1                             2/2     Running       0          5m19s
s3example-007e6298aad4f17c-exec-2                             2/2     Running       0          5m19s

Not so great of a workaround. In a few minutes it can already bug the K8s cluster

pysparkshell-0a9af798aad8f8c7-exec-102                        1/2     Terminating       0          11s
pysparkshell-0a9af798aad8f8c7-exec-103                        1/2     Terminating       0          11s
pysparkshell-0a9af798aad8f8c7-exec-104                        2/2     Running           0          4s
pysparkshell-0a9af798aad8f8c7-exec-105                        1/2     Running           0          4s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions