HELP! Can't run example against AWS EMR Cluster #247
Answered
by
wajda
mfrictionless
asked this question in
Q&A
-
Sorry, brand new to managing cluster configurations here. Any help here would be appreciated. Full Log pyspark --packages za.co.absa.spline.agent.spark:spark-3.0-spline-agent-bundle_2.12:0.6.0 --conf "spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener" --conf "spark.spline.lineageDispatcher.http.producer.url=http://localhost:9090/producer"
Python 3.7.8 (default, Jul 24 2020, 20:26:49)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-9)] on linux
Type "help", "copyright", "credits" or "license" for more information.
Ivy Default Cache set to: /home/hadoop/.ivy2/cache
The jars for the packages stored in: /home/hadoop/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
za.co.absa.spline.agent.spark#spark-3.0-spline-agent-bundle_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-5493c2e7-11b8-4490-934a-2a46a3e1414d;1.0
confs: [default]
found za.co.absa.spline.agent.spark#spark-3.0-spline-agent-bundle_2.12;0.6.0 in central
found za.co.absa.spline.agent.spark#agent-core_2.12;0.6.0 in central
found org.scala-lang#scala-compiler;2.12.10 in central
found org.scala-lang#scala-reflect;2.12.10 in central
found org.scala-lang.modules#scala-xml_2.12;1.0.6 in central
found za.co.absa.commons#commons_2.12;0.0.27 in central
found org.json4s#json4s-ext_2.12;3.5.3 in central
found joda-time#joda-time;2.9.5 in central
found org.joda#joda-convert;1.8.1 in central
found org.scalaz#scalaz-core_2.12;7.2.29 in central
found org.scalaj#scalaj-http_2.12;2.4.1 in central
found io.github.classgraph#classgraph;4.8.87 in central
found org.scala-graph#graph-core_2.12;1.12.5 in central
:: resolution report :: resolve 631ms :: artifacts dl 15ms
:: modules in use:
io.github.classgraph#classgraph;4.8.87 from central in [default]
joda-time#joda-time;2.9.5 from central in [default]
org.joda#joda-convert;1.8.1 from central in [default]
org.json4s#json4s-ext_2.12;3.5.3 from central in [default]
org.scala-graph#graph-core_2.12;1.12.5 from central in [default]
org.scala-lang#scala-compiler;2.12.10 from central in [default]
org.scala-lang#scala-reflect;2.12.10 from central in [default]
org.scala-lang.modules#scala-xml_2.12;1.0.6 from central in [default]
org.scalaj#scalaj-http_2.12;2.4.1 from central in [default]
org.scalaz#scalaz-core_2.12;7.2.29 from central in [default]
za.co.absa.commons#commons_2.12;0.0.27 from central in [default]
za.co.absa.spline.agent.spark#agent-core_2.12;0.6.0 from central in [default]
za.co.absa.spline.agent.spark#spark-3.0-spline-agent-bundle_2.12;0.6.0 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 13 | 0 | 0 | 0 || 13 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-5493c2e7-11b8-4490-934a-2a46a3e1414d
confs: [default]
0 artifacts copied, 13 already retrieved (0kB/14ms)
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/06/12 00:39:45 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/za.co.absa.spline.agent.spark_spark-3.0-spline-agent-bundle_2.12-0.6.0.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/za.co.absa.spline.agent.spark_agent-core_2.12-0.6.0.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.scala-lang_scala-compiler-2.12.10.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/za.co.absa.commons_commons_2.12-0.0.27.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.json4s_json4s-ext_2.12-3.5.3.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.scalaz_scalaz-core_2.12-7.2.29.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.scalaj_scalaj-http_2.12-2.4.1.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/io.github.classgraph_classgraph-4.8.87.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.scala-graph_graph-core_2.12-1.12.5.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.scala-lang_scala-reflect-2.12.10.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.scala-lang.modules_scala-xml_2.12-1.0.6.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/joda-time_joda-time-2.9.5.jar added multiple times to distributed cache.
21/06/12 00:39:49 WARN Client: Same path resource file:///home/hadoop/.ivy2/jars/org.joda_joda-convert-1.8.1.jar added multiple times to distributed cache.
/usr/lib/spark/python/pyspark/shell.py:45: UserWarning: Failed to initialize Spark session.
warnings.warn("Failed to initialize Spark session.")
Traceback (most recent call last):
File "/usr/lib/spark/python/pyspark/shell.py", line 41, in <module>
spark = SparkSession._create_shell_session()
File "/usr/lib/spark/python/pyspark/sql/session.py", line 489, in _create_shell_session
return SparkSession.builder.getOrCreate()
File "/usr/lib/spark/python/pyspark/sql/session.py", line 191, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/usr/lib/spark/python/pyspark/sql/utils.py", line 131, in deco
return f(*a, **kw)
File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o64.sessionState.
: java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration
at za.co.absa.spline.harvester.conf.DefaultSplineConfigurer$.apply(DefaultSplineConfigurer.scala:66)
at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener$.za$co$absa$spline$harvester$listener$SplineQueryExecutionListener$$constructEventHandler(SplineQueryExecutionListener.scala:65)
at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.<init>(SplineQueryExecutionListener.scala:37)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.util.Utils$.$anonfun$loadExtensions$1(Utils.scala:2736)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2725)
at org.apache.spark.sql.util.ExecutionListenerManager.$anonfun$new$1(QueryExecutionListener.scala:84)
at org.apache.spark.sql.util.ExecutionListenerManager.$anonfun$new$1$adapted(QueryExecutionListener.scala:83)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.sql.util.ExecutionListenerManager.<init>(QueryExecutionListener.scala:83)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$listenerManager$2(BaseSessionStateBuilder.scala:307)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.listenerManager(BaseSessionStateBuilder.scala:307)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:334)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1107)
at org.apache.spark.sql.SparkSession.$anonfun$sessionState$2(SparkSession.scala:157)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:155)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:152)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.Configuration
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 40 more |
Beta Was this translation helpful? Give feedback.
Answered by
wajda
Jun 12, 2021
Replies: 1 comment 1 reply
-
Please add the following Maven dependency to the driver: |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
wajda
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Please add the following Maven dependency to the driver:
commons-configuration:commons-configuration:1.10