Memory issues while running Apache Spark streaming applications on Google Dataproc cluster | OutOfMemoryError Java heap space #1026

Sujay39 · 2023-07-03T02:57:39Z

Background:

Cluster:

High availability Google Dataproc cluster is created to run Apache Spark Streaming applications.
The input to these applications is Kafka topic with 12 partitions on n node cluster which is hosted on Google Compute instances. The throughput on this topic is approximately 5k events per minute.

Application:

The application processes the events and stores the data in Google Cloud Storage bucket. 1.5GB memory is allocated for the driver and the executor of the Spark Application. Running Spark version is 3.3.3.
Also, Google Cloud Storage location is given as the checkpoint location.

Issue:

While writing checkpoint information and some metadata information the Spark application runs out of memory and crashes. The stack trace for a couple of occurrences:

Caused by: java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:75) ~[?:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.createOutputStream(GoogleHadoopOutputStream.java:90) ~[gcs-connector-hadoop3-2.2.14.jar:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:71) ~[gcs-connector-hadoop3-2.2.14.jar:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.create(GoogleHadoopFileSystemBase.java:616) ~[gcs-connector-hadoop3-2.2.14.jar:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS.createInternal(GoogleHadoopFS.java:98) ~[gcs-connector-hadoop3-2.2.14.jar:?] at org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:626) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:701) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:697) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FileContext.create(FileContext.java:703) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.createTempFile(CheckpointFileManager.scala:327) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.CheckpointFileManager$RenameBasedFSDataOutputStream.<init>(CheckpointFileManager.scala:140) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.CheckpointFileManager$RenameBasedFSDataOutputStream.<init>(CheckpointFileManager.scala:143) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.createAtomic(CheckpointFileManager.scala:333) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.$anonfun$addNewBatchByStream$2(HDFSMetadataLog.scala:173) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog$$Lambda$2548/0x000000010135b840.apply$mcZ$sp(Unknown Source) ~[?:?] at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) ~[scala-library-2.12.14.jar:?] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.14.jar:?] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.addNewBatchByStream(HDFSMetadataLog.scala:171) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:116) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$18(MicroBatchExecution.scala:675) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$Lambda$3985/0x00000001018e3440.apply$mcV$sp(Unknown Source) ~[?:?] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.14.jar:?] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:687) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runBatch(MicroBatchExecution.scala:672) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$2(MicroBatchExecution.scala:255) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$Lambda$2228/0x0000000101209040.apply$mcV$sp(Unknown Source) ~[?:?] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.14.jar:?] at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:375) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:373) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:68) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:218) ~[spark-sql_2.12-3.3.0.jar:3.3.0] 23/06/27 19:03:37 ERROR Utils: uncaught error in thread spark-listener-group-shared, stopping SparkContext
java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:75) ~[?:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.createOutputStream(GoogleHadoopOutputStream.java:90) ~[gcs-connector-hadoop3-2.2.14.jar:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:71) ~[gcs-connector-hadoop3-2.2.14.jar:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.create(GoogleHadoopFileSystemBase.java:616) ~[gcs-connector-hadoop3-2.2.14.jar:?] at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS.createInternal(GoogleHadoopFS.java:98) ~[gcs-connector-hadoop3-2.2.14.jar:?] at org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:626) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:701) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:697) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.hadoop.fs.FileContext.create(FileContext.java:703) ~[hadoop-client-api-3.3.3.jar:?] at org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.createTempFile(CheckpointFileManager.scala:327) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.CheckpointFileManager$RenameBasedFSDataOutputStream.<init>(CheckpointFileManager.scala:140) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.CheckpointFileManager$RenameBasedFSDataOutputStream.<init>(CheckpointFileManager.scala:143) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.createAtomic(CheckpointFileManager.scala:333) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.$anonfun$addNewBatchByStream$2(HDFSMetadataLog.scala:173) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog$$Lambda$2473/0x000000010131a440.apply$mcZ$sp(Unknown Source) ~[?:?] at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) ~[scala-library-2.12.14.jar:?] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.14.jar:?] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.addNewBatchByStream(HDFSMetadataLog.scala:171) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:116) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$18(MicroBatchExecution.scala:675) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$Lambda$3827/0x000000010184f840.apply$mcV$sp(Unknown Source) ~[?:?] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.14.jar:?] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:687) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runBatch(MicroBatchExecution.scala:672) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$2(MicroBatchExecution.scala:255) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$Lambda$2122/0x00000001011b1840.apply$mcV$sp(Unknown Source) ~[?:?] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.14.jar:?] at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:375) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:373) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:68) ~[spark-sql_2.12-3.3.0.jar:3.3.0] at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:218) ~[spark-sql_2.12-3.3.0.jar:3.3.0] Exception in thread "stream execution thread for [id = 2be4acf0-1a9e-4dcc-9db2-addc0de7e89f, runId = 37627391-5ef0-42e0-a787-b7ce2ecb8feb]" java.lang.OutOfMemoryError: Java heap space

Trouble shooting attempted:

Fine tuned the configs given in this documentation.
Tried the solutions suggested here.

However these steps have not solved the out-of-memory error.

Any suggestion to solve this is highly appreciated.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory issues while running Apache Spark streaming applications on Google Dataproc cluster | OutOfMemoryError Java heap space #1026

Memory issues while running Apache Spark streaming applications on Google Dataproc cluster | OutOfMemoryError Java heap space #1026

Sujay39 commented Jul 3, 2023 •

edited

Memory issues while running Apache Spark streaming applications on Google Dataproc cluster | OutOfMemoryError Java heap space #1026

Memory issues while running Apache Spark streaming applications on Google Dataproc cluster | OutOfMemoryError Java heap space #1026

Comments

Sujay39 commented Jul 3, 2023 • edited

Background:

Cluster:

Application:

Issue:

Trouble shooting attempted:

Sujay39 commented Jul 3, 2023 •

edited