One often sees questions in forums asking why, for a particular Spark job, certain configurations outperform others. A naive understanding of Spark might imply that increasing the number of executors or increasing the cores per executor will lead to faster job completions. This is wrong. In this post, we show how to optimize Apache Spark.
Faster execution of Spark jobs requires an intimate understanding of the relationship between CPU cores, executors (comprised of cores), tasks, and memory. In addition, Spark can run in different modes (local, standalone, cluster), and the different cluster managers handle the resources (memory and cores) differently. Furthermore, the nature of the job (whether compute-intensive or memory-intensive) affects the choice of parameters.
Optimizing Apache Spark is not trivial; the benefits, however, are significant. Faster completion times lower costs. For short jobs (minutes), optimizing Spark might not yield significant savings. Nevertheless, for longer jobs (hours or days), the time savings could be significant. Therefore, an analyst should have a good grasp of how to optimize Apache Spark.
OPTIMIZE APACHE SPARK PARAMETERS
Recall that Spark can run in different modes. This short guide is not intended to be a comprehensive overview covering all possible Spark configurations. Instead, we compare a few possible configurations for one particular mode, YARN. The pertinent parameters are:
- executor-cores: the number of cores present in each executor. One or more executors run on a node.
- num-executors: the total number of executors across the entire cluster.
- executor-memory: amount of memory available to each executor.
Let’s illustrate how to optimize Apache Spark with an example taken from here.
The analyst runs a 3 node cluster. Each node is an i7 Quad-Core (8 logical cores due to hyperthreading), with 32 GB memory. Thus, the entire cluster has 24 logical cores available to it. The analyst runs the same job using three different configurations and notices a significant speed difference. In the table below, we show the three tested configurations (the fourth and fifth configurations were not tested but are our suggestions and are presented for discussion).
|Configuration||executor memory (GB)||cores per executor||number of executors||Time (minutes)|
Config 1, with 7 cores per executor, and 3 executors (1 on each node), takes 1.6 times longer than config 3 (to the surprise of the analyst). Both configs 1 and 2 have 1 executor per node. Reserving ~2 GB for the operating system and for Spark-related processes leaves roughly 30 GB per executor. But the analyst configures each executor to use 19 GB. Consequently, 11 GB is idle. If the job is memory intensive, this presents a problem: Spark will repeatedly spill data to disk in order to free memory. Spilling data to disk should be avoided at all costs. In addition, config 1 with 7 cores per executor can lead to bad input/output throughput. A good rule of thumb is to assign 5 or less cores per executor.
In config 2, the analyst assigns 4 cores per executor. The total number of executors requested is 3 (one executor per node). Now, YARN recommends reserving one core for the operating system on each data node, and one more for the application manager on the node that runs it). Therefore, there are 3 idle cores on each data node, and 2 idle cores on the application manager. Unfortunately, this configuration is sub-optimal, due to unutilized resources.
Config 3, while the fastest, is misconfigured. The analyst requests 12 executors, each with 2 cores per executor, totaling 24 cores. However, the analyst does not reserve any cores for the operating system or the application manager. Following YARN’s recommendations would leave 20 cores (7+7+6) available to Spark, not 24.
Configs 4 and 5 are based on the hardware description in the analyst’s original post. We present the configurations but did not test them. How did we arrive at the suggested configurations? On the 2 data nodes, we reserve 1 core for the OS. In contrast, on the node running the application manager, we reserve 1 core for the OS, and 1 for the application manager. This leaves 20 cores available across the cluster.
Config 4 is suitable for a compute-intensive job, and utilizes 18 cores, divided among 9 executors. There are 3 executors per node; each executor comprises 2 cores. Each node avails ~30 GB to Spark, divided equally between the 3 executors, yielding 9 GB per executor.
Similarly, config 5 is suitable for a memory-intensive job, and utilizes 15 cores, divided among 3 executors. There is 1 executor per node; each executor comprises 5 cores. Each node avails ~30 GB to the one executor. Since YARN adds an overhead to the executor memory requested, we request 28 GB and not the full 30 GB.
In closing, we showed how to optimize Apache Spark. It is essential for anyone running Spark jobs to understand how to properly configure the parameters. Well-optimized jobs run efficiently, complete faster, and save time and money.