Optimize apache spark

Optimize Apache Spark and Hadoop in big data analytics [Part 2] [Advanced]

One often sees questions in forums asking why, for a particular Spark job, certain configurations outperform others. A naive understanding of Spark might imply that increasing the number of executors or increasing the cores per executor will lead to faster job completions. This is wrong. In this post, we show how to optimize Apache Spark. Faster execution [...]