Apache Spark Tuning: Killing Data Skew & Mastering Broadcast Joins
There is no pain in Data Engineering quite like watching a Spark job race to 99% completion in 5 minutes, only to hang on the final task for 4 hours. If you are staring at the Spark UI and seeing …