Spark Starter Kit

Learn about the similarities and differences between Spark and Hadoop, How Spark is faster than Hadoop?. Explore the challenges Spark tries to address, you will give you a good idea about the need for spark. Spark’s performance and efficiency. RDDs. Step by step how the program we write gets translated in to actual execution behind the scenes in a Spark cluster.

Spark

You’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark. Learn about big data and how Spark fits into the big data ecosystem. Practice processing and cleaning datasets to get comfortable with Spark’s SQL and dataframe APIs. Debug and optimize your Spark code when running on a cluster. Use Spark’s Machine Learning Library to train machine learning models at scale.