Storage & Databases
Learn about GCP offerings regarding storage: FS, SQL and NoSQL
Learn about GCP offerings regarding storage: FS, SQL and NoSQL
Dataframes in Spark: definition, features, use cases, sources, creation. Sample Game of Thrones use case
Deep dive into Spark streaming module, with structured streaming. Learn about Spark's micro-batch strategy and aggregations.
Learn about the similarities and differences between Spark and Hadoop, How Spark is faster than Hadoop?. Explore the challenges Spark tries to address, you will give you a good idea about the need for spark. Spark’s performance and efficiency. RDDs. Step by step how the program we write gets translated in to actual execution behind the scenes in a Spark cluster.
Spark introduction: what is it, modules, data types, operations, aggregations, joins, developing applications
Apache Hive: History, what is it, data flow, modeling, types, modes and main features. Differences with RDBMS.
Learn about HBase: what is it, use cases and applications, storage and architecture. See a quick demo.
Introduction to Hadoop. You’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
How to choose the data model for your use case