By the end of the course, you’ll be able to learn
- Cluster Setup on Google Cloud
- Introduction to Apache Spark
- Resilient Distributed Datasets
- Dataframe abstraction in Spark
- Working with JSON data
- Working with Parquet data
- Working with Avro data
- Working with ORC data
- Working with DataFrame Columns
- Manipulating dates with spark Dataframes
- Manipulating Strings with spark Dataframes
- Working with Hive metastore
Who this course is for:
- Students who wants to appear for CCA175 Hadoop and Spark Developer Certification.