Spark SQL & Hadoop (For Data Scientists & Big Data Analysts)

Spark SQL & Hadoop (For Data Scientists & Big Data Analysts)

What you’ll learn

  • Students will get hands-on experience working in a Spark Hadoop environment that’s free and downloadable as part of this course.
  • Students will have opportunities solve Data Engineering and Data Analysis Problems using Spark on a Hadoop cluster in the sandbox environment that comes as part
  • Issuing HDFS commands.
  • Converting a set of data values in a given format stored in HDFS into new data values or a new data format and writing them into HDFS.
  • Loading data from HDFS for use in Spark applications & writing the results back into HDFS using Spark.
  • Reading and writing files in a variety of file formats.
  • Performing standard extract, transform, load (ETL) processes on data using the Spark API.
  • Using metastore tables as an input source or an output sink for Spark applications.
  • Applying the understanding of the fundamentals of querying datasets in Spark.
  • Filtering data using Spark.
  • Writing queries that calculate aggregate statistics.
  • Joining disparate datasets using Spark.
  • Producing ranked or sorted data.
Deal Score0
External links may contain affiliate links, meaning we get a commission if you decide to make a purchase