Best Seller

Data Engineering Essentials using SQL, Python, and PySpark

Data Engineering Essentials using SQL, Python, and PySpark

Learn key Data Engineering Skills such as SQL, Python, Apache Spark (Spark SQL and Pyspark) with Exercises and Projects-56 hours course

What you’ll learn

  • Setup Environment to learn SQL and Python essentials for Data Engineering
  • Database Essentials for Data Engineering using Postgres such as creating tables, indexes, running SQL Queries, using important pre-defined functions, etc.
  • Data Engineering Programming Essentials using Python such as basic programming constructs, collections, Pandas, Database Programming, etc.
  • Data Engineering using Spark Dataframe APIs (PySpark) using Databricks. Learn all important Spark Data Frame APIs such as select, filter, groupBy, orderBy, etc.
  • Data Engineering using Spark SQL (PySpark and Spark SQL). Learn how to write high quality Spark SQL queries using SELECT, WHERE, GROUP BY, ORDER BY, ETC.
  • Relevance of Spark Metastore and integration of Dataframes and Spark SQL
  • Ability to build Data Engineering Pipelines using Spark leveraging Python as Programming Language
  • Use of different file formats such as Parquet, JSON, CSV etc in building Data Engineering Pipelines
  • Setup Hadoop and Spark Cluster on GCP using Dataproc
  • Understanding Complete Spark Application Development Life Cycle to build Spark Applications using Pyspark. Review the applications using Spark UI. 

This course is designed to address these key challenges for professionals at all levels to acquire the required Data Engineering Skills (Python, SQL, and Apache Spark).

  • Setup Environment to learn Data Engineering Essentials such as SQL (using Postgres), Python, etc.
  • Setup required tables in Postgres to practice SQL
  • Writing basic SQL Queries with practical examples using WHERE, JOIN, GROUP BY, HAVING, ORDER BY, etc
  • Advanced SQL Queries with practical examples such as cumulative aggregations, ranking, etc
  • Scenarios covering troubleshooting and debugging related to Databases.
  • Performance Tuning of SQL Queries
  • Exercises and Solutions for SQL Queries.
  • Basics of Programming using Python as Programming Language
  • Python Collections for Data Engineering
  • Data Processing or Data Engineering using Pandas
  • 2 Real Time Python Projects with explanations (File Format Converter and Database Loader)
  • Scenarios covering troubleshooting and debugging in Python Applications
  • Performance Tuning Scenarios related to Data Engineering Applications using Python
  • Getting Started with Google Cloud Platform to setup Spark Environment using Databricks
  • Writing Basic Spark SQL Queries with practical examples using WHERE, JOIN, GROUP BY, HAVING, ORDER BY, etc
  • Creating Delta Tables in Spark SQL along with CRUD Operations such as INSERT, UPDATE, DELETE, MERGE, etc
  • Advanced Spark SQL Queries with practical examples such as ranking
  • Integration of Spark SQL and Pyspark
  • In-depth coverage of Apache Spark Catalyst Optimizer for Performance Tuning
  • Reading Explain Plans of Spark SQL Queries or Pyspark Data Frame APIs
  • In-depth coverage of columnar file formats and Performance tuning using Partitioning

How to Enroll Data Engineering Essentials using SQL, Python, and PySpark course?

  • To Access "Data Engineering Essentials using SQL, Python, and PySpark" Click on Enroll Now button at end of the post. It will redirect you to Udemy Course Page and then you can start the enrollment process.
  • If you're New to Udemy? Sign up with your email and create a password. for Existing users, log in with your credentials to access course.
  • How many members can access this course with a coupon?

    Data Engineering Essentials using SQL, Python, and PySpark Course coupon is limited to the first 1,000 enrollments. Click 'Enroll Now' to secure your spot and dive into this course on Udemy before it reaches its enrollment limits!

    External links may contain affiliate links, meaning we get a commission if you decide to make a purchase
    Deal Score-1

    Learn Data Science. Courses starting at $12.99

    New customer offer! Top courses from $14.99 when you first visit Udemy

    Compare items
    • Total (0)
    Compare
    0