Overview
Join over 9 million learners and start Cleaning Data with PySpark today!
Working with data is tricky - working with millions or even billions of rows is worse. Did you receive some data processing code written on a laptop with fairly pristine data? Chances are you’ve probably been put in charge of moving a basic data process from prototype to production. You may have worked with real world datasets, with missing fields, bizarre formatting, and orders of magnitude more data. Even if this is all new to you, this Cleaning Data with PySpark course at Data Camp helps you learn what’s needed to prepare data processes using Python with Apache Spark. You’ll learn terminology, methods, and some best practices to create a performant, maintainable, and understandable data processing platform.
Programme Structure
Chapters include:
- A review of DataFrame fundamentals and the importance of data cleaning.
- Improving Performance
- Manipulating DataFrames in the real world
- Complex processing and data pipelines
Key information
Duration
- Part-time
- 1 days
Start dates & application deadlines
Language
Delivered
Disciplines
Data Science & Big Data Web Technologies & Cloud Computing Machine Learning View 265 other Short Courses in Machine Learning in United StatesAcademic requirements
We are not aware of any specific GRE, GMAT or GPA grading score requirements for this programme.
English requirements
We are not aware of any English requirements for this programme.
Other requirements
General requirements
Prerequisites
- Intermediate Python
- Introduction to PySpark
Tuition Fee
-
International
FreeTuition FeeBased on the tuition of 0 USD for the full programme during 1 days. -
National
FreeTuition FeeBased on the tuition of 0 USD for the full programme during 1 days.
Basic Access: Free; Premium (for individuals): $12.42 per month billed annually; Teams: $25 per month billed annually; Enterprise: Contact sales for pricing