Overview
Context
Working with data is tricky - working with millions or even billions of rows is worse. Did you receive some data processing code written on a laptop with fairly pristine data?
Chances are you’ve probably been put in charge of moving a basic data process from prototype to production. You may have worked with real world datasets, with missing fields, bizarre formatting, and orders of magnitude more data.
Even if this is all new to you, this Cleaning Data with PySpark course offered by Data Camp helps you learn what’s needed to prepare data processes.
You’ll learn terminology, methods, and some best practices to create a performant, maintainable, and understandable data processing platform.
Programme Structure
Chapters include:
- DataFrame details
- Manipulating DataFrames in the real world
- Improving Performance
- Complex processing and data pipelines
Key information
Duration
- Part-time
- 1 days
Start dates & application deadlines
Language
Delivered
Campus Location
- New York City, United States
Disciplines
Data Science & Big Data View 467 other Short Courses in Data Science & Big Data in United StatesWhat students do after studying
Academic requirements
We are not aware of any specific GRE, GMAT or GPA grading score requirements for this programme.
English requirements
We are not aware of any English requirements for this programme.
Other requirements
General requirements
Prerequisites
- Intermediate Python
- Introduction to PySpark
Tuition Fees
-
International Applies to you
Applies to youNon-residentsFree - Out-of-StateFree
-
Domestic
Applies to youIn-StateFree
Additional Details
This course can be accessed for free with the Data Camp Premium or Teams subscriptions