Overview
Context
The real world is messy and your job is to make sense of it. Toy datasets like MTCars and Iris are the result of careful curation and cleaning, even so the data needs to be transformed for it to be useful for powerful machine learning algorithms to extract meaning, forecast, classify or cluster. This Feature Engineering with PySpark course offered by Data Camp will discuss this subject. With size of datasets now becoming ever larger, let's use PySpark to cut this Big Data problem down to size!
What you will do during this course:
- Get to know a bit about your problem before you dive in! Then learn how to statistically and visually inspect your dataset!
- Real data is rarely clean and ready for analysis. In this chapter learn to remove unneeded information, handle missing values and add additional data to your analysis.
- You will learn how to create new features for your machine learning model to learn from. We'll look at generating them by combining fields, extracting values from messy columns or encoding them for better results.
- You will learn how to choose which type of model we want. Then we will learn how to apply our data to the model and evaluate it. Lastly, we'll learn how to interpret the results and save the model for later!
Programme Structure
Chapters
- Exploratory Data Analysis
- Wrangling with Spark Functions
- Feature Engineering
- Building a Model
Key information
Duration
- Part-time
- 1 days
Start dates & application deadlines
Language
Delivered
Campus Location
- New York City, United States
Disciplines
Data Science & Big Data View 467 other Short Courses in Data Science & Big Data in United StatesWhat students do after studying
Academic requirements
We are not aware of any specific GRE, GMAT or GPA grading score requirements for this programme.
English requirements
We are not aware of any English requirements for this programme.
Other requirements
General requirements
PREREQUISITES
- Supervised Learning with scikit-learn
- Introduction to PySpark
Tuition Fees
-
International Applies to you
Applies to youNon-residentsFree - Out-of-StateFree
-
Domestic
Applies to youIn-StateFree
Additional Details
This course can be accessed for free with the Data Camp Premium or Teams subscriptions