Studyportals
Certificate Online

Feature Engineering with PySpark Data Camp

Highlights
Tuition fee
Free
Free
Free
Unknown
Tuition fee
Free
Free
Free
Unknown
Duration
1 days
Duration
1 days
Apply date
Anytime
Unknown
Apply date
Anytime
Unknown
Start date
Anytime
Unknown
Start date
Anytime
Unknown
Taught in
English
Taught in
English

About

In this Feature Engineering with PySpark course offered by Data Camp you will learn the gritty details that data scientists are spending 70-80% of their time on; data wrangling and feature engineering.

Overview

Context

The real world is messy and your job is to make sense of it. Toy datasets like MTCars and Iris are the result of careful curation and cleaning, even so the data needs to be transformed for it to be useful for powerful machine learning algorithms to extract meaning, forecast, classify or cluster. This Feature Engineering with PySpark course offered by Data Camp will discuss this subject. With size of datasets now becoming ever larger, let's use PySpark to cut this Big Data problem down to size!

What you will do during this course:

  • Get to know a bit about your problem before you dive in! Then learn how to statistically and visually inspect your dataset!
  • Real data is rarely clean and ready for analysis. In this chapter learn to remove unneeded information, handle missing values and add additional data to your analysis.
  • You will learn how to create new features for your machine learning model to learn from. We'll look at generating them by combining fields, extracting values from messy columns or encoding them for better results.
  • You will learn how to choose which type of model we want. Then we will learn how to apply our data to the model and evaluate it. Lastly, we'll learn how to interpret the results and save the model for later!

Programme Structure

Chapters

  • Exploratory Data Analysis
  • Wrangling with Spark Functions
  • Feature Engineering
  • Building a Model

Key information

Duration

  • Part-time
    • 1 days

Start dates & application deadlines

You can apply for and start this programme anytime.

Language

English

Delivered

Online

Campus Location

  • New York City, United States

What students do after studying

Join for free or log in to access our complete career info list.

Academic requirements

We are not aware of any specific GRE, GMAT or GPA grading score requirements for this programme.

English requirements

We are not aware of any English requirements for this programme.

Other requirements

General requirements

PREREQUISITES

  • Supervised Learning with scikit-learn
  • Introduction to PySpark

Tuition Fees

Tuition fees are shown in and the most likely applicable fee is shown based on your nationality.
  • International

    Non-residents
    Free
  • Out-of-State
    Free
  • Domestic

    In-State
    Free

Additional Details

This course can be accessed for free with the Data Camp Premium or Teams subscriptions

Funding

Other interesting programmes for you

Our partners

Feature Engineering with PySpark
Data Camp
Feature Engineering with PySpark
-
Data Camp

Wishlist

Go to your profile page to get personalised recommendations!