Introduction to Spark SQL in Python, Short Course | Part time online | Data Camp | United States
Studyportals
Short Online

Introduction to Spark SQL in Python

1 days
Duration
Free
Free
Unknown
Tuition fee
Anytime
Unknown
Apply date
Anytime
Unknown
Start date

About

The Introduction to Spark SQL in Python course at Data Camp uses a natural language text dataset that is easy to understand. Sentences are sequences of words. Window functions are very suitable for manipulating sequence data.

Overview

Context of the Introduction to Spark SQL in Python course at Data Camp

You're familiar with SQL, and have heard great things about Apache Spark. Then this course is for you! Apache Spark is a computing framework for processing big data. Spark SQL is a component of Apache Spark that works with tabular data. Window functions are an advanced feature of SQL that take Spark to a new level of usefulness.

You will use Spark SQL to analyze time series. You will extract the most common sequences of words from a text document. You will create feature sets from natural language text and use them to predict the last word in a sentence using logistic regression. Spark combines the power of distributed computing with the ease of use of Python and SQL.

 The same techniques taught here can be applied to sequences of song identifiers, video ids, or podcast ids. Exercises include discovering frequent word sequences, and converting word sequences into machine learning feature set data for training a text classifier.

Programme Structure

Chapters

  • Pyspark SQL
  • Using window function sql for natural language processing
  • Caching, Logging, and the Spark UI
  • Text classification

Key information

Duration

  • Part-time
    • 1 days

Start dates & application deadlines

You can apply for and start this programme anytime.

Language

English

Delivered

Online

Academic requirements

We are not aware of any specific GRE, GMAT or GPA grading score requirements for this programme.

English requirements

We are not aware of any English requirements for this programme.

Other requirements

General requirements

PREREQUISITES

  • Introduction to PySpark
  • Intermediate SQL
  • Python Data Science Toolbox (Part 2)

Tuition Fee

To always see correct tuition fees
  • International

    Free
    Tuition Fee
    Based on the tuition of 0 USD for the full programme during 1 days.
  • National

    Free
    Tuition Fee
    Based on the tuition of 0 USD for the full programme during 1 days.

Basic Access: Free; Premium (for individuals): $12.42 per month billed annually; Teams: $25 per month billed annually; Enterprise: Contact sales for pricing

Funding

Other interesting programmes for you

Our partners

Introduction to Spark SQL in Python
Data Camp
Introduction to Spark SQL in Python
-
Data Camp

Wishlist

Go to your profile page to get personalised recommendations!