Careertail
About UsCoursesCareer PathsBlogOpportunities
Log In
Courses>Data Science>PySpark Essentials for Data Scientists (Big Data + Python)
DevelopmentPySpark Essentials for Data Scientists (Big Data + Python)
Price:Paid
Length:17.5 hours
Content type:video
level:all levels
Updated:20 February 2024
Published:21 August 2022
Similar courses
Opportunities
Courses>Data Science>PySpark Essentials for Data Scientists (Big Data + Python)
PySpark Essentials for Data Scientists (Big Data + Python)
4.5 (4.3k)
17.5 hours
4263 students
What you will learn
1Use Python with Big Data on a distributed framework (Apache Spark)
2Work with REAL datasets on realistic consulting projects
3How to streaming LIVE data from Twitter using Spark Structured Streaming
4Learn how to create a "Pandora Like" app that classifies songs into genres using machine learning
5Flag suspicious job postings using Natural Language Processing
6Use machine learning to predict optimal cement strength and the factors that affect it
7Classify Christmas cooking recipes using Topic Modeling (LDA)
8Customer Segmentation using Gaussian Mixture Modeling (Clustering)
9Use cluster analysis to develop a strategy designed to increase college graduation rates for under-priveleged populations
10How to use the k-means clustering algorithm to define a marketing outreach strategy
11Integrate a UI to monitor your model training and development process with MLflow
12Theory and application of cutting edge data science algorithms
13Manipulate, Join and Aggregate Dataframes in Spark with Python
14Learn how to apply Spark's machine learning techniques on distributed Dataframes
15Cross Validation & Hyperparameter Tuning
16Frequent Pattern Mining Techniques
17Classification & Regression Techniques
18Data Wrangling for Natural Language Processing
19How to write SQL Queries in Spark
Target audiences
1Data Scientists interested in learning PySpark
2PySpark developers looking to strengthen their coding skills
3Python developers who need to work with big data
4Data Scientists who want to learn to work with big data
Requirements
1Familiarity with Python is helpful but not required
2Some background in data science is helpful but not required
3A hunger to LEARN
FAQ
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
Description

This course is for data scientists (or aspiring data scientists) who want to get PRACTICAL training in PySpark (Python for Apache Spark) using REAL WORLD datasets and APPLICABLE coding knowledge that you’ll use everyday as a data scientist! By enrolling in this course, you’ll gain access to over 100 lectures, hundreds of example problems and quizzes and over 100,000 lines of code!

I’m going to provide the essentials for what you need to know to be an expert in Pyspark by the end of this course, that I’ve designed based on my EXTENSIVE experience consulting as a data scientist for clients like the IRS, the US Department of Labor and United States Veterans Affairs.

I’ve structured the lectures and coding exercises for real world application, so you can understand how PySpark is actually used on the job. We are also going to dive into my custom functions that I wrote MYSELF to get you up and running in the MLlib API fast and make getting started building machine learning models a breeze! We will also touch on MLflow which will help us manage and track our model training and evaluation process in a custom user interface that will make you even more competitive on the job market!

Each section will have a concept review lecture as well as code along activities structured problem sets for you to work through to help you put what you have learned into action, as well as the solutions to each problem in case you get stuck. Additionally, real world consulting projects have been provided in every section with AUTHENTIC datasets to help you think through how to apply each of the concepts we have covered.

Lastly, I’ve written up some condensed review notebooks and handouts of all the course content to make it super easy for you to reference later on. This will be super helpful once you land your first job programming in PySpark!

I can’t wait to see you in the lectures! And I really hope you enjoy the course! I’ll see you in the first lecture!

Similar courses
Opportunities
Make the most out of your online education
Careertail
Copyright © 2021 Careertail.
All rights reserved
Quick Links
Get StartedLog InAbout UsCourses
Company
BlogContactsPrivacy PolicyCookie PolicyTerms and Conditions
Stay up to date
Trustpilot
Careertail
Courses>Data Science>PySpark Essentials for Data Scientists (Big Data + Python)
DevelopmentPySpark Essentials for Data Scientists (Big Data + Python)
Price:Paid
Length:17.5 hours
Content type:video
level:all levels
Updated:20 February 2024
Published:21 August 2022
Similar courses
Opportunities
Courses>Data Science>PySpark Essentials for Data Scientists (Big Data + Python)
PySpark Essentials for Data Scientists (Big Data + Python)
4.5 (4.3k)
17.5 hours
4263 students
What you will learn
1Use Python with Big Data on a distributed framework (Apache Spark)
2Work with REAL datasets on realistic consulting projects
3How to streaming LIVE data from Twitter using Spark Structured Streaming
4Learn how to create a "Pandora Like" app that classifies songs into genres using machine learning
5Flag suspicious job postings using Natural Language Processing
6Use machine learning to predict optimal cement strength and the factors that affect it
7Classify Christmas cooking recipes using Topic Modeling (LDA)
8Customer Segmentation using Gaussian Mixture Modeling (Clustering)
9Use cluster analysis to develop a strategy designed to increase college graduation rates for under-priveleged populations
10How to use the k-means clustering algorithm to define a marketing outreach strategy
11Integrate a UI to monitor your model training and development process with MLflow
12Theory and application of cutting edge data science algorithms
13Manipulate, Join and Aggregate Dataframes in Spark with Python
14Learn how to apply Spark's machine learning techniques on distributed Dataframes
15Cross Validation & Hyperparameter Tuning
16Frequent Pattern Mining Techniques
17Classification & Regression Techniques
18Data Wrangling for Natural Language Processing
19How to write SQL Queries in Spark
Target audiences
1Data Scientists interested in learning PySpark
2PySpark developers looking to strengthen their coding skills
3Python developers who need to work with big data
4Data Scientists who want to learn to work with big data
Requirements
1Familiarity with Python is helpful but not required
2Some background in data science is helpful but not required
3A hunger to LEARN
FAQ
You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
Description

This course is for data scientists (or aspiring data scientists) who want to get PRACTICAL training in PySpark (Python for Apache Spark) using REAL WORLD datasets and APPLICABLE coding knowledge that you’ll use everyday as a data scientist! By enrolling in this course, you’ll gain access to over 100 lectures, hundreds of example problems and quizzes and over 100,000 lines of code!

I’m going to provide the essentials for what you need to know to be an expert in Pyspark by the end of this course, that I’ve designed based on my EXTENSIVE experience consulting as a data scientist for clients like the IRS, the US Department of Labor and United States Veterans Affairs.

I’ve structured the lectures and coding exercises for real world application, so you can understand how PySpark is actually used on the job. We are also going to dive into my custom functions that I wrote MYSELF to get you up and running in the MLlib API fast and make getting started building machine learning models a breeze! We will also touch on MLflow which will help us manage and track our model training and evaluation process in a custom user interface that will make you even more competitive on the job market!

Each section will have a concept review lecture as well as code along activities structured problem sets for you to work through to help you put what you have learned into action, as well as the solutions to each problem in case you get stuck. Additionally, real world consulting projects have been provided in every section with AUTHENTIC datasets to help you think through how to apply each of the concepts we have covered.

Lastly, I’ve written up some condensed review notebooks and handouts of all the course content to make it super easy for you to reference later on. This will be super helpful once you land your first job programming in PySpark!

I can’t wait to see you in the lectures! And I really hope you enjoy the course! I’ll see you in the first lecture!

Similar courses
Opportunities
Make the most out of your online education
Careertail
Copyright © 2021 Careertail.
All rights reserved
Quick Links
Get StartedLog InAbout UsCourses
Company
BlogContactsPrivacy PolicyCookie PolicyTerms and Conditions
Stay up to date
Trustpilot