PySpark for Data Science – Intermediate

You get to learn about how to use spark python or PySpark to perform data analysis.
What you’ll learn
- This module on PySpark Tutorials aims to explain the intermediate concepts such as those like the use of Spark session in case of later versions and the use of Spark Config and Spark Context in case of earlier versions.
Requirements
- The pre-requisite of these PySpark Tutorials is not much except for that the person should be well familiar and should have a great hands-on experience in any of the languages such as Java, Python or Scala or their equivalent. The other pre-requisites include the development background and the sound and fundamental knowledge of big data concepts and ecosystem as Spark API is based on top of big data Hadoop only.
Description
This module on PySpark Tutorials aims to explain the intermediate concepts such as those like the use of Spark session in case of later versions and the use of Spark Config and Spark Context in case of earlier versions. We will learn the following in this course:
- Regression
- Linear Regression
- Output Column
- Test Data
- Prediction
- Generalized Linear Regression
- Forest Regression
- Classification
- Binomial Logistic Regression
- Multinomial Logistic Regression
- Decision Tree
- Random Forest
Who this course is for:
- The target audience for these PySpark Tutorials includes ones such as the developers, analysts, software programmers, consultants, data engineers, data scientists , data analysts, software engineers, Big data programmers