Scala and Spark for Big Data Analytics

Key FeaturesLearn Scala’s sophisticated type system that combines Functional Programming and object-oriented conceptsWork on a wide array of applications, from simple batch jobs to stream processing and machine learningExplore the most common as well as some complex use-cases to perform large-scale data analysis with SparkBook DescriptionScala has been observing a steady rise in adoption over the past few years, especially in the field of data science and analytics. Going hand in hand with Scala, is Apache Spark, which is built on Scala and is widely used in the field of Analytics.If you want to leverage the power of both Scala and Spark to make sense of Big Data, then this book is for you.This book is divided into three parts. In the first part, it will introduce you to Scala programming, helping you understand its fundamentals and be able to program with Spark. It will then move on to introducing you to Spark and the design choices beneath it and show you how to perform data analysis with it. Finally, to shake things up, the book moves on to Advanced Spark and teaches you advanced topics, such as monitoring, configuration, debugging, testing, and finally deployment.By the end of this book, you will be able to perform full stack data analysis with Spark and feel that no amount of data is too big.What you will learnUnderstand the basics of Scala and explore functional programming.Get familiar with the Collections API, one of the most prominent features of the standard library.Work with RDDs, the basic abstractions behind Apache Spark.Use Spark for the analysis of structured and unstructured data and work with SparkSQL’s APIs.Take advantage of Spark for the analysis of streaming data and explore interoperability with streaming software, such as Apache Kafka.Use common Machine Learning techniques, such as Dimensionality Reduction and One Hot Encoding, and build a predictive model using Spark.Use Bayesian inference to build another kind of classification model and understand when the Decision Tree algorithm should be used.Build a Clustering model and use it to make predictions.Tune your application and use Spark Testing Base.Deploy a full Spark application on a cluster using Mesos.

Author: Stefano Baghino

Learn more

Deals