Spark for Python Developers

Nonfiction, Computers, Database Management, Data Processing, Application Software, Business Software, Programming, Programming Languages
Cover of the book Spark for Python Developers by Amit Nandi, Packt Publishing
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart
Author: Amit Nandi ISBN: 9781784397371
Publisher: Packt Publishing Publication: September 8, 2016
Imprint: Packt Publishing Language: English
Author: Amit Nandi
ISBN: 9781784397371
Publisher: Packt Publishing
Publication: September 8, 2016
Imprint: Packt Publishing
Language: English

A concise guide to implementing Spark Big Data analytics for Python developers, and building a real-time and insightful trend tracker data intensive app

About This Book

  • Set up real-time streaming and batch data intensive infrastructure using Spark and Python
  • Deliver insightful visualizations in a web app using Spark (PySpark)
  • Inject live data using Spark Streaming with real-time events

Who This Book Is For

This book is for data scientists and software developers with a focus on Python who want to work with the Spark engine, and it will also benefit Enterprise Architects. All you need to have is a good background of Python and an inclination to work with Spark.

What You Will Learn

  • Create a Python development environment powered by Spark (PySpark), Blaze, and Bookeh
  • Build a real-time trend tracker data intensive app
  • Visualize the trends and insights gained from data using Bookeh
  • Generate insights from data using machine learning through Spark MLLIB
  • Juggle with data using Blaze
  • Create training data sets and train the Machine Learning models
  • Test the machine learning models on test datasets
  • Deploy the machine learning algorithms and models and scale it for real-time events

In Detail

Looking for a cluster computing system that provides high-level APIs? Apache Spark is your answer—an open source, fast, and general purpose cluster computing system. Spark's multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms.

Are you a Python developer inclined to work with Spark engine? If so, this book will be your companion as you create data-intensive app using Spark as a processing engine, Python visualization libraries, and web frameworks such as Flask.

To begin with, you will learn the most effective way to install the Python development environment powered by Spark, Blaze, and Bookeh. You will then find out how to connect with data stores such as MySQL, MongoDB, Cassandra, and Hadoop.

You'll expand your skills throughout, getting familiarized with the various data sources (Github, Twitter, Meetup, and Blogs), their data structures, and solutions to effectively tackle complexities. You'll explore datasets using iPython Notebook and will discover how to optimize the data models and pipeline. Finally, you'll get to know how to create training datasets and train the machine learning models.

By the end of the book, you will have created a real-time and insightful trend tracker data-intensive app with Spark.

Style and approach

This is a comprehensive guide packed with easy-to-follow examples that will take your skills to the next level and will get you up and running with Spark.

View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart

A concise guide to implementing Spark Big Data analytics for Python developers, and building a real-time and insightful trend tracker data intensive app

About This Book

Who This Book Is For

This book is for data scientists and software developers with a focus on Python who want to work with the Spark engine, and it will also benefit Enterprise Architects. All you need to have is a good background of Python and an inclination to work with Spark.

What You Will Learn

In Detail

Looking for a cluster computing system that provides high-level APIs? Apache Spark is your answer—an open source, fast, and general purpose cluster computing system. Spark's multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms.

Are you a Python developer inclined to work with Spark engine? If so, this book will be your companion as you create data-intensive app using Spark as a processing engine, Python visualization libraries, and web frameworks such as Flask.

To begin with, you will learn the most effective way to install the Python development environment powered by Spark, Blaze, and Bookeh. You will then find out how to connect with data stores such as MySQL, MongoDB, Cassandra, and Hadoop.

You'll expand your skills throughout, getting familiarized with the various data sources (Github, Twitter, Meetup, and Blogs), their data structures, and solutions to effectively tackle complexities. You'll explore datasets using iPython Notebook and will discover how to optimize the data models and pipeline. Finally, you'll get to know how to create training datasets and train the machine learning models.

By the end of the book, you will have created a real-time and insightful trend tracker data-intensive app with Spark.

Style and approach

This is a comprehensive guide packed with easy-to-follow examples that will take your skills to the next level and will get you up and running with Spark.

More books from Packt Publishing

Cover of the book Statistics for Machine Learning by Amit Nandi
Cover of the book Enterprise Agility by Amit Nandi
Cover of the book ASP.NET Data Presentation Controls Essentials by Amit Nandi
Cover of the book Mastering Machine Learning with scikit-learn - Second Edition by Amit Nandi
Cover of the book Instant jQuery Drag-and-Drop Grids How-to by Amit Nandi
Cover of the book Instant Apache Solr for Indexing Data How-to by Amit Nandi
Cover of the book Neural Network Programming with Java by Amit Nandi
Cover of the book Getting Started with SOQL by Amit Nandi
Cover of the book Implementing Splunk - Second Edition by Amit Nandi
Cover of the book Implementing Microsoft Dynamics NAV - Third Edition by Amit Nandi
Cover of the book OpenFlow Cookbook by Amit Nandi
Cover of the book Cloud Native Programming with Golang by Amit Nandi
Cover of the book Red Hat Enterprise Linux Server Cookbook by Amit Nandi
Cover of the book Wearable-Tech Projects with the Raspberry Pi Zero by Amit Nandi
Cover of the book Hands-On High Performance with Spring 5 by Amit Nandi
We use our own "cookies" and third party cookies to improve services and to see statistical information. By using this website, you agree to our Privacy Policy