Programming Elastic Mapreduce: Using Aws Services to Build an End-To-End Application - , Christopher Phillips
Description
Reviews
Q&A
This practical guide demonstrates how to quickly launch data analysis projects in the cloud using Amazon Elastic MapReduce (EMR), the hosted Hadoop framework in Amazon Web Services (AWS). Authors Kevin Schmidt and Christopher Phillips walk you through the construction of a sample MapReduce log analysis application, showcasing best practices for using EMR and various AWS and Apache technologies.
- Get an overview of the AWS and Apache software tools used in large-scale data analysis
- Go through the process of executing a Job Flow with a simple log analyzer
- Discover useful MapReduce patterns for filtering and analyzing data sets
- Use Apache Hive and Pig instead of Java to build a MapReduce Job Flow
- Learn the basics for using Amazon EMR to run machine learning algorithms
- Develop a project cost model for using Amazon EMR and other AWS tools
Published by O'Reilly Media, this guide is essential for both professional mechanics and DIY enthusiasts looking to leverage EMR and AWS technologies for efficient data analysis.