Data-Intensive Text Processing with MapReduce

Nonfiction, Computers, Advanced Computing, Natural Language Processing, Artificial Intelligence, Reference & Language, Language Arts, Linguistics
Cover of the book Data-Intensive Text Processing with MapReduce by Jimmy Lin, Chris Dyer, Morgan & Claypool Publishers
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart
Author: Jimmy Lin, Chris Dyer ISBN: 9781608453436
Publisher: Morgan & Claypool Publishers Publication: October 10, 2010
Imprint: Morgan & Claypool Publishers Language: English
Author: Jimmy Lin, Chris Dyer
ISBN: 9781608453436
Publisher: Morgan & Claypool Publishers
Publication: October 10, 2010
Imprint: Morgan & Claypool Publishers
Language: English

Our world is being revolutionized by data-driven methods: access to large amounts of data has generated new insights and opened exciting new opportunities in commerce, science, and computing applications. Processing the enormous quantities of data necessary for these advances requires large clusters, making distributed computing paradigms more crucial than ever. MapReduce is a programming model for expressing distributed computations on massive datasets and an execution framework for large-scale data processing on clusters of commodity servers. The programming model provides an easy-to-understand abstraction for designing scalable algorithms, while the execution framework transparently handles many system-level details, ranging from scheduling to synchronization to fault tolerance. This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader "think in MapReduce", but also discusses limitations of the programming model as well. Table of Contents: Introduction / MapReduce Basics / MapReduce Algorithm Design / Inverted Indexing for Text Retrieval / Graph Algorithms / EM Algorithms for Text Processing / Closing Remarks

View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart

Our world is being revolutionized by data-driven methods: access to large amounts of data has generated new insights and opened exciting new opportunities in commerce, science, and computing applications. Processing the enormous quantities of data necessary for these advances requires large clusters, making distributed computing paradigms more crucial than ever. MapReduce is a programming model for expressing distributed computations on massive datasets and an execution framework for large-scale data processing on clusters of commodity servers. The programming model provides an easy-to-understand abstraction for designing scalable algorithms, while the execution framework transparently handles many system-level details, ranging from scheduling to synchronization to fault tolerance. This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader "think in MapReduce", but also discusses limitations of the programming model as well. Table of Contents: Introduction / MapReduce Basics / MapReduce Algorithm Design / Inverted Indexing for Text Retrieval / Graph Algorithms / EM Algorithms for Text Processing / Closing Remarks

More books from Morgan & Claypool Publishers

Cover of the book High Power Microwave Tubes by Jimmy Lin, Chris Dyer
Cover of the book Deep Learning for Computer Architects by Jimmy Lin, Chris Dyer
Cover of the book Theory of Electromagnetic Pulses by Jimmy Lin, Chris Dyer
Cover of the book Physics is… by Jimmy Lin, Chris Dyer
Cover of the book Incentive-Centric Semantic Web Application Engineering by Jimmy Lin, Chris Dyer
Cover of the book Creating Autonomous Vehicle Systems by Jimmy Lin, Chris Dyer
Cover of the book Essential Classical Mechanics for Device Physics by Jimmy Lin, Chris Dyer
Cover of the book Candidate Multilinear Maps by Jimmy Lin, Chris Dyer
Cover of the book Talking Renewables by Jimmy Lin, Chris Dyer
Cover of the book Database Anonymization by Jimmy Lin, Chris Dyer
Cover of the book Understanding the Magic of the Bicycle by Jimmy Lin, Chris Dyer
Cover of the book Advanced Circuit Simulation using Multisim Workbench by Jimmy Lin, Chris Dyer
Cover of the book Mitigation of Cancer Therapy Side-Effects with Light by Jimmy Lin, Chris Dyer
Cover of the book The Physical Microbe by Jimmy Lin, Chris Dyer
Cover of the book Transforming Technologies to Manage Our Information by Jimmy Lin, Chris Dyer
We use our own "cookies" and third party cookies to improve services and to see statistical information. By using this website, you agree to our Privacy Policy