Apache Yarn is a tool for resource management and was introduced in Hadoop 2.x. Here are few best Apache Hadoop Yarn books to master Apache Hadoop Yarn.
Apache Yarn is the short form of “Yet another Resource Negotiator” and was a key feature introduced in Hadoop 2. YARN is being considered as a large-scale, distributed operating system for big data applications. Basically, it is a cluster management technology.
In Hadoop 1, both application management (data processing part) and resource management were taken care by MapReduce, but in Hadoop 2.x, resource management has been handled by YARN.
If you see the overall architecture of Hadoop 1 and Hadoop 2, you will easily find the differences between the two-
Here resource management means, managing the resources of the Hadoop cluster such as memory, CPU, etc.
Here are few advantages of Yarn-
- Efficient Utilization of Resources: Now multiple applications running in Hadoop can share the common resource. There is nothing like fixed slot.
- Run applications that don’t follow MapReduce Model: As Yarn has separated the application and cluster management and so it has enabled to support multiple and varied data processing.
- No more Job Tracker & Task Tracker: With Yarn, there is no further need of Job Tracker & task tracker in Hadoop. Yarn has divided the couple of major operation of job tracker- resource management and job scheduling/monitoring into the following units-
1. Resource Manager– central manager
2. Node Manager– Node-specific work
I hope so far you came to know about Yarn and how it was introduced in Hadoop2. Let’s see some of the best Apache Hadoop Yarn books to master Yarn. If you are looking for the Hadoop books, you can check our list of Hadoop books for beginners.
3 Best Apache Hadoop Yarn Books to gain expertise in Yarn
Here are some of the compiled list of Apache Hadoop Yarn books for your reference. Go through these books and master your proficiency in Yarn.
1. Apache Hadoop YARN
- Book Name: Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2
- Page: 400
- Authors: Arun Murthy, Vinod Vavilapalli, Douglas Eadline, Joseph Niemiec, and Jeff Markham
- Publication: Addison-Wesley Professional
- Price: Kindle ($3.94); Paper book ($31.99)
Apache Hadoop YARN- Moving beyond MapReduce and Batch Processing with Apache Hadoop 2 is a complete Apache Hadoop Yarn book with examples you will need to master Yarn.
It has all the required resources for administrators, developers, and power users of the Hadoop YARN framework.
On Amazon, it has an average rating of 3.0 out of 5 based on the reviews provided by 16 users. The authors are well experienced and have published several other books as well.
Overall this Apache Hadoop Yarn book has been designed to cover the basics to the advanced view of Yarn and Hadoop 2.
2. Hadoop 2 Quick-Start Guide
- Book Name: Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem
- Page: 304
- Author: Douglas Eadline
- Publication: Addison-Wesley Professional
- Price: Kindle ($3.79); Paper Book ($17.48-$24.46)
I found this Apache Hadoop Yarn book highest rating so far on Amazon. Although it has been reviewed by just six buyers has an average rating of 4.8 out of 5 which is recommendable.
Douglas is a well-known writer in data science and has crafted this Apache Yarn book very beautifully.
It is a starter guide to Hadoop 2 and is an accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Here are some of the features of this Apache Yarn Book-
- Understanding what Hadoop 2 and YARN do, and how they improve on Hadoop 1 with MapReduce
- Understanding Hadoop-based Data Lakes versus RDBMS Data Warehouses
- Installing Hadoop 2 and core services on Linux machines, virtualized sandboxes, or clusters
- Exploring the Hadoop Distributed File System (HDFS)
- Understanding the essentials of MapReduce and YARN application programming
- Simplifying programming and data movement with Apache Pig, Hive, Sqoop, Flume, Oozie, and HBase
- Observing application progress, controlling jobs and managing workflows
- Managing Hadoop efficiently with Apache Ambari–including recipes for HDFS to NFSv3 gateway, HDFS snapshots, and YARN configuration
- Learning basic Hadoop 2 troubleshooting, and installing Apache Hue and Apache Spark
3. YARN Essentials
- Book Name: YARN Essentials
- Page: 185
- Author: Amol Fasale and Nirmal Kumar
- Publication: Packt Publishing
- Price: Kindle ($5.30); Paperbook ($29.99)
Yarn Essentials is a comprehensive, hands-on guide to installing, administer, and configure settings in YARN. You can learn the inner workings of YARN and how its robust and generic framework enables optimal resource utilization across multiple applications.
It is like a Yarn beginner book to learn Yarn as a step-by-step self-learning guide to help you perform optimal resource utilization in a cluster. Based on the five reviews it has received on Amazon, it has an average rating of 4.2.
Wrapping it up!
These were some of the best Apache Hadoop Yarn books. One should go through these Apache Yarn books to master Hadoop 2.
I would suggest you start with any of these books and read & understand thoroughly to understand the resource management.
Buy the Kindle edition from Amazon as it will cost you less and share your feedback with us.