Book Detail : APACHE HADOOP YARN

Book Title: 
APACHE HADOOP YARN
Resource Category: 
Publisher: 
Publication Year: 
2 014
Number of Pages: 
400
ISBN: 
13:978-0-321-93450-5
10:0-321-93450-4
Language: 
English
Edition: 
Second
WishList: 
yes
Available at Shelf: 
No
Description: 

Moving beyond MapReduce and Batch Processing

Table of Contents (Summary): 
  1. Apache Hadoop YARN:  A Brief History and Rationale 

  2. Apache Hadoop YARN Install Quick Start

  3. Apache Hadoop YARN Core Concepts

  4. Functional Overview of YARN Components

  5. Installing Apache Hadoop YARN

  6. Apache Hadoop YARN Administration 

  7. Apache Hadoop YARN Architecture Guide

  8. Capacity Scheduler in YARN 

  9. MapReduce with Apache Hadoop YARN

  10. Apache Hadoop YARN Application Example

  11. Using Apache Hadoop YARN Distributed-Shell

  12. Apache Hadoop YARN Frameworks

Table of Contents (Expanded): 
  1. Apache Hadoop YARN:  A Brief History and Rationale 

    • Phase 0: The Era of Ad Hoc Clusters

    • Phase 1: Hadoop on Demand, HDFS in the HOD World, Features and Advantages of HOD, Shortcomings of Hadoop on Demand

    • Phase 2: Dawn of the Shared Compute Clusters, Evolution of Shared Clusters, Issues with Shared MapReduce Clusters

    • Phase 3: Emergence of YARN

  2. Apache Hadoop YARN Install Quick Start

    • Steps to Configure a Single-Node YARN Cluster

      • Step 1: Download Apache Hadoop     

      • Step 2: Set JAVA_HOME    

      • Step 3: Create Users and Groups       

      • Step 4: Make Data and Log Directories    

      • Step 5: Configure core-site.xml        

      • Step 6: Configure hdfs-site.xml     

      • Step 7: Configure mapred-site.xml  

      • Step 8: Configure yarn-site.xml   

      • Step 9: Modify Java Heap Sizes

      • Step 10: Format HDFS 

      • Step 11: Start the HDFS Services   

      • Step 12: Start YARN Services   

      • Step 13: Verify the Running Services Using the

    • Run Sample MapReduce Examples

    • Wrap Up 

  3. Apache Hadoop YARN Core Concepts

    • Beyond MapReduce 

      • The MapReduce Paradigm

    • Apache Hadoop MapReduce

      • The Need for Non-MapReduce Workloads

      • Addressing Scalability

      • Improved Utilization

      • User Agility  

    • Apache Hadoop YARN

    • YARN Components

      • ResourceManager

      • ApplicationMaster

      • Resource Model 

      • ResourceRequests and Containers

      • Container Specification

  4. Functional Overview of YARN Components

    • Architecture Overvie

    • Resource Mangaer

    • YARN Scheduling Components

      • FIFO Scheduler

      • Capacity Scheduler

      • Fair Scheduler

    • Containers  

    • NodeManager 

    • ApplicationMaster

    • YARN Resource Model

      • Client Resource Request

      • ApplicationMaster Container Allocation 

      • ApplicationMaster–Container

      • Manager Communication   

    • Managing Application Dependencies 

      • LocalResources Definitions

      • LocalResource Timestamps

      • LocalResource Types

      • LocalResource Visibilities 

      • Lifetime of LocalResources  

  5. Installing Apache Hadoop YARN

    • The Basics

    • System Preparation 

      • Step 1: Install EPEL and pdsh

      • Step 2: Generate and Distribute ssh Keys

    • Script-based Installation of Hadoop 2

      • JDK Options

      • Step 1: Download and Extract the Scripts 

      • Step 2: Set the Script Variables

      • Step 3: Provide Node Names

      • Step 4: Run the Script

      • Step 5: Verify the Installation

    • Script-based Uninstall  

    • Configuration File Processing

    • Configuration File Settings

      • core-site.xml 

      • hdfs-site.xml 

      • mapred-site.xml

      • yarn-site.xml 

    • Start-up Scripts

    • Installing Hadoop with Apache Ambari   

      • Performing an Ambari-based Hadoop Installation

      • Step 1: Check Requirements  

      • Step 2: Install the Ambari Server

      • Step 3: Install and Start Ambari Agents

      • Step 4: Start the Ambari Server

      • Step 5: Install an HDP2.X Cluster  

  6. Apache Hadoop YARN Administration 

    • Script-based Configuration

    • Monitoring Cluster Health: Nagios, Monitoring Basic Hadoop Services, Monitoring the JVM 

    • Real-time Monitoring: Ganglia

    • Administration with Ambari

    • JVM Analysis

    • Basic YARN Administration

      • YARN Administrative Tools  

      • Adding and Decommissioning YARN Nodes

      • Capacity Scheduler Configuration

      • YARN WebProxy 

      • Using the JobHistoryServer

      • Refreshing User-to-Groups Mappings

      • Refreshing Superuser Proxy Groups Mappings

      • Refreshing ACLs for Administration of ResourceManager

      • Reloading the Service-level Authorization Policy File

      • Managing YARN Jobs

      • Setting Container Memory

      • Setting Container Cores

      • Setting MapReduce Properties

      • User Log Management

  7. Apache Hadoop YARN Architecture Guide

    • Overview

    • Resource Manager

      • Overview of the ResourceManager Components

      • Client Interaction with the ResourceManager    

      • Application Interaction with the ResourceManager   

      • Interaction of Nodes with the ResourceManager   

      • Core ResourceManager Components 

      • Security-related Components in the ResourceManager 

    • Node Manager

      • Overview of the NodeManager Components

      • NodeManager Components 

      • NodeManager Security Components 

      • Important NodeManager Functions

    • ApplicationMaster  

      • Overview

      • Liveliness  

      • Resource Requirements  

      • Scheduling  

      • Scheduling Protocol and Locality

      • Launching Containers  

      • Completed Containers

      • ApplicationMaster Failures and Recovery

      • Coordination and Output Commit

      • Information for Clients

      • Security 

      • Cleanup on ApplicationMaster Exit  

    • YARN Containers

      • Container Environment

      • Communication with the ApplicationMaster

    • Summary for Application-writers

  8. Capacity Scheduler in YARN 

    • Introduction to the Capacity Scheduler  

      • Elasticity with Multitenancy

      • Security 

      • Resource Awareness 

      • Granular Scheduling  

      • Locality 

      • Scheduling Policies 

    • Capacity Scheduler Configuration 

    • Queues 

    • Hierarchical Queues   

      • Key Characteristics 

      • Scheduling Among Queues

      • Defining Hierarchical Queues 

    • Queue Access Control 

    • Capacity Management with Queues

    • User Limits  

    • Reservations

    • State of the Queues

    • Limits on Applications   

    • User Interface

  9. MapReduce with Apache Hadoop YARN

    • Running Hadoop YARN MapReduce Examples 

      • Listing Available Examples

      • Running the Pi Example

      • Using the Web GUI to Monitor Examples

      • Running the Terasort Test

      • Run the TestDFSIO Benchmark  

    • MapReduce Compatibility  

    • The MapReduce ApplicationMaster 

      • Enabling Application Master Restarts

      • Enabling Recovery of Completed Tasks 

      • The JobHistory Server  

    • Calculating the Capacity of a Node

    • Changes to the Shuffle Service

    • Running Existing Hadoop Version 1 Applications      

      • Binary Compatibility of org.apache.hadoop.mapred APIs     

      • Source Compatibility of org.apache.hadoop.mapreduce APIs   

      • Compatibility of Command-line Scripts 

      • Compatibility Tradeoff Between MRv1 and Early MRv2 (0.23.x) Applications 

    • Running MapReduce Version 1 Existing Code 

      • Running Apache Pig Scripts on YARN 

      • Running Apache Hive Queries on YARN

      • Running Apache Oozie Workflows on YARN 

    • Advanced Features

      • Uber Jobs  

      • Pluggable Shuffle and Sort   

  10. Apache Hadoop YARN Application Example

    • The YARN Client 

    • The ApplicationMaster  

  11. Using Apache Hadoop YARN Distributed-Shell

    • Using the YARN Distributed-Shell 

      • A Simple Example

      • Using More Containers 

      • Distributed-Shell Examples with Shell Arguments  

    • Internals of the Distributed-Shell 

      • Application Constants

      • Client 

      • ApplicationMaster  

      • Final Containers 

  12. Apache Hadoop YARN Frameworks

    • Distributed-Shell 

    • Hadoop MapReduce  

    • Apache Tez

    • Apache Giraph 

    • Hoya: HBase on YARN

    • Dryad on YARN 

    • Apache Spark 

    • Apache Storm 

    • REEF: Retainable Evaluator Execution Framework     

    • Hamster: Hadoop and MPI on the Same Cluster 

A - Supplemental Content and Code Downloads 

B - YARN Installation Scripts

C - YARN Administration Scripts 

D - Nagios Modules 

E - Resources and Additional Information 

F - HDFS Quick Reference

1.91978
Average: 1.9 (321 votes)

Search the Web

Custom Search

Searches whole web. Use the search in the right sidebar to search only within javajee.com!!!