Book Detail : Big Data Made Easy

Book Title: 
Big Data Made Easy
Resource Category: 
Publisher: 
Publication Year: 
2 014
Number of Pages: 
392
ISBN: 
978-1-484200-95-7
978-1-4842-0094-0
Language: 
English
WishList: 
yes
Available at Shelf: 
No
Description: 

A Working Guide to the Complete Hadoop Toolset

Table of Contents (Summary): 
  1. The Problem with Data

  2. Storing and Configuring Data with Hadoop, YARN, and ZooKeeper

  3. Collecting Data with Nutch and Solr

  4. Processing Data with Map Reduce

  5. Scheduling and Workflow

  6. Moving Data

  7. Monitoring Data

  8. Cluster Management 

  9. Analytics with Hadoop

  10. ETL with Hadoop

  11. Reporting with Hadoop

Table of Contents (Expanded): 
  1. The Problem with Data

    • A Definition of “Big Data” 

    • The Potentials and Difficulties of Big Data

      • Requirements for a Big Data System

      • How Hadoop Tools Can Help

      • My Approach

    • Overview of the Big Data System

      • Big Data Flow and Storage

      • Benefits of Big Data Systems

    • What’s in This Book

      • Storage: Chapter 2

      • Data Collection: Chapter 3

      • Processing: Chapter 4

      • Scheduling: Chapter 5

      • Data Movement: Chapter 6

      • Monitoring: Chapter 7

      • Cluster Management: Chapter 8

      • Analysis: Chapter 9

      • ETL: Chapter 10

      • Reports: Chapter 11

  2. Storing and Configuring Data with Hadoop, YARN, and ZooKeeper

    • An Overview of Hadoop

      • The Hadoop V1 Architecture

      • The Differences in Hadoop V2

      • The Hadoop Stack

      • Environment Management

    • Hadoop V1 Installation

      • Hadoop 1.2.1 Single-Node Installation

      • Setting up the Cluster

      • Running a Map Reduce Job Check

      • Hadoop User Interfaces

    • Hadoop V2 Installation

      • ZooKeeper Installation

      • Hadoop MRv2 and YARN

    • Hadoop Commands 

      • Hadoop Shell Commands

      • Hadoop User Commands

      • Hadoop Administration Commands

  3. Collecting Data with Nutch and Solr

    • The Environment

      • Stopping the Servers

      • Changing the Environment Scripts

      • Starting the Servers

    • Architecture 1: Nutch 1.x

      • Nutch Installation

      • Solr Installation

      • Running Nutch with Hadoop

    • Architecture 2: Nutch 2.x

      • Nutch and Solr Configuration

      • HBase Installation

      • Gora Configuration

      • Running the Nutch Crawl

      • Potential Errors

    • A Brief Comparison

  4. Processing Data with Map Reduce

    • An Overview of the Word-Count Algorithm

    • Map Reduce Native

      • Java Word-Count Example 1

      • Java Word-Count Example 2

      • Comparing the Examples

    • Map Reduce with Pig

      • Installing Pig

      • Running Pig

      • Pig User-Defined Functions

    • Map Reduce with Hive

      • Installing Hive

      • Hive Word-Count Example

    • Map Reduce with Perl

  5. Scheduling and Workflow

    • An Overview of Scheduling

      • The Capacity Scheduler

      • The Fair Scheduler

    • Scheduling in Hadoop V1

      • V1 Capacity Scheduler

      • V1 Fair Scheduler 

    • Scheduling in Hadoop V2

      • V2 Capacity Scheduler

      • V2 Fair Scheduler 

    • Using Oozie for Workflow

      • Installing Oozie  

      • The Mechanics of the Oozie Workflow

      • Creating an Oozie Workflow

      • Running an Oozie Workflow

      • Scheduling an Oozie Workflow 

  6. Moving Data

    • Moving File System Data

      • The Cat Command

      • The CopyFromLocal Command

      • The CopyToLocal Command

      • The Cp Command 

      • The Get Command

      • The Put Command

      • The Mv Command

      • The Tail Command

    • Moving Data with Sqoop

      • Check the Database

      • Install Sqoop 

      • Use Sqoop to Import Data to HDFS

      • Use Sqoop to Import Data to Hive

    • Moving Data with Flume

      • Install Flume

      • A Simple Agent

      • Running the Agent

    • Moving Data with Storm

      • Install ZeroMQ

      • Install JZMQ

      • Install Storm

      • Start and Check Zookeeper

      • Run Storm

      • An Example of Storm Topology

  7. Monitoring Data

    • The Hue Browser

      • Installing Hue

      • Starting Hue

      • Potential Errors

      • Running Hue

    • Ganglia

      • Installing Ganglia

      • Potential Errors

      • The Ganglia Interface

    • Nagios

      • Installing Nagios

      • Potential Errors 

      • The Nagios Interface

  8. Cluster Management 

    • The Ambari Cluster Manager

      • Ambari Installation

    • The Cloudera Cluster Manager

      • Installing Cloudera Cluster Manager

      • Running Cloudera Cluster Manager

    • Apache Bigtop

      • Installing Bigtop 

      • Running Bigtop Smoke Tests

  9. Analytics with Hadoop

    • Cloudera Impala

      • Installation of Impala 

      • Impala User Interfaces

      • Uses of Impala 

    • Apache Hive

      • Database Creation

      • External Table Creation

      • Hive UDFs

      • Table Creation

      • The SELECT Statement 

      • The WHERE Clause

      • The Subquery

      • Table Joins

      • The INSERT Statement

      • Organization of Table Data

    • Apache Spark

      • Installation of Spark

      • Uses of Spark

      • Spark SQL

  10. ETL with Hadoop

    • Pentaho Data Integrator

      • Installing Pentaho 

      • Running the Data Integrator

      • Creating ETL 

      • Potential Errors

    • Talend Open Studio

      • Installing Open Studio for Big Data

      • Running Open Studio for Big Data

      • Creating the ETL

      • Potential Errors 

  11. Reporting with Hadoop

    • Hunk

      • Installing Hunk

      • Running Hunk

      • Creating Reports and Dashboards

      • Potential Errors

    • Talend Reports

      • Installing Talend

      • Running Talend 

      • Generating Reports

      • Potential Errors

    • Summary

Index

3.143065
Average: 3.1 (173 votes)

Search the Web

Custom Search

Searches whole web. Use the search in the right sidebar to search only within javajee.com!!!