Error loading page.
Try refreshing the page. If that doesn't work, there may be a network issue, and you can use our self test page to see what's preventing the page from loading.
Learn more about possible network issues or contact support for more help.

Field Guide to Hadoop

ebook

If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You'll quickly understand how Hadoop's projects, subprojects, and related technologies work together.

Each chapter introduces a different topic—such as core technologies or data transfer—and explains why certain components may or may not be useful for particular needs. When it comes to data, Hadoop is a whole new ballgame, but with this handy reference, you'll have a good grasp of the playing field.

Topics include:

  • Core technologies—Hadoop Distributed File System (HDFS), MapReduce, YARN, and Spark
  • Database and data management—Cassandra, HBase, MongoDB, and Hive
  • Serialization—Avro, JSON, and Parquet
  • Management and monitoring—Puppet, Chef, Zookeeper, and Oozie
  • Analytic helpers—Pig, Mahout, and MLLib
  • Data transfer—Scoop, Flume, distcp, and Storm
  • Security, access control, auditing—Sentry, Kerberos, and Knox
  • Cloud computing and virtualization—Serengeti, Docker, and Whirr

  • Expand title description text
    Publisher: O'Reilly Media

    Kindle Book

    • Release date: March 2, 2015

    OverDrive Read

    • ISBN: 9781491947883
    • File size: 18492 KB
    • Release date: March 2, 2015

    EPUB ebook

    • ISBN: 9781491947883
    • File size: 18492 KB
    • Release date: March 2, 2015

    PDF ebook

    • ISBN: 9781491947906
    • File size: 7137 KB
    • Release date: March 2, 2015

    Formats

    Kindle Book
    OverDrive Read
    EPUB ebook
    PDF ebook

    Languages

    English

    If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You'll quickly understand how Hadoop's projects, subprojects, and related technologies work together.

    Each chapter introduces a different topic—such as core technologies or data transfer—and explains why certain components may or may not be useful for particular needs. When it comes to data, Hadoop is a whole new ballgame, but with this handy reference, you'll have a good grasp of the playing field.

    Topics include:

  • Core technologies—Hadoop Distributed File System (HDFS), MapReduce, YARN, and Spark
  • Database and data management—Cassandra, HBase, MongoDB, and Hive
  • Serialization—Avro, JSON, and Parquet
  • Management and monitoring—Puppet, Chef, Zookeeper, and Oozie
  • Analytic helpers—Pig, Mahout, and MLLib
  • Data transfer—Scoop, Flume, distcp, and Storm
  • Security, access control, auditing—Sentry, Kerberos, and Knox
  • Cloud computing and virtualization—Serengeti, Docker, and Whirr

  • Expand title description text
    • Details

      Publisher:
      O'Reilly Media

      Kindle Book
      Release date: March 2, 2015

      OverDrive Read
      ISBN: 9781491947883
      File size: 18492 KB
      Release date: March 2, 2015

      EPUB ebook
      ISBN: 9781491947883
      File size: 18492 KB
      Release date: March 2, 2015

      PDF ebook
      ISBN: 9781491947906
      File size: 7137 KB
      Release date: March 2, 2015

    • Creators
    • Formats
      Kindle Book
      OverDrive Read
      EPUB ebook
      PDF ebook
    • Languages
      English