If your role includes moving datasets into Hadoop, this book will help you do it more efficiently using Apache Flume. From installation to customization, it's a complete step-by-step guide on making the service work for you.
Overview
In Detail
Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with many failover and recovery mechanisms.
Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of Flume, which includes moving data to/from databases, NO-SQL-ish data stores, as well as optimizing performance. This book includes real-world scenarios on Flume implementation.
Apache Flume: Distributed Log Collection for Hadoop starts with an architectural overview of Flume and then discusses each component in detail. It guides you through the complete installation process and compilation of Flume.
It will give you a heads-up on how to use channels and channel selectors. For each architectural component (Sources, Channels, Sinks, Channel Processors, Sink Groups, and so on) the various implementations will be covered in detail along with configuration options. You can use it to customize Flume to your specific needs. There are pointers given on writing custom implementations as well that would help you learn and implement them.
What you will learn from this book
Approach
A starter guide that covers Apache Flume in detail.
Who this book is written for
Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators.
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
A starter guide that covers Apache Flume in detail.Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators
Steve Hoffman has 30 years of software development experience and holds a B.S. in computer engineering from the University of Illinois Urbana-Champaign and a M.S. in computer science from the DePaul University. He is currently a Principal Engineer at Orbitz Worldwide.More information on Steve can be found at http://bit.ly/bacoboy or on Twitter @bacoboy.This is Steve's first book.
„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.
EUR 5,33 für den Versand von Vereinigtes Königreich nach Deutschland
Versandziele, Kosten & DauerEUR 5,70 für den Versand von Vereinigtes Königreich nach Deutschland
Versandziele, Kosten & DauerAnbieter: WeBuyBooks, Rossendale, LANCS, Vereinigtes Königreich
Zustand: Like New. Most items will be dispatched the same or the next working day. An apparently unread copy in perfect condition. Dust cover is intact with no nicks or tears. Spine has no signs of creasing. Pages are clean and not marred by notes or folds of any kind. Artikel-Nr. wbs7955394991
Anzahl: 1 verfügbar
Anbieter: ThriftBooks-Dallas, Dallas, TX, USA
Paperback. Zustand: Good. No Jacket. Pages can have notes/highlighting. Spine may show signs of wear. ~ ThriftBooks: Read More, Spend Less 0.44. Artikel-Nr. G1782167919I3N00
Anzahl: 1 verfügbar
Anbieter: Ria Christie Collections, Uxbridge, Vereinigtes Königreich
Zustand: New. In. Artikel-Nr. ria9781782167914_new
Anzahl: Mehr als 20 verfügbar
Anzahl: Mehr als 20 verfügbar