Big Data Dictionary

Big Table/HBase

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size (petabytes of data) across thousands of commodity servers. HBase (the Hadoop Database) is an open source project which implements a clone of the Bigtable storage architecture very faithfully (other clones for the ideas of Bigtable include Cassandra, Accumulo and Hypertable). The initial prototype of HBase released in February 2007. HBase does not support a full relational data model. However, it provides clients with a simple data model that supports dynamic control over data layout and format. In particular, a HBase is a sparse, distributed, persistent multidimensional
sorted map. The map is indexed by a row key, column key, and a timestamp. Each value in the map is an uninterpreted array of bytes. Thus, clients usually need to serialize various forms of structured and semi-structured data into these strings. A concrete example that relfects some of the main design decisions of HBase is the scenario of storing a copy of a large collection of web pages into a single table.

The above figure illustrates an example of an HBase table where URLs are used as row keys and various aspects of web pages as column names. The contents of the web pages are stored in a single column which stores multiple versions of the page under the timestamps when they were fetched. The row keys in a table are arbitrary strings where every read or write of data under a single row key is atomic. HBase maintains the data in lexicographic order by row key where the row range for a table is dynamically partitioned. Each row range is called a tablet which represents the unit of distribution and load balancing. Thus, reads of short row ranges are ecient and typically require communication with only a small number of machines. HBase can have an unbounded number of columns which are grouped into sets called column families. These column families represent the basic unit of access control. Each cell in a HBase can contain multiple versions of the same data which are indexed by their timestamps. Each client can exibly decide the number of n versions of a cell that need to be kept. These versions are stored in decreasing timestamp order so that the most recent versions can be always read first.

HBase API provides functions for creating and deleting tables and column families. It also provides functions for changing cluster, table, and column family metadata, such as access control rights. Client applications can write or delete values in HBase tables, look up values from individual rows, or iterate over a subset of the data in a table. On the transaction level, HBase supports only single-row transactions which can be used to perform atomic read-modify-write sequences on data stored under a single row key (i.e. no general transactions across row keys). The fact that reads of short row ranges require low communication can aect the development of queries, so that they are suitable to the available network resources. On the physical level, HBase uses the Hadoop distributed le system (HDFS) in place of the Google le system (GFS). It puts updates into memory and periodically writes them out to les on the disk.

The basic unit of scalability and load balancing in HBase is called a region. Regions are essentially contiguous ranges of rows stored together. They are dynamically split by the system when they become too large. Alternatively, they may also be merged to reduce their number and required storage les. Initially there is only one region for a table, and as you start adding data to it, the system is monitoring it to ensure that you do not exceed a congured maximum size. If you exceed the limit, the region is split into two at the middle key of the region creating two roughly equal halves. Each region is served by exactly one region server, and each of these servers can serve many regions at any time. Splitting and serving regions can be thought of as auto sharding, as oered by other systems. The region servers can be added or removed while the system is up and running. The master is responsible for assigning regions to region servers and uses ZooKeeper, a reliable, highly available, persistent and distributed coordination service, to facilitate that task. In addition, it handles schema changes such as table and column family creations.

In HBase, all operations that mutate data are guaranteed to be atomic on a per-row basis. It uses optimistic concurrency control which aborts any operations if there is a con ict with other updates. This aects all other concurrent readers and writers of that same row as they either read a consistent last mutation or may have to wait before being able to apply their change. For data storage and access, HBase provides a Java API, a Thrift API8, REST API and JDBC/ODBC connection.