Many data analysis techniques (e.g. PageRank algorithm, recursive relational queries, social network analysis) require iterative computations. These techniques have a common requirement which is that data are processed iteratively until the computation satisfies a convergence or stopping condition. The basic MapReduce framework does not directly support these iterative data analysis applications. Instead, programmers must implement iterative programs by manually issuing multiple MapReduce jobs and orchestrating their execution using a driver program.
The HaLoop system is designed to support iterative processing flon the MapReduce framework by extending the basic MapReduce framework with two main functionalities:
The above figure illustrates the architecture of HaLoop as a modified version of the basic MapReduce framework. In order to accommodate the requirements of iterative data analysis applications, HaLoop has incorporated the following changes to the basic Hadoop MapReduce framework:
In principle, HaLoop relies on the same file system and has the same task queue structure as Hadoop but the task scheduler and task tracker modules are modified, and the loop control, caching, and indexing modules are newly introduced to the architecture. The task tracker not only manages task execution but also manages caches and indices on the slave node, and redirects each task's cache and index accesses to local le system.
There has been error in communication with Booktype server. Not sure right now where is the problem.
You should refresh this page.