English |  Español |  Français |  Italiano |  Português |  Русский |  Shqip

Big Data Dictionary

Apache Sqoop

Sqoop is an open source project, developed by Cloudera, which is designed for eciently transferring bulk data between Apache Hadoop and structured data stores such as relational databases. It extracts data from RDBMSs and inserts it into HDFS. It also works the other way around. Connectors between many RDBMSs (e.g. MySQL, Oracle, Aster Data, EMC Greenplum, Netezza, Teradata) and HDFS are available.

The above figure illustrates how the Sqoop system works. In practice, Sqoop can import just one table from the relational database, all tables in a database and just portions of a table (using the WHERE clause). It uses MapReduce to actually import the data and imports data to HDFS as delimited text les or SequenceFiles. It generates a class le which can encapsulate a row of the imported data.

There has been error in communication with Booktype server. Not sure right now where is the problem.

You should refresh this page.