Chapter 3: Sqoop InstallationDepending on the framework you are using you will have to configure Sqoop accordingly. Let me take you through all separately.
Here I am considering three ways those people normally use-
- Own configured Hadoop – Custom configuration
- Cloudera CDH
Now I will explain all in detail here-
#1 Custom Sqoop Installation
Here I am considering that you have already installed Hadoop and Java on your system and now you just want to install Sqoop to proceed further.
Follow the below steps for Sqoop installation when you have configured your own clusters-
Step 1: Download Sqoop
In this tutorial, we are using the latest version of fully functional Sqoop 1.4.6. You can download this Sqoop version from here.
You will find multiple files here and you will have to download the sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz file.
Step 2: Start with the Sqoop installation
Now you will have to first untar the above file. Use the below command on your command-line interface (eg. Putty) to untar the file.
$tar -xvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz $ su password
Now you can move this file to the standard folder “/usr/lib/sqoop” directory using the below command-
# mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha /usr/lib/sqoop
Step 3: Configure bashrc file”
Now you will have to set the SQOOP_HOME path to the path where you put the Sqoop tar file. In our previous step, we have put it to “/usr/lib/sqoop” and so set the Sqoop_Home location to this location.
Use the below command to make this happen-
#Sqoop export SQOOP_HOME=/usr/lib/sqoop export PATH=$PATH:$SQOOP_HOME/bin
As we know whenever you will make changes to the bashrc file, you will have to execute it to reflect the changes. Use the below command to do that-
$ source ~/.bashrc
Step 4: Configure Sqoop now
So, far we have untar the tar file and have also set the Sqoop_Home location. Now let’s perform the Sqoop basic file configuration.
To do this, we will have to make changes to sqoop-env.sh which you will find under Sqoop_Home/conf folder.
First of all, redirect to the conf folder and copy the template file using the below two commands-
$ cd $SQOOP_HOME/conf $ mv sqoop-env-template.sh sqoop-env.sh
Now open sqoop-env.sh file and make the changes as per the below 2 lines-
export HADOOP_COMMON_HOME=/usr/local/hadoop export HADOOP_MAPRED_HOME=/usr/local/hadoop
Step5: Configure MySQL
As we need an RDBMS like MySQL or Oracle SQL for the data transfer between RDBMS and Hadoop and so let’s install MySQL as well.
First of all download mysql-connector-java-5.1.30.tar.gz file from this link.
Now untar the file and move to the /usr/lib/sqoop/lib directory.
$ tar -zxf mysql-connector-java-5.1.30.tar.gz $ su password: # cd mysql-connector-java-5.1.30 # mv mysql-connector-java-5.1.30-bin.jar /usr/lib/sqoop/lib
You are all done now, just verify Sqoop and you are good to go.
Step 6: Verify Sqoop
Use the below code to verify the set up-
$ cd $SQOOP_HOME/bin $ sqoop-version
Now let me tell you about the Sqoop installation in Cloudera and HortonWorks.
#2. Install Sqoop in HortonWorks
Usually, if you are using HortonWorks image then you will find Sqoop and MySQL already installed and directly start using it from the command line.
You should check this link for the details on Install Sqoop in HortonWorks.
#3. Sqoop installation in Cloudera
As you have already used Cloudera CDH and so you need to again install Sqoop. You can check the details using the below commands on command line-
$ sqoop help$ sqoop version $ sqoop import
You can check the further details on Cloudera official site using this link.Previous: Sqoop ArchitectureChapter 4: Sqoop Import