Chapter 3: Sqoop Installation
Depending on the framework you are using you will have to configure Sqoop accordingly. Let me take you through all separately.
Now I will explain all in detail here-
1. Custom Sqoop Installation
Here I am considering that you have already installed Hadoop and Java on your system and now you just want to install Sqoop to proceed further.
Follow the below steps for Sqoop installation when you have configured your own clusters-
Step 1: Download Sqoop
In this tutorial, we are using the latest version of fully functional Sqoop 1.4.6. You can download this Sqoop version from here.
You will find multiple files here and you will have to download the sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz file.
Step 2: Start with the Sqoop installation
Now you will have to first untar the above file. Use the below command on your command line interface (eg. Putty) to untar the file.
Now you can move this file to the standard folder “/usr/lib/sqoop” directory using the below command-
Step 3: Configure bashrc file
Now you will have to set the SQOOP_HOME path to the path where you put the Sqoop tar file. In our previous step, we have put it to “/usr/lib/sqoop” and so set the Sqoop_Home location to this location.
Use the below command to make this happen-
As we know whenever you will make changes to the bashrc file, you will have to execute it to reflect the changes. Use the below command to do that-
Step 4: Configure Sqoop now
So far we have untar the tar file and have also set the Sqoop_Home location. Now let’s perform the Sqoop basic file configuration.
To do this, we will have to make changes to sqoop-env.sh which you will find under Sqoop_Home/conf folder.
First of all, redirect to conf folder and copy the template file using the below two commands-
Now open sqoop-env.sh file and make the changes as per the below 2 lines-
Step5: Configure MySQL
As we need an RDBMS like MySQL or Oracle SQL for the data transfer between RDBMS and Hadoop and so let’s install MySQL as well.
First of all download mysql-connector-java-5.1.30.tar.gz file from this link.
Now untar the file and move to the /usr/lib/sqoop/lib directory.
You are all done now, just verify Sqoop and you are good to go.
Step 6: Verify Sqoop
2. Install Sqoop in HortonWorks
Usually, if you are using HortonWorks image then you will find Sqoop and MySQL already installed and directly start using it from the command line.
You should check this link for the details on Install Sqoop in HortonWorks.
3. Sqoop installation in Cloudera
As you have already used Cloudera CDH and so you need to again install Sqoop. You can check the details using the below commands on command line-