Hadoop

Ask Big Data Hadoop Related Questions

Ask Any Hadoop and it's Ecosystem Related Questions

We are creating our forum till the time it is not live, you can post all your questions and doubts here. We will make sure it is getting answered correctly.

All the Hadoop and BI related questions are highly welcomed. You can also post job-related queries.

7 Comments

  • Thanks for this website.

    Could you please provide me the code or way to schedule Sqoop job daily that load incremental import into one hive table.

    It would be great help if I will get step by step solution.

    Thanks,
    Sumit kumar

  • customers = load ‘./in2/customersTable.txt’ using PigStorage(‘ ‘) as (nameCus:chararray, age: int);
    purchases = load ‘./in2/purchasesTable.txt’ using PigStorage(‘ ‘) as (namePur:chararray, flavor: chararray);
    A = JOIN customers BY (name), purchases BY (name);
    B = foreach A generate A::nameCus,A::namePur,A::age,A::flavor;
    C = group B by flavor;
    D = foreach C generate COUNT(C) as purchaseCount;
    E = ORDER D BY purchaseCount DESC;
    F = LIMIT E 1;
    G = store F into ‘./flavorcount’;

  • hai
    I installed sqoop as per ur guidelines.when I am using sqoop list-databases command,i am getting an error as could not find or load main class org.apache.sqoop.Sqoop. how to resolve this issue?

    • Hi,

      You need to make sure that you have sqoop-1.4.3.jar under your SQOOP HOME directory.

      Also, check the sqoop version you have downloaded. Please try and comment for any other issues.

  • I know this is off topic from Hadoop. But somewhat related to Big Data. My question is
    Why is a spark RDD immutable?
    I know Immutable means cannot change. What is the need for rdd to be immutable and what are the advantages of it?

Leave a Comment