I want to run sample Pig Latin script


#1

Hi Team - I have 2 simple data files. I want to load these 2 data files and then run a sample Pig Latin script that I have which access the data files. I need some help. below is the Pig Latin script.
Pig Latin statements. this is from one of the online class that I am taking.

When I use Cloud x Lab to upload the 2 data files that I have, what would be the directory path so that I can update the “jjzhang/lab3/people.csv”, “piglab/zip-city.csv” and “'piglab/output/Joined_Results” that you see below?

My goal is to be able to run the below script and understand the functionality.

A = LOAD ‘jjzhang/lab3/people.csv’ USING PigStorage (’,’) AS (gender:chararray, age:int, income:int, zip:int);

B = FOREACH A GENERATE income, zip;
DUMP B;

C = FOREACH B GENERATE income/1000, zip;
DUMP C;

D = FILTER B BY income > 20000;
DUMP D;

Sorted_Income = ORDER D BY income ASC;
DUMP Sorted_Income;

ZipCity = LOAD ‘piglab/zip-city.csv’ USING PigStorage (’,’) AS (zip:int, city:chararray);

joined_results = JOIN A BY (zip), ZipCity BY (zip);
DUMP joined_results;

STORE Sorted_Income INTO ‘piglab/output/Joined_Results’ USING PigStorage(’,’);