Idea for NLP Project


Those of you who are looking for NLP project using Machine Learning or Deep Learning, you can work on this.

The data is located in HDFS at: /data/amazon_review_full_csv

$ hadoop fs -ls /data/amazon_review_full_csv
Found 3 items
-rw-r--r--   3 hdfs hdfs       1485 2019-08-20 18:55 /data/amazon_review_full_csv/readme.txt
-rw-r--r--   3 hdfs hdfs  292412008 2019-08-20 18:55 /data/amazon_review_full_csv/test.csv
-rw-r--r--   3 hdfs hdfs 1348342434 2019-08-20 18:55 /data/amazon_review_full_csv/train.csv

This is how the data looks like:

$ hadoop fs -cat /data/amazon_review_full_csv/train.csv|more
"3","more like funchuck","Gave this to my dad for a gag gift after directing ""Nunsense,"" he got a reall kick out
of it!"
"5","Inspiring","I hope a lot of people hear this cd. We need more strong and positive vibes like this. Great vocal
s, fresh tunes, cross-cultural happiness. Her blues is from the gut. The pop sounds are catchy and mature."
"5","The best soundtrack ever to anything.","I'm reading a lot of reviews saying that this is the best 'game soundt
rack' and I figured that I'd write a review to disagree a bit. This in my opinino is Yasunori Mitsuda's ultimate ma
sterpiece. The music is timeless and I'm been listening to it for years now and its beauty simply refuses to fade.T
he price tag on this is pretty staggering I must say, but if you are going to buy any cd for this much money, this
is the only one that I feel would be worth every penny."
"4","Chrono Cross OST","The music of Yasunori Misuda is without question my close second below the great Nobuo Uema
tsu.Chrono Cross OST is a wonderful creation filled with rich orchestra and synthesized sounds. While ambiance is o
ne of the music's major factors, yet at times it's very uplifting and vigorous. Some of my favourite tracks include
; ""Scars Left by Time, The Girl who Stole the Stars, and Another World""."
"5","Too good to be true","Probably the greatest soundtrack in history! Usually it's better to have played the game
 first but this is so enjoyable anyway! I worked so hard getting this soundtrack and after spending [money] to get
it it was really worth every penny!! Get this OST! it's amazing! The first few tracks will have you dancing around
with delight (especially Scars Left by Time)!! BUY IT NOW!!"
"5","There's a reason for the price","There's a reason this CD is so expensive, even the version that's not an impo
rt.Some of the best music ever. I could listen to every track every minute of every day. That's about all i can say

Please note that each line contains three things in comma separated format: rating, review title and review text.

You can build a mode on the train.csv which can predict the rating from the text of the review title and review text.

This is based on actual reviews from amazon.