Lecture 7: More into Hadoop and HDSF
UPDATE: Please check the file attached below (course_Cluster_instructions.pdf)
UPDATE: History of Commands from different terminals is avaialble below
Project Ideas
1- Given a twitter handle provide the following:
-- number of tweets
-- top 10 words used by the user
2- given 2 usernames do the following
-- compare the number of tweets each did
-- compare the number of retweets each got
3- given a username do the following
-- get the average amount of time he/she waits between tweets
-- get the shortest and highest amount of time he/she waited between tweets
4- given a keyword do the following:
--- get the number of tweets that has the keyword in them
--- get the most retweeted tweet that has the keyword in it