Distributed Computing and MapReduce

Today I conducted a session on Distributed Computing. It mainly focussed on MapReduce and its open source implementation-Hadoop.
The session first covered a brief idea of why we need to have distributed computing since everything is driven by need.Then It covered how a distributed file system works.It focussed on the Google File System.A demonstartion of how Hadoop works on a single node enviornment where a example very similar to the wordcount example was show. It was an election simulation where the input was the votes cast and the Hadoop Job aggregated the results.After this some finer details about Hadoop
was covered which was the last topic in the session.

Some resources that are very helpful:

http://code.google.com/edu/parallel/index.html#hadoop

Slides for the session:

http://www.slideshare.net/varunthacker/distributed-computing-3635200

http://www.jakobhoman.com/search/label/Hadoop

How to setup an Hadoop enviorment:

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29

Advertisements

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: