Nov 27, 2012

Not getting performance with MapReduce

I am working on hadoop mapreduce to get performance benefit but when I run my program on hadoop it takes about 37 minutes where as it takes only about 5 minutes for simple C++ program for doing the same task..

Here's some additional info.

Map/Reduce Tutorial

"This document comprehensively describes all user-facing facets of the Hadoop Map/Reduce framework and serves as a tutorial."

I'm not sure what the issue is or how familiar you are with MapReduce.  Yahoo has a pretty good Hadoop tutorial that has two modules on MapReduce.  Perhaps that may be of some help.  Good luck! 




I am new to Mapreduce and I am using Hadoop Pipes for that. I have an input file which contains the number of records, one per line. I have written one simple program to print those lines in which three words are common. In map function i have emiited the word as a key and record as a value and compared those records in reduce function. I compared Hadoop's performance with simple C++ program in which I read the records from file and split it into words and load the data, word as a key and record as a value in map and after loading all the data I compared that data. But I found that for doing the same task Hadoop MapReduce takes long time compared with plain C++ program.

Answer this