Tags

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Use Case – Delhi Assembly Elections has been done. There are many electoral booths and many candidates for each booths. Voting machine generates candidate_name when voter submit vote to their name. Election commission wanted to count votes in day.

Sample Data Attached: change file extension to .txt after downloading. WordPress doesnot allow uploading txt files so i changes extenstion to pdf.

file1

file2

Solution –
Assuming you have gone through Day 2 which means you know how to run application.

Creating VoteCounter.jar

1. open JDeveloper and create new Java Application
Specify Application Name – VoteCounterApp
Specify Project Name – VoteCounterProj
Specify Package Name – com.bigdata.votecount

2. Goto Project Properties -> Libraries&Classpath ->click “Add Jar/Libraries”
Select all jars from “\hadoop-2.5.2\share\hadoop\common\lib”
Select hadoop-common-2.5.2.jar from “\hadoop-2.5.2\share\hadoop\common”

– I have downloaded hadoop-2.5.2.tar.gz as discussed in Day 2 exercise and extracted in \hadoop-2.5.2 folder

3. Create VoteCounterReducer.java

package com.bigdata.votecount;

import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class VoteCountReducer extends Reducer {

@Override
public void reduce(Text key, Iterable values, Context output)
throws IOException, InterruptedException {
int voteCount = 0;
for(IntWritable value: values){
voteCount+= value.get();
}
output.write(key, new IntWritable(voteCount));
}
}

Step 4. Create VoteCounterMapper.java

package com.bigdata.votecount;

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class VoteCountMapper extends Mapper {

private final static IntWritable one = new IntWritable(1);

@Override
public void map(Object key, Text value, Context output) throws IOException,
InterruptedException {

//If more than one word is present, split using white space.
String[] words = value.toString().split(” “);
//Only the first word is the candidate name
output.write(new Text(words[0]), one);
}
}

5. Create VoteCountApplication.java

package com.bigdata.votecount;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class VoteCountApplication extends Configured implements Tool{

public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new Configuration(), new VoteCountApplication(), args);
System.exit(res);
}

@Override
public int run(String[] args) throws Exception {
if (args.length != 2) {
System.out.println(“usage: [input] [output]”);
System.exit(-1);
}

Job job = Job.getInstance(new Configuration());
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setMapperClass(VoteCountMapper.class);
job.setReducerClass(VoteCountReducer.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setJarByClass(VoteCountApplication.class);

job.submit();
return 0;
}
}

6. Deploy this project as JAR file.
Specify JAR Name – VoteCounter.jar

Running VoteCounter.jar

7. create a folder name “vcinput” in “/user/hue/” folder.

8. Create two files file1.txt and file2.txt and put these contents in to that.

9. Type this command,

hadoop jar /media/sf_bigdata/VoteCounter/deploy/VoteCounter.jar com.bigdata.votecount.VoteCountingApplication /user/hue/vcinput/ user/hue/vcoutput/

10. open this url in browser for history of the job – http://127.0.0.1:19888/jobhistory

Have fun!

Follow Day 5 for detailed discussion of MapReduce Programming..

Advertisements