运行Hadoop WordCount

原创
2017/02/06 15:01
阅读数 44

运行Hadoop WordCount

1.启动Hadoop

./root/hadoop/hadoop-2.6.0/sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
或者使用:
./root/hadoop/hadoop-2.6.0/sbin/start-dfs.sh
./root/hadoop/hadoop-2.6.0/sbin/start-yarn.sh

2.准备测试文件,在某个目录创建测试文件

[root@localhost /]# mkdir /root/testFile
[root@localhost /]# echo "Hello Hadoop" > /root/testFile/hello.txt
[root@localhost /]# echo "Hello Java" > /root/testFile/hello2.txt

3.在HDFS上创建输入文件夹目录 input

/root/hadoop/hadoop-2.6.0/bin
[root@localhost bin]# hadoop fs -mkdir /input
  1. 把本地硬盘上创建的文件传进input里面
[root@localhost bin]# hadoop fs -put /root/testFile/hello*.txt /input
  1. hadoop自带的wordcount jar包位置 WordCount类代码
/root/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar
  1. 开始运行 wordcount
[root@localhost bin]# hadoop jar /root/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input/ /output/wordcount1
17/02/05 19:48:34 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/02/05 19:48:39 INFO input.FileInputFormat: Total input paths to process : 2
17/02/05 19:48:39 INFO mapreduce.JobSubmitter: number of splits:2
17/02/05 19:48:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1486108015974_0001
17/02/05 19:48:43 INFO impl.YarnClientImpl: Submitted application application_1486108015974_0001
17/02/05 19:48:44 INFO mapreduce.Job: The url to track the job: http://localhost:8099/proxy/application_1486108015974_0001/
17/02/05 19:48:44 INFO mapreduce.Job: Running job: job_1486108015974_0001
17/02/05 19:49:20 INFO mapreduce.Job: Job job_1486108015974_0001 running in uber mode : false
17/02/05 19:49:20 INFO mapreduce.Job:  map 0% reduce 0%
17/02/05 19:49:47 INFO mapreduce.Job:  map 50% reduce 0%
17/02/05 19:49:49 INFO mapreduce.Job:  map 100% reduce 0%
17/02/05 19:49:58 INFO mapreduce.Job:  map 100% reduce 100%
17/02/05 19:49:59 INFO mapreduce.Job: Job job_1486108015974_0001 completed successfully
17/02/05 19:49:59 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=54
		FILE: Number of bytes written=316700
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=229
		HDFS: Number of bytes written=24
		HDFS: Number of read operations=9
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=2
		Launched reduce tasks=1
		Data-local map tasks=2
		Total time spent by all maps in occupied slots (ms)=52251
		Total time spent by all reduces in occupied slots (ms)=6032
		Total time spent by all map tasks (ms)=52251
		Total time spent by all reduce tasks (ms)=6032
		Total vcore-seconds taken by all map tasks=52251
		Total vcore-seconds taken by all reduce tasks=6032
		Total megabyte-seconds taken by all map tasks=53505024
		Total megabyte-seconds taken by all reduce tasks=6176768
	Map-Reduce Framework
		Map input records=2
		Map output records=4
		Map output bytes=40
		Map output materialized bytes=60
		Input split bytes=205
		Combine input records=4
		Combine output records=4
		Reduce input groups=3
		Reduce shuffle bytes=60
		Reduce input records=4
		Reduce output records=3
		Spilled Records=8
		Shuffled Maps =2
		Failed Shuffles=0
		Merged Map outputs=2
		GC time elapsed (ms)=679
		CPU time spent (ms)=9280
		Physical memory (bytes) snapshot=707444736
		Virtual memory (bytes) snapshot=2677784576
		Total committed heap usage (bytes)=516423680
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=24
	File Output Format Counters 
		Bytes Written=24
[root@localhost bin]#
  1. 查看运行结果
[root@localhost bin]# hdfs dfs -cat /output/wordcount1/*
Hadoop	1
Hello	2
Java	1

参考:http://www.itnose.net/detail/6197823.html

展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部