快速搭建Hadoop环境并测试mapreduce

hadoop有三种运行方式,单机版包括直接本地运行,假多点环境,多点集群环境。本文测试第一种方法,快速部署hadoop应用。

开始:
下载
wget
60MB大小
解压
tar -zxvf hadoop-1.0.3.tar.gz

配置
[@Hadoop48 ~]$ echo $JAVA_HOME
/usr/java/jdk1.7.0

cd hadoop-1.0.3
[@Hadoop48 hadoop-1.0.3]$ vi conf/hadoop-env.sh
#将注释去掉,设置JAVA_HOME环境变量
export JAVA_HOME=/usr/java/jdk1.7.0

测试:
[@Hadoop46 hadoop-1.0.3]$ ./bin/hadoop
Usage: hadoop [--config confdir] COMMAND

执行例子程序中的grep

[@Hadoop48 hadoop-1.0.3]$ mkdir input
[@Hadoop48 hadoop-1.0.3]$ cp conf/* input
[@Hadoop48 hadoop-1.0.3]$ ./bin/hadoop jar hadoop-examples-1.0.3.jar grep input output ‘[a-z.]+’
12/05/22 18:03:32 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/05/22 18:03:32 WARN snappy.LoadSnappy: Snappy native library not loaded

[@Hadoop46 hadoop-1.0.3]$ cat output/*
117 value
99 property
91 name
88 description
85 the
77 of

测试mapreduce 例子wordcount,单词计数:

[@Hadoop46 hadoop-1.0.3]$ rm -r output
[@Hadoop46 hadoop-1.0.3]$ ./bin/hadoop jar hadoop-examples-1.0.3.jar wordcount input output
12/05/22 18:32:54 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/05/22 18:32:55 INFO input.FileInputFormat: Total input paths to process : 16

12/05/22 18:33:47 INFO mapred.JobClient: Map output records=2587
[@Hadoop46 hadoop-1.0.3]$

可以看到花费将近1分钟计算单词数

[@Hadoop46 hadoop-1.0.3]$ ls output/
part-r-00000 _SUCCESS

[@Hadoop46 hadoop-1.0.3]$ cat output/*
“”. 4
“*” 10
“alice,bob 10
“console” 1
“hadoop.root.logger”. 1
“jks”. 4

which 17
who 3
will 8
with 5
worker 1
would 7
xmlns:xsl=”” 1
you 1

10分钟内完成。

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/14a311c3300d11fdbd4dbbe3da42e15e.html