⑧ 启动Hadoop
在命令行里执行,start-all.sh,或者执行start-dfs.sh,再执行start-mapred.sh
⑨ 输入jps,查看启动的服务进程
master节点:
[root@master ~]# jps
25429 SecondaryNameNode
25500 JobTracker
25201 NameNode
25328 DataNode
18474 Jps
25601 TaskTracker
[root@slave1 ~]# jps 4469 TaskTracker 4388 DataNode 29622 Jps
如上显示,则说明相应的服务进程都启动成功了。
圈10(额,像①一样的圈出不来了(⊙o⊙)) 查看hdfs分布式文件系统的 文件目录结构
hadoop fs -ls /
此时发现为空,因为确实什么也没有,运行一下命令,则可创建一个文件夹:
hadoop fs -mkdir /newDir
再次执行hadoop fs -ls /,则会看到newDir文件夹,关于hadoop fs 命令,参见:HDFS 命令
圈11 运行hadoop 类似hello world的程序
本来,都是以word count来运行的,但是还得建文件夹之类的,有一个更简单的,就是example中的计算π值的程序,我们来计算一下,进入hadoop目录,运行如下:
[root@slave1 hadoop-0.20.2]# hadoop jar hadoop-0.20.2-examples.jar pi 4 2 Number of Maps = 4 Samples per Map = 2 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Starting Job 12/05/20 09:45:19 INFO mapred.FileInputFormat: Total input paths to process : 4 12/05/20 09:45:19 INFO mapred.JobClient: Running job: job_201205190417_0005 12/05/20 09:45:20 INFO mapred.JobClient: map 0% reduce 0% 12/05/20 09:45:30 INFO mapred.JobClient: map 50% reduce 0% 12/05/20 09:45:31 INFO mapred.JobClient: map 100% reduce 0% 12/05/20 09:45:45 INFO mapred.JobClient: map 100% reduce 100% 12/05/20 09:45:47 INFO mapred.JobClient: Job complete: job_201205190417_0005 12/05/20 09:45:47 INFO mapred.JobClient: Counters: 18 12/05/20 09:45:47 INFO mapred.JobClient: Job Counters 12/05/20 09:45:47 INFO mapred.JobClient: Launched reduce tasks=1 12/05/20 09:45:47 INFO mapred.JobClient: Launched map tasks=4 12/05/20 09:45:47 INFO mapred.JobClient: Data-local map tasks=4 12/05/20 09:45:47 INFO mapred.JobClient: FileSystemCounters 12/05/20 09:45:47 INFO mapred.JobClient: FILE_BYTES_READ=94 12/05/20 09:45:47 INFO mapred.JobClient: HDFS_BYTES_READ=472 12/05/20 09:45:47 INFO mapred.JobClient: FILE_BYTES_WRITTEN=334 12/05/20 09:45:47 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=215 12/05/20 09:45:47 INFO mapred.JobClient: Map-Reduce Framework 12/05/20 09:45:47 INFO mapred.JobClient: Reduce input groups=8 12/05/20 09:45:47 INFO mapred.JobClient: Combine output records=0 12/05/20 09:45:47 INFO mapred.JobClient: Map input records=4 12/05/20 09:45:47 INFO mapred.JobClient: Reduce shuffle bytes=112 12/05/20 09:45:47 INFO mapred.JobClient: Reduce output records=0 12/05/20 09:45:47 INFO mapred.JobClient: Spilled Records=16 12/05/20 09:45:47 INFO mapred.JobClient: Map output bytes=72 12/05/20 09:45:47 INFO mapred.JobClient: Map input bytes=96 12/05/20 09:45:47 INFO mapred.JobClient: Combine input records=0 12/05/20 09:45:47 INFO mapred.JobClient: Map output records=8 12/05/20 09:45:47 INFO mapred.JobClient: Reduce input records=8 Job Finished in 28.952 seconds Estimated value of Pi is 3.50000000000000000000
计算PI值为3.5,还算靠近,至于输出log日志,就不介绍了,以后学的稍微深入,可多做了解。Hadoop 三节点集群的配置就介绍到这里,接下来,会介绍一下如何在windows中远程连接hadoop,并配置eclipse来进行MapReduce的开发和调试。
本文打算写成一个系列,从集群搭建,到windows中远程连接开发调试,再到源码的阅读和分析,立此存照,必须说到做到。
更多Hadoop相关信息见Hadoop 专题页面 ?tid=13