这个是mapreduce任务的配置,由于hadoop2.x使用了yarn框架,所以要实现分布式部署,必须在mapreduce.framework.name属性下配置为yarn。mapred.map.tasks和mapred.reduce.tasks分别为map和reduce的任务数。
[hadoop@linux-node1 hadoop]$ cp mapred-site.xml.template mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>linux-node1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>linux-node1:19888</value> </property> </configuration>(7)配置节点yarn-site.xml
#该文件为yarn架构的相关配置
<?xml version="1.0"?> <!-- mapred-site.xml --> <configuration> <property> <name>mapred.child.java.opts</name> <value>-Xmx400m</value> <!--Not marked as final so jobs can include JVM debuggung options --> </property> </configuration> <?xml version="1.0"?> <!-- yarn-site.xml --> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>linux-node1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>linux-node1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>linux-node1:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>linux-node1:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>linux-node1:8088</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>8192</value> </property> </configuration>7、复制hadoop到其他节点
scp -r /home/hadoop/hadoop/ 192.168.0.90:/home/hadoop/ scp -r /home/hadoop/hadoop/ 192.168.0.91:/home/hadoop/ scp -r /home/hadoop/hadoop/ 192.168.0.92:/home/hadoop/8、在linux-node1使用hadoop用户初始化NameNode
/home/hadoop/hadoop/bin/hdfs namenode –format #echo $? #sudo yum –y install tree # tree /home/hadoop/dfs9、启动hadoop
/home/hadoop/hadoop/sbin/start-dfs.sh /home/hadoop/hadoop/sbin/stop-dfs.sh#namenode节点上面查看进程
ps aux | grep --color namenode#DataNode上面查看进程
ps aux | grep --color datanode10、启动yarn分布式计算框架
[hadoop@linux-node1 .ssh]$ /home/hadoop/hadoop/sbin/start-yarn.sh starting yarn daemons