配置之前,需要在Cluster文件系统创建以下文件夹,用于存放命名空间以及数据信息。
~/dfs/name
~/dfs/data
~/temp
这里要涉及到的配置文件有7个:
~/Hadoop-2.4.0/etc/hadoop/hadoop-env.sh
~/hadoop-2.4.0/etc/hadoop/yarn-env.sh
~/hadoop-2.4.0/etc/hadoop/slaves
~/hadoop-2.4.0/etc/hadoop/core-site.xml
~/hadoop-2.4.0/etc/hadoop/hdfs-site.xml
~/hadoop-2.4.0/etc/hadoop/mapred-site.xml
~/hadoop-2.4.0/etc/hadoop/yarn-site.xml
以上个别文件默认不存在的,可以复制相应的template文件获得。
~/ect/hadoop/hadoop-env.sh 与 yarn-env.sh
原文件中设置Java环境:export JAVA_HOME=${JAVA_HOME},如果你环境变量中未配置JAVA_HOME,那么这里JAVA_HOME设置指向你的JAVA配置路径。
譬如:export JAVA_HOME="/usr/local/jdk"
~/etc/hadoop/slave
slaves (这个文件里面保存所有slave节点)
写入以下内容:
Slave1
Slave2
~/etc/hadoop/core-site.xml
在configuration节点里面添加属性
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://Master:9000</value>
</property>
添加httpfs的选项
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
~/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.dataname.data.dir</name>
<value>file:/opt/hadoop/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>Master:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
~/etc/hadoop/yarn-site.xml
<property>
<name>yarn.resourcemanager.address</name>
<value>Master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>Master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>Master:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>Master:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>Master:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>