背景:
阅读新闻
Hadoop安装以及配置
[日期:2013-05-15] 来源:Linux社区 作者:jsjwk [字体:]
伪分布模式(是只使用一台机机器的集群模式)
============================================
7、配置Hadoop
---------------------------------------
hadoop0.23.6版本配置
(1)编辑文件hadoop/etc/hadoop/yarn-env.sh
头部增加:
export Java_HOME=/usr/lib/jvm/jdk1.7.0_17
export HADOOP_FREFIX=/opt/apps/hadoop
export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
export PATH=$PATH:$HADOOP_FREFIX/bin
export PATH=$PATH:$HADOOP_FREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
export YARN_HOME=${HADOOP_FREFIX}
export HADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop
(2)编辑文件libexec/hadoop-config.sh
添加export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_17
ln -s yarn-env.sh hadoop-env.sh
mkdir -p /opt/apps/hadoop_tmp/hadoop-root
(3)编辑文件hadoop/etc/hadoop/core-site.xml
< configuration>
< property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:54310/</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/apps/hadoop/hadoop-root</value>
</property>
< property>
<name>fs.arionfs.impl</name>
<value>org.apache.hadoop.fs.pvfs2.Pvfs2FileSystem</value>
<description>The FileSystem for arionfs.</description>
< /property>
< /configuration>
(4)编辑文件hadoop/etc/hadoop/hdfs-site.xml
< property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/apps/hadoop_space/dfs/name</value>
<final>true</final>
< /property>
< property>
<name>dfs.namenode.data.dir</name>
<value>file:/opt/apps/hadoop_space/dfs/data</value>
<final>true</final>
< /property>
< property>
<name>dfs.replication</name>
<value>1</value>
< /property>
< property>
<name>dfs.permission</name>
<value>false</value>
< /property>
(5)编辑文件hadoop/etc/hadoop/mapred-site.xml
< property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
< /property>
< property>
<name>mapreduce.job.tracker</name>
<value>hdfs://localhost:9001</value>
< final>true</final>
< /property>
< property>
<name>mapreduce.map.memory.mb</name>
<value>1536</value>
< /property>
< property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024M</value>
< /property>
< property>
<name>mapreduce.reduce.memory.mb</name>
<value>3072</value>
< /property>
< property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx2560M</value>
< /property>
< property>
<name>mapreduce.task.io.sort.mb</name>
<value>512</value>
< /property>
< property>
<name>mapreduce.task.io.sort.factor</name>
<value>100</value>
< /property>
< property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>50</value>
< /property>
< property>
<name>mapreduce.system.dir</name>
<value>file:/opt/apps/hadoop_space/mapred/system</value>
< /property>
< property>
<name>mapreduce.local.dir</name>
<value>file:/opt/apps/hadoop_space/mapred/local</value>
<final>true</final>
< /property>
(6)编辑文件hadoop/etc/hadoop/yarn-site.xml
< property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
< /property>
< property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
< /property>
< property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
< /property>
< property>
<name>user.name</name>
<value>hadoop</value>
< /property>
< property>
<name>yarn.resourcemanager.address</name>
<value>localhost:54311</value>
< /property>
< property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:54312</value>
< /property>
< property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:54313</value>
< /property>
< property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:54314</value>
< /property>
< property>
<name>yarn.web-proxy.address</name>
<value>localhost:54315</value>
< /property>
< property>
<name>mapred.job.tracker</name>
<value>localhost</value>
< /property>
---------------------------------------
hadoop1.0.4版本配置
mkdir -p /opt/apps/hadoop_tmp/hadoop-root
(1)编辑文件hadoop/conf/hadoop-env.sh
将注释的JAVA_HOME配置改为
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_17
(2)修改文件hadoop/conf/core-site.xml
< property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>master的hdfs连接地址,这个决定namenode</description>
< /property>
< property>
<name>hadoop.tmp.dir</name>
<value>/opt/apps/hadoop_tmp/hadoop-root</value>
<description>最重要的hadoop临时目录,其它目录会引用该目录的配置</description>
< /property>
(3)修改文件hadoop/conf/hdfs-site.xml
< property>
<name>dfs.replication</name>
<value>1</value>
< /property>
<property>
<name>dfs.permissions</name>
<value>false</value>
<description>关闭权限配置</description>
< /property>
(4)修改文件hadoop/conf/mapred-site.xml
< property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
< /property>
---------------------------------------
创建文件夹:
mkdir -p /opt/apps/hadoop_tmp/hadoop-root/dfs/name
8、格式化namenode (首次运行必需滴)
先进入hadoop目录,格式化namenode:
hadoop namenode -format
9、启动hadoop
---------------------------------------
hadoop0.23.6版本配置
在/opt/apps/hadoop/sbin
./start-dfs.sh
./start-yarn.sh
---------------------------------------
hadoop1.0.4版本配置
在/opt/apps/hadoop/bin
./start-all.sh
------------------------------
PS:
如果启动报错可能是顺序不对
rm -rf /opt/apps/hadoop/hadoop-root
rm -rf /opt/apps/hadoop_space/*
kill 所有进程 然后重新启动
------------------------------
界面:
:50030 (MapReduce的Web页面)
:50070 (HDFS的Web页面)
测试:
查看HDFS的命令行使用方式
hdfs dfs -help
查看HDFS中的文件
hdfs dfs -ls
在HDFS根目录创建文件夹
hdfs dfs -mkdir /firstTest
拷贝当前文件到HDFS上的一个文件夹
hdfs dfs -copyFromLocal test.txt /firstTest
运行一个小测试demo:
hadoop jar hadoop-mapreduce-examples-0.23.6.jar wordcount /firstTest result
查看运行结果:hdfs dfs -cat result/part-r-00000
本文评论 查看全部评论 (0)
尊重网上道德,遵守中华人民共和国的各项有关法律法规 承担一切因您的行为而直接或间接导致的民事或刑事法律责任 本站管理人员有权保留或删除其管辖留言中的任意内容 本站有权在网站内转载或引用您的评论 参与本评论即表明您已经阅读并接受上述条款
评论声明
最新资讯