Hadoop系列:在Linux下部署Hadoop 0.20.1

两台测试虚机,系统为REHL 5.3 x64,正常安装最新版本的JDK,正确设置SSH无密码登录。
服务器一:192.168.56.101 dev1
服务器二:192.168.56.102 dev2

相关阅读:

Linux下单机模式的Hadoop部署

Hadoop入门—Linux下伪分布式计算的安装与wordcount的实例展示

从下载hadoop-0.20.1.tar.gz,把hadoop-0.20.1.tar.gz拷贝到dev1的“/usr/software/hadoop”目录下。登录dev1执行以下命令:

# cd /usr/software/hadoop
# tar zxvf hadoop-0.20.1.tar.gz
# cp -a hadoop-0.20.1 /usr/hadoop
# cd /usr/hadoop/conf

修改hadoop环境配置文件hadoop-env.sh
# vi hadoop-env.sh

添加以下内容:
export JAVA_HOME=/usr/java/jdk1.6.0_16

修改hadoop主要配置文件core-site.xml
# vi core-site.xml
添加以下内容(可以根据需求自行定义):

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="https://www.linuxidc.com/configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://dev1</value>
    <description>The name of the default file system. Either the literal string "local" or a host:port for DFS.</description>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/hadoop/tmp</value>
    <description>A base for other temporary directories.</description>
  </property>

<property>
    <name>dfs.name.dir</name>
    <value>/usr/hadoop/filesystem/name</value>
    <description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
  </property>

<property>
    <name>dfs.data.dir</name>
    <value>/usr/hadoop/filesystem/data</value>
    <description>
      Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are i
      gnored.
    </description>
  </property>

<property>
    <name>dfs.replication</name>
    <value>1</value>
    <description>Default block replication. The actual number of replications can be specified when the file is created. The default iSUSEd if replication is not specified in create time.</description>
  </property>
</configuration>

修伽hadoop的mapred-site.xml文件
# vi mapred-site.xml

添加如下内容:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="https://www.linuxidc.com/configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>dev1:9001</value>
    <description>
      The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and
      reduce task.
    </description>
  </property>

</configuration>
修改hadoop定义namenode的masters文件:
# vi masters
添加以下内容:
dev1
修改hadoop定义datanode的slaves文件:
# vi slaves
添加以下内容:
dev2

在dev2按以上步骤安装hadoop。
格式化namenode:
# ./hadoop namenode -format
到此所有安装和配置完成。
在dev1执行以下命令,启动hadoop:
# cd /usr/hadoop/bin
# ./start-all.sh
启动完成后,可以以下执行命令来查看hadoop查看其基本情况:
# ./hadoop dfsadmin -report
或在浏览器中输入:50070/dfshealth.jsp查看。

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/34ea23f26fd2d3fbea735e3931709436.html