Ubuntu下安装及配置单点Hadoop

环境:Ubuntu 10.10,Hadoop 0.21.0首先准备工作:

1,我是在deskTop版本的Ubuntu下安装的所以,需要先安装ssh server。这个很好找,直接到到新立得里搜索ssh,第一个就是。

2,安装sun jdk6,切忌一定要是java6及其以上版本。先到更新管理器里把canonical的源加进去。

之后sudo apt-get update.

3, sudo apt-get sun-java6-jdk

4, sudo update-java-alternatives -s java-6-sun

5,增加一个用户组用户,用于hadoop运行及访问。

sudo addgroup hadoop
sudo adduser --ingroup hadoop hadoop

6,生成SSH证书,配置SSH加密key

user@ubuntu:~$ su - hadoop
hadoop@ubuntu:~$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
9b:82:ea:58:b4:e0:35:d7:ff:19:66:a6:ef:ae:0e:d2 hadoop@ubuntu
The key's randomart image is:
[...snipp...]
hadoop@ubuntu:~$

hadoop@ubuntu:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

配置完成,测试一下:

hadoop@ubuntu:~$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is d7:87:25:47:ae:02:00:eb:1d:75:4f:bb:44:f9:36:26.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Linux ubuntu 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:27:30 UTC 2010 i686 GNU/Linux
Ubuntu 10.04 LTS
[...snipp...]
hadoop@ubuntu:~$

8,禁用ipV6配置:
打开/etc/sysctl.conf, 此文件需要root权限。
再次文件中,追加如下:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

重启,测试是否配置成功:

$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6

如果是1就ok了。
下面就是安装Hadoop了。
首先是:下载,解压缩,分配权限。
下载就不说了。
下载后运行如下:

$ cd /usr/local
$ sudo tar xzf hadoop-0.20.2.tar.gz
$ sudo mv hadoop-0.20.2 hadoop
$ sudo chown -R hadoop:hadoop hadoop

ook,现在安装完毕。
下面说说如何配置和启动:
基本思路是,配置JDK地址,配置core-site.xml,配置mapred-site.xml,hdfs-site.xml.
首先建立一个用来存放数据的目录:mkdir /usr/local/hadoop-datastore
打开conf/core-site.xml,配置如下

<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop-datastore/</value>
    <description>A base for other temporary directories.</description>
  </property>

  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:54310</value>
    <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
  </property>
</configuration>

mapred-site.xml如下:

<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>
</configuration>

hdfs-site.xml如下:

<configuration>
<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>
</configuration>

ok,配置完毕
格式化HDFS:

hadoop@ubuntu:~$ /hadoop/bin/hadoop namenode -format

启动HDFS和MapReduce

hadoop@ubuntu:~$ /bin/start-all.sh

停止服务的脚本是:

hadoop@ubuntu:~$ /bin/stop-all.sh

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/536ba9c44075c33fbd0a592a61df7386.html