单机模式:安装简单,几乎不用作任何配置,但仅限于调试用途
伪分布模式:在单节点上同时启动namenode、datanode、jobtracker、tasktracker、secondary namenode等5个进程,模拟分布式运行的各个节点
完全分布式模式:正常的Hadoop集群,由多个各司其职的节点构成
相关阅读:
安装环境
操作平台:vmware2
操作系统:Oracle linux 5.6
软件版本:hadoop-0.22.0,jdk-6u18
集群架构:3 node,master node(gc),slave node(rac1,rac2)
安装步骤
1. 下载Hadoop和jdk:
如:hadoop-0.22.0
2. 配置hosts文件
所有的节点(gc,rac1,rac2)都修改/etc/hosts,使彼此之间都能把主机名解析为ip
[root@gc ~]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.2.101 rac1.localdomain rac1
192.168.2.102 rac2.localdomain rac2
192.168.2.100 gc.localdomain gc
3. 建立hadoop运行账号
在所有的节点创建hadoop运行账号
[root@gc ~]#groupadd hadoop
[root@gc ~]#useradd -g hadoop grid--注意此处一定要指定分组,不然可能会不能建立互信
[root@gc ~]# id grid
uid=501(grid) gid=54326(hadoop) groups=54326(hadoop)
[root@gc ~]#passwd grid
Changing password for user grid.
New UNIX password:
BAD PASSWORD: it is too short
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
4. 配置ssh免密码连入
注意要以hadoop用户登录,在hadoop用户的主目录下进行操作。
每个节点做下面相同的操作
[hadoop@gc ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
54:80:fd:77:6b:87:97:ce:0f:32:34:43:d1:d2:c2:0d hadoop@gc.localdomain
[hadoop@gc ~]$ cd .ssh
[hadoop@gc .ssh]$ ls
id_rsa id_rsa.pub
把各个节点的authorized_keys的内容互相拷贝加入到对方的此文件中,然后就可以免密码彼此ssh连入。
在其中一节点(gc)节点就可完成操作
[hadoop@gc .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@gc .ssh]$ssh rac1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'rac1 (192.168.2.101)' can't be established.
RSA key fingerprint is 19:48:e0:0a:37:e1:2a:d5:ba:c8:7e:1b:37:c6:2f:0e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'rac1,192.168.2.101' (RSA) to the list of known hosts.
hadoop@rac1's password:
[hadoop@gc .ssh]$ssh rac2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'rac2 (192.168.2.102)' can't be established.
RSA key fingerprint is 19:48:e0:0a:37:e1:2a:d5:ba:c8:7e:1b:37:c6:2f:0e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'rac2,192.168.2.102' (RSA) to the list of known hosts.
hadoop@rac2's password:
[hadoop@gc .ssh]$scp ~/.ssh/authorized_keys rac1:~/.ssh/authorized_keys
hadoop@rac1's password:
authorized_keys 100% 1213 1.2KB/s 00:00
[hadoop@gc .ssh]$scp ~/.ssh/authorized_keys rac2:~/.ssh/authorized_keys
hadoop@rac2's password:
authorized_keys 100% 1213 1.2KB/s 00:00
[hadoop@gc .ssh]$ ll
总计 16
-rw-rw-r-- 1 hadoop hadoop 1213 10-30 09:18 authorized_keys
-rw------- 1 hadoop hadoop 1675 10-30 09:05 id_rsa
-rw-r--r-- 1 hadoop hadoop 403 10-30 09:05 id_rsa.pub
--分别测试连接
[grid@gc .ssh]$ ssh rac1 date
2012年 11月 18日星期日 01:35:39 CST
[grid@gc .ssh]$ ssh rac2 date
2012年 10月 30日星期二 09:52:46 CST
--可以看到这步和配置oracle RAC中使用 SSH建立用户等效性步骤是一样的。