每台机器上都运行:ssh-keygen -t rsa -P ""
配置SSH(因为Hadoop需要通过SSH来管理它的节点)
Master节点:
hduser@master:~$scp .ssh/authorized_keys root@slave2:/home/root
hduser@master:~$scp .ssh/authorized_keys root@slave2:/home/root
Slave1节点:
root@Ubuntu:/home/hduser# chown -R hduser:hadoop authorized_keys
hduser@ubuntu:~$ cat .ssh/id_rsa.pub >> authorized_keys
hduser@ubuntu:~$ scp authorized_keys hduser@slave2:/home/hduser
Slave2节点:
hduser@ubuntu:~$ cat .ssh/id_rsa.pub >> authorized_keys
hduser@ubuntu:~$ cp authorized_keys .ssh/
hduser@ubuntu:~$ scp authorized_keys hduser@master:/home/hduser/.ssh
hduser@ubuntu:~$ scp authorized_keys hduser@slave1:/home/hduser/.ssh
完成SSH配置。
5.关闭IPV6功能
vi /etc/sysctl.conf
在/etc/sysctl.conf中添加:
# disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
然后需要重启机器
查看是否关闭IPV6
$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
如果为0,则没有关闭,如果为1,则关闭
##或者仅仅不允许Hadoop使用ipv6,在hadoop-env.sh中添加:
exportHADOOP_OPTS=-DJava.net.preferIPv4Stack=true
6.Hadoop安装:
将上传到/home/root的文件修改所属为hduser(或者直接在hduser用户上传)
root@ubuntu:~# chown -R hduser:hadoop *.gz
(使用hduser用户)将Hadoopxx.xx.gz在/usr/local文件夹中解压
~$sudo tar xzf hadoop-1.0.4.tar.gz
出现错误:
hduser is not in the sudoers file. This incident will be reported.
解决方法:
1)查看sudoers的位置
root@master:~# whereis sudoers
sudoers: /etc/sudoers.d /etc/sudoers /usr/share/man/man5/sudoers.5.gz
2)添加修改权限
root@slave2:~# chmod u+x /etc/sudoers
3)添加hduser的sudo权限
root@slave2:~# vi /etc/sudoers
添加:hduser ALL=(ALL:ALL) ALL
解压Hadoop:
再进行解压:
hduser@master:/usr/local$ sudo tar xzf hadoop-1.0.4.tar.gz
修改目录名称:
hduser@master:/usr/local$ sudo mv hadoop-1.0.4 hadoop
修改所属:
hduser@master:/usr/local$ sudo chown -R hduser:hadoop hadoop
更新$HOME/.bashrc(每台机器上都需要修改)
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/java/jdk1.6.0_43
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin
7.Hadoop配置:
1)修改conf/hadoop-env.sh
设置JAVA_HOME
exportJAVA_HOME=/usr/java/jdk1.6.0_43
2)仅修改Master节点:
conf/masters内容修改为:
master
conf/slaves内容修改为:
master
slave1
slave2
3)修改所有节点的conf/*-site.xml
conf/core-site.xml (所有机器)
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
< /property>
conf/mapred-site.xml (所有机器)
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
< /property>
conf/hdfs-site.xml (所有机器)
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
< /property>