第五章 Centos下完全分布式部署Hadoop-3.3.1

一、Hadoop环境准备 1.集群规划 主机名 IP HDFS YARN
hadoop102   10.0.0.102   NameNode、DataNode   NodeManager  
hadoop103   10.0.0.103   DataNode、SecondaryNameNode   NodeManager、ResourceManager  
hadoop104   10.0.0.104   DataNode   NodeManager  
#1.注意事项: ps: 1)NameNode和SecondaryNameNode不要安装在同一台服务器 2)ResourceManager也很消耗内存,不要和NameNode、SecondaryNameNode配置在同一台机器上。 #2.配置文件说明 Hadoop配置文件分两类:默认配置文件和自定义配置文件,只有用户想修改某一默认配置值时,才需要修改自定义配置文件,更改相应属性值。 1)默认配置文件: 要获取的默认文件 文件存放在Hadoop的jar包中的位置 [core-default.xml] hadoop-common-3.1.3.jar/core-default.xml [hdfs-default.xml] hadoop-hdfs-3.1.3.jar/hdfs-default.xml [yarn-default.xml] hadoop-yarn-common-3.1.3.jar/yarn-default.xml [mapred-default.xml] hadoop-mapreduce-client-core-3.1.3.jar/mapred-default.xml 2)自定义配置文件: core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml四个配置文件存放在$HADOOP_HOME/etc/hadoop这个路径上,用户可以根据项目需求重新进行修改配置。 2.修改主机名称 #1.修改hadoop102的hosts文件 [root@hadoop102 ~]# vim /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.0.0.102 hadoop102 10.0.0.103 hadoop103 10.0.0.104 hadoop104 #2.将hadoop102的hosts文件拷贝到hadoop103 [root@hadoop102 ~]# scp /etc/hosts root@hadoop103:/etc/hosts root@hadoop103\'s password: hosts 100% 222 1.5KB/s 00:00 #2.将hadoop102的hosts文件拷贝到hadoop104 [root@hadoop102 ~]# scp /etc/hosts root@hadoop104:/etc/hosts root@hadoop104\'s password: hosts 100% 222 108.8KB/s 00:00 3.创建部署用户 #1.创建用户 [root@hadoop102 ~]# useradd delopy [root@hadoop103 ~]# useradd delopy [root@hadoop104 ~]# useradd delopy #2.sudo提权 [root@hadoop102 ~]# vim /etc/sudoers ## Allow root to run any commands anywhere root ALL=(ALL) ALL delopy ALL=(ALL) ALL #3.复制sudo文件到hadoop103 [root@hadoop102 ~]# scp /etc/sudoers root@hadoop103:/etc/sudoers root@hadoop103\'s password: sudoers 100% 4356 1.0MB/s 00:00 #4.复制sudo文件到hadoop104 [root@hadoop102 ~]# scp /etc/sudoers root@hadoop104:/etc/sudoers root@hadoop104\'s password: sudoers 100% 4356 769.0KB/s 00:00 #5.创建程序和数据目录 [root@hadoop102 ~]# mkdir /data/ [root@hadoop102 ~]# mkdir /opt/module [root@hadoop102 ~]# chown -R delopy.delopy /data/ [root@hadoop102 ~]# chown -R delopy.delopy /opt/module/ 二、SSH免密登录 1.生成密钥对(所有机器) #1.切换delopy用户 [root@hadoop102 ~]# su delopy #2.设置用户密码 [root@hadoop102 ~]# passwd delopy Changing password for user delopy. New password: BAD PASSWORD: The password is shorter than 8 characters Retype new password: passwd: all authentication tokens updated successfully. #3.生成密钥对,一直回车即可 [delopy@hadoop102 ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/delopy/.ssh/id_rsa): Created directory \'/home/delopy/.ssh\'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/delopy/.ssh/id_rsa. Your public key has been saved in /home/delopy/.ssh/id_rsa.pub. The key fingerprint is: SHA256:8gV808AJIHCQTE8uEkUPuCn16A8IuSQrMfQSf2CBBEc delopy@hadoop102 The key\'s randomart image is: +---[RSA 2048]----+ |=OE== ...o.. | |.*+X . . oo | |o=*o= o o . | |X+.+.. o . | |=B. . . S . | |= o o . | |. o . | | . | | | +----[SHA256]-----+ 2.查看密钥(所有机器) #1.查看生成的密钥对 [delopy@hadoop102 ~]$ cd ~/.ssh [delopy@hadoop102 ~/.ssh]$ ll total 8 -rw------- 1 delopy delopy 1679 2021-08-31 14:59 id_rsa -rw-r--r-- 1 delopy delopy 398 2021-08-31 14:59 id_rsa.pub #3.查看公钥 [root@hadoop102 ~/.ssh]# cat id_rsa.pub ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCebmCdoFk9XrT5AVoNJFlhwoYArJY80BU9JyNwwXziR6NjuTrS4pzENBwx/Lbq0/qMI/PdZMMdiBYhpZTL/DkZyDoRf+2zRzPNQUMvTrK3bjIH4CAs3L7qSrkGICeaWQ9PIJwaRqF2yPS16qFTnq8aAimz08UiGzLfhGUHiEA+QF8usoe3titLXQ9fguRxyCfigdCEeq+xhPVuDpXCNoi6Woh4mnegGoVtJWgguFG0DU1gfUGckl0oKHM4ZbVBaQWTmQjHUKgvwwlXAO4gZ3qkVcGzMxfcc0P/OMqojYEbD5n/RFiMbN8ylCJt6QjOj23NzTG/LTNFFRbDfbLRhhm1 root@hadoop102 [delopy@hadoop102 ~/.ssh]$ cat id_rsa.pub ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7WKwWyb3lliFTQ1HPxZS63NvFLPYYiLovVhspMCWkiRKrgIGXB++tBRi2vJvLpLyMOpJVRc0hIUD2ycBgHuWLtWYNqma/1xzeIu67OrsK+v8+CeTCzqZ97DPp881Uu+4SoVQOkla7evpH40DOibvKd7SN8L7Mk+PEsVCeIrNyA/g2iZ9+M+XWaZIIYJb15QRPZLcgj1GHcR0cf6DtuTt26pCVimSYJ8DOYNNfHfwWKyJfBKKaQUX3ByYDbKIIH+yw3VbLgyU3v9oseYCA5psqeuD0YLuERrr45rydNRL7/oeoW2NicHSG2V1H6KBQBq861HcdbmcE2nbZtWrAsKpv delopy@hadoop102 [delopy@hadoop103 ~/.ssh]$ cat id_rsa.pub ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDLlKkbKYyIpUpwYLRBIqLhhU2YYb9o1dafpNwR8IkIj6rDBc2OzD1fqdzSQSpHX8LXShDTv2nr4R++SG1MabwqJ4q7JKwmZRSjuy/flQK0uhtSW6rPNqZX3P8Tl8rSqUMInOwwna9qCZTI8gajPrXRHAJ+oKRWWtGQ3M6t6larC4tXSoFQ4nBkPEgXUFnYphX1mYJiD0QduUXZwK7IMzFXPP/SkW+PddepFlsV2gTf2xCsLh7RHhsh0zWThkJGqLb6nPbIjOydQ84C3Z5DusAxOqlvuQk2FKpOQrB0dAgtHog7Oc/1vJqAMRe6MPdzaExl+OIEW2Xh8jJf9JWSkcs3 delopy@hadoop103 [delopy@hadoop104 ~/.ssh]$ cat id_rsa.pub ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC2EXMXB9V4f86vRhD2cHhZEd+gqatotEy9HkwKfajelPgH1KcD4jepM7h+RmutGj+QfCSE/fj56GuebjHFJmB8eB1X5wZ0B3lBbz+KV/bNB7IAHvEWn7KG6nkdkzT47zLJrWVY6zxS0BMW86WF4wNGeyHq4R3XZnRxEW/LJ/ZjENpJkh7X2Om2H6d+tq8WjBSCvlidSB8WlG+OAnLxk/rVUaUdRmBTXqBUhcWqIsD+vMaa/rESxvXbrn/0pl83ZVguRpbNPHbpEPvUujBn/FPSvwv0DN9JEB+v+AzOQADJvT+2mDI/FDzCPpashoeSN31p1vdgXJUQEsBaIlxrm94H delopy@hadoop104 3.配置SSH免密(所有机器) #1.编辑新文件authorized_keys,将所有公钥添加进去 [delopy@hadoop102 ~/.ssh]$ vim authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCebmCdoFk9XrT5AVoNJFlhwoYArJY80BU9JyNwwXziR6NjuTrS4pzENBwx/Lbq0/qMI/PdZMMdiBYhpZTL/DkZyDoRf+2zRzPNQUMvTrK3bjIH4CAs3L7qSrkGICeaWQ9PIJwaRqF2yPS16qFTnq8aAimz08UiGzLfhGUHiEA+QF8usoe3titLXQ9fguRxyCfigdCEeq+xhPVuDpXCNoi6Woh4mnegGoVtJWgguFG0DU1gfUGckl0oKHM4ZbVBaQWTmQjHUKgvwwlXAO4gZ3qkVcGzMxfcc0P/OMqojYEbD5n/RFiMbN8ylCJt6QjOj23NzTG/LTNFFRbDfbLRhhm1 root@hadoop102 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7WKwWyb3lliFTQ1HPxZS63NvFLPYYiLovVhspMCWkiRKrgIGXB++tBRi2vJvLpLyMOpJVRc0hIUD2ycBgHuWLtWYNqma/1xzeIu67OrsK+v8+CeTCzqZ97DPp881Uu+4SoVQOkla7evpH40DOibvKd7SN8L7Mk+PEsVCeIrNyA/g2iZ9+M+XWaZIIYJb15QRPZLcgj1GHcR0cf6DtuTt26pCVimSYJ8DOYNNfHfwWKyJfBKKaQUX3ByYDbKIIH+yw3VbLgyU3v9oseYCA5psqeuD0YLuERrr45rydNRL7/oeoW2NicHSG2V1H6KBQBq861HcdbmcE2nbZtWrAsKpv delopy@hadoop102 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDLlKkbKYyIpUpwYLRBIqLhhU2YYb9o1dafpNwR8IkIj6rDBc2OzD1fqdzSQSpHX8LXShDTv2nr4R++SG1MabwqJ4q7JKwmZRSjuy/flQK0uhtSW6rPNqZX3P8Tl8rSqUMInOwwna9qCZTI8gajPrXRHAJ+oKRWWtGQ3M6t6larC4tXSoFQ4nBkPEgXUFnYphX1mYJiD0QduUXZwK7IMzFXPP/SkW+PddepFlsV2gTf2xCsLh7RHhsh0zWThkJGqLb6nPbIjOydQ84C3Z5DusAxOqlvuQk2FKpOQrB0dAgtHog7Oc/1vJqAMRe6MPdzaExl+OIEW2Xh8jJf9JWSkcs3 delopy@hadoop103 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC2EXMXB9V4f86vRhD2cHhZEd+gqatotEy9HkwKfajelPgH1KcD4jepM7h+RmutGj+QfCSE/fj56GuebjHFJmB8eB1X5wZ0B3lBbz+KV/bNB7IAHvEWn7KG6nkdkzT47zLJrWVY6zxS0BMW86WF4wNGeyHq4R3XZnRxEW/LJ/ZjENpJkh7X2Om2H6d+tq8WjBSCvlidSB8WlG+OAnLxk/rVUaUdRmBTXqBUhcWqIsD+vMaa/rESxvXbrn/0pl83ZVguRpbNPHbpEPvUujBn/FPSvwv0DN9JEB+v+AzOQADJvT+2mDI/FDzCPpashoeSN31p1vdgXJUQEsBaIlxrm94H delopy@hadoop104 #2.修改文件权限为600 [delopy@hadoop102 ~/.ssh]$ chmod 600 authorized_keys #3.ssh文件夹下(~/.ssh)的文件功能解释 known_hosts 记录ssh访问过计算机的公钥(public key) id_rsa 生成的私钥 id_rsa.pub 生成的公钥 authorized_keys 存放授权过的无密登录服务器公钥 4.测试SSH免密登录(所有机器) #1.ssh免密登录hadoop102 [delopy@hadoop102 ~/.ssh]$ ssh hadoop102 The authenticity of host \'hadoop102 (10.0.0.102)\' can\'t be established. ECDSA key fingerprint is SHA256:g6buQ4QMSFl+5MMAh8dTCmLtkIfdT8sgRFYc6uCzV3c. ECDSA key fingerprint is MD5:5f:d7:ad:07:e8:fe:d2:49:ec:79:2f:d4:91:59:c5:03. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added \'hadoop102,10.0.0.102\' (ECDSA) to the list of known hosts. Last login: Tue Aug 31 15:21:35 2021 [delopy@hadoop102 ~]$ logout Connection to hadoop102 closed. #2.ssh免密登录hadoop103 [delopy@hadoop102 ~/.ssh]$ ssh hadoop103 The authenticity of host \'hadoop103 (10.0.0.103)\' can\'t be established. ECDSA key fingerprint is SHA256:g6buQ4QMSFl+5MMAh8dTCmLtkIfdT8sgRFYc6uCzV3c. ECDSA key fingerprint is MD5:5f:d7:ad:07:e8:fe:d2:49:ec:79:2f:d4:91:59:c5:03. Are you sure you want to continue connecting (yes/no)? yes There were 16 failed login attempts since the last successful login. Last login: Tue Aug 31 14:58:54 2021 [delopy@hadoop103 ~]$ logout Connection to hadoop103 closed. #3.ssh免密登录hadoop104 [delopy@hadoop102 ~/.ssh]$ ssh hadoop104 The authenticity of host \'hadoop104 (10.0.0.104)\' can\'t be established. ECDSA key fingerprint is SHA256:g6buQ4QMSFl+5MMAh8dTCmLtkIfdT8sgRFYc6uCzV3c. ECDSA key fingerprint is MD5:5f:d7:ad:07:e8:fe:d2:49:ec:79:2f:d4:91:59:c5:03. Are you sure you want to continue connecting (yes/no)? yes Last failed login: Tue Aug 31 15:12:11 CST 2021 from 10.0.0.102 on ssh:notty There were 4 failed login attempts since the last successful login. Last login: Tue Aug 31 15:01:13 2021 [delopy@hadoop104 ~]$ logout Connection to hadoop103 closed. 三、编写集群分发脚本xsync 1.scp(secure copy)安全拷贝 #1.scp定义 scp可以实现服务器与服务器之间的数据拷贝。(from server1 to server2) #2.基本语法 scp -r $pdir/$fname $user@$host:$pdir/$fname 命令 递归 要拷贝的文件路径/名称 目的地用户@主机:目的地路径/名称 2.rsync远程同步工具 #1.rsync定义 rsync主要用于备份和镜像。具有速度快、避免复制相同内容和支持符号链接的优点。 rsync和scp区别:用rsync做文件的复制要比scp的速度快,rsync只对差异文件做更新。scp是把所有文件都复制过去。 #2.基本语法 rsync -av $pdir/$fname $user@$host:$pdir/$fname 命令 选项参数 要拷贝的文件路径/名称 目的地用户@主机:目的地路径/名称 选项参数说明: -a 归档拷贝 -v 显示复制过程 3.需求分析 #1.需求:循环复制文件到所有节点的相同目录下 #2.需求分析: 1)rsync命令原始拷贝: rsync -av /opt/module atguigu@hadoop103:/opt/ 2)期望脚本: xsync要同步的文件名称 3)期望脚本在任何路径都能使用(脚本放在声明了全局环境变量的路径) [delopy@hadoop102 ~]$ echo $PATH /usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin 4.编写xsync集群分发脚本 #1.在/home/deploy/bin目录下创建xsync文件 [delopy@hadoop102 ~]$ mkdir bin [delopy@hadoop102 ~]$ cd bin/ [delopy@hadoop102 ~/bin]$ vim xsync #!/bin/bash #1. 判断参数个数 if [ $# -lt 1 ] then echo Not Enough Arguement! exit; #!/bin/bash #1. 判断参数个数 if [ $# -lt 1 ] then echo Not Enough Arguement! exit; fi #2. 遍历集群所有机器 for host in hadoop102 hadoop103 hadoop104 do echo ==================== $host ==================== #3. 遍历所有目录,挨个发送 for file in $@ do #4. 判断文件是否存在 if [ -e $file ] then #5. 获取父目录 pdir=$(cd -P $(dirname $file); pwd) #6. 获取当前文件的名称 fname=$(basename $file) ssh $host "mkdir -p $pdir" rsync -av $pdir/$fname $host:$pdir else echo $file does not exists! fi done done #2.修改脚本 xsync 具有执行权限 [delopy@hadoop102 ~/bin]$ chmod +x xsync #3.测试脚本 [delopy@hadoop102 ~/bin]$ ./xsync /home/delopy/bin #4.配置环境变量 [delopy@hadoop102 ~]$ sudo vim /etc/profile.d/my_env.sh # RSYNC_HOME export PATH=http://www.likecs.com/home/delopy/bin:$PATH # JAVA_HOME export JAVA_HOME=http://www.likecs.com/opt/module/jdk export PATH=$PATH:$JAVA_HOME/bin # HADOOP_HOME export HADOOP_HOME=http://www.likecs.com/opt/module/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_HOME/lib/native" #5.同步环境变量配置(root所有者) [delopy@hadoop102 ~]$ sudo ./bin/xsync /etc/profile.d/my_env.sh 注意:如果用了sudo,那么xsync一定要给它的路径补全。 让环境变量生效 #5.所有机器刷新环境变量并查看 [atguigu@hadoop103 bin]$ source /etc/profile [delopy@hadoop102 ~]$ echo $PATH /opt/hadoop/bin:/opt/hadoop/sbin:/home/delopy/bin:/home/delopy/bin:/home/delopy/bin/xsync:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/opt/jdk/bin 四、JDK安装 JDK官网下载:https://www.oracle.com 1.创建软件存放目录 [delopy@hadoop102 ~]$ mkdir /data/software/ [delopy@hadoop102 ~]$ cd /data/software/ 2.上传JDK安装包 [delopy@hadoop102 /data/software]$ rz [delopy@hadoop102 /data/software]$ ll total 181192 -rw-r--r-- 1 delopy delopy 185540433 2021-06-16 14:21 jdk-8u131-linux-x64.tar.gz 3.解压安装包 [delopy@hadoop102 /data/software]$ tar xf jdk-8u131-linux-x64.tar.gz -C /opt/module/ [delopy@hadoop102 /data/software]$ cd /opt/module/ [delopy@hadoop102 /opt/module]$ ll total 0 drwxr-xr-x 8 delopy delopy 255 2017-03-15 16:35 jdk1.8.0_131 4.做软连接 [delopy@hadoop102 /data/software]$ cd /opt/module/ [delopy@hadoop102 /opt/module]$ ll total 0 drwxr-xr-x 8 delopy delopy 255 2017-03-15 16:35 jdk1.8.0_131 5.推送JDK到其他机器 [delopy@hadoop102 /opt/module]$ xsync /opt/module/ 6.验证JDK版本(所有机器) [delopy@hadoop102 /opt/module]$ java -version java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) 五、Hadoop安装 Hadoop官网下载地址:https://hadoop.apache.org/releases.html 1.下载安装包 [delopy@hadoop102 ~]$ cd /data/software/ [delopy@hadoop102 /data/software]$ wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz [delopy@hadoop102 /data/software]$ ll total 772196 -rw-r--r-- 1 delopy delopy 605187279 2021-06-15 17:55 hadoop-3.3.1.tar.gz 2.解压安装包 [delopy@hadoop102 /data/software]$ tar xf hadoop-3.3.1.tar.gz -C /opt/module/ [delopy@hadoop102 /data/software]$ cd /opt/module/ [delopy@hadoop102 /opt/module]$ ll total 0 drwxr-xr-x 10 delopy delopy 215 2021-06-15 13:52 hadoop-3.3.1 3.做软连接 [delopy@hadoop102 /opt/module]$ ln -s hadoop-3.3.1 hadoop [delopy@hadoop102 /opt/module]$ ll total 0 lrwxrwxrwx 1 delopy delopy 12 2021-09-01 11:43 hadoop -> hadoop-3.3.1 drwxr-xr-x 10 delopy delopy 215 2021-06-15 13:52 hadoop-3.3.1 4.同步Hadoop程序到其他机器 [delopy@hadoop102 /opt/module]$ xsync /opt/module/ 5.验证hadoop(所有机器) [delopy@hadoop102 /opt/module]$ hadoop version Hadoop 3.3.1 Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2 Compiled by ubuntu on 2021-06-15T05:13Z Compiled with protoc 3.7.1 From source with checksum 88a4ddb2299aca054416d6b7f81ca55 This command was run using /opt/module/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar 六、Hadoop集群配置 1.核心配置文件 [delopy@hadoop102 ~]$ cd /opt/module/hadoop/etc/hadoop/ [delopy@hadoop102 /opt/module/hadoop/etc/hadoop]$ vim core-site.xml See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop102:8020</value> <description>指定NameNode的地址</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop/data</value> <description>指定hadoop数据的存储目录</description> </property> <property> <name>hadoop.http.staticuser.user</name> <value>delopy</value> <description>配置HDFS网页登录使用的静态用户为delopy</description> </property> </configuration> 2.HDFS配置文件 [delopy@hadoop102 /opt/module/hadoop/etc/hadoop]$ vim hdfs-site.xml Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="http://www.likecs.com/configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.namenode.http-address</name> <value>hadoop102:9870</value> <description>nn web端访问地址</description> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop104:9868</value> <description>2nn web端访问地址</description> </property> </configuration> 3.YARN配置文件 [delopy@hadoop102 /opt/module/hadoop/etc/hadoop]$ vim yarn-site.xml <?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <description>指定MR走shuffle</description> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop103</value> <description>指定ResourceManager的地址</description> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> <description>环境变量的继承</description> </property> </configuration> 4.MapReduce配置文件 [delopy@hadoop102 /opt/module/hadoop/etc/hadoop]$ vim mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="http://www.likecs.com/configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>指定MapReduce程序运行在Yarn上</description> </property> </configuration> 5.配置workers [delopy@hadoop102 /opt/module/hadoop/etc/hadoop]$ vim workers hadoop102 hadoop103 hadoop104 ps: 该文件中添加的内容结尾不允许有空格,文件中不允许有空行。 6.分发配置好的Hadoop配置文件 [delopy@hadoop102 /opt/module/hadoop/etc/hadoop]$ xsync /opt/module/ 七、启动Hadoop集群 1.格式化HDFS #1.如果集群是第一次启动,需要在hadoop102节点格式化NameNode(注意:格式化NameNode,会产生新的集群id,导致NameNode和DataNode的集群id不一致,集群找不到已往数据。如果集群在运行过程中报错,需要重新格式化NameNode的话,一定要先停止namenode和datanode进程,并且要删除所有机器的data和logs目录,然后再进行格式化。) [delopy@hadoop102 ~]$ hdfs namenode -format ... ... /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop102/10.0.0.102 ************************************************************/ 2.启动HDFS #1.hadoop102启动HDFS [delopy@hadoop102 ~]$ start-dfs.sh Starting namenodes on [hadoop102] Starting datanodes hadoop103: WARNING: /opt/module/hadoop/logs does not exist. Creating. hadoop104: WARNING: /opt/module/hadoop/logs does not exist. Creating. Starting secondary namenodes [hadoop104] #2.查看集群HDFS启动状态 [delopy@hadoop102 ~]$ jps 18016 Jps 17653 NameNode 17756 DataNode [delopy@hadoop103 ~]$ jps 16681 DataNode 16748 Jps [delopy@hadoop104 ~]$ jps 31880 DataNode 31976 SecondaryNameNode 32024 Jps 3.启动YARN #1.hadoop103启动YARN [delopy@hadoop103 ~]$ start-yarn.sh Starting resourcemanager Starting nodemanagers #2.查看集群YARN启动状态 [delopy@hadoop103 ~]$ jps 16968 NodeManager 16681 DataNode 17052 Jps 16862 ResourceManager [delopy@hadoop102 ~]$ jps 18800 NameNode 18905 DataNode 19323 Jps 19229 NodeManager [delopy@hadoop104 ~]$ jps 32197 Jps 31880 DataNode 31976 SecondaryNameNode 32090 NodeManager 4.Web端查看HDFS的NameNode #1.浏览器中输入::9870,下图可以看到Live Nodes:3,Disk:300G.

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zwxysd.html