搭建Spark的单机版集群

搭建Spark的单机版集群

一、创建用户

# useradd spark

# passwd spark

二、下载软件

JDK,Scala,SBT,Maven

版本信息如下:

JDK jdk-7u79-linux-x64.gz

Scala scala-2.10.5.tgz

SBT sbt-0.13.7.zip

Maven apache-maven-3.2.5-bin.tar.gz

注意:如果只是安装Spark环境,则只需JDK和Scala即可,SBT和Maven是为了后续的源码编译。

三、解压上述文件并进行环境变量配置

# cd /usr/local/

# tar xvf /root/jdk-7u79-linux-x64.gz

# tar xvf /root/scala-2.10.5.tgz

# tar xvf /root/apache-maven-3.2.5-bin.tar.gz

# unzip /root/sbt-0.13.7.zip

修改环境变量的配置文件

# vim /etc/profile

export Java_HOME=/usr/local/jdk1.7.0_79 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export SCALA_HOME=/usr/local/scala-2.10.5 export MAVEN_HOME=/usr/local/apache-maven-3.2.5 export SBT_HOME=/usr/local/sbt export PATH=$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin:$MAVEN_HOME/bin:$SBT_HOME/bin

使配置文件生效

# source /etc/profile

测试环境变量是否生效

# java –version

java version "1.7.0_79" Java(TM) SE Runtime Environment (build 1.7.0_79-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

# scala –version

Scala code runner version 2.10.5 -- Copyright 2002-2013, LAMP/EPFL

# mvn –version

Apache Maven 3.2.5 (12a6b3acb947671f09b81f49094c53f426d8cea1; 2014-12-15T01:29:23+08:00) Maven home: /usr/local/apache-maven-3.2.5 Java version: 1.7.0_79, vendor: Oracle Corporation Java home: /usr/local/jdk1.7.0_79/jre Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "3.10.0-229.el7.x86_64", arch: "amd64", family: "unix"

# sbt --version

sbt launcher version 0.13.7

四、主机名绑定

[root@spark01 ~]# vim /etc/hosts

192.168.244.147 spark01

五、配置spark

切换到spark用户下

下载Hadoop和spark,可使用wget命令下载

spark-1.4.0

Hadoop

解压上述文件并进行环境变量配置

修改spark用户环境变量的配置文件

[spark@spark01 ~]$ vim .bash_profile

export SPARK_HOME=$HOME/spark-1.4.0-bin-hadoop2.6 export HADOOP_HOME=$HOME/hadoop-2.6.0 export HADOOP_CONF_DIR=$HOME/hadoop-2.6.0/etc/hadoop export PATH=$PATH:$SPARK_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

使配置文件生效

[spark@spark01 ~]$ source .bash_profile

修改spark配置文件

[spark@spark01 ~]$ cd spark-1.4.0-bin-hadoop2.6/conf/

[spark@spark01 conf]$ cp spark-env.sh.template spark-env.sh

[spark@spark01 conf]$ vim spark-env.sh

在后面添加如下内容:

export SCALA_HOME=/usr/local/scala-2.10.5 export SPARK_MASTER_IP=spark01 export SPARK_WORKER_MEMORY=1500m export JAVA_HOME=/usr/local/jdk1.7.0_79

有条件的童鞋可将SPARK_WORKER_MEMORY适当设大一点,因为我虚拟机内存是2G,所以只给了1500m。

配置slaves

[spark@spark01 conf]$ cp slaves slaves.template

[spark@spark01 conf]$ vim slaves

将localhost修改为spark01

启动master

[spark@spark01 spark-1.4.0-bin-hadoop2.6]$ sbin/start-master.sh

starting org.apache.spark.deploy.master.Master, logging to /home/spark/spark-1.4.0-bin-hadoop2.6/sbin/../logs/spark-spark-org.apache.spark.deploy.master.Master-1-spark01.out

查看上述日志的输出内容

[spark@spark01 spark-1.4.0-bin-hadoop2.6]$ cd logs/

[spark@spark01 logs]$ cat spark-spark-org.apache.spark.deploy.master.Master-1-spark01.out

 

Spark Command: /usr/local/jdk1.7.0_79/bin/java -cp /home/spark/spark-1.4.0-bin-hadoop2.6/sbin/../conf/:/home/spark/spark-1.4.0-bin-hadoop2.6/lib/spark-assembly-1.4.0-hadoop2.6.0.jar:/home/spark/spark-1.4.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/home/spark/spark-1.4.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/home/spark/spark-1.4.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/home/spark/hadoop-2.6.0/etc/hadoop/ -Xms512m -Xmx512m -XX:MaxPermSize=128m org.apache.spark.deploy.master.Master --ip spark01 --port 7077 --webui-port 8080 ======================================== 16/01/16 15:12:30 INFO master.Master: Registered signal handlers for [TERM, HUP, INT] 16/01/16 15:12:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/01/16 15:12:32 INFO spark.SecurityManager: Changing view acls to: spark 16/01/16 15:12:32 INFO spark.SecurityManager: Changing modify acls to: spark 16/01/16 15:12:32 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); users with modify permissions: Set(spark) 16/01/16 15:12:33 INFO slf4j.Slf4jLogger: Slf4jLogger started 16/01/16 15:12:33 INFO Remoting: Starting remoting 16/01/16 15:12:33 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster@spark01:7077] 16/01/16 15:12:33 INFO util.Utils: Successfully started service 'sparkMaster' on port 7077. 16/01/16 15:12:34 INFO server.Server: jetty-8.y.z-SNAPSHOT 16/01/16 15:12:34 INFO server.AbstractConnector: Started SelectChannelConnector@spark01:6066 16/01/16 15:12:34 INFO util.Utils: Successfully started service on port 6066. 16/01/16 15:12:34 INFO rest.StandaloneRestServer: Started REST server for submitting applications on port 6066 16/01/16 15:12:34 INFO master.Master: Starting Spark master at spark://spark01:7077 16/01/16 15:12:34 INFO master.Master: Running Spark version 1.4.0 16/01/16 15:12:34 INFO server.Server: jetty-8.y.z-SNAPSHOT 16/01/16 15:12:34 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:8080 16/01/16 15:12:34 INFO util.Utils: Successfully started service 'MasterUI' on port 8080. 16/01/16 15:12:34 INFO ui.MasterWebUI: Started MasterWebUI at :8080 16/01/16 15:12:34 INFO master.Master: I have been elected leader! New state: ALIVE

 

从日志中也可看出,master启动正常

下面来看看master的 web管理界面,默认在8080端口

搭建Spark的单机版集群

启动worker

[spark@spark01 spark-1.4.0-bin-hadoop2.6]$ sbin/start-slaves.sh spark://spark01:7077

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/a2784b237ea1367627fbcadbdcc0e079.html