在Linux上怎么安装和配置Apache Samza

samza是一个分布式的流式数据处理框架(streaming processing),它是基于Kafka消息队列来实现类实时的流式数据处理的。(准确的说,samza是通过模块化的形式来使用kafka的,因此可以构架在其他消息队列框架上,但出发点和默认实现是基于kafka)

Apache Kafka主要是用来控制发消息的

Apache Hadoop YARN会提供错误信息,隔离处理器,安全和资源管理.

本文将介绍怎么在 Ubuntu 14.04 的32位 系统上安装Samza.


安装准备:

要安装和配置Apache-Samza,需要以下东西

JDK 1.7
maven2
kafka
yarn
zookeeper
#  apt-get install curl gem


下载并设置JDK路径:

我们需要安装JDK并设置好其环境变量.

# cd /usr/java
 
# wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2FOracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u79-b15/jdk-7u79-linux-i586.tar.gz"
 
# tar xzf jdk-7u79-linux-i586.tar.gz


解压并设置好JAVA_HOME路径
# tar -zxvf  jdk-7u79-linux-i586.tar.gz
# JAVA_HOME=/usr/java/jdk1.7.0_79
# export JAVA_HOME
# PATH=$JAVA_HOME/bin:$PATH
# export PATH

把上面的加入到 ~/.bashrc 和 /etc/bashrc文件去

安装Maven2:

接下来下载安装maven

#  wget https://launchpad.net/~bneijt/+archive/ubuntu/ppa/+build/2139203/+files/maven3_3.0.1-0~ppa2_all.deb

# dpkg -i maven3_3.0.1-0~ppa2_all.deb

检查maven版本好

#  mvn3 -version

Apache Maven 3.0.1 (r1038046; 2010-11-23 16:28:32+0530)
Java version: 1.7.0_79
Java home: /usr/java/jdk1.7.0_79/jre
Default locale: en_IN, platform encoding: UTF-8
OS name: "linux" version: "3.8.0-29-generic" arch: "i386" Family: "unix"

安装Hello-Samza :

我们就按照在 /usr/local 文件夹下面把

# cd /usr/local

把hello-samza复制进来,

# git clone git://git.apache.org/samza-hello-samza.git hello-samza

本项目中含有一个"grid"的脚本,其中有hello-samza变量,有了这个你可以搞定一切了. 使用它可以安装 Kafka, Yarn和Zookeeper.

执行下面的命令,

# cd /usr/local/hello-samza


root@dev:/usr/local/hello-samza# bin/grid install kafka

EXECUTING: install kafka
Downloading kafka_2.10-0.8.2.1.tgz...
  % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                Dload  Upload  Total  Spent    Left  Speed
 15 15.4M  15 2406k    0    0  304k      0  0:00:51  0:00:07  0:00:44  443k

root@dev:/usr/local/hello-samza# bin/grid install yarn

EXECUTING: install yarn
Downloading hadoop-2.6.1.tar.gz...
  % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                Dload  Upload  Total  Spent    Left  Speed
 77  187M  77  145M    0    0  239k      0  0:13:23  0:10:22  0:03:01  204k

root@dev:/usr/local/hello-samza#  bin/grid install zookeeper

EXECUTING: install zookeeper
Downloading zookeeper-3.4.3.tar.gz...
  % Total    % Received % Xferd  Average Speed  Time    Time    Time  Current
                                Dload  Upload  Total  Spent    Left  Speed
  8 15.4M    8 1324k    0    0  212k      0  0:01:14  0:00:06  0:01:08  266k

现在你会发现所有的包都在hello-samza根目录下面的一个名字叫 “deploy”文件夹里面.

root@dev:/usr/local/hello-samza# cd deploy
root@dev:/usr/local/hello-samza/deploy# ls

kafka  yarn  zookeeper

执行bin/grid bootstrap命令

root@dev:/usr/local/hello-samza# bin/grid bootstrap

Download
:samza-yarn_2.10:processResources
:samza-yarn_2.10:classes
:samza-yarn_2.10:lesscss
....
....
BUILD SUCCESSFUL

Total time: 20 mins 32.855 secs
/usr/local/hello-samza
EXECUTING: install zookeeper
Using previously downloaded file /root/.samza/download/zookeeper-3.4.3.tar.gz
EXECUTING: install yarn
Using previously downloaded file /root/.samza/download/hadoop-2.6.1.tar.gz
EXECUTING: install kafka
Using previously downloaded file /root/.samza/download/kafka_2.10-0.8.2.1.tgz
EXECUTING: start zookeeper
JMX enabled by default
Using config: /usr/local/hello-samza/deploy/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
EXECUTING: start yarn
starting resourcemanager, logging to /usr/local/hello-samza/deploy/yarn/logs/yarn-root-resourcemanager-dev.out
starting nodemanager, logging to /usr/local/hello-samza/deploy/yarn/logs/yarn-root-nodemanager-dev.out
EXECUTING: start kafka
 

上面的grid执行完后,你就可以验证YARN是否安装好了并在运行,访问URL :8088. 看到的就是YARN UI界面.

Build一个Samza工作包:

你需要build下这个包,YARN就是通过这个包来执行grid的.

注: 比如你build的是hello-samza项目的最新版的话,记得首先执行下下面的命令。

root@dev:/usr/local/hello-samza#./gradlew publishToMavenLocal


你可以在hello-samza项目中使用这些命令:

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/10876656a4b962fbb2f5a3365c1c60f9.html