打包完成后上传到部署Spark的服务器上。由于SparkLauncher所在的类引用了SparkLauncher,所以还需要把这个jar也上传到服务器上。
[xinghailong@hnode10 launcher]$ ls launcher_test.jar spark-launcher_2.11-2.2.0.jar [xinghailong@hnode10 launcher]$ pwd /home/xinghailong/launcher由于SparkLauncher需要指定SPARK_HOME,因此如果你的机器可以执行spark-submit,那么就看一下spark-submit里面,SPARK_HOME是在哪
[xinghailong@hnode10 launcher]$ which spark2-submit /var/lib/hadoop-hdfs/bin/spark2-submit最后几行就能看到:
export SPARK2_HOME=http://www.likecs.com/var/lib/hadoop-hdfs/app/spark # disable randomized hash for string in Python 3.3+ export PYTHONHASHSEED=0 exec "${SPARK2_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "$@"综上,我们需要的是:
一个自定义的Jar,里面包含spark应用和SparkLauncher类
一个SparkLauncher的jar,spark-launcher_2.11-2.2.0.jar 版本根据你自己的来就行
一个当前目录的路径
一个SARK_HOME环境变量指定的目录
然后执行命令启动测试:
java -Djava.ext.dirs=http://www.likecs.com/home/xinghailong/launcher -cp launcher_test.jar Launcher /var/lib/hadoop-hdfs/app/spark yarn说明:
-Djava.ext.dirs 设置当前目录为java类加载的目录
传入两个参数,一个是SPARK_HOME;一个是启动模式
观察删除发现成功启动运行了:
id null state UNKNOWN Mar 10, 2018 12:00:52 PM org.apache.spark.launcher.OutputRedirector redirect INFO: 18/03/10 12:00:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable ********** state changed ********** ...省略一大堆拷贝jar的日志 ********** info changed ********** ********** state changed ********** Mar 10, 2018 12:00:55 PM org.apache.spark.launcher.OutputRedirector redirect INFO: 18/03/10 12:00:55 INFO yarn.Client: Application report for application_1518263195995_37615 (state: ACCEPTED) ... 省略一堆重定向的日志 application_1518263195995_37615 (state: ACCEPTED) id application_1518263195995_37615 state SUBMITTED Mar 10, 2018 12:01:00 PM org.apache.spark.launcher.OutputRedirector redirect INFO: 18/03/10 12:01:00 INFO yarn.Client: Application report for application_1518263195995_37615 (state: RUNNING) ********** state changed ********** ... 省略一堆重定向的日志 INFO: user: hdfs ********** state changed ********** Mar 10, 2018 12:01:08 PM org.apache.spark.launcher.OutputRedirector redirect INFO: 18/03/10 12:01:08 INFO util.ShutdownHookManager: Shutdown hook called Mar 10, 2018 12:01:08 PM org.apache.spark.launcher.OutputRedirector redirect INFO: 18/03/10 12:01:08 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-f07e0213-61fa-4710-90f5-2fd2030e0701 总结这样就实现了基于Java应用提交Spark任务,并获得其Appliation_id和状态进行定位跟踪的需求了。