Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二) (5)

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

hdfs命令行查询:

part-0000:

spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ bin/hadoop fs -text /soGouQSortedResult.txt/part-00000

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

hdfs命令行查询:

part-0000:

spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ bin/hadoop fs -text /soGouQSortedResult.txt/part-00001

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

我们通过hadoop命令把上述两个文件的内容合并起来:

spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ bin/hadoop fs -getmerge hdfs://SparkSingleNode:9000/soGouQSortedResult.txt combinedSortedResult.txt      //注意,第二个参数,是本地文件的目录

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ bin/hadoop fs -ls /
Found 6 items
-rw-r--r-- 1 spark supergroup 3593 2016-09-18 10:15 /README.md
-rw-r--r-- 1 spark supergroup 216118 2016-09-27 09:17 /SogouQ.mini
drwxr-xr-x - spark supergroup 0 2016-09-26 21:17 /result
drwxr-xr-x - spark supergroup 0 2016-09-26 21:49 /resultDescSorted
drwxr-xr-x - spark supergroup 0 2016-09-27 10:08 /soGouQSortedResult.txt
drwx-wx-wx - spark supergroup 0 2016-09-09 16:28 /tmp
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ ls
bin etc libexec NOTICE.txt share
combinedSortedResult.txt include LICENSE.txt README.txt tmp
dfs lib logs sbin
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

或者

spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ bin/hdfs dfs -getmerge hdfs://SparkSingleNode:9000/soGouQSortedResult.txt combinedSortedResult.txt       //两者是等价的

Spark RDD/Core 编程 API入门系列之动手实战和调试Spark文件操作、动手实战操作搜狗日志文件、搜狗日志文件深入实战(二)

spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ ls
bin etc lib LICENSE.txt NOTICE.txt sbin tmp
dfs include libexec logs README.txt share
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ cd bin
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0/bin$ ls
container-executor hdfs mapred.cmd yarn
hadoop hdfs.cmd rcc yarn.cmd
hadoop.cmd mapred test-container-executor
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0/bin$ cd hdfs
bash: cd: hdfs: Not a directory
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0/bin$ cd ..
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ bin/hdfs dfs -getmerge hdfs://SparkSingleNode:9000/soGouQSortedResult.txt combinedSortedResult.txt
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$ ls
bin etc libexec NOTICE.txt share
combinedSortedResult.txt include LICENSE.txt README.txt tmp
dfs lib logs sbin
spark@SparkSingleNode:/usr/local/hadoop/hadoop-2.6.0$

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zwgddy.html