Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/fulong/Hadoop/hadoop-2.2.0/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
hive>
验证
打算创建一张表存储搜狗实验室下载的用户搜索行为日志。
数据下载地址:
首先创建表:
hive> create table searchlog (time string,id string,sword string,rank int,clickrank int,url string) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile;
此时会报错:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:javax.jdo.JDODataStoreException: An exception was thrown while adding/validating class(es) : ORA-01754: a table may contain only one column of type LONG
解决办法:
用解压缩工具打开${HIVE_HOME}/lib中的hive-metastore-0.13.0.jar,发现名为package.jdo的文件,打开该文件并找到下面的内容。
<fieldbackground-color: #ffff00;">viewOriginalText" default-fetch-group="false">
<column jdbc-type="LONGVARCHAR"/>
</field>
<fieldbackground-color: #ffff00;">viewExpandedText" default-fetch-group="false">
<column jdbc-type="LONGVARCHAR"/>
</field>
可以发现列VIEW_ORIGINAL_TEXT和VIEW_EXPANDED_TEXT的类型都为LONGVARCHAR,对应于Oracle中的LONG,这样就与Oracle表只能存在一列类型为LONG的列的要求相矛盾,所以就出现错误了。
按照Hive官网的建议将该两列的jdbc-type的值改为CLOB,修改后的内容如下所示。
<fielddefault-fetch-group="false">
<column jdbc-type="CLOB"/>
</field>
<fielddefault-fetch-group="false">
<column jdbc-type="CLOB"/>
</field>
修改以后,重启hive。
重新执行创建表的命令,创建表成功:
hive> create table searchlog (time string,id string,sword string,rank int,clickrank int,url string) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile;
OK
Time taken: 0.986 seconds
将本地数据加载进表中:
hive> load data local inpath '/home/fulong/Downloads/SogouQ.reduced' overwrite into table searchlog;
Copying data from file:/home/fulong/Downloads/SogouQ.reduced
Copying file: file:/home/fulong/Downloads/SogouQ.reduced
Loading data to table default.searchlog
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted hdfs://fulonghadoop/user/hive/warehouse/searchlog
Table default.searchlog stats: [numFiles=1, numRows=0, totalSize=152006060, rawDataSize=0]
OK
Time taken: 25.705 seconds
查看所有表:
hive> show tables;
OK
searchlog
Time taken: 0.139 seconds, Fetched: 1 row(s)
统计行数:
hive> select count(*) from searchlog;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1407233914535_0001, Tracking URL = :8088/proxy/application_1407233914535_0001/
Kill Command = /home/fulong/Hadoop/hadoop-2.2.0/bin/hadoop job -kill job_1407233914535_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2014-08-20 18:03:17,667 Stage-1 map = 0%, reduce = 0%
2014-08-20 18:04:05,426 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.46 sec
2014-08-20 18:04:27,317 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.74 sec
MapReduce Total cumulative CPU time: 4 seconds 740 msec
Ended Job = job_1407233914535_0001
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 Cumulative CPU: 4.74 sec HDFS Read: 152010455 HDFS Write: 8 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 740 msec
OK
1724264
Time taken: 103.154 seconds, Fetched: 1 row(s)
===============================================
Ubuntu 12.10 +Hadoop 1.2.1版本集群配置
搭建Hadoop环境(在Winodws环境下用虚拟机虚拟两个Ubuntu系统进行搭建)
===============================================