Hive UDAF和UDTF实现group by后获取top值(3)

@Override
    public void process(Object[] args) throws HiveException {
        String input = args[0].toString();
        String[] test = input.split("\\$\\*");
        for (int i = 0; i < test.length; i++) {
            try {
                String[] result  = new String[3];
                String[] sp= test[i].split("\\$\\@");
                result[0] =sp[0];
                result[1] =sp[1];
                result[2] = String.valueOf(i + 1);
                forward(result);
            } catch (Exception e) {
                continue;
            }
        }

}

}

两个函数分别以top_group和explode_map为函数名加入到hive函数库中,应用例子如下(获取前100个landingrefer的top url 100)

hive -e "select t.landingrefer, mytable.col1, mytable.col2,mytable.col3 from (select landingrefer, top_group(url,100) pro, count(sid) s from pvlog  where dt=20120719 and depth=1 group by landingrefer order by s desc limit 100) t lateral view explode_map(t.pro) mytable as col1, col2, col3;"> test

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/832ac4b595ddda530092bbf816c87ebf.html