Pig 0.12.1安装和使用

Pig 0.12.1安装和使用

1 :安装

解压,配置环境变量,验证 pig安装是否成功

[linuxidc@jifeng02 ~]$ tar zxf pig-0.12.0.tar.gz
[linuxidc@jifeng02 ~]$ vi .bash_profile
# .bash_profile

# Get the aliases and functions
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH
export JAVA_HOME=$HOME/jdk1.7.0_45
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export Hadoop_HOME=$HOME/hadoop/hadoop-1.2.1
export ANT_HOME=$HOME/apache-ant-1.9.4

export HIVE_HOME=$HOME/hadoop/hive-0.12.0-bin
export HBASE_HOME=$HOME/hbase-0.94.21
export PIG_HOME=$HOME/pig-0.12.1

export PATH=$PATH:$ANT_HOME/bin:$HIVE_HOME/bin::$HBASE_HOME/bin:$PIG_HOME/bin
~
~
~
".bash_profile" 23L, 591C 已写入                                                                                 
[linuxidc@jifeng02 ~]$ source .bash_profile

[linuxidc@jifeng02 ~]$ pig -helpwhich: no hadoop in (/home/linuxidc/jdk1.7.0_45/bin:/home/linuxidc/jdk1.7.0_45/bin:/home/linuxidc/jdk1.7.0_45/bin:/usr/lib/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/linuxidc/bin:/home/linuxidc/apache-ant-1.9.4/bin:/home/linuxidc/hadoop/hive-0.12.0-bin/bin::/home/linuxidc/hbase-0.94.21/bin:/home/linuxidc/bin:/home/linuxidc/apache-ant-1.9.4/bin:/home/linuxidc/hadoop/hive-0.12.0-bin/bin::/home/linuxidc/hbase-0.94.21/bin:/home/linuxidc/pig-0.12.1/bin:/home/linuxidc/bin:/home/linuxidc/apache-ant-1.9.4/bin:/home/linuxidc/hadoop/hive-0.12.0-bin/bin::/home/linuxidc/hbase-0.94.21/bin:/home/linuxidc/pig-0.12.0/bin)Warning: $HADOOP_HOME is deprecated.Apache Pig version 0.12.0 (r1529718) compiled Oct 07 2013, 12:20:14

2.Pig执行模式
Pig有两种执行模式,分别为:
1)本地模式(Local)
本地模式下,Pig运行在单一的JVM中,可访问本地文件。该模式适用于处理小规模数据或学习之用。
运行以下命名设置为本地模式:

[linuxidc@jifeng02 ~]$ pig -x local
which: no hadoop in (/home/linuxidc/jdk1.7.0_45/bin:/home/linuxidc/jdk1.7.0_45/bin:/home/linuxidc/jdk1.7.0_45/bin:/usr/lib/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/linuxidc/bin:/home/linuxidc/apache-ant-1.9.4/bin:/home/linuxidc/hadoop/hive-0.12.0-bin/bin::/home/linuxidc/hbase-0.94.21/bin:/home/linuxidc/bin:/home/linuxidc/apache-ant-1.9.4/bin:/home/linuxidc/hadoop/hive-0.12.0-bin/bin::/home/linuxidc/hbase-0.94.21/bin:/home/linuxidc/pig-0.12.1/bin:/home/linuxidc/bin:/home/linuxidc/apache-ant-1.9.4/bin:/home/linuxidc/hadoop/hive-0.12.0-bin/bin::/home/linuxidc/hbase-0.94.21/bin:/home/linuxidc/pig-0.12.0/bin)
Warning: $HADOOP_HOME is deprecated.

2015-08-16 22:57:09,716 [main] INFO  org.apache.pig.Main - Apache Pig version 0.12.0 (r1529718) compiled Oct 07 2013, 12:20:14
2015-08-16 22:57:09,717 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/linuxidc/pig_1439737029715.log
2015-08-16 22:57:09,735 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/linuxidc/.pigbootup not found
2015-08-16 22:57:09,828 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
grunt>

2)MapReduce模式

在MapReduce模式下,Pig将查询转换为MapReduce作业提交给Hadoop(可以说群集 ,也可以说伪分布式)。

应该检查当前Pig版本是否支持你当前所用的Hadoop版本。某一版本的Pig仅支持特定版本的Hadoop,你可以通过访问Pig官网获取版本支持信息。

Pig会用到HADOOP_HOME环境变量。如果该变量没有设置,Pig也可以利用自带的Hadoop库,但是这样就无法保证其自带肯定库和你实际使用的HADOOP版本是否兼容,所以建议显式设置HADOOP_HOME变量。且还需要设置PIG_CLASSPATH变量:

[linuxidc@jifeng02 ~]$ vi .bash_profile

# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH
export JAVA_HOME=$HOME/jdk1.7.0_45
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=$HOME/hadoop/hadoop-1.2.1
export ANT_HOME=$HOME/apache-ant-1.9.4

export HIVE_HOME=$HOME/hadoop/hive-0.12.0-bin
export HBASE_HOME=$HOME/hbase-0.94.21
export PIG_HOME=$HOME/pig-0.12.0
export PIG_CLASSPATH=$HOME/hadoop/hadoop-1.2.1/conf/

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/afbc8723929606c4a5fc164464a0ef49.html