Hadoop启动脚本全面详解

在工作过程中,经常需要调整一些Hadoop的参数配置,所以经常会遇到各种各样的问题。比如改了个配置怎么突然namenode起不来啦,加了个jar包怎么让hadoop的jvm加载啊,如何设定log目录啦等等,每次都需要仔细的查一遍启动脚本才能找到原因,费时又费力,因此专门总结了一下以便不时之需。

cloudera的hadoop的启动脚本写的异常复杂和零散,各种shell脚本分散在系统的各个角落,让人很无语。下面以namenode启动的过程为例说明hadoop的启动脚本的调用关系和各个脚本的作用。

hadoop启动的入口脚本是/etc/init.d/hadoop-hdfs-name,下面我们顺着启动namenode的顺序看看hadoop的启动调用过程。

/etc/init.d/hadoop-hdfs-namenode:

#1.加载/etc/default/hadoop /etc/default/hadoop-hdfs-namenode

#2.执行/usr/lib/hadoop/sbin/hadoop-daemon.sh启动namenode

cloudera启动namenode的用户为hdfs,默认的配置目录是/etc/hadoop/conf

start() {
  [ -x $EXEC_PATH ] || exit $ERROR_PROGRAM_NOT_INSTALLED
  [ -d $CONF_DIR ] || exit $ERROR_PROGRAM_NOT_CONFIGURED
  log_success_msg "Starting ${DESC}: "

su -s /bin/bash $SVC_USER -c "$EXEC_PATH --config '$CONF_DIR' start $DAEMON_FLAGS"

# Some processes are slow to start
  sleep $SLEEP_TIME
  checkstatusofproc
  RETVAL=$?

[ $RETVAL -eq $RETVAL_SUCCESS ] && touch $LOCKFILE
  return $RETVAL
}

/etc/default/hadoop  /etc/default/hadoop-hdfs-namenode:

#1.配置logdir,piddir,user

/usr/lib/hadoop/sbin/hadoop-daemon.sh

#1.加载/usr/lib/hadoop/libexec/hadoop-config.sh

DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh

#2.加载hadoop-env.sh

if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
  . "${HADOOP_CONF_DIR}/hadoop-env.sh"
fi

#3.指定log目录

# get log directory
if [ "$HADOOP_LOG_DIR" = "" ]; then
  export HADOOP_LOG_DIR="$HADOOP_PREFIX/logs"
fi

#4.补全log目录和log4j的logger等参数

export HADOOP_LOGFILE=hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.log
export HADOOP_ROOT_LOGGER=${HADOOP_ROOT_LOGGER:-"INFO,RFA"}
export HADOOP_SECURITY_LOGGER=${HADOOP_SECURITY_LOGGER:-"INFO,RFAS"}
export HDFS_AUDIT_LOGGER=${HDFS_AUDIT_LOGGER:-"INFO,NullAppender"}
log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out
pid=$HADOOP_PID_DIR/hadoop-$HADOOP_IDENT_STRING-$command.pid
HADOOP_STOP_TIMEOUT=${HADOOP_STOP_TIMEOUT:-5}

#5.调用/usr/lib/hadoop-hdfs/bin/hdfs

hadoop_rotate_log $log
echo starting $command, logging to $log
cd "$HADOOP_PREFIX"
case $command in
  namenode|secondarynamenode|datanode|journalnode|dfs|dfsadmin|fsck|balancer|zkfc)
 if [ -z "$HADOOP_HDFS_HOME" ]; then
  hdfsScript="$HADOOP_PREFIX"/bin/hdfs
 else
  hdfsScript="$HADOOP_HDFS_HOME"/bin/hdfs
 fi
 nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
  ;;
  (*)
 nohup nice -n $HADOOP_NICENESS $hadoopScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
  ;;
esac
echo $! > $pid
sleep 1; head "$log"
sleep 3;
if ! ps -p $! > /dev/null ; then
  exit 1
fi

可以看到namenode的sysout输出到$log中,即log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/27dea5bfb551815ab20432c31aab93ba.html