hadoop脚本分析
1.start-all.sh libexec/hadoop-config.sh --设置变量 sbin/start-dfs.sh --config $HADOOP_CONF_DIR --启动hdfs sbin/start-yarn.sh --config $HADOOP_CONF_DIR --启动yarn 2.libexec/hadoop-config.sh COMMON_DIR ... HADOOP_CONF_DIR=... HEAP_SIZE=1000m, CLASSPATH=... 3.sbin/start-dfs.sh 1)libexec/hdfs-config.sh 2)#获得名称节点主机名 NAMENODES = hdfs getconf -namenodes 3)启动namenode "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$NAMENODES" \ --script "$bin/hdfs" start namenode $nameStartOpt 4)启动datanode "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --script "$bin/hdfs" start datanode $dataStartOpt 5)启动2ndNN "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$SECONDARY_NAMENODES" \ --script "$bin/hdfs" start secondarynamenode 6)启动quorumjournal nodes "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$JOURNAL_NODES" \ --script "$bin/hdfs" start journalnode ;; 7) ZK Failover controllers, if auto-HA is enabled echo "Starting ZK Failover Controllers on NN hosts [$NAMENODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ --config "$HADOOP_CONF_DIR" \ --hostnames "$NAMENODES" \ --script "$bin/hdfs" start zkfc 4.libexec/hdfs-config.sh 调用libexec/hadoop-config.sh 5.sbin/hadoop-daemons.sh --启动守护进程的脚本 1)执行 libexec/hdfs-config.sh 2)exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@" 循环slaves文件,通过ssh方式登录远程主机,执行相应的命令 [hadoop-daemon.sh] 调用hadoop-config.sh [bin/hdfs] 实质是:通过此命令启动java进程 6.分析脚本后,能通过脚本精确操作每个节点 本机启动节点:hadoop-daemon.sh 所有班机启动节点:hadoop-daemon.sh hadoop-daemon.sh start namenode(在namenode主机执行) hadoop-daemon.sh start secondarynamenode(需要登录到2nnNN服务器执行此命令) hadoop-daemons.sh start datanode(在namanode主机执行,可启动所有的datanode节点) hadoop-daemon.sh start datanode(在datanode主机执行,可启动本机的datanode节点) 7.bin/hadoop hadoop-config.sh 最终调用java程序 8.bin/hdfs hadoop-config.sh 最终调用java程序
转载请注明:MitNick » hadoop脚本分析