Hadoop分布式开发环境搭建教程

Author Avatar
Fa1c0n 3月 01, 2017
  • 在其它设备中阅读本文章

配置外部环境:
Mac OS X El Capitan 10.11.4
选用虚拟机:VMware Fusion 8.1 Professional
下载并安装VMware虚拟机,由于本文的重点不是安装虚拟机,故虚拟机软件安装部分不再描述。
本次虚拟机系统采用最新版本的Ubuntu 15.10 Wily Werewolf,点击此处可下载镜像文件
Ubuntu ISO Image

下载完成后,即可打开VMware开始创建虚拟机并安装,只是在创建虚拟机的过程中,建议不要勾选Linux快捷安装。如图所示:
Disable QuickSetup Image
在虚拟机创建完成后,先不要急着开机启动系统安装,先打开设置页面,选择处理器和内存,根据当前宿主机器配置虚拟机可用CPU处理器核心数量和合适的内存大小,本机选取了4个处理器核心,2GB的内存,在高级选项中推荐打开虚拟化管理程序和代码分析应用程序,有助于提升虚拟机中部分程序的运行效率,如图所示:
CPUSettings Image
点击显示全部后,选择网络适配器,配置Hadoop务必要将网络适配器模式选择为桥接模式,可根据宿主机器的网络选择,如图所示:
BridgeNetwork Image
返回后进入磁盘模块,可以适当增加磁盘容量,hadoop是分布式存储系统,建议的磁盘容量是40GB。完成后,即可开启虚拟机自动加载镜像,并开始安装Ubuntu,如图所示:
UbuntuSetupWelcome Image
注:在磁盘分区模块,不推荐选择加密Ubuntu,会给后面的操作带来麻烦。
UbuntuDiskSetting Image
Ubuntu系统的安装过程本文不再描述。
UbuntuSetup Image
如下图所示,Ubuntu已安装完成:
UbuntuFinish Image
UbuntuDesktop Image
在初次完成安装Ubuntu后,打开终端,需要先执行以下两条命令:

sudo apt-get update
sudo apt-get dist-upgrade

DistUpgrade Image
刚刚安装好的Ubuntu执行dist-upgrade后建议重启一次,此次更新了包括linux-kernel-image、linux-firmware等内核级软件,如图所示:
DistUpgradeReboot Image
由于此分布式开发环境在移动PC端搭建,性能不足,故只搭建3个主机的集群。
众所周知,Hadoop是基于Java编写,Hadoop、MapReduce运行需要JDK,因此在安装Hadoop之前,必须安装和配置JDK。点击此处跳转到Oracle官方下载链接下载。
下载完成后,笔者将JDK放在了/usr/local/jdk目录下,读者可根据自己的情况适当调整。
jdkUncompressPath Image
jdkFolder Image
在配置完成jdk后,需要为jdk设置环境变量,在终端下输入:

sudo vi /etc/profile

然后增加如下几条命令:

export JAVA_HOME=/usr/local/jdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

如图所示:
jdkProfileSettings Image
若要让环境变量生效,可以选择重启或执行以下命令并如图所示:

source /etc/profile
java -version

jdksourceProfile Image
在执行java -version成功出现java版本信息时,此时jdk配置完毕。

接下来,到Hadoop官网下载Hadoop的执行包,而非源码包,一定要选择binary下载,而非source。(否则需要配置环境编译Hadoop,编译Hadoop将在另外一篇文章中详细描述,此处不再详叙。)链接点击此处。本文笔者下载最新版本的Hadoop2.6.4的安装包,如图所示:
DownloadHadoopBinary Image
在下载完成后,首先我们先开启Ubuntu的root用户登陆,Ubuntu安装完成后默认不能够使用root用户登陆,开启root登录,需要执行以下指令:

vi /usr/share/lightdm/lightdm.conf.d/50-ubuntu.conf

并在行末添加:

greeter-show-manual-login=true

如图所示:
enableRootLogin Image
由于刚刚开启了root账户的登录权限,故需要为root账户设置密码,输入如下指令:

sudo passwd root

输入两次root账户要设置的密码,如图所示:
passwdRoot Image
完成上述命令后,重新启动Ubuntu,可以看到多用户登录界面,选择root用户,输入密码登录,如图所示:
rootLogin Image
接下来需要配置ssh免密码登录,输入以下命令安装ssh:

sudo apt-get install ssh

如图所示:
sshInstall Image
在安装完成后,输入以下命令检查ssh服务是否启动,如图所示:

ps -e | grep ssh

grepSSHInstall Image
安装完成后,打开ssh配置文件修改远程登录访问权限:

vi /etc/ssh/sshd_config

修改行内容如下:

//PermitRootLogin without-password
PermitRootLogin yes

permitRootLoginSSHConfig Image
生成ssh密钥的过程需要在三台中进行,故ssh密钥配置稍后进行。
现在将刚才下载好的Hadoop进行解压,如图所示:
hadoopUncompress Image
或使用以下命令解压:

tar -zxvf /home/fa1c0n/download/hadoop-2.6.4.tar.gz

进行解压。
配置Hadoop需要配置以下文件,参加文件列表:

core-site.xml
hadoop-env.sh
hdfs-site.xml
mapred-site.xml
slaves
yarn-env.sh
yarn-site.xml

由mapred-site.xml不存在,故打开终端后,使用下列命令创建,如图所示:

cd /root/hadoop-2.6.4/
cd etc/hadoop/
cp mapred-site.xml.template mapred-site.xml

createmapred-sitexml Image
首先,执行下面的命令修改core-site.xml:

vi /root/hadoop-2.6.4/etc/hadoop/core-site.xml

如图所示:
coresitexml Image
core-site.xml文件内容如下,如果怕敲错可以直接粘贴上去:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://master:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hdfs_all/tmp</value>
  </property>
</configuration>

接下来修改hadoop-env.sh,指令同上:

vi /root/hadoop-2.6.4/etc/hadoop/hadoop-env.sh

如图所示:
hadoopenvsh Image
同样,hadoop-env.sh文件内容如下:

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE     file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME.  All others are
# optional.  When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined     on
# remote nodes.

# The java implementation to use.
# export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/local/jdk
# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol.  Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

# Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
  if [ "$HADOOP_CLASSPATH" ]; then
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
  else
    export HADOOP_CLASSPATH=$f
  fi
done

# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""

# Extra Java runtime options.  Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"

export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"

export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"

# On secure datanodes, user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol.  This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}

# Where log files are stored.  $HADOOP_HOME/logs by default.
#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER

# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}

###
# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in     HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""

###
# Advanced Users Only!
###

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by
#       the user that will run the hadoop daemons.  Otherwise there is the
#       potential for a symlink attack.
export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER

接下来修改hdfs-site.xml,指令同上:

vi /root/hadoop-2.6.4/etc/hadoop/hdfs-site.xml

如图所示:
hdfs-sitexml Image
同样,hdfs-site.xml文件内容如下:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
            <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hdfs_all/dfs/name</value>
    </property>
    <property>
      <name>dfs.namenode.data.dir</name>
  <value>file:/home/hfds_all/dfs/data</value>
    </property>
</configuration>

接下来修改mapred-site.xml,指令同上:

vi /root/hadoop-2.6.4/etc/hadoop/mapred-site.xml

如图所示:
mapred-sitexml Image
同样,mapred-site.xml文件内容如下:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>master:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>master:19888</value>
    </property>
</configuration>

接下来修改slaves,指令同上:

vi /root/hadoop-2.6.4/etc/hadoop/slaves

如图所示:
slaves Image
同样,slaves文件内容如下:

slave1
slave2

注:该文件自带的localhost须去掉,原因:此处填写的是DataNode,而非NameNode。

接下来修改yarn-env.sh,指令同上:

vi /root/hadoop-2.6.4/etc/hadoop/yarn-env.sh

如图所示:
yarn-envsh Image
同样,yarn-env.sh文件内容如下:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# User for YARN daemons
export HADOOP_YARN_USER=${HADOOP_YARN_USER:-yarn}

# resolve links - $0 may be a softlink
export YARN_CONF_DIR="${YARN_CONF_DIR:-$HADOOP_YARN_HOME/conf}"

# some Java parameters
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/local/jdk
if [ "$JAVA_HOME" != "" ]; then
  #echo "run java in $JAVA_HOME"
  JAVA_HOME=$JAVA_HOME
fi

if [ "$JAVA_HOME" = "" ]; then
  echo "Error: JAVA_HOME is not set."
  exit 1
fi

JAVA=$JAVA_HOME/bin/java
JAVA_HEAP_MAX=-Xmx1000m

# For setting YARN specific HEAP sizes please use this
# Parameter and set appropriately
# YARN_HEAPSIZE=1000

# check envvars which might override default args
if [ "$YARN_HEAPSIZE" != "" ]; then
  JAVA_HEAP_MAX="-Xmx""$YARN_HEAPSIZE""m"
fi

# Resource Manager specific parameters

# Specify the max Heapsize for the ResourceManager using a numerical value
# in the scale of MB. For example, to specify an jvm option of -Xmx1000m, set
# the value to 1000.
# This value will be overridden by an Xmx setting specified in either YARN_OPTS
# and/or YARN_RESOURCEMANAGER_OPTS.
# If not specified, the default value will be picked from either YARN_HEAPMAX
# or JAVA_HEAP_MAX with YARN_HEAPMAX as the preferred option of the two.
#export YARN_RESOURCEMANAGER_HEAPSIZE=1000

# Specify the max Heapsize for the timeline server using a numerical value
# in the scale of MB. For example, to specify an jvm option of -Xmx1000m, set
# the value to 1000.
# This value will be overridden by an Xmx setting specified in either YARN_OPTS
# and/or YARN_TIMELINESERVER_OPTS.
# If not specified, the default value will be picked from either YARN_HEAPMAX
# or JAVA_HEAP_MAX with YARN_HEAPMAX as the preferred option of the two.
#export YARN_TIMELINESERVER_HEAPSIZE=1000

# Specify the JVM options to be used when starting the ResourceManager.
# These options will be appended to the options specified as YARN_OPTS
# and therefore may override any similar flags set in YARN_OPTS
#export YARN_RESOURCEMANAGER_OPTS=

# Node Manager specific parameters

# Specify the max Heapsize for the NodeManager using a numerical value
# in the scale of MB. For example, to specify an jvm option of -Xmx1000m, set
# the value to 1000.
# This value will be overridden by an Xmx setting specified in either YARN_OPTS
# and/or YARN_NODEMANAGER_OPTS.
# If not specified, the default value will be picked from either YARN_HEAPMAX
# or JAVA_HEAP_MAX with YARN_HEAPMAX as the preferred option of the two.
#export YARN_NODEMANAGER_HEAPSIZE=1000

# Specify the JVM options to be used when starting the NodeManager.
# These options will be appended to the options specified as YARN_OPTS
# and therefore may override any similar flags set in YARN_OPTS
#export YARN_NODEMANAGER_OPTS=

# so that filenames w/ spaces are handled correctly in loops below
IFS=


# default log directory & file
if [ "$YARN_LOG_DIR" = "" ]; then
  YARN_LOG_DIR="$HADOOP_YARN_HOME/logs"
fi
if [ "$YARN_LOGFILE" = "" ]; then
  YARN_LOGFILE='yarn.log'
fi

# default policy file for service-level authorization
if [ "$YARN_POLICYFILE" = "" ]; then
  YARN_POLICYFILE="hadoop-policy.xml"
fi

# restore ordinary behaviour
unset IFS


YARN_OPTS="$YARN_OPTS -Dhadoop.log.dir=$YARN_LOG_DIR"
YARN_OPTS="$YARN_OPTS -Dyarn.log.dir=$YARN_LOG_DIR"
YARN_OPTS="$YARN_OPTS -Dhadoop.log.file=$YARN_LOGFILE"
YARN_OPTS="$YARN_OPTS -Dyarn.log.file=$YARN_LOGFILE"
YARN_OPTS="$YARN_OPTS -Dyarn.home.dir=$YARN_COMMON_HOME"
YARN_OPTS="$YARN_OPTS -Dyarn.id.str=$YARN_IDENT_STRING"
YARN_OPTS="$YARN_OPTS -Dhadoop.root.logger=${YARN_ROOT_LOGGER:-INFO,console}"
YARN_OPTS="$YARN_OPTS -Dyarn.root.logger=${YARN_ROOT_LOGGER:-INFO,console}"
if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then
  YARN_OPTS="$YARN_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH"
fi
YARN_OPTS="$YARN_OPTS -Dyarn.policy.file=$YARN_POLICYFILE"

接下来修改yarn-site.xml,指令同上:

vi /root/hadoop-2.6.4/etc/hadoop/yarn-site.xml

如图所示:
yarn-sitexml Image
同样,yarn-site.xml文件内容如下:

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>
</configuration>

至此,Hadoop的配置文件配置完毕,为了防止读者在粘贴过程中配置出错,可以点击此处下载配置文件包,直接解压所有文件到

/root/hadoop-2.6.4/etc/hadoop/

目录即可。
下一步是配置环境变量,执行以下命令打开环境变量配置文件:

vi /etc/environment

如下图所示:
pathenvironmentsetting Image

PATH="/root/hadoop-2.6.4/bin:/root/hadoop-2.6.4/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"

注意增加的hadoop路径,然后输入以下命令使当前配置的环境变量生效,若仍没有生效,请尝试重启:

source /etc/environment

如下图所示:
sourcetcenvironment Image

到此时为止,配置Hadoop在一台机器需要完成的工作结束,输入poweroff关机。回到VMware Fusion 8中,对此虚拟机进行二次克隆。
点击Finder边栏上的虚拟机,选择创建完整克隆,输入克隆后的虚拟机,如图所示:
ubuntucloning Image
待克隆完成后,VMware的虚拟机资源列表应如下图所示:

clonefinishvmwarelist Image
接下来需要打开三台虚拟机进行操作了,同时启动三台虚拟机,如图所示:
ubuntuVMx3 Image
分别打开三台虚拟机的终端,输入ifconfig命令分别查看三台IP地址,按照虚拟机名分配master、slave,并填写在/etc/hosts文件中,如图所示:
etchostsettings Image
完成后保存hosts文件,并分别在三台主机上配置SSH密钥,在master上执行以下命令:

ufw disable
ssh-keygen -t dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ls .ssh/
scp authorized_keys slave1:~/.ssh/

如下图所示:
masterSSHKeygenSettings Image
在slave1上执行如下命令:

ufw disable
ssh-keygen -t dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ls .ssh/
scp authorized_keys slave2:~/.ssh/

如下图所示:
slave1SSHKeygenSettings Image
在slave2上执行如下命令:

ufw disable
ssh-keygen -t dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ls .ssh/
scp authorized_keys master:~/.ssh/
scp authorized_keys slave1:~/.ssh/

如下图所示:
slave2SSHKeygenSettings_1 Image
slave2SSHKeygenSettings_2 Image

至此,SSH密钥配置完毕,接下来在master节点上执行以下命令格式化namenode节点:

hadoop namenode -format

如下图所示:
hadoopnamenodeformat1 Image
hadoopnamenodeformat2 Image
hadoopnamenodeformat3 Image
hadoopnamenodeformat4 Image
hadoopnamenodeformat5 Image
hadoopnamenodeformat6 Image

此时Hadoop已经配置完毕,输入以下命令在master节点上启动Hadoop:

start-all.sh

如下图所示:
startallsh Image
如上图,Hadoop已经配置成功,为了核实服务是否开启,在三台机器上分别运行jps,在master上运行jps如图所示:
masterjps Image
在slave1和slave2上运行jps如图所示:
slave1jps Image
slave2jps Image
若读者运行的jps和图片上一致,说明配置成功,运行下面的命令查看集群的状态:

hadoop dfsadmin -report

如下图所示:
dfsadminreport1 Image
dfsadminreport2 Image
有此图可以看到,Hadoop已配置成功,打开网页查看如下:
hadoopcluster Image
hadoopoverview1 Image
hadoopoverview2 Image
hadoopoverview3 Image
至此,Hadoop分布式集群开发环境搭建完毕,若需要停止Hadoop运行,则执行以下命令:

stop-all.sh

如图所示:
stopallsh Image