hadoop 2.7 分布式安装 + HA 开始安装:

1、解压hadoop并修改环境变量(每台机器)

[hadoop@new-cdh1 soft]$ tar -zvxf hadoop-2.6.0-cdh5.7.0.tar.gz
[hadoop@new-cdh1 soft]$ vi ~/.bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs
export HADOOP_HOME=/hadoop/soft/hadoop-2.6.0-cdh5.7.0
PATH=PATH:PATH:HOME/bin:HADOOP_HOME/bin:HADOOP\_HOME/bin:HADOOP_HOME/sbin

export PATH

[hadoop@new-cdh1 soft]$ source ~/.bash_profile

2、修改$HADOOP_HOME/etc/hadoop/slaves文件

[hadoop@new-cdh1 soft]$ vi hadoop-2.6.0-cdh5.7.0/etc/hadoop/slaves
new-cdh15
new-cdh16
new-cdh5
new-cdh6
new-cdh7
new-cdh9
new-cdh10
new-cdh11
new-cdh12
new-cdh13

3、修改HADOOP_HOME/etc/hadoop/hadoopenv.shHADOOP\_HOME/etc/hadoop/hadoop-env.sh和HADOOP_HOME/etc/hadoop/yarn-env.sh文件

[hadoop@new-cdh1 soft]$ vi hadoop-2.6.0-cdh5.7.0/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/opt/jdk1.7.0_79

[hadoop@new-cdh1 soft]$ vi hadoop-2.6.0-cdh5.7.0/etc/hadoop/yarn-env.sh

export JAVA_HOME=/opt/jdk1.7.0_79

4、修改HADOOPHOME/etc/hadoop/core−site.xml文件

fs.defaultFS hdfs://familyha hadoop.tmp.dir /hadoop/tmp/hadoop-${user.name} fs.trash.interval 1 io.native.lib.available true io.compression.codecs org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.DeflateCodec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.Lz4Codec io.file.buffer.size 131072 ha.zookeeper.quorum new-cdh12:2181,new-cdh13:2181,new-cdh15:2181,new-cdh16:2181,new-cdh17:2181 dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /hadoop/.ssh/id_rsa

5、修改HADOOPHOME/etc/hadoop/mapred−site.xml文件

mapreduce.framework.name yarn mapreduce.jobhistory.address new-cdh10:10020 mapreduce.jobhistory.webapp.address new-cdh10:19888 mapreduce.jobhistory.webapp.https.address new-cdh10:19890 mapreduce.jobhistory.admin.address new-cdh10:10033 mapreduce.jobhistory.max-age-ms 604800000 mapreduce.jobhistory.cleaner.interval 86400000 yarn.app.mapreduce.am.staging-dir /user

6、修改HADOOP_HOME/etc/hadoop/hdfs-site.xml文件

dfs.nameservices familyha dfs.ha.namenodes.familyha family1,family2 dfs.namenode.rpc-address.familyha.family1 new-cdh1:8020 dfs.namenode.rpc-address.familyha.family2 new-cdh2:8020 dfs.namenode.http-address.familyha.family1 new-cdh1:50070 dfs.namenode.http-address.familyha.family2 new-cdh2:50070 dfs.namenode.servicerpc-address.familyha.family1 new-cdh1:53333 dfs.namenode.servicerpc-address.familyha.family2 new-cdh2:53333 dfs.namenode.shared.edits.dir qjournal://new-cdh5:8485;new-cdh6:8485;new-cdh7:8485/familyha dfs.client.failover.proxy.provider.familyha org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.journalnode.edits.dir /hadoop/data/journal dfs.replication 3 dfs.namenode.name.dir file:/hadoop/data/dfs/name dfs.datanode.data.dir file:/hadoop/data/dfs/data dfs.ha.automatic-failover.enabled true dfs.webhdfs.enabled true dfs.journalnode.http-address 0.0.0.0:8480 dfs.journalnode.rpc-address 0.0.0.0:8485 dfs.permissions false

7、修改HADOOP_HOME/etc/hadoop/yarn-site.xml文件

<!\-\- Licensed under the Apache License, Version 2.0 (the "License"); you 
	may not use this file except in compliance with the License. You may obtain 
	a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless 
	required by applicable law or agreed to in writing, software distributed 
	under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES 
	OR CONDITIONS OF ANY KIND, either express or implied. See the License for 
	the specific language governing permissions and limitations under the License. 
	See accompanying LICENSE file. -->
<configuration>
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>
	<property>
		<name>yarn.resourcemanager.hostname.rm1</name>
		<value>new-cdh3</value>
	</property>
	<property>
		<name>yarn.resourcemanager.hostname.rm2</name>
		<value>new-cdh4</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.id</name>
		<value>rm1</value>
	</property>
	<property>
		<name>yarn.resourcemanager.address.rm1</name>
		<value>${yarn.resourcemanager.hostname.rm1}:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm1</name>
		<value>${yarn.resourcemanager.hostname.rm1}:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.https.address.rm1</name>
		<value>${yarn.resourcemanager.hostname.rm1}:8089</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>${yarn.resourcemanager.hostname.rm1}:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
		<value>${yarn.resourcemanager.hostname.rm1}:8025</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm1</name>
		<value>${yarn.resourcemanager.hostname.rm1}:8041</value>
	</property>

	<property>
		<name>yarn.resourcemanager.address.rm2</name>
		<value>${yarn.resourcemanager.hostname.rm2}:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm2</name>
		<value>${yarn.resourcemanager.hostname.rm2}:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.https.address.rm2</name>
		<value>${yarn.resourcemanager.hostname.rm2}:8089</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>${yarn.resourcemanager.hostname.rm2}:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
		<value>${yarn.resourcemanager.hostname.rm2}:8025</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm2</name>
		<value>${yarn.resourcemanager.hostname.rm2}:8041</value>
	</property>

	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.nodemanager.local-dirs</name>
		<value>/hadoop/data/yarn/local</value>
	</property>
	<property>
		<name>yarn.nodemanager.log-dirs</name>
		<value>/hadoop/data/yarn/log</value>
	</property>
	<property>
		<name>yarn.client.failover-proxy-provider</name>
		<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk-state-store.address</name>
		<value>new-cdh12:2181,new-cdh13:2181,new-cdh15:2181,new-cdh16:2181,new-cdh17:2181</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>new-cdh12:2181,new-cdh13:2181,new-cdh15:2181,new-cdh16:2181,new-cdh17:2181</value>
	</property>
	<property>
		<name>yarn.resourcemanager.store.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>cluster</value>
	</property>
	<property>
		<name>yarn.resourcemanager.recovery.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
	</property>
	<property>
		<name>yarn.scheduler.fair.allocation.file</name>
		<value>/hadoop/soft/hadoop-2.6.0-cdh5.7.0/etc/hadoop/fairscheduler.xml</value>
	</property>

</configuration>

8、添加$HADOOP_HOME/etc/hadoop/fairscheduler.xml文件

1024 mb, 1 vcores 1536 mb, 1 vcores 5 300 1.0 root,yarn,search,hdfs 1024 mb, 1 vcores 1536 mb, 1 vcores 1024 mb, 1 vcores 1536 mb, 1 vcores

9、创建相关文件夹

在相关的机器上创建文件夹

mkdir -p /hadoop/data/journal;
mkdir -p /hadoop/data/dfs/name;
mkdir -p /hadoop/data/dfs/data;
mkdir -p /hadoop/data/yarn/local;
mkdir -p /hadoop/data/yarn/log

10、复制集群到其他机器

[hadoop@new-cdh1 soft]$ scp -r hadoop-2.6.0-cdh5.7.0 new-cdh2:~/soft/
[hadoop@new-cdh1 soft]$ scp -r hadoop-2.6.0-cdh5.7.0 new-cdh3:~/soft/


11、修改new-cdh4配置文件

    <property>
            <name>yarn.resourcemanager.ha.id</name>
            <value>rm2</value>
    </property>

12、第一次启动

1、启动zookeeper(略)

2、格式化ZooKeeper集群

在new-cdh1 执行

[hadoop@new-cdh1 soft]$ hdfs zkfc -formatZK


16/06/22 17:51:37 INFO zookeeper.ClientCnxn: Session establishment complete on server new-cdh15/192.168.36.15:2181, sessionid = 0xf5576a58dc00004, negotiated timeout = 5000
16/06/22 17:51:37 INFO ha.ActiveStandbyElector: Session connected.
16/06/22 17:51:37 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/familyha in ZK.
16/06/22 17:51:37 INFO zookeeper.ZooKeeper: Session: 0xf5576a58dc00004 closed
16/06/22 17:51:37 INFO zookeeper.ClientCnxn: EventThread shut down

zookeeper 查看格式化效果

[hadoop@new-cdh12 ~]$ zkCli.sh
Connecting to localhost:2181
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha
[familyha]

3、启动journalnode进程

在new-cdh5、new-cdh6、new-cdh7 上分别启动并查看

[hadoop@new-cdh5 ~]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-journalnode-new-cdh5.out
[hadoop@new-cdh5 ~]$ jps
3697 Jps
3653 JournalNode

4、格式化new-cdh1 上的 namenode

[hadoop@new-cdh1 ~]$ hdfs namenode -format
16/06/22 18:10:12 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = new-cdh1/192.168.36.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.0-cdh5.7.0



16/06/22 18:10:18 INFO namenode.FSImage: Allocated new BlockPoolId: BP-996370904-192.168.36.1-1466590218912
16/06/22 18:10:19 INFO common.Storage: Storage directory /hadoop/data/dfs/name has been successfully formatted.
16/06/22 18:10:19 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
16/06/22 18:10:19 INFO util.ExitUtil: Exiting with status 0
16/06/22 18:10:19 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at new-cdh1/192.168.36.1
************************************************************/

5、启动并查看new-cdh1上的namenode

[hadoop@new-cdh1 ~]$ hadoop-daemon.sh start namenode
starting namenode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-new-cdh1.out
[hadoop@new-cdh1 ~]$ jps
5486 NameNode
5577 Jps

6、将刚才格式化的new-cdh1 上的 namenode信息同步到备用new-cdh2 的namenode

[hadoop@new-cdh2 ~]$ hdfs namenode -bootstrapStandby
16/06/22 18:17:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = new-cdh2/192.168.36.2
STARTUP_MSG: args = [-bootstrapStandby]
STARTUP_MSG: version = 2.6.0-cdh5.7.0
STARTUP_MSG: classpath =

About to bootstrap Standby ID family2 from:
Nameservice ID: familyha
Other Namenode ID: family1
Other NN’s HTTP address: http://new-cdh1:50070
Other NN’s IPC address: new-cdh1/192.168.36.1:53333
Namespace ID: 1800644275
Block pool ID: BP-996370904-192.168.36.1-1466590218912
Cluster ID: CID-753f5b34-21e6-4305-9672-50607ce8d630
Layout version: -60
isUpgradeFinalized: true

16/06/22 18:17:58 INFO common.Storage: Storage directory /hadoop/data/dfs/name has been successfully formatted.
16/06/22 18:17:59 INFO namenode.TransferFsImage: Opening connection to http://new-cdh1:50070/imagetransfer?getimage=1&txid=0&storageInfo=-60:1800644275:0:CID-753f5b34-21e6-4305-9672-50607ce8d630&bootstrapstandby=true
16/06/22 18:17:59 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
16/06/22 18:17:59 INFO namenode.TransferFsImage: Transfer took 0.05s at 0.00 KB/s
16/06/22 18:17:59 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 353 bytes.
16/06/22 18:17:59 INFO util.ExitUtil: Exiting with status 0
16/06/22 18:17:59 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at new-cdh2/192.168.36.2
************************************************************/

7、启动并查看new-cdh2上的namenode

[hadoop@new-cdh2 ~]$ hadoop-daemon.sh start namenode
starting namenode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-new-cdh2.out
[hadoop@new-cdh2 ~]$ jps
4178 Jps
4087 NameNode

8、启动并查看所有datanode

[hadoop@new-cdh2 ~]$ hadoop-daemons.sh start datanode
new-cdh5: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh5.out
new-cdh6: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh6.out
new-cdh12: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh12.out
new-cdh7: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh7.out
new-cdh13: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh13.out
new-cdh11: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh11.out
new-cdh9: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh9.out
new-cdh10: starting datanode, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-new-cdh10.out
[hadoop@new-cdh5 ~]$ jps
3766 DataNode
3865 Jps
3653 JournalNode

9、在new-cdh3上启动并查看yarn

[hadoop@new-cdh3 ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-new-cdh3.out
new-cdh13: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh13.out
new-cdh6: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh6.out
new-cdh7: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh7.out
new-cdh12: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh12.out
new-cdh5: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh5.out
new-cdh10: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh10.out
new-cdh11: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh11.out
new-cdh9: starting nodemanager, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-new-cdh9.out
[hadoop@new-cdh3 ~]$ jps
4016 ResourceManager
4295 Jps

9、在new-cdh1、new-cdh2上分别启动并查看zkfc

[hadoop@new-cdh2 ~]$ hadoop-daemon.sh start zkfc
starting zkfc, logging to /hadoop/soft/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-zkfc-new-cdh2.out
[hadoop@new-cdh2 ~]$ jps
4087 NameNode
4407 DFSZKFailoverController
4459 Jps

10、在网页查看安装效果 new-cdh1 new-cdh1datanode new-cdh2new-cdh3yarn 以后启动就可以在 hdfs 和 yarn 的主节点上直接运行 start-dfs.sh或start-yarn.sh来启动