Hadoop-2.8.0搭建集群

原创
2017/04/23 18:58
阅读数 92

Zookeeper集群

先需要搭建zookeeper集群, 请参考前面的文章: Zookeeper集群

Hadoop配置

core-site.xml:

<property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>

<!-- 指定hadoop临时目录 -->

 <property>

  <name>hadoop.tmp.dir</name>

  <value>/opt/hadoop-2.8.0/tmp</value>

 </property>

<property>      
      <name>io.file.buffer.size</name>      
      <value>4096</value>      
 </property>

 <!-- 指定zookeeper地址 -->

 <property>

  <name>ha.zookeeper.quorum</name>

  <value>192.168.56.1:2181,192.168.56.101:2181,192.168.56.102:2181</value>

 </property>

 <property>

  <name>ha.zookeeper.session-timeout.ms</name>

  <value>3000</value>

 </property>

hdfs-site.xml:

 <property>

  <name>dfs.nameservices</name>

  <value>mycluster</value>

 </property>

 <!-- mycluster下面有两个NameNode,分别是nn1,nn2 -->

 <property>

  <name>dfs.ha.namenodes.mycluster</name>

  <value>nn1,nn2</value>

 </property>

 <!-- nn1的RPC通信地址 -->

 <property>

  <name>dfs.namenode.rpc-address.mycluster.nn1</name>

  <value>192.168.56.1:9000</value>

 </property>

 <!-- nn2的RPC通信地址 -->

 <property>

  <name>dfs.namenode.rpc-address.mycluster.nn2</name>

  <value>192.168.56.101:9000</value>

 </property>

 <!-- nn1的http通信地址 -->

 <property>

  <name>dfs.namenode.http-address.mycluster.nn1</name>

  <value>192.168.56.1:50070</value>

 </property>

 <!-- nn2的http通信地址 -->

 <property>

  <name>dfs.namenode.http-address.mycluster.nn2</name>

  <value>192.168.56.101:50070</value>

 </property>

 <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->

 <property>

  <name>dfs.namenode.shared.edits.dir</name>

  <value>qjournal://192.168.56.1:8485;192.168.56.101:8485;192.168.56.102:8485/mycluster</value>

 </property>

 <!-- 指定JournalNode在本地磁盘存放数据的位置 -->

 <property>

  <name>dfs.journalnode.edits.dir</name>

  <value>/opt/hadoop-2.8.0/tmp/journal</value>

 </property>

 <property>

  <name>dfs.ha.automatic-failover.enabled</name>

  <value>true</value>

 </property>

 <!-- 配置失败自动切换实现方式 -->

 <property>

  <name>dfs.client.failover.proxy.provider.mycluster</name>

  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

  </value>

 </property>

 <!-- 配置隔离机制,多个机制用换行分割,即每个机制暂用一行 -->

 <property>

  <name>dfs.ha.fencing.methods</name>

  <value>

   sshfence

   shell(/bin/true)

  </value>

 </property>

 <!-- 使用sshfence隔离机制时需要ssh免密码登陆 -->

 <property>

  <name>dfs.ha.fencing.ssh.private-key-files</name>

  <value>~/.ssh/id_rsa</value>

 </property>

 <!-- 配置sshfence隔离机制超时时间 -->

 <property>

  <name>dfs.ha.fencing.ssh.connect-timeout</name>

  <value>30000</value>

 </property>

 <!--指定namenode名称空间的存储地址 -->

 <property>

  <name>dfs.namenode.name.dir</name>

  <value>file:///opt/hadoop-2.8.0/hdfs/name</value>

 </property>

 <!--指定datanode数据存储地址 -->

 <property>

  <name>dfs.datanode.data.dir</name>

  <value>file:///opt/hadoop-2.8.0/hdfs/data</value>

 </property>

 <!--指定数据冗余份数 -->

 <property>

  <name>dfs.replication</name>

  <value>1</value>

 </property>

mapred-site.xml:

<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

<!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->

 <property>

  <name>mapreduce.jobhistory.address</name>

  <value>0.0.0.0:10020</value>

 </property>

 <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->

 <property>

  <name>mapreduce.jobhistory.webapp.address</name>

  <value>0.0.0.0:19888</value>

 </property>

yarn-site.xml:

 <property>

  <name>yarn.resourcemanager.ha.enabled</name>

  <value>true</value>

 </property>

 <!--开启自动恢复功能 -->

 <property>

  <name>yarn.resourcemanager.recovery.enabled</name>

  <value>true</value>

 </property>

 <!-- 指定RM的cluster id -->

 <property>

  <name>yarn.resourcemanager.cluster-id</name>

  <value>yrc</value>

 </property>

 <!--配置resourcemanager -->

 <property>

  <name>yarn.resourcemanager.ha.rm-ids</name>

  <value>rm1,rm2</value>

 </property>

 <!-- 分别指定RM的地址 -->

 <property>

  <name>yarn.resourcemanager.hostname.rm1</name>

  <value>192.168.56.1</value>

 </property>

 <property>

  <name>yarn.resourcemanager.hostname.rm2</name>

  <value>192.168.56.101</value>

 </property>

 <!-- <property> <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> 

  <description>If we want to launch more than one RM in single node,we need 

  this configuration</description> </property> -->

 <!-- 指定zk集群地址 -->

 <property>

  <name>ha.zookeeper.quorum</name>

  <value>192.168.56.1:2181,192.168.56.101:2181,192.168.56.102:2181</value>

 </property>

 !--配置与zookeeper的连接地址-->

 <property>

  <name>yarn.resourcemanager.zk-state-store.address</name>

  <value>192.168.56.1:2181,192.168.56.101:2181,192.168.56.102:2181</value>

 </property>

 <property>

  <name>yarn.resourcemanager.store.class</name>

  <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore

  </value>

 </property>

 <property>

  <name>yarn.resourcemanager.zk-address</name>

  <value>192.168.56.1:2181,192.168.56.101:2181,192.168.56.102:2181</value>

 </property>

 <property>

  <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>

  <value>/yarn-leader-election</value>

  <description>Optionalsetting.Thedefaultvalueis/yarn-leader-election

  </description>

 </property>

 <property>

  <name>yarn.nodemanager.aux-services</name>

  <value>mapreduce_shuffle</value>

 </property>

vi  etc/hadoop/slaves:

192.168.56.1
192.168.56.101
192.168.56.102

环境变量

hadoop-env.sh和yarn-env.sh中添加JAVA_HOME:

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home

 

启动Hadoop集群:

1)启动zookeeper集群: 分别在3个zookeeper服务器上执行: zkServer.sh start

2)启动journalnode集群: 在第一台机器上执行: sbin/hadoop-daemons.sh start journalnode

3)格式化zkfc,让在zookeeper中生成ha节点:在01机器上执行:hdfs zkfc -formatZK

4)格式化hdfs:  在01机器上: hadoop namenode -format

5)启动NameNode: 在01上: sbin/hadoop-daemon.sh start namenode

                               在02上: bin/hdfs namenode -bootstrapStandby

                                            sbin/hadoop-daemon.sh start namenode

6)启动datanode集群:在01上: sbin/hadoop-daemons.sh start datanode

7)启动yarn集群: 在01上:sbin/start-yarn.sh

8)启动ZKFC:  在01上: sbin/hadoop-daemons.sh start zkfc

Web界面

启动完成后可以访问:

http://192.168.56.1:50070

http://192.168.56.101:50070

http://192.168.56.1:8088

 

展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部