文档章节

Create a hadoop2.4.1 Cluster on Cent OS 6.5 (host

幸运的幸福
 幸运的幸福
发布于 2014/08/12 11:53
字数 1499
阅读 30
收藏 0

1. Prepare three Cent Os hosts for poc

10.28.241.174 shuynh-gecko1
10.28.241.172 shuynh-gecko2
10.28.241.175 shuynh-gecko3

root@shuynh-gecko1:~# cat /etc/os-release

2. Get all images related to Hadoop Cluster on each node

2.1 get all images

root@shuynh-gecko1:~# docker pull sequenceiq/hadoop-docker

root@shuynh-gecko2:~# docker pull sequenceiq/hadoop-docker

root@shuynh-gecko3:~# docker pull sequenceiq/hadoop-docker

2.2 check the image on each node

root@shuynh-gecko1:~# docker images |grep sequenceiq/hadoop-docker |grep 2.4.1
sequenceiq/hadoop-docker           2.4.1               8040f2b27b10        4 weeks ago         854.1 MB


3. Get source code of docker-scripts on each node

root@shuynh-gecko1:/# git clone https://github.com/jay-lau/hadoop-docker-master-cluster.git
Cloning into 'hadoop-docker-master-cluster'...
remote: Counting objects: 16, done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 16 (delta 1), reused 16 (delta 1)
Unpacking objects: 100% (16/16), done.
Checking connectivity... done.

root@shuynh-gecko2:/# git clone https://github.com/jay-lau/hadoop-docker-master-cluster.git

root@shuynh-gecko3:/# git clone https://github.com/jay-lau/hadoop-docker-master-cluster.git


Notes: By default, if nodetype is N, we will start both of namendoe and datanode. If we want only start namenode, please remove the start logic in bootstrap.sh.

if [ $3 = "N" ] ; then
    echo "starting Hadoop Namenode,resourcemanager,datanode,nodemanager"

    #rm -rf  /tmp/hadoop-root
    #$HADOOP_PREFIX/bin/hdfs namenode -format> /dev/null 2>&1
    $HADOOP_PREFIX/sbin/hadoop-daemon.sh  start namenode > /dev/null 2>&1
    echo "Succeed to start namenode"

    $HADOOP_PREFIX/sbin/yarn-daemon.sh  start resourcemanager > /dev/null 2>&1
    echo "Succeed to start resourcemanager"


    #$HADOOP_PREFIX/sbin/hadoop-daemon.sh  start datanode > /dev/null 2>&1
    #echo "Succeed to start datanode"

    #$HADOOP_PREFIX/sbin/yarn-daemon.sh  start nodemanager > /dev/null 2>&1
    #echo "Succeed to start nodemanager"

    $HADOOP_PREFIX/bin/hadoop dfsadmin -safemode leave
else
    echo "starting Hadoop Datanode,nodemanager"

    rm -rf  /tmp/hadoop-root
    $HADOOP_PREFIX/sbin/hadoop-daemon.sh  start datanode > /dev/null 2>&1
    echo "Succeed to start datanode"

    $HADOOP_PREFIX/sbin/yarn-daemon.sh  start nodemanager > /dev/null 2>&1
    echo "Succeed to start nodemanager"
fi

4. Build Hadoop docker on each node

#enter the folder of  the hadoop-docker-master-cluster scripts.

4.1 build the images on shuynh-gecko1

root@shuynh-gecko1:~#cd /root/hadoop-docker-master-cluster
root@shuynh-gecko1:~#docker build -t="sequenceiq/hadoop-cluster-docker:2.4.1" .

root@shuynh-gecko1:~/hadoop-docker-master-cluster# docker build -t="sequenceiq/hadoop-cluster-docker:2.4.1" .                       

Sending build context to

Docker daemon   149 kB
Sending build context to Docker daemon
Step 0 : FROM sequenceiq/hadoop-docker:2.4.1
 ---> 8040f2b27b10
Step 1 : MAINTAINER SequenceIQ
 ---> Using cache
 ---> 882cff7182a4
Step 2 : USER root
 ---> Using cache
 ---> 408f0a434373
Step 3 : ADD core-site.xml $HADOOP_PREFIX/etc/hadoop/core-site.xml
 ---> 927521fd85ae
Removing intermediate container 7df7dba3d730
Step 4 : ADD hdfs-site.xml $HADOOP_PREFIX/etc/hadoop/hdfs-site.xml
 ---> 949460061b1e
Removing intermediate container e4cb6829fdb9
Step 5 : ADD mapred-site.xml $HADOOP_PREFIX/etc/hadoop/mapred-site.xml
 ---> e268a15c1d3f
Removing intermediate container 5c901152fb30
Step 6 : ADD yarn-site.xml $HADOOP_PREFIX/etc/hadoop/yarn-site.xml
 ---> 284ca37d9857
Removing intermediate container 9c780fc17aa7
Step 7 : ADD slaves $HADOOP_PREFIX/etc/hadoop/slaves
 ---> 1e3a4ffa5632
Removing intermediate container 2094c6c5622f
Step 8 : ADD bootstrap.sh /etc/bootstrap.sh
 ---> b8c32c42b655
Removing intermediate container 0d9616f32157
Step 9 : RUN chown root:root /etc/bootstrap.sh
 ---> Running in 103d2f89a580
 ---> 765f1e58c184
Removing intermediate container 103d2f89a580
Step 10 : RUN chmod 700 /etc/bootstrap.sh
 ---> Running in 5cc86e285299
 ---> 1a4b1dfb615c
Removing intermediate container 5cc86e285299
Step 11 : ENV BOOTSTRAP /etc/bootstrap.sh
 ---> Running in 57e323c93b5f
 ---> a1082f764127
Removing intermediate container 57e323c93b5f
Step 12 : RUN rm -f /etc/ssh/ssh_host_dsa_key
 ---> Running in 6be294648cc9
 ---> a9c5d835c39c
Removing intermediate container 6be294648cc9
Step 13 : RUN rm -f /etc/ssh/ssh_host_rsa_key
 ---> Running in 80f727977a76
 ---> ae19d6e5171d
Removing intermediate container 80f727977a76
Step 14 : RUN rm -f /root/.ssh/id_rsa
 ---> Running in 3fbebc17ee38
 ---> c473e2ed5f6f
Removing intermediate container 3fbebc17ee38
Step 15 : RUN ssh-keygen -q -N "" -t dsa -f /etc/ssh/ssh_host_dsa_key
 ---> Running in 72b62e9a0656
 ---> f7444a1eb624
Removing intermediate container 72b62e9a0656
Step 16 : RUN ssh-keygen -q -N "" -t rsa -f /etc/ssh/ssh_host_rsa_key
 ---> Running in 550b8fb8809d
 ---> 3338f146799a
Removing intermediate container 550b8fb8809d
Step 17 : RUN ssh-keygen -q -N "" -t rsa -f /root/.ssh/id_rsa
 ---> Running in 99d28e7ead76
 ---> d4befb3f8898
Removing intermediate container 99d28e7ead76
Step 18 : RUN cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
 ---> Running in 74be0823aad2
 ---> cf03143c566f
Removing intermediate container 74be0823aad2
Step 19 : EXPOSE 50020 50021 50090 50070 50010 50011 50075 50076 8031 8032 8033 8040 8042 49707 22 8088 8030
 ---> Running in 46c625d45f0d
 ---> 62fce6617879
Removing intermediate container 46c625d45f0d
Step 20 : CMD ["-h"]
 ---> Running in 4677defeb509
 ---> 268426cafb54
Removing intermediate container 4677defeb509
Step 21 : ENTRYPOINT ["/etc/bootstrap.sh"]
 ---> Running in d5d4c1a34868
 ---> b51d46a23ae3
Removing intermediate container d5d4c1a34868
Successfully built b51d46a23ae3

root@shuynh-gecko1:~/hadoop-docker-master-cluster# docker images|grep sequenceiq/hadoop-cluster-docker
sequenceiq/hadoop-cluster-docker   2.4.1               b51d46a23ae3        6 minutes ago       854.1 MB

4.2 build the images on shuynh-gecko2

root@shuynh-gecko2:~#cd /root/hadoop-docker-master-cluster
root@shuynh-gecko2:~#docker build -t="sequenceiq/hadoop-cluster-docker:2.4.1" .

root@shuynh-gecko2:~/hadoop-docker-master-cluster# docker build -t="sequenceiq/hadoop-cluster-docker:2.4.1" .     

4.3 build the images on shuynh-gecko3

root@shuynh-gecko3:~#cd /root/hadoop-docker-master-cluster
root@shuynh-gecko3:~#docker build -t="sequenceiq/hadoop-cluster-docker:2.4.1" .

root@shuynh-gecko3:~/hadoop-docker-master-cluster# docker build -t="sequenceiq/hadoop-cluster-docker:2.4.1" .     

5. Configure /etc/hosts file  for each node

configure /etc/hosts file on every nodes
10.28.241.174 shuynh-gecko1
10.28.241.172 shuynh-gecko2
10.28.241.175 shuynh-gecko3

6. Create Hadoop Cluster

# Start a container
docker run   --net=host  sequenceiq/hadoop-cluster-docker:2.4.1 $1 $2 $3 $4 $5 $6

Params definition as below:
$1:Hdfs port, such as 9000
$2:Hdfs DataNode port, such as 50010
$3:Type of Namenode or Datanode, such as N | D
$4:Number of hdfs replication, default is 1. Need more improvement for this param.
$5:Default command, run as service "-d", run as interactive "-bash"
$6:Master Node IP address, such as 10.28.241.174

#If we need run interactive, please add "-i -t " options.

6.1 Create NameNode and DataNode (interactive serivce, using -bash, add -i,-t ) on shuynh-gecko1:

[root@shuynh-gecko1 ~]# docker stop $(docker ps -a -q)
[root@shuynh-gecko1 ~]# docker rm $(docker ps -a -q)
[root@shuynh-gecko1 ~]# docker run  -i -t --net="host"  sequenceiq/hadoop-cluster-docker:2.4.1 9001 50010 N 1 -bash 10.28.241.174
BOOTSTRAP=/etc/bootstrap.sh
HOSTNAME=shuynh-gecko1
HADOOP_PREFIX=/usr/local/hadoop
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/java/default/bin
PWD=/
JAVA_HOME=/usr/java/default
SHLVL=1
HOME=/
_=/usr/bin/env
/
Hdfs port:9001
Hdfs DataNode port:50010
Namenode or datanode:N
Number of hdfs replication:1
Default command:-bash
Master ip:10.28.241.174
Starting sshd: [  OK  ]
starting Hadoop Namenode,resourcemanager,datanode,nodemanager
Succeed to start namenode
Succeed to start resourcemanager
Succeed to start datanode
Succeed to start nodemanager
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Safe mode is OFF

6.2 Create DataNode (backend service, using -d) on shuynh-gecko2:
[root@shuynh-gecko2 ~]# docker stop $(docker ps -a -q)
[root@shuynh-gecko2 ~]# docker rm $(docker ps -a -q)
[root@shuynh-gecko2 hadoop-docker-master-cluster]# docker run   --net="host"  sequenceiq/hadoop-cluster-docker:2.4.1 9001 50010 D 1 -d 10.28.241.174
BOOTSTRAP=/etc/bootstrap.sh
HOSTNAME=shuynh-gecko2
HADOOP_PREFIX=/usr/local/hadoop
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/java/default/bin
PWD=/
JAVA_HOME=/usr/java/default
SHLVL=1
HOME=/
_=/usr/bin/env
/
Hdfs port:9001
Hdfs DataNode port:50010
Namenode or datanode:D
Number of hdfs replication:1
Default command:-bash
Master ip:10.28.241.174
starting Hadoop Datanode,nodemanager
Succeed to start datanode
Succeed to start nodemanager

6.3 Create DataNode (backend service, using -d) on shuynh-gecko3:
[root@shuynh-gecko2 ~]# docker stop $(docker ps -a -q)
[root@shuynh-gecko2 ~]# docker rm $(docker ps -a -q)
[root@shuynh-gecko2 hadoop-docker-master-cluster]# docker run   --net="host"  sequenceiq/hadoop-cluster-docker:2.4.1 9001 50010 D 1 -d 10.28.241.174

7. Check the cluster status

7.1 Access the WEB GUI

Access http://10.28.241.174:50070/dfshealth.html#tab-datanode

or

Access http://10.28.241.174:50070/dfshealth.html#tab-datanode

7.2 Using command line to check the status

bash-4.1# $HADOOP_PREFIX/bin/hdfs dfsadmin -report

8. Run a sample hadoop case

bash-4.1# $HADOOP_PREFIX/bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'

14/07/30 23:59:40 INFO client.RMProxy: Connecting to ResourceManager at /10.28.241.174:8032
14/07/30 23:59:40 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
14/07/30 23:59:40 INFO input.FileInputFormat: Total input paths to process : 26
14/07/30 23:59:41 INFO mapreduce.JobSubmitter: number of splits:26
14/07/30 23:59:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1406778261600_0003
14/07/30 23:59:41 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
14/07/30 23:59:42 INFO impl.YarnClientImpl: Submitted application application_1406778261600_0003
14/07/30 23:59:42 INFO mapreduce.Job: The url to track the job: http://shuynh-gecko1:8088/proxy/application_1406778261600_0003/
14/07/30 23:59:42 INFO mapreduce.Job: Running job: job_1406778261600_0003

14/07/30 23:59:49 INFO mapreduce.Job: Job job_1406778261600_0003 running in uber mode : false
14/07/30 23:59:49 INFO mapreduce.Job:  map 0% reduce 0%
14/07/30 23:59:57 INFO mapreduce.Job:  map 4% reduce 0%
14/07/30 23:59:58 INFO mapreduce.Job:  map 8% reduce 0%
14/07/30 23:59:59 INFO mapreduce.Job:  map 27% reduce 0%
14/07/31 00:00:00 INFO mapreduce.Job:  map 31% reduce 0%
14/07/31 00:00:01 INFO mapreduce.Job:  map 38% reduce 0%
14/07/31 00:00:04 INFO mapreduce.Job:  map 42% reduce 0%
....


© 著作权归作者所有

幸运的幸福
粉丝 0
博文 5
码字总数 3821
作品 0
朝阳
私信 提问
Hadoop 2.X : 分布式安装

原文:http://disi.unitn.it/~lissandrini/notes/installing-hadoop-on-ubuntu-14.html This guide is shows step by step how to set up a multi nod cluster with Hadoop and HDFS 2.4.1 o......

樂天
2015/04/25
241
0
Teamtalk问题请教

@南湖船老大 你好,想跟你请教个问题: 我今天部署了Teamtalk,系统是cent os 6.5 但服务也启动成功,但是访问web的时候提示404??这是什么问题?

chrisyang
2016/06/21
366
2
mycat-1.5.1[NIOREACTOR err Got packets out of order code:1156]

mycat:1.5.1 cent-os:6.5 jdk:1.7 mysql:5.6.26 偶然会出现以下异常: WARN [$_NIOREACTOR-3-RW] (MultiNodeHandler.java:134) -error response from MySQLConnection [id=58496, lastT......

thinkingtime
2017/03/27
246
0
Install Redmine on Centos 6.5

Install Redmine on Centos 6.5 - 64 bit Install Redmine on Centos 6.5 - 64 bit The System Requirements During the installation process we will use the Centos 6.5 - 64 bit OS, the......

Ericklee
2015/01/22
0
0
Install Redmine on Centos 6.5 - 64 bit

Install Redmine on Centos 6.5 - 64 bit Install Redmine on Centos 6.5 - 64 bit The System Requirements Update the System Install the dependencies packages Install Apache and MySQ......

xiaoxin
2014/08/25
1K
1

没有更多内容

加载失败,请刷新页面

加载更多

川普给埃尔多安和内堪尼亚胡的信

任性 https://twitter.com/netanyahu/status/1186647558401253377 https://edition.cnn.com/2019/10/16/politics/trump-erdogan-letter/index.htm...

Iridium
12分钟前
5
0
golang-mysql-原生

db.go package mainimport ("database/sql""time"_ "github.com/go-sql-driver/mysql")var (db *sql.DBdsn = "root:123456@tcp(127.0.0.1:3306)/test?charset=u......

李琼涛
40分钟前
4
0
编程作业20191021092341

1编写一个程序,把用分钟表示的时间转换成用小时和分钟表示的时 间。使用#define或const创建一个表示60的符号常量或const变量。通过while 循环让用户重复输入值,直到用户输入小于或等于0的值...

1李嘉焘1
41分钟前
6
0
Netty整合Protobuffer

现在我们都知道,rpc的三要素:IO模型,线程模型,然后就是数据交互模型,即我们说的序列化和反序列化,现在我们来看一下压缩比率最大的二进制序列化方式——Protobuffer,而且该方式是可以跨...

算法之名
46分钟前
18
0
如何用C++实现栈

栈的定义 栈(stack)又名堆栈,它是一种运算受限的线性表。限定仅在表尾进行插入和删除操作的线性表。这一端被称为栈顶,相对地,把另一端称为栈底。向一个栈插入新元素又称作进栈、入栈或压...

BWH_Steven
今天
5
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部