文档章节

Kubeflow等镜像部署到集群多节点

openthings
 openthings
发布于 2018/11/28 16:32
字数 1372
阅读 393
收藏 0

为了将Kubeflow(https://github.com/kubeflow/kubeflow)/Kubernetes等镜像放到本地集群部署或者更新,需要一系列的操作。如果集群的多个节点同时访问外部镜像服务,将带来较大的并发网络流量,不仅速度慢、而且费用增加。因此我将其分为两个阶段来进行,第一阶段将镜像下载到本地,第二阶段各个节点从本地文件系统或镜像服务来获取镜像的拷贝。

1、从 gcr到本地存储

这一阶段其实也分为两个步骤。

首先,从能够访问到gcr的站点(https://www.katacoda.com)下载。如:


echo ""
echo "================================================================="
echo "pull kubeflow images for system from gcr.io and hub.docker.com..."
echo "This tools created by openthings, NO WARANTY. 2018.07.10."
echo "================================================================="

echo ""
echo "1. centraldashboard"
docker pull gcr.io/kubeflow-images-public/centraldashboard:v0.2.1

echo ""
echo "2. jupyterhub-k8s"
docker pull gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1

echo ""
echo "3. tf_operator"
docker pull gcr.io/kubeflow-images-public/tf_operator:v0.2.0 

echo ""
echo "4. ambassador"
docker pull quay.io/datawire/ambassador:0.30.1

echo ""
echo "5. redis"
docker pull redis:4.0.1 

echo ""
echo "6. seldonio/cluster-manager"
docker pull seldonio/cluster-manager:0.1.6

echo ""
echo "Finished."
echo ""

然后将镜像推送到国内站点(如阿里云-http://registry.cn-hangzhou.aliyuncs.com)。如


echo ""
echo "================================================================="
echo "Push kubeflow images for system to aliyun.com ..."
echo "This tools created by openthings, NO WARANTY. 2018.07.10."
echo "================================================================="

MY_REGISTRY=registry.cn-hangzhou.aliyuncs.com/openthings

echo ""
echo "1. centraldashboard"
docker tag gcr.io/kubeflow-images-public/centraldashboard:v0.2.1 ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1
docker push ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1

echo ""
echo "2. jupyterhub-k8s"
docker tag gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1 ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1
docker push ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1

echo ""
echo "3. tf_operator"
docker tag gcr.io/kubeflow-images-public/tf_operator:v0.2.0 ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0
docker push ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0

echo ""
echo "4. ambassador"
docker tag quay.io/datawire/ambassador:0.30.1 ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1
docker push ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1

echo ""
echo "5. redis"
docker tag redis:4.0.1 ${MY_REGISTRY}/redis:4.0.1
docker push ${MY_REGISTRY}/redis:4.0.1

echo ""
echo "6. seldonio/cluster-manager"
docker tag seldonio/cluster-manager:0.1.6 ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6
docker push ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6

echo ""
echo "Finished."
echo ""

然后,可以从阿里云下载到本地,恢复为原始的名称:


echo ""
echo "================================================================="
echo "Pull kubeflow images for system from aliyun.com ..."
echo "This tools created by openthings, NO WARANTY. 2018.11.28."
echo "================================================================="

MY_REGISTRY=registry.cn-hangzhou.aliyuncs.com/openthings

echo ""
echo "1. centraldashboard"
docker pull ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1
docker tag ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1 gcr.io/kubeflow-images-public/centraldashboard:v0.2.1 

echo ""
echo "2. jupyterhub-k8s"
docker pull ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1
docker tag ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1 gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1

echo ""
echo "3. tf_operator"
docker pull ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0
docker tag ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0 gcr.io/kubeflow-images-public/tf_operator:v0.2.0

echo ""
echo "4. ambassador"
docker pull ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1
docker tag ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1 quay.io/datawire/ambassador:0.30.1

echo ""
echo "5. redis"
docker pull ${MY_REGISTRY}/redis:4.0.1
docker tag ${MY_REGISTRY}/redis:4.0.1 redis:4.0.1

echo ""
echo "6. seldonio/cluster-manager"
docker pull ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6
docker tag ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6 seldonio/cluster-manager:0.1.6

echo ""
echo "Finished."
echo ""

从阿里云下载到本地后,可以推送到本地镜像服务(如Harbor)或者打包为*.tar文件。

2、从本地存储到集群部署

从本地Harbor中安装,使用docker tag将镜像改名后,就可以使用了。可以参考上面的从阿里云下载的方法。

打包为*.tar文件,参见:

echo "==================================================================="
echo "Save Kubeflow images to tar."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"

echo "###################################################################"
echo "Kubeflow 0.3.3 ML system images."
echo "-------------------------------------------------------------------"

echo "A1.>> centraldashboard"
docker save gcr.io/kubeflow-images-public/centraldashboard:v0.2.1 -o A1-kubeflow-centraldashboard-v0.2.1.tar
echo ""

echo "A2.>> jupyterhub-k8s"
docker save gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1 -o A2-kubeflow-jupyterhub-k8s-v20180531-3bb991b1.tar
echo ""

echo "A3.>> tf_operator"
docker save gcr.io/kubeflow-images-public/tf_operator:v0.2.0 -o A3-kubeflow-tf_operator-v0.2.0.tar
echo ""

echo "A4.>> ambassador"
docker save quay.io/datawire/ambassador:0.30.1 -o A4-kubeflow-ambassador-0.30.1.tar
echo ""

echo "A5.>> redis"
docker save redis:4.0.1  -o A5-kubeflow-redis-4.0.1.tar
echo ""

echo "A6.>> seldonio/cluster-manager"
docker save seldonio/cluster-manager:0.1.6 -o A6-kubeflow-seldonio-cluster-manager-0.1.6.tar
echo ""

echo "=================================================================="
echo "Kubeflow worker engine images......"
echo "B1.>> Tensorflow notebook CPU"
docker save gcr.io/kubeflow-images-public/tensorflow-1.12.0-notebook-cpu:v-base-76107ff-897 -o A6-kubeflow-tensorflow-1.12.0-notebook-cpu-v-base-76107ff-897.tar
echo ""

echo "B2.>> Tensorflow notebook GPU"
docker save gcr.io/kubeflow-images-public/tensorflow-1.12.0-notebook-gpu:v-base-76107ff-897 -o A6-kubeflow-tensorflow-1.12.0-notebook-gpu-v-base-76107ff-897.tar
echo ""

echo "==================================================================="
echo "Save Kubeflow images Finished."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"
echo "==================================================================="
echo ""

将所有的镜像压缩为一个zip包,然后上传到工作节点:

echo "Uploading 10.1.1.202"
sshpass -p xxxx scp kf-images-0.3.3.zip supermap@10.1.1.202:/home/supermap/

echo "Uploading 10.1.1.203"
sshpass -p xxxx scp kf-images-0.3.3.zip supermap@10.1.1.203:/home/supermap/

echo "Uploading 10.1.1.142"
sshpass -p xxxx scp kf-images-0.3.3.zip supermap@10.1.1.142:/home/supermap/

echo "Uploading 10.1.1.193"
sshpass -p xxxx scp kf-images-0.3.3.zip supermap@10.1.1.193:/home/supermap/

echo "Uploading 10.1.1.234"
sshpass -p xxxx scp kf-images-0.3.3.zip supermap@10.1.1.234:/home/supermap/

echo "Uploading 10.1.1.205"
sshpass -p xxxx scp kf-images-0.3.3.zip supermap@10.1.1.205:/home/supermap/

echo "Uploading 10.1.1.112"
sshpass -p xxxx scp kf-images-0.3.3.zip supermap@10.1.1.112:/home/supermap/

echo "Upload kf-images-0.3.3.zip Finished."

然后,在各个节点上恢复到Docker的原始镜像名称。如下:

echo "==================================================================="
echo "Load Kubeflow images from tar."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"

echo "###################################################################"
echo "Kubernetes core system images."
echo "-------------------------------------------------------------------"

echo "A1<< centraldashboard"
sudo docker load -i A1-kubeflow-centraldashboard-v0.2.1.tar
echo ""

echo "A2<< jupyterhub-k8s"
sudo docker load -i A2-kubeflow-jupyterhub-k8s-v20180531-3bb991b1.tar
echo ""

echo "A3<< tf_operator"
sudo docker load -i A3-kubeflow-tf_operator-v0.2.0.tar
echo ""

echo "A4<< ambassador"
sudo docker load -i A4-kubeflow-ambassador-0.30.1.tar
echo ""

echo "A5<< redis"
sudo docker load -i A5-kubeflow-redis-4.0.1.tar
echo ""

echo "A6<< seldonio/cluster-manager"
sudo docker load -i A6-kubeflow-seldonio-cluster-manager-0.1.6.tar
echo ""

echo "=================================================================="
echo "Kubeflow worker engine images......"
echo "B1<< Tensorflow notebook CPU"
sudo docker load -i A6-kubeflow-tensorflow-1.12.0-notebook-cpu-v-base-76107ff-897.tar
echo ""

echo "B2<< Tensorflow notebook GPU"
sudo docker load -i A6-kubeflow-tensorflow-1.12.0-notebook-gpu-v-base-76107ff-897.tar
echo ""

echo "==================================================================="
echo "Load Kubeflow images Finished."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"
echo "==================================================================="

在每一个节点执行上面的脚本,也可以使用 ansible来远程批量执行。

ansible all -i hosts_ansible -m shell -a "unzip -u /home/supermap/kf-images-0.3.3.zip && cd /home/supermap/kf-images-0.3.3 && ./kf-images-load.sh" --ask-sudo-pass --become --become-method=sudo

上面的hosts_ansible为ansible的hosts列表文件(请参考 Ansible快速开始-指挥集群 )。

上面的这个过程也适用于Kubernetes本身镜像的下载和更新。更多参考:

查看镜像是否有新的版本:

© 著作权归作者所有

openthings
粉丝 322
博文 1137
码字总数 687066
作品 1
东城
架构师
私信 提问
AirFlow/NiFi/MLFlow/KubeFlow进展

大数据分析中,进行流程化的批处理是必不可少的。传统的大数据处理大部分是基于关系数据库系统,难以实现大规模扩展;主流的基于Hadoop/Spark体系总体性能较强,但使用复杂、扩展能力弱。大数...

openthings
06/21
332
0
Kubeflow镜像的快速下载(V0.3.3)

Kubeflow是一个面向Kubernetes集群运行的机器学习框架。要想使用得先想办法把镜像搬到自己的环境里来。 目前版本0.3.3的容器镜像已经搬回来,可以使用下面的脚本来从Aliyun的镜像服务站下载:...

openthings
2018/11/28
648
0
Kubeflow 入门——为 Kubernetes 打造的组件化、可移植、可扩展的机器学习堆栈

【编者的话】本文来自 Kubeflow 项目的产品经理 David Aronchick 和首席工程师 Jeremy Lewi,主要讲了他们新的开源项目——Kubeflow 的一些入门知识,Kubeflow 致力于使 Kubernetes 上的机器...

openthings
2018/05/06
419
0
谷歌发布Kubeflow 0.1版本,基于Kubernetes的机器学习工具包

自从 Google 发布开源容器编排工具——Kubernetes 以来,我们已经见证了其以各种方式遍地开花的景象。随着 Kubernetes 越来越受欢迎,许多辅助项目也已经发展起来。现在,Google 发布了Kubef...

Docker
2018/05/06
0
0
Kubeflow 0.1 发布,基于 Kubernetes 的机器学习工具库

Google 发布了 Kubeflow 开源工具 0.1 版本,该工具旨在将机器学习带入 Kubernetes 容器的世界。该项目背后的想法是让数据科学家充分利用在 Kubernetes 集群上运行机器学习任务的优势。Kubef...

局长
2018/05/07
1K
1

没有更多内容

加载失败,请刷新页面

加载更多

Experts say the weaker pound is drawing investors to the UK tech sector

UK tech companies secured a record £5.5bn in foreign investment in the first seven months of this year, research shows. This was more than the amount invested per capita in th......

wowloop
10分钟前
2
0
Add support for Android 9-patch images in BorderImage

The 9-patch image implementation in Qt Quick Controls 1 is an internal implementation detail of the Android style. It cannot handle .9.png image files out of the box, but takes ......

shzwork
14分钟前
4
0
c/c++日期时间处理函数小结

日期时间处理函数: 日期时间转为字符串 strftime/std::put_time 字符串解析成日期时间 strptime/std::get_time 时间结构转换:time_t->tm localtime:time_t->tm 时间结构转换:tm->time_t ...

chuqq
19分钟前
4
0
Apache Flink 进阶入门(二):Time 深度解析

前言 Flink 的 API 大体上可以划分为三个层次:处于最底层的 ProcessFunction、中间一层的 DataStream API 和最上层的 SQL/Table API,这三层中的每一层都非常依赖于时间属性。时间属性是流处...

大涛学长
20分钟前
3
0
创龙基于Xilinx Artix-7系列FPGA处理器

SOM-TLA7是一款由广州创龙基于Xilinx Artix-7系列FPGA自主研发的核心板,可配套广州创龙Artix-7开发板使用。核心板尺寸仅70mm*50mm,采用沉金无铅工艺的10层板设计,专业的PCB Layout保证信号...

Tronlong创龙
26分钟前
5
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部