文档章节

论文阅读 - Blogel: A Block-Centric Framework for Distr

wdfnst
 wdfnst
发布于 2016/05/16 11:18
字数 1468
阅读 27
收藏 0

Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs

如何分块?

  1. Voronoi Digram 划分 The GVD computation can be easily implemented in the vertex- centric computing model, by performing multi-source BFS. Specif- ically, in superstep 1, each source s sets block(s) = s and broad- casts it to the neighbors; for each non-source vertex v, block(v) is unassigned. Finally, the vertex votes to halt. In superstep i (i > 1), if block(v) is unassigned, v sets block(v) to an arbitrary source received, and broadcasts block(v) to its neighbors before voting to halt. Otherwise, v votes to halt directly. When the process con- verges, we have block(v) = si for each v ∈ V C(si).

  2. 2D划分
    The first job is vertex-centric and works as follows: (1)each worker samples a subset of its vertices with probability psamp and sends the sample to the master; (2)the master first partitions the sampled vertices into nx slots by the x-coordinates, and then each slot is further partitioned into ny slots by the y-coordinates.

如何管理块? (计算, 消息传递)

Block-centric algorithm. Our block-centric solution operates in VB-mode. Each vertex maintains the same fields as in the vertex- centric algorithm, and blocks do not maintain any information. In each superstep, V-compute() is first executed for all vertices, where a vertex v finds w∗ from the incoming messages as in the vertex- centric algorithm. However, now v votes to halt only if d(w∗) ≥ dist(v). Otherwise, v updates ⟨prev(v), dist(v)⟩ = ⟨w∗, d(w∗)⟩ but stays active. Then, B-compute() is executed, where each block B collects all its active vertices v into a priority queue Q (with dist(v) as the key), and makes these vertices vote to halt. B- compute() then runs Dijkstra’s algorithm on B using Q, which re- moves the vertex v ∈ Q with the smallest value of dist(v) from Q for processing each time. The out-neighbors u ∈ Γ(v) are updated as follows. For each u ∈ V (B), if dist(v)+l(v, u) < dist(u), we update ⟨prev(u), dist(u)⟩ to be ⟨v, dist(v)+l(v, u)⟩, and insert u into Q with key dist(u) if u ∈/ Q, or update dist(u) if u is already in Q. For each u ̸∈ V (B), a message ⟨v, dist(v) + l(v, u)⟩ is sent to u. B votes to halt when Q becomes empty. In the next superstep, if a vertex u receives a message, u is activated along with its block, and the block-centric computation repeats.

  • Highlight, page 1
    For processing graphs with a large diameter δ, the message (or neighbor) propagation paradigm of the vertex-centric model often leads to algorithms that require O(δ) rounds (also called super- steps) of computation.

  • Underline, page 1
    For example, a single-source shortest path algorithm in [11] takes 10,789 supersteps on a USA road network.

  • Underline, page 1
    Apart from spatial networks, some large web graphs also have large diameters (from a few hundred to thousands). For example, the vertex-centric system in [14] takes 2,450 rounds for computing strongly connected components on a web graph.

  • Highlight, page 2
    loading: each worker loads a portion of vertices from HDFS into main-memory; the workers then exchange ver- tices through the network (by hashing over vertex ID) so that each worker wi finally holds all and only those vertices assigned to wi

  • Highlight, page 3
    BTC

  • Highlight, page 3
    Friendster

  • Highlight, page 3
    USA Road

  • Highlight, page 3
    On the contrary, the block- centric model works on G and a high-degree vertex involves at most O(n/b) messages each round, where n is the number of vertices in the giant CC.

  • Highlight, page 3
    The idea is to broadcast the smallest vertex ID seen so far by each vertex v, denoted by min(v).

  • Highlight, page 3
    Each vertex be- longs to a unique block, and let block(v) be the ID of the block that v belongs to.

  • Highlight, page 4
    Similar to a vertex in Pregel, a block in Blogel also has a com- pute() function. We use B-compute() and V-compute() to denote the compute() function of a block and a vertex, respectively.

  • Highlight, page 4
    A block has access to all its vertices, and can send messages to any block B or vertex v as long as worker(B) or worker(v) is available.

  • Highlight, page 4
    Each B-worker maintains two message buffers, one for exchang- ing vertex-level messages and the other for exchanging block-level messages. A block also has a state indicating whether it is active, and may vote to halt.

  • Highlight, page 6
    Block-centric algorithm. Our block-centric solution operates in VB-mode. Each vertex maintains the same fields as in the vertex- centric algorithm, and blocks do not maintain any information. In each superstep, V-compute() is first executed for all vertices, where a vertex v finds w∗ from the incoming messages as in the vertex- centric algorithm. However, now v votes to halt only if d(w∗) ≥ dist(v). Otherwise, v updates ⟨prev(v), dist(v)⟩ = ⟨w∗, d(w∗)⟩ but stays active. Then, B-compute() is executed, where each block B collects all its active vertices v into a priority queue Q (with dist(v) as the key), and makes these vertices vote to halt. B- compute() then runs Dijkstra’s algorithm on B using Q, which re- moves the vertex v ∈ Q with the smallest value of dist(v) from Q for processing each time. The out-neighbors u ∈ Γ(v) are updated as follows. For each u ∈ V (B), if dist(v)+l(v, u) < dist(u), we update ⟨prev(u), dist(u)⟩ to be ⟨v, dist(v)+l(v, u)⟩, and insert u into Q with key dist(u) if u ∈/ Q, or update dist(u) if u is already in Q. For each u ̸∈ V (B), a message ⟨v, dist(v) + l(v, u)⟩ is sent to u. B votes to halt when Q becomes empty. In the next superstep, if a vertex u receives a message, u is activated along with its block, and the block-centric computation repeats.

  • Highlight, page 6
    v finds the in-neighbor w∗ such that d(w∗) is the small- est among all d(w) received.

  • Text Note, page 6

  • Highlight, page 7
    We first review the Graph Voronoi Diagram (GVD) [3] of an undirected unweighted graph G = (V, E).

  • Highlight, page 8
    The sampling rate psamp decides the number of blocks, and usually a small value as 0.1% is a good choice.

  • Highlight, page 8
    As for the stopping parameters, γ is usually set as 90%, and pmax as 10% with f = 2, so that there will not be too many rounds of multi-source BFS.

  • Highlight, page 8
    The GVD computation can be easily implemented in the vertex- centric computing model, by performing multi-source BFS.

  • Highlight, page 8
    Specif- ically, in superstep 1, each source s sets block(s) = s and broad- casts it to the neighbors; for each non-source vertex v, block(v) is unassigned. Finally, the vertex votes to halt. In superstep i (i > 1), if block(v) is unassigned, v sets block(v) to an arbitrary source received, and broadcasts block(v) to its neighbors before voting to halt. Otherwise, v votes to halt directly. When the process con- verges, we have block(v) = si for each v ∈ V C(si).

  • Highlight, page 8
    The multi-source BFS has linear workload since each vertex only broadcasts a message to its neighbors when block(v) is assigned, and thus the total messages exchanged by all vertices is bounded by O(|E|).

  • Highlight, page 8
    Initially, each vertex v samples itself as a source with probability psamp. Then, multi- source BFS is performed to partition the vertices into blocks.

  • Highlight, page 8
    The first job is vertex-centric and works as follows: (1)each worker samples a subset of its vertices with probability psamp and sends the sample to the master; (2)the master first partitions the sampled vertices into nx slots by the x-coordinates, and then each slot is further partitioned into ny slots by the y-coordinates.

  • Highlight, page 9
    Data Type |V| |E| AVG Deg Max Deg Web Graphs WebUK directed 133,633,040 5,507,679,822 41.21 22,429 WebBase directed 118,142,155 1,019,903,190 8.63 3,841 Social Networks Friendster undirected 65,608,366 3,612,134,270 55.06 5,214 LiveJournal directed 10,690,276 224,614,770 21.01 1,053,676 RDF BTC undirected 164,732,473 772,822,094 4.69 1,637,619 Spatial Networks USA Road undirected 23,947,347 58,333,344 2.44 9 Euro Road undirected 18,029,721 44,826,904 2.49 12

  • Highlight, page 10
    Giraph++ [18] proposed a graph coarsening method to reduce the size of the input graph so that METIS can run on the smaller graph.

© 著作权归作者所有

wdfnst
粉丝 2
博文 27
码字总数 22859
作品 1
宁波
私信 提问
MnasNet:终端轻量化模型新思路

雷锋网(公众号:雷锋网) AI 科技评论按,本文作者陈泰红(ahong007@yeah.net),他为 AI 科技评论撰写了关于 MnasNet 的独家解读文章。 1. Motivation CNN 模型近年发展非常迅猛,在多项视觉...

思颖
2018/08/12
0
0
12月20日,哈工大(深圳)邀您参与人工智能国际顶会论文报告会啦!

雷锋网 AI 科技评论按:12 月 20 日(周三),由中国人工智能学会青年工作委员会、哈工大(深圳)计算机科学与技术学院主办的「人工智能国际顶级会议论文报告会」将正式举行。雷锋网作为独家...

奕欣
2017/12/18
0
0
nginx cpu过高或过低--状态简单分析与监控

nginx cpu占用率过高,可能是CPU密集型计算导致堵塞的. 分析工具 https://github.com/agentzh/nginx-systemtap-toolkit#sample-bt https://github.com/agentzh/stapxx#ngx-lj-lua-stacks 2) ......

testwork
2016/05/31
92
0
protobuf-netbeans-plugin

protobuf-netbeans-plugin 是一个 NetBeans 用来开发 Protocol Buffers 的模块。 Protocol Buffers 是Google公司开发的一种数据描述语言,类似于XML能够将结构化数据序列化,可用于数据存储、...

匿名
2010/03/12
1K
0
方法▍想要接触人工智能吗?先要学会如何阅读论文

作者|xingoo 编辑| 树袋熊 凭借着对算法和AI的向往,终于有机会接触到人工智能的领域。现在的主要工作就是在OCR文字识别,期间也看了不少的论文,从CTPN到Faster RCNN,再到EAST和FOTS。最开...

36大数据
02/21
0
0

没有更多内容

加载失败,请刷新页面

加载更多

3分钟看懂Activity启动流程

背景介绍 从事开发到了一定阶段,想要提高就必须搞明白系统的一些工作原理。为什么?因为只有明白了这些,你才能针对平台的特性写出优质的代码。当遇到棘手的问题时,你才能更快速的结合系统...

天王盖地虎626
29分钟前
1
0
机器学习算法GPU版本安装配置

##XGBoost for GPU安装https://blog.csdn.net/weixin_30963287/article/details/79145107https://blog.csdn.net/wl2858623940/article/details/80546140https://blog.csdn.net/u01164186......

KYO4321
32分钟前
1
0
微软展开训练AI来推Windows 10 1903版自动更新

Windows 10 May 2019(1903版)正式释出将近一个月,或许已经有用户自主安装更新了,不过微软认为还不够多。微软表示将开始训练机器学习(machine learning)技术,帮助1803版本以前的PC更新...

yisy5566
今天
0
0
前后端分离-前端搭建(Vue)(2)

先安装node.js以及npm 现在基本的node.js都包含有npm,下载安装后, 可以在cmd命令里输入 node -v 和npm -v 分别查看安装的版本 两个都显示了版本就是安装ok 这次我们使用JetBrains WebStor...

咸鱼-李y
今天
0
0
好程序员web前端教程分享三大前端框架相关问题

  好程序员web前端教程分享三大前端框架相关问题,三大前端框架,有没有哪个框架的组件间交互像js的方法传值一样简单? 首先框架组件通信是为了方便组件模块之间进行数据交互的,因为框架的...

好程序员IT
今天
0
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部