文档章节

阅读论文 --- GPS: A Graph Processing System∗

wdfnst
 wdfnst
发布于 2016/05/16 22:12
字数 886
阅读 21
收藏 0

输入图片说明

输入图片说明

  • Highlight, page 1
    GPS is similar to Google’s proprietary Pregel system, with three new features: (1) an extended API to make global computations more easily expressed and more efficient; (2) a dynamic repartitioning scheme that re- assigns vertices to different workers during the computation, based on messaging patterns; and (3) an optimization that distributes adjacency lists of high-degree vertices across all compute nodes to improve performance.

  • Highlight, page 1
    The GPS API has an extension that enables efficient implementation of al- gorithms composed of one or more vertex-centric com- putations, combined with global computations.

  • Highlight, page 1
    GPS has an optimization called large adjacency list par- titioning (LALP), which partitions the adjacency lists of high-degree vertices across compute nodes, again to reduce communication.

  • Highlight, page 2
    Broadly

  • Underline, page 2
    1.4 Partitioning Experiments

  • Highlight, page 3
    A computation thread loops through the vertices in the worker and executes vertex.compute() on each active ver- tex. It maintains an outgoing message buffer for all work- ers in the cluster, including itself. When a buffer is full it is either given to MINA threads for sending over the network, or passed directly to the local message parser thread.

  • Highlight, page 3
    MINA threads send and receive message buffers, as well as simple coordination messages between the master and the worker. When a message buffer is received, it is passed to the message parser thread.

  • Highlight, page 3
    A message parser thread parses incoming message buffers into separate messages and enqueues them into the re- ceiving vertices’ message queues for the next superstep.

  • Highlight, page 3
    GPS is implemented in Java. The compute nodes run HDFS (Hadoop Distributed File System) [18], which is used to store persistent data such as the input graph and the checkpointing files.

  • Highlight, page 3
    each line starts with the ID of a vertex u, followed by the IDs of u’s outgoing neighbors.

  • Highlight, page 3
    GPS assigns the vertices of G to workers using the same simple round- robin scheme used by Pregel: vertex u is assigned to worker W(u mod k).

  • Highlight, page 3
    The master coordinates the computation by instructing work- ers to: (a) start parsing input files; (b) start a new super- step; (c) terminate computation; and (d) checkpoint their states for fault-tolerance.

  • Highlight, page 5
    by partitioning large graphs “intelligently” before computation begins, we can reduce total network I/O by up to 13.6x and run-time by up to 2.5x. The effects of partitioning depend on three factors: (1) the graph algorithm being executed; (2) the graph itself; and (3) the configuration of the worker tasks across compute nodes.

  • Highlight, page 5
    Domain-based: In this partitioning scheme for web graphs only, we locate all web pages from the same domain in the same partition, and partition the domains randomly across the workers.

  • Highlight, page 5
    Random: The default “mod” partitioning method

  • Box, page 6
    (a) PageRank on sk-2005-d (b) Different algorithms on uk-2007-u Figure 6: Network I/O, different partitioning schemes (a) PageRank (50 iter.) on sk-2005-d (b) Different algorithms on uk-2007-u Figure 7: Run-time, different partitioning schemes

  • Highlight, page 8
    There are three questions any dynamic repartitioning scheme must answer: (1) which vertices to reassign; (2) how and when to move the reassigned vertices to their new workers; (3) how to lo- cate the reassigned vertices.

  • Highlight, page 8
    One option is to reassign vertex u at worker Wi to a new worker Wj if u send/receives more message to/from Wj than to/from any other worker, and that number of messages is over some threshold.

  • Highlight, page 9
    An obvious option is for each worker to store an in-memory map consisting of <vertex-id, new-worker-id> pairs.

  • Highlight, page 10
    We have developed a compiler from the Green-Marl [19] domain-specific language for graph processing into GPS.

  • Highlight, page 10
    some al- gorithms become very complex when implemented in a vertex-centric fashion—a classic example is contructing or doing a reverse traversal on a BFS tree. Green-Marl’s high-level constructs can express some of these compu- tations very easily, e.g., lines 7 and 11 in Figure 13b.

  • Box, page 11
    1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Procedure PageRank(G: Graph, e,d: Double, PR: Node Prop<Double>(G)) { Int i = 0; Double N = G.NumNodes(); G.PR = 1 / N; // Init PageRank Do { // Main iteration diff = 0.0; Foreach (t: G.Nodes) { Double val = (1−d) / N + d∗Sum(w: t.InNbrs){ w.PR / w.OutDegree()}; t.PR <= val @ t; diff +=|val−t.PR|;} i++; }While(i<30);} (a) PageRank Procedure bc approx(G:Graph, BC:Node Prop<Float>) { G.BC = 0; // Initialize BC as 0 per each node Node Prop<Float> sigma, delta; G.sigma = 0; Node s = G.PickRandom(); s. sigma = 1; InBFS(v: G.Nodes From s) { // BFS−order traversal // Summing over BFS parents v.sigma = Sum(w:v.UpNbrs) { w.sigma }; } InReverse { // Reverse−BFS order traversal v.delta = // Summing over BFS children Sum (w:v.DownNbrs) { v.sigma / w.sigma ∗ (1+ w.delta) }; v.BC += v.delta; // accumulate delta into BC } (b) Approximate Betweenness Centrality Figure 13: Green-Marl Programs }

© 著作权归作者所有

共有 人打赏支持
wdfnst
粉丝 2
博文 27
码字总数 22859
作品 1
宁波
私信 提问
半定规划(SDP)for 最大割

半定规划(SDP)for 最大割 最大割问题: Instance: given an undirected graph G(V,E), find a bipartition of V into S and T that maximizing the size of the cut E(S,T)= 贪心算法、随机......

hlveying
01/02
0
0
caffe学习:卷积计算

在caffe中如何计算卷积的? caffe中, 卷积网络的前向传播过程需要计算类似 这样的连接, forward_cpu_gemm()函数用weight矩阵和输入的bottom相乘, 然后对bias进行处理, bias程序会根据情况决定...

Hanging_Gardens
01/11
0
0
100 open source Big Data architecture papers

Big Data technology has been extremely disruptive with open source playing a dominant role in shaping its evolution. While on one hand it has been disruptive, on the other it ha......

naughty
2016/04/05
55
0
每周论文清单:知识图谱,文本匹配,图像翻译,视频对象分割

在碎片化阅读充斥眼球的时代,越来越少的人会去关注每篇论文背后的探索和思考。 在这个栏目里,你会快速 get 每篇精选论文的亮点和痛点,时刻紧跟 AI 前沿成果。 点击本文底部的「阅读原文」...

c9yv2cf9i06k2a9e
2017/12/27
0
0
HanLP 关键词提取算法分析详解

HanLP 关键词提取算法分析详解 参考论文:《TextRank: Bringing Order into Texts》 TextRank算法提取关键词的Java实现 TextRank算法自动摘要的Java实现这篇文章中作者大概解释了一下TextRan...

左手的倒影
11/05
0
0

没有更多内容

加载失败,请刷新页面

加载更多

设计模式之工厂模式

本篇博文主要翻译这篇文章: https://www.journaldev.com/1392/factory-design-pattern-in-java 由于翻译水平有限,自认为许多地方翻译不恰当,欢迎各位给出宝贵的建议,建议大家去阅读原文。...

firepation
8分钟前
1
0

中国龙-扬科
11分钟前
0
0
简单谈谈vue的过渡动画

在vue中,实现过渡动画一般是下面这样: `<``transition` `name``=``"fade"``>``<``div``></``div``>``</``transition``>` 用一个transition对元素或者组件进行封装. 在过渡的时候,会......

嫣然丫丫丫
17分钟前
1
0
文件及目录处理

file_get_contents file_put_contens fopen r/r+ 只读打开,指针开头 w/w+ 写入打开,指针开头,清空文件,不存创建 a/a+ 追加打开,指针末尾,不存创建 x/x+ 创建模式打开 b 二进制打开 t 文本打开...

关元
19分钟前
0
0
如何在Angular中使用better-scroll插件

由于需要在一个固定的的高度做无限滚动,本来css的overflow-y也可以完成的,奈何安卓不是很流畅,还很生硬,就是用了第三方库better-scroll,配合angular的ng-content。angular的ng-content和...

前端攻城老湿
25分钟前
2
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部