文档章节

A plain english introduction to CAP Theorem

nao
 nao
发布于 2017/09/30 13:44
字数 1635
阅读 14
收藏 0
CAP

A plain english introduction to CAP Theorem

http://ksat.me/a-plain-english-introduction-to-cap-theorem/

You'll often hear about the CAP theorem which specifies some kind of an upper limit when designing distributed systems. As with most of my other introduction tutorials, Lets try understanding CAP by comparing it with a real world situation.

Chapter 1: "Remembrance Inc" Your new ventrue:

Last night when your spouse appreciated you on remembering her birathday and bringing her a gift, a strange has idea strikes you. People are so bad in remembering things. And you're sooo good at it. So why not start a ventrue that will put your talent to use? The more you think about it, the more you like it. In fact you even come up with a news paper ad which explains you idea.

Remembrance Inc! - Never forget,  even without remembering!
   Ever felt bad that you forget so much?  Don't worry. Help is just a phone away!
    When you need to remember something, just call 555--55-REMEM and tell us what you need to remember. For eg., call us and let us know of your boss's phone number, and forget to remember it. when you need to know it back.. call back the same number[(555)--55-REMEM ] and we'll tell you what's your boss's phone number.
   Charges : only $0.1 per request

So, you typical phone conversation will look like this:

  • Customer: Hey, Can you store my neighbor's birathday?
  • You: When is it ?
  • Customer: 2nd of jan
  • You:(write it down aganist the customer's page in you paper note book)Stored. Call us any time for knowing you neighbor's birathday again!
  • Customer: Thank you !
  • You: No problem! We charged you credit card with $0.1


Chapter 2: you scale up:

You venture(企业) gets funded by YCombinator. You idea is so simple, needs nothing but a paper notebook and phone, yet so effective that is spreads like wild fire. You start getting hundres of call every day.

And there starts the problem. You see that more and more of your customers have to wait in the queue to speak to you. Most of them even hang up tired of the waiting tone. Besides when you were sick the other day and could not come to work you lost a whole day business. Not to mention all those dissatisfied customers who wanted information on that day. You decide it's time for you to scale up and bring in your wife to help you.

Your start with a simple plan:

  1. You and your wife both get an extension phone
  2. Customers still dial(555)-55-REMEM and need to remember only one number
  3. A pbx will route the a customers call to whoever is free and equally


Chapter3: You have you first "Bad Servie":

Two days after you implemented the new system, you get a call form you get a call from trusted cunstomr Jhon. This is how it goes:

  • Jhon: Hey
  • You: Glad you called “Remembrance Inc!”. What can I do for you?
  • Jhon: Can you tell me when is my flight to New Delhi?
  • You: Sure.. 1 sec sir (You look up your notebook) (wow! there is no entry for “flight date” in Jhon’s page)!!!!!
  • You: Sir, I think there is a mistake. You never told us about your flight to delhi
  • Jhon: What! I just called you guys yesterday!(cuts the call!)

How did that happen? Could Jhon be lying? You think about it for a second and the reason hits you ! Could Jhon's call yesterday reached you wife? You go to your wife's desk and check her notebook. Sure enough it's there. You tell this to your wife and she realizes the problem too.

What a terrible flaw in your distributed design! Your distributed system is not consistent! There could always be a chance that a customer updates someting which goes to either your or your wife and when the next call from the customer is routed to another person there will not be a consistent reply form Remembrance Inc!



Chapter4: You fix the Consistency problem:

Well, your competitors may ignore a bad service, but not you. You think all night in the bed when your wife is sleeping and come up with a beautiful plan in the morning. You wake up your wife and tell her.

” Darling this is what we are going to do from now”

  • Whenever any one of us get a call for an update(when the customer wants us to remember something) before completing the call we tell the other person
  • This way both of us note down any updates
  • When there is call for search(When the customer wants information he has already stored) we don’t need to talk with the other person. Since both of us have the latest updated information in both of our note books we can just refer to it..

There is only one problem though, you say, and that is an "update" request has to involve both of us and we cannot work in parallel during that time. For eg. When you get an updat request and telling me to update too. I cannot take other calls. But that's okay becuse that most calls we get anyway are "search"(a customer updates once and asks many times). Besides, we cannot give wrong information at any cost.

"Neat" your wife says, "but there is one more flaw i this system that you haven't thougt of. What if one of us doesn't report to work on a paticular day? On the day, then, we won't be able to take 'any' update calls, because the other person cannot be updated! We will have Availability problem, i.e, for eg:if an update request comes to me I will never be able to complete that call because even though I have written the update in my note book, I can never update you. So I can never complete the call"



Chapter 5: You come up with the greatest solution Ever:

You being to realize a little bit on why distributed system might not be as easy as you thought at first. Is it that difficult to come up with a solution that could be both "Consistent and Available"? Could be difficult for others, but not for you!!! Then next morning you come up with a solution that your competitors cannpt think of in their dreams! You wake you wife up eagerly again..

"look", you tell her .. "This is what we can do to consistent and available". The plan is mostly similar to what I told you yesterday:

  • Whenever any one of us get a call for an update(when the customer wants us to remember something) before completing the call, if the other person is available we tell the other person. This way both of us note down any updates
  • But if the other person is not available(doesn’t report to work) we send the other person an email about the update.
  • The next day when the other person comes to work after taking a day off, He first goes through all the emails, updates his note book accordingly.. before taking his first call.

Genius! You wife says! I can’t find any flaws in this systems. Let’s put it to use.. Remembrance Inc! is now both Consistent and available!



Chapter 6: Your wife gets angry:

Everyting goes well for a while. Your system is consistent. Your system works well even when one of you doesn't report to work. But what if Both of you report to work and one fo you doesn't update the other person? Remember all those days you've been waking your wife up early with your Greatest-idea-ever-bullshit? What if your wife decides to take calls but is too arnry with you adn decides not to update you for a day? Your idea totally breaks! Your idea so far is good for consistentcy and availablity but is not Partition Tolerang! You can decide to be partition tolerant by deciding not to take any calls until you patch up with your wife. Then your system will not "avaliable" during that time ...



chapter7: conslusion:

So Let's look at CAP Theorem now. Its states that, when you are designing a distributted system you can get cannot achieve all three of Consistency, Availablity and Partition tolerance. You can pick only two of:

  • Consistency: You customers, once they have updated information with you, will always get the most updated information when they call subsequently. No matter how quickly they call back
  • Availability: Remembrance Inc will always be available for calls until any one of you(you and you wife)report to work even if there is a communication loss beteween you and your wife!


Bonus: Eventual Consistency with a run around clerk:

Here is another food for thought. You can have a run around clerk, who will update other's notebook when one of your's or your wife's note books is updated. The greatest benefit of this is that, he can work in background and one of your or your wife's "update" doesn't have to block, waiting for the other one to update. This is how many NoSql systems work, on node upates itself locally and a background process synchorinizes all other nodes accordingly... The only problem is that you will lose consistency of some time. For eg., a customer's call reaches your wife first and before the check has a chance to update your notebook. the cumstor' calls back and it reaches you. The he won't get a consistent reply... But that said, this is not at all a bad idea if such cases are limited. For eg. assuming a customer won't forget things so quickly that calls back in 5 minutes.

That's CAP and eventual consistency for you in simple english :)

本文转载自:http://ksat.me/a-plain-english-introduction-to-cap-theorem/

共有 人打赏支持
nao

nao

粉丝 28
博文 155
码字总数 108154
作品 0
成都
后端工程师
私信 提问
MongoDB

NoSQL(NoSQL = Not Only SQL ),意即"不仅仅是SQL"。 在现代的计算系统上每天网络上都会产生庞大的数据量。 这些数据有很大一部分是由关系数据库管理系统(RDMBSs)来处理。 1970年 E.F.Codd...

5431039
2016/04/17
0
0
浅谈 CAP 理论

原文同步至 本文介绍了介绍了分布式系统著名的 CAP 理论。什么是 CAP 理论?为什么说 CAP 只能三选二?了解 CAP 对于系统架构又有什么指导意义?本文将一一作答。 什么是 CAP 理论 在计算机科...

waylau
2016/02/28
3.7K
10
Miller Rabin素性检测算法

前言 本文是Introduction to Modern Cryptography 一书的阅读笔记,主要介绍Miller-Rabin素性检测算法。在公钥密码编码学中,一个比较重要的问题就是如何生成素数,可以参考RSA公钥密码体制一...

初雪之音
2016/03/24
220
0
分布式存储系统设计的若干原则

1、CAP理论 2000年Eric Brewer教授提出了著名的CAP理论,即:一个分布式系统不可能满足一致性,可用性和分区容错性这三个需求,最多只能同时满足两个。2002年MIT的Seth Gilbert 和 Nancy ly...

红薯
2011/08/10
405
1
Web-Scale IT 我之见!

Gartner 曾在发表过的一篇文章中表示,到2017年,全球50%的企业将使用Web-Scale IT 架构。下面我们来看看 Andre Leibovici 对 Web-Scale IT 的看法: Web-scale IT 不仅仅是一个流行词,更是...

OneAPM蓝海讯通
2015/12/31
12
0

没有更多内容

加载失败,请刷新页面

加载更多

大数据反欺诈技术架构

一年多以前,有朋友让我聊一下你们的大数据反欺诈架构是怎么实现的,以及我们途中踩了哪些坑,怎么做到从30min延迟优化到1s内完成实时反欺诈。当时呢第一是觉得不合适,第二也是觉得场景比较...

微笑向暖wx
14分钟前
0
0
flink-系统内部消息传递的exactly once语义

At Most once,At Least once和Exactly once 在分布式系统中,组成系统的各个计算机是独立的。这些计算机有可能fail。 一个sender发送一条message到receiver。根据receiver出现fail时sender如...

xtof
21分钟前
0
0
iOS程序执行顺序和UIViewController 的生命周期(整理)

说明:此文是自己的总结笔记,主要参考: iOS程序的启动执行顺序 AppDelegate 及 UIViewController 的生命周期 UIView的生命周期 言叶之庭.jpeg 一. iOS程序的启动执行顺序 程序启动顺序图 iO...

壹峰
23分钟前
0
0
配置网络、远程登录、Linux秘钥认证

配置网络 一台服务器安装完系统之后不管是为了方便管理还是业务需要,我们都要给它配置ip地址。让机器能够联网。在现实的生产环境的当中,往往我们给服务器配置的ip都是提前规划好的,但是在...

李超小牛子
27分钟前
0
0
dotConnect for Oracle入门指南(五):检索和修改数据

【下载dotConnect for Oracle最新版本】 dotConnect for Oracle(原名OraDirect.NET)建立在ADO.NET技术上,为基于Oracle数据库的应用程序提供完整的解决方案。它为设计应用程序结构带来了新的...

电池盒
27分钟前
0
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部