文档章节

Improve API gateway throttling

stubhub
 stubhub
发布于 2014/07/01 11:59
字数 1511
阅读 58
收藏 0

What is throttling?

Throttling is a flow control feature that limits access toresource to a certain number of times. Once the upper limit orthreshold is reached, access to resource is rejected. A ban listcan be used to record such failed access, so that within a timewindow, access to the same resource is also denied. It providesprivilege based access control and shield resource against DDOSattack.


In API gateway, throttling is basically used to limit APIaccess based on subscription tier. Eg: gold tier allows 1000 APIaccess per limit. Throttling can be single tier (Eg: per userbased, per IP address based) or multi-tier (per user per APIendpoint). In both cases, throttling can be viewed as a tuple (key,upper limit). For the multi-tier case, the key can be viewed as apath. Eg, User id + “/” + ipaddress



Throttlingwithin VM

Counter based implementation

The time line is divided into aseries of time windows. A counter is maintained for the each timewindow.When new request comes in, the correct time window start isretrieved and counter is incremented. Request is denied if counterreaches limit and throttle key is added to the ban list. Thecounter is reset at the end of current time window


Refer to the followingdiagram

wKiom1MpK4PDDEeoAAFtPSLOtY4258.jpg


Time window approach is quitesimple to implement: a map is good to capture throttlinginformation and simplicity implies less calculationoverhead.

However, it subjects to aproblem known as access spike. Consider the followingscenario

Throttling criteria: 1000request per minute

Current time window: 15:00 –15:01

For the first 30 second withincurrent time window, no request

For the last 30 second withincurrent time window, there are 999 requests. All requests passthrottling

Now it starts next time window:15:01 – 15:02

For the first 30 second withincurrent time window, there are 999 requests. All requests passthrottling

Now if you considered the timewindow from second 30 15:00 to second 30 15:01, it accepts 1998requests, almost double the throttling limit!




Given the above explanation,choice of time window is a fairly important aspect toconsider:

If time window is too large,the access spike problem will manifest itself more.

If time window is too small,time of throttling calculation becomes non-trivial compares to thetime window and accuracy will be compromised. Currently, wenormally use per-second throttling, this is usually well- balancedbetween accuracy and access spike issue



Queue based implementation

An alternative is queue basedapproach. For each key, a sorted queue is maintained to recordrequest time. When new request comes in, we need to look back thequeue, sum up all request times that with range of current time andcurrent time – time window size to gets the total number ofrequest. This is compared against throttling limit to allow/denyaccess.

At backend, there is anotherhousekeeping thread that cleans up queues for older request timesthat are earlier than current time – time window size.


Refer to the followingdiagram

wKioL1MpK3vREI9HAAEK9B4XX7E445.jpg

Queue based implementation isnot subject to access spike mentioned above. But calculation costis higher due to the need to iterate through the queue. The queueneeds to be locked for multi-threaded access. This impactsperformance and throughput


Scalability consideration:

Within a single VM, scalabilityissue mainly arises from concurrency and lock contention bydifferent threads. Carefully chosen data structure could reducelock usage and improve performance. For counter based approach,Concurrent hashmap is used to implement key to counter mapping, andjava atomicInteger is used to implement counter. For queue basedapproach, ConcurrentSkipList is used to implement the sorted queue.




Throttling incluster environment

The simple problem becomes moreintriguing in a cluster environment. API requests may be handled bydifferent members in a cluster, and the total number of requestshould not exceed throttling limit. Currently WSO2 API gateway usespeer to peer cluster synchronization approach, where throttlingdata on one cluster member is asynchronously replicated to othermembers.


This approach is flawed due tothe following reasons:

Large number of messagesexchanged among cluster members: On receiving eachthrottling request, data is synchronized to all cluster members.Given average M throttling request on each node and N clustermembers, the total number of message is M*N


To make the situation worse,throttling data is stored in axis2 message context and the wholecontext is replicated. Under peak load, number of throttle keysstored on each node will be huge and throttle data will have alarge memory footprint, this makes serialization/deserializationand transferring of the message across networkexpensive.


Throttling accuracy is notguaranteed.

This is mainly due to networktransfer latency of throttling data, consider the followingscenario:

10 requests are allowed perminute

Node 1 received 5 requests andnode 2 received 4 requests previously and both nodes aresynchronized, so counter for node 1 and node 2 are both set to9

Now Node 1 receives6th request for current time window

Node 1 replicate throttlingdata to node 2

Node 2 received 5threquest for current time window, however this is before lateststate of node 1 is replicated to node 2, node 2 checks localcounter and allows the request

There are totally 11 requestsreceived within time window and exceeds throttling limit!



The proposed solution is to usea centralized throttle server to handle access request for thewhole cluster, compared with synchronization approach, it onlysends 1 message to throttling server for each request, resulting in much lessnetwork overhead.


From implementationperspective, we need a centralized key value store of highperformance; key is the concatenation of throttle key and starttime of current time window, value is counter. Banlist is also keptin the store.

We choose memcached or redis ascandidate key value store

Memcached has better readperformance and slightly better write performance compared toRedis, especially for highly concurrent access. Redis providecluster synchronization support and more flexible data structure.So in our case, both redis throttle and memcached throttle areimplemented, but memcached throttle is more preferred. Refer to

http://blog.sina.com.cn/s/blog_72995dcc01018qkf.html

http://iyunlin.com/thread/200319for various comparison between memcached and redis



CAS (compare and swap) isanother feature necessary to ensure accuracy of counter. A nodereceives throttle request will retrieve counter from server,compare it with throttle limit. Before counter is incremented andwritten to server, it may have been updated by another node. CASallowed us to detect such data contention and to avoid writingstale value. In this case, client is responsible for retrievecounter again and retry. Luckily both redis and memcached supportsCAS operation



Counter reset at end of timewindow is handled by key expiration. Each key’s expiration time isset to time window length, so there is no need to explicitly removekey from store at the end of each time window. Note that the smallestexpiration time for memcached is 1 second, which implies that wecannot do accurate throttling at millisecond level.



A single centralized throttlemay become a bottleneck on heavy load if all throttling request ishandled by it. Ideally, the centralized throttle can be a clustertoo and throttle request can be distributed among the cluster.So whichserver should handle a particular throttle request? We use hashpartition: calculate a hash value of the throttle key (murmur hashalgorithm is used), divide by number of throttle servers. The modis index of server. Alternatively, consistent hash can be used.Thisensures that same throttle key always hits the same server and wedon’t have to worry about distributing the key value store amongservers. Afew limitations: We have not considered data replication andbackup, as memcached does not support that. If one throttle serveris down, data stored on it is lost and we will not auto fallback toanther server. Load distribution isnot even as some key may be accessed more frequently than another,implying that the corresponding sever will take more load.

wKiom1MpK8qT9B0_AAC6poLA6jE641.jpg



Further optimization

During implementation, we made some observations that can improveperformance further:

Ban list can be stored incentral throttle server as well as API gateway server. Key affinitysaves network bandwidth and is effective against DDOS attack. If akey is rejected, it reaches central throttle server only for thefirst time and the key is added to gateway server’s local ban list.Subsequently, throttle request for the same key is directlyrejected by gateway server and does not reach central throttleserver.



The counter can be distributedbetween central throttle server and API gateway server.

Say throttle limit = 1000requests per min with 20 API gateway server in a cluster, we couldallocate a quota of 40 as local throttle limit for each gatewayserver. So that the first 40 throttle requests are handled locally.Note that this is a heuristic approach on the premise that requestfor a particular throttle key is distributed evenly to all gatewaycluster node, and this is meaningful only if throttle limit is fairlarge.

wKiom1MpLHvyYRKDAAC6fXkBI00875.jpg



CAS can be expensive. Weobserve that under high concurrency scenario, CAS can fail easilyand need to retry for many times. Each CAS attempt sends anadditional request to central throttle server and degradesperformance drastically. So we should try to reduce CAS usage muchas possible. Consider the following scenario

Throttle limit = 1000 requestsper min with 20 API gateway server in a cluster

The first 980 request is safewithout using CAS. A simple atomic increment operation will do.Contention only arises when we are near throttle limit. We shoulduse CAS only when counter >= 980. To further reduce contention,we can sleep a random short time before each CAS attempt

wKiom1MpK-mCIjRMAAGgIc2NAcs904.jpg


本文转载自:http://shadowisper.blog.51cto.com/3189863/1379528

stubhub
粉丝 1
博文 10
码字总数 2015
作品 0
浦东
私信 提问
Amazon 推出 API 网关使用计划

近日,Amazon升级了他们的API网关服务,推出了API网关“使用计划(Usage Plans)”。借助使用计划,Amazon API网关客户可以根据不同的访问级别和用户分类管理和货币化他们的API。通过第三方开...

局长
2016/09/07
2.8K
0
VirtualBox 进行重要更新,推出 5.1.0 版本

VirtualBox 5.1.0 版本发布,这是一个重要的更新,主要是在性能的提高和多媒体的支持上。 更新日志: VMM: new APIC and I/O APIC implementations that result in significantly improved p...

nnnm
2016/07/13
11.7K
45
Firefox 61.0 Beta 13 (Quantum) 发布,支持暗色主题

Mozilla Firefox 61.0 Beta 已发布,主要带来了 Quantum 引擎性能上的改进,优化了处理 CSS 的过程,让页面加载速度更快,并且用户还会感觉到切换标签页也会变快。 同时 Firefox 61.0 Beta ...

局长
2018/06/14
2K
10
云计算的24个设计模式

注:译自微软出版的《Cloud Design Patterns Book》。下载地址:https://download.microsoft.com/download/B/B/6/BB69622C-AB5D-4D5F-9A12-B81B952C1169/CloudDesignPatternsBook-PDF.pdf 注......

开源中国驻成都办事处
2016/10/15
101
0
Chef 14.4.42 发布,自动化服务器配置管理工具

Chef 14.4.42 发布了,Chef 是一个系统集成框架,为整个架构提供配置管理功能。 更新内容如下: Validatorless bootstrap fix #7562 (coderanger) support repeated options in systemd_uni...

h4cd
2018/08/22
621
0

没有更多内容

加载失败,请刷新页面

加载更多

Jenkins admin 密码忘记解决

一、admin密码未更改情况 1.进入\Jenkins\secrets目录,打开initialAdminPassword文件,复制密码; find / -name initialAdminPassword [root@jenkins jenkins]# cat /var/lib/jenkins/secre......

SuShine
23分钟前
4
0
LiveData原理分析

LiveData原理分析 1 LiveData简介 大部分Android应用会从网络或SQLite数据库存取数据,并根据数据更新界面。为了避免ANR,主线程中不能存取数据。而后台线程中无法更新界面。通常的做法是让后...

tommwq
37分钟前
3
0
Java描述设计模式(20):命令模式

本文源码:GitHub·点这里 || GitEE·点这里 一、生活场景 1、场景描述 智能电脑的品牌越来越多,由此诞生了一款电脑控制的APP,万能遥控器,用户在使用遥控器的时候,可以切换为自家电视的品...

知了一笑
38分钟前
2
0
java---网络编程(上)

1.1网络编程 网络编程指的是编写运行在多个设备计算机的程序,这些计算机通过网络连接起来 java.net包中提供了两种常见的网络协议的支持: TCP:TCP是传输控制层协议的缩写,它保障了两个应用...

Firefly-
42分钟前
12
0
城市搜索插件 city-query

  今天,给大家介绍一个比较简单有用的插件city-query,大家可以从coding上面下载下来。 git clone https://gitee.com/jflsy/city-query.git   引用插件时只需要src文件下的内容就可以了...

芳缘
47分钟前
5
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部