文档章节

Tigase Load Tests again - 500k user connections

今幕明
 今幕明
发布于 2014/11/21 12:05
字数 1458
阅读 190
收藏 0

By admin on May 29, 2010    

I have had a great opportunity and pleasure to use Sun's environment both hardware and software to run load tests on the Tigase server for a couple of last weeks. Sun also ofered me something more. Their knowledge and help during the tests allowed me to improve Tigase and fine tune the system to get the best results. I would like to say great thank you to Deepak Dhanukodi, ISV Engineering from Sun who was a huge help.

Below is a description of the tests I run, environment and results I got.

Summary

I know summary should be at the end but I realize that many people may be interested in the results while not being interested in all the details. So here we go.

 

  • The main goal for all these tests was to run the Tigase server under the load of 500k active user connections with an empty user roster. This test was going to show how well the Tigase server handles huge number of I/O and huge number of user sessions on a single machine.

    Success! The Tigase easily handled the load with CPU usage below 30% and memory consumption at about 50% level. The test was so successful that we tried to run another similar test to get to 1mln online users. This however failed because client machines couldn't generate such a load.

  • Secondary goal was to run comparison tests with different user roster size for and user connections count above 100k to see how the roster size impacts the load and resources consumption.

    This test wasn't the kind of score max but still I think it is also a great success. At the roster size of 40 and above the Tigase server started to behave unstable. Long GC activities impacted overall performance and in some cases leaded to unresponsive service. More details below. I learnt not only that default GC is not a good choice for the Tigase server under a high load but also I found the best GC and GC parameters to get a stable service with even higher load than I planed before. The CMS GC is the one which should be used to run Tigase.

  • Max connections and roster with 50 elements was the last test I wanted to run. In most XMPP installations I helped to setup, the average roster size was just below 50 elements. So the goal for this test was to see how many connections the Tigase can handle with such a roster.

    300k user connections with roster size 50 is the result which is quite good. CPU usage was below 50% and memory consumption below 60%. We could certainly try to handle more connections. Unfortunately I have never expected that the system can handle more than 300k user connections with 50 elements roster so this is what I had in the database prepared for the test.

 

Testing environment

I had 12 machines to run my tests. One for the Tigase server, second for the database and 10 more machines to generate clients' load:

  1. Tigase server SPARC Enterprise T5220, 32GB RAM, CPU - UltraSPARC-T2 with 8 Cores and 8 threads on each core which gives 64 processing units, CPU Clock speed - 1165MHz, 146GB 10k HDD SAS and SCSI.

  2. Database server Sun Fire X4600, 32GB RAM, CPU - 2xAMD Opteron 854 with 4 Cores each which gives 8 processing units, CPU Clock speed - 2.8GHz, 73GB 10k SAS HDD.

  3. Client machines 10x - Sun Fire V20z, 4GB RAM, CPU AMD Opteron Dual Core 2.1GHz, 36GB 10k SCSI HDD.

Software used:

  1. Tigase XMPP Server 4.1.5 as XMPP (Jabber) server.

  2. TSung 1.3.0 as clients' load generator.

  3. MySQL 5.1.33 Community Server as a database and the configuration file.

  4. Solaris 10 Update 6 as OS on the server, Solaris Express Community Edition snv_110 X86 as OS on load generators.

 

Test types

There were 2 main types of tests I ran:

  1. Standard test when the user session was about 20 minutes length, arrival duration 60 minutes. This test was mainly to compare the server behavior with different user roster sizes. The maximum number of users' connections was tuned by adjusting connections rate. This was however limited by the database which couldn't handle load generated by connection rate above 0.0045 sec.

  2. Extended test similar script to standard one but the user session time has been extended by putting script body in a loop. This was done to get maximum possible number of user connections in the test to see how Tigase can handle that.

Tigase setup

Here is a complete description of the Tigase installation which was fine tuned to get maximum performance during all tests. Please note I am not the MySQL database expert and I couldn't get it working fast enough to not impact performance. Therefore the system was configured in such a way to avoid any writing to the database during the test.

The complete JVM parameters for the tests are:

-XX:+UseLargePages -XX:+UseBiasedLocking 
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
-XX:ParallelCMSThreads=8 -Dfile.encoding=UTF-8 
-Dsun.jnu.encoding=UTF-8 -Djdbc.drivers=com.mysql.jdbc.Driver
-server -d64 -Xms28G -Xmx28G -XX:PermSize=64m 
-XX:MaxPermSize=512m

The Tigase server parameters:

--property-file etc/init.properties --test

The '--test' parameter only excludes offline messages plugin from the installation and also decreases default logging level. This is done to avoid delaying the server with any unecessary IO operation. During the tests Tsung sends lots of messages to online users. In the second phase it happens quite often that the message sent to online users is processed when the user actually is gone and then it goes to database to offline storage. This introduced long delays. Also heavy logging introduces significant delays too and impacts overall performance, therefore it is set to absolute minimum during tests.

The Tigase server configuration properties

config-type=--gen-config-def
--admins=admin@tigase.test
--virt-hosts = tigase.test
--auth-db=tigase-auth
--user-db=mysql
--user-db-uri=jdbc:mysql://192.168.111.32/tigasedb_20roster?user=tigase_user&password=tigase_passwd
--user-repo-pool-size=12
--comp-name-1=srecv
--comp-class-1=tigase.server.sreceiver.StanzaReceiver
#--debug=server
--monitoring=jmx:10000

A few notes to the parameters:

  1. The 'tigase-auth' was used as authentication connector. It uses stored procedures to perform all user login/logout actions. Normally these procedures also update last login/logout time. For this test however updating user login/logout time was removed from stored procedures to minimize database delays.

  2. Depending on the roster size a different database was used.

  3. Database connection pool of 12 was used for user data database. There was only a single database connection for user authentication connector.

  4. StanzaReceiver was loaded to run Tigase internal monitoring tools detecting system overload, threads dead-locks and other possible problems.

  5. Monitoring via JMX was enabled and the system was also monitored using JConsole.

 

The user roster

The user roster was either empty or had a fixed, the same size for all users. It was built in such a way that always exactly half of the buddies were online and the other half was off-line when the user was logging in. Later on the rest of buddies was logging in too so eventualy all budies in the roster were online during the rest of the test.

Tests and tests results

Basic tests

Name Roster Session lentgh Connections rate Max connections CPU usage RAM usage Tsung reports Comments
500k empty 80min 0.005 sec 622k CPU Memory Tsung report Attempt was also to get to 1mln connections. This however failed due to limitation on the load generating machines. They were maxing resources out over 500k connections.
300k* 50 20 min 0.0045 sec* 300k CPU Memory Tsung report The requirement was to keep user session within 20min so to generate more connections the new connections rate had to be changed. Unfortunately 0.0045sec rate was the highest the database could handle so the 300k was the test limit or the database limit, not the Tigase server limit.

* - the database limit.

Other tests

No Roster Session lentgh Connections rate Max connections Tsung reports Comments
1. Empty 20min 0.015 sec >100k Tsung report Default GC.
2. 10 20min 0.015 sec >100k Tsung report Default GC.
3. 20 20min 0.015 sec >100k Tsung report Default GC.
4. 30 20min 0.015 sec >100k Tsung report Default GC.
5. 40 20min 0.015 sec >100k Tsung report Default GC.
6. 50 20min 0.015 sec >100k Tsung report GC Settings: XX:+UseLargePages -XX:+UseBiasedLocking -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:ParallelGCThreads=32, this didn't help much. At certain load GC delays make Tigase unresponsive.
7. 50 20min 0.0045 sec 299k Tsung report GC Settings: -XX:+UseLargePages -XX:+UseBiasedLocking -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:ParallelCMSThreads=8, this is the secret formula. CMS GC is the one which works well with Tigase and offer stable service even under a very high load.


本文转载自:http://www.tigase.net/blog-entry/tigase-load-tests-again-500k-user-connections

共有 人打赏支持
今幕明
粉丝 46
博文 224
码字总数 39350
作品 0
朝阳
程序员
私信 提问
http://www.tigase.net/blog-entry/1mln-or-more-onli

By admin on May 29, 2011 I have been working on clustering code improvements in the Tigase server for last a few months to make it more reliable and better scale. In article abo......

今幕明
2014/11/21
0
0
tsung测试tigase性能

tigase 性能测试,测试环境应用太多,只记录一个测试的方法; 具体性能可以参考官方的测试结果,而且有测试配置文件; http://www.tigase.org/content/tigase-load-tests-again-500k-user-co...

greki
2014/04/03
0
1
搭建Tigase jaxmpp使用

假设本机已经成功安装了MySQL数据库 A:搭建Tigase 官方安装文档:http://www.tigase.org/content/manual-installation-console-mode 1:下载架包并解压缩 /usr/local/tigase-server-5.1.4-b...

丁佳辉
2016/06/08
79
0
XMPP服务tigase配置流程

下载jdk,我使用的是1.8版本的。 下载tigase包,我使用tigase-server-5.2.1-b3461-dist-max.tar,这个包,在JDK1.6 上,运行不起来。需要在1.7以上。 解压缩包, $ unzip tigase-server-x.y....

淡风
2014/04/25
0
0
使用tigase管理命令 tclmt 中的add-user 出现错误的解决记录

使用 tigase 的 tclmt 进行添加用户,下载tclmt安装包,安装后, cd tclmt bin/tclmt.sh -u admin@tigase1.local -p 123456 add-user test21 123456 test21@tigase1.local 然后显示 awaiting......

今幕明
2014/04/29
0
0

没有更多内容

加载失败,请刷新页面

加载更多

漏洞防御与修复工作

漏洞管理工作是企业安全建设必不可少的一环,在风险管理工作中,漏洞管理能够防患于未然,企业对漏洞管理有着广泛的基础建设和实践经验。但随着攻防技术的发展,传统漏洞管理的安全技术和管理...

linuxprobe16
14分钟前
0
0
MicroPython技术及应用前景

1 Micropython技术是什么? MicroPython极精简高效的实现了Python3语言。它包含Python标准库的一小部分,能在单片机和受限环境中运行。 1.1 MicroPython发展 由剑桥大学的理论物理学家乔治....

bodasisiter
20分钟前
0
0
跟我学Spring Cloud(Finchley版)-13-通用方式使用Hystrix

本节详细讲解使用Hystrix的通用方式。 简介 Hystrix是由Netflix开源的一个延迟和容错库,用于隔离访问远程系统、服务或者第三方库,防止级联失败,从而提升系统的可用性与容错性。Hystrix主要...

周立_ITMuch
28分钟前
0
0
🛠️Hanjst/汉吉斯特更新加JavaScript运行时优化等

这是 Hanjst/汉吉斯特 发布以来的首个主要升级更新版本。这次的主要升级更新的内容包括移除HTML Comments注释行, 优化在 Hanjst include模板文件时的JavaScript运行时环境。 Hanjst 在设计和...

wadelau
今天
2
0
OSChina 周六乱弹 —— 舔狗是没有好下场的

Osc乱弹歌单(2019)请戳(这里) 【今日歌曲】 @我没有抓狂 :#今天听什么# #今天听这个# 分享 Nirvana 的歌曲《Smells Like Teen Spi...》 《Smells Like Teen Spi...》- Nirvana 手机党少...

小小编辑
今天
463
13

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部