1、索引之segment memory:
一个segment是一个完备的lucene倒排索引,而倒排索引是通过词典(Term Dictionary)到文档列表(Postings List)的映射关系,快速做查询的。所以每个segment都有会一些索引数据驻留在heap里。
因此segment越多,瓜分掉的heap也越多,并且这部分heap是无法被GC掉的!
怎么知道segment memory占用情况呢? CAT API可以给出答案。
1)、查看一个索引所有segment的memory占用情况:
curl -s -uelastic:changeme 'http://192.168.58.158:9200/_cat/segments?v' index shard prirep ip segment generation docs.count docs.deleted size size.memory committed searchable version compound www-nginx-tp-2017-08-20 0 p 192.168.58.158 _s9v 36643 58203 0 56.8mb 127077 true true 6.5.1 false |
2)、查看一个node上所有segment占用的memory总和:
curl -s -uelastic:changeme 'http://192.168.58.158:9200/_cat/nodes?v&h=segments.count,segments.memory,segments.index_writer_memory,segments.version_map_memory,segments.fixed_bitset_memory' segments.count segments.memory segments.index_writer_memory segments.version_map_memory segments.fixed_bitset_memory 9488 1.1gb 75.8mb 1.3mb 29.1kb |
减少data node上的segment memory占用,有三种方法:
①、删除不用的索引。
②、关闭索引(文件仍然存在于磁盘,只是释放掉内存)。需要的时候可重新打开。
③、定期对不再更新的索引做force merge(之前版本是optimze)
【关闭索引减少的segment memory占用,会比force merge要多很多】
【本文的索引是按天区分的,所以之前的索引文件,就不会再有更新了】
①、force merge【完全合并索引】:
此项操作,时间长短和服务器负载,根据索引大小决定,建议凌晨操作前一天的索引
curl -s -XPOST -uelastic:changeme 'http://192.168.58.158:9200/api-nginx-tp-*/_forcemerge?max_num_segments=1' |
实例:
#索引合并前 [root@elk ~]$curl -s -uelastic:changeme 'http://192.168.58.158:9200/_cat/indices?v'|grep api-nginx-tp-2017-09-04 green open api-nginx-tp-2017-09-04 6r1H7VDQR6K0A5HkP-dyrg 5 0 46840141 0 24.5gb 24.5gb #一共是133行 [root@elk ]$curl -s -uelastic:changeme 'http://192.168.58.158:9200/_cat/segments?v'|grep api-nginx-tp-2017-09-04 api-nginx-tp-2017-09-04 0 p 192.168.58.158 _dtw 17924 622749 0 301.4mb 537606 true true 6.5.1 false api-nginx-tp-2017-09-04 0 p 192.168.58.158 _jv5 25745 1322064 0 630.2mb 1046505 true true 6.5.1 false api-nginx-tp-2017-09-04 0 p 192.168.58.158 _vc1 40609 432480 0 212.6mb 394258 true true 6.5.1 false ................................. api-nginx-tp-2017-09-04 4 p 192.168.58.158 _2m0k 121844 1437 0 946.5kb 14959 true true 6.5.1 true
#合并索引 [root@elk-aliyun-10.47.58.158 ~]$curl -s -XPOST -uelastic:changeme 'http://192.168.58.158:9200/api-nginx-tp-2017-09-04/_forcemerge?max_num_segments=1' {"_shards":{"total":5,"successful":5,"failed":0}}
#合并索引后 [root@elk ~]$curl -s -uelastic:changeme 'http://192.168.58.158:9200/_cat/indices?v'|grep api-nginx-tp-2017-09-04 green open api-nginx-tp-2017-09-04 6r1H7VDQR6K0A5HkP-dyrg 5 0 46840141 0 21.3gb 21.3gb
[root@elk ~]$curl -s -uelastic:changeme 'http://192.168.58.158:9200/_cat/segments?v'|grep api-nginx-tp-2017-09-04 api-nginx-tp-2017-09-04 0 p 192.168.58.158 _2qpk 127928 9366137 0 4.2gb 6526325 true true 6.5.1 false api-nginx-tp-2017-09-04 1 p 192.168.58.158 _2rti 129366 9366212 0 4.2gb 6522427 true true 6.5.1 false api-nginx-tp-2017-09-04 2 p 192.168.58.158 _2tam 131278 9371827 0 4.2gb 6538908 true true 6.5.1 false api-nginx-tp-2017-09-04 3 p 192.168.58.158 _2kc3 119667 9369922 0 4.2gb 6528417 true true 6.5.1 false api-nginx-tp-2017-09-04 4 p 192.168.58.158 _2m0l 121845 9366043 0 4.2gb 6519260 true true 6.5.1 false
总结: 1、合并前:indices是24.5gb,segments有133个 2、合并后:indices是21.3gb,segments有实际的logstash客户端个数
segments合并之前的size.memory是38372792,合并之后是32635337,少了5.5M左右 |
②、close index:
针对不使用的index,建议close,减少内存占用。因为只要索引处于open状态,索引库中的segement就会占用内存,close之后就只会占用磁盘空间了。
curl -s -XPOST -uelastic:changeme 'http://192.168.58.158:9200/api-nginx-tp-2017-08-28/_close' curl -s -XPOST -uelastic:changeme 'http://192.168.58.158:9200/*-2017-08-28/_close' |
③、调优思路及脚本:
#对于按天存储的nginx日志【视情况而定】:
1)、索引合并的,第二天凌晨合并前一天的
2)、索引关闭的,可以关闭7天前的
脚本可以看到优化前和优化后, segment memory 的变化,包括索引的合并和关闭
#!/bin/bash #Author: # perofu #Email: # perofu.com@gmail.com
log_file=/tmp/elasticsearch_optimize_index.log day=`date "+%Y-%m-%d"` day_1=`date -d "1 days ago" "+%Y-%m-%d"` day_7=`date -d "7 days ago" "+%Y-%m-%d"` ip="192.168.58.158" index_name="bb-nginx-tp aa-nginx-tp"
#number_of_replicas setting 0 curl -XPUT -uelastic:changeme "http://${ip}:9200/*-${day}/_settings" -d ' { "number_of_replicas": 0 }' # clear cache curl -s -uelastic:changeme -XPOST 'http://${ip}:9200/_cache/clear' echo "${day}" >> ${log_file}
curl -s -uelastic:changeme 'http://${ip}:9200/_cat/nodes?v&h=segments.count,segments.memory,segments.index_writer_memory,segments.version_map_memory,segments.fixed_bitset_memory' &>> ${log_file} #optimize segment
echo "${index_name}" | while read line do echo "" >> ${log_file} curl -s -uelastic:changeme 'http://${ip}:9200/_cat/indices?v' | grep "${line}-${day_1}" &>> ${log_file} curl -s -uelastic:changeme 'http://${ip}:9200/_cat/segments?v'|grep "${line}-${day_1}" | awk 'BEGIN{summ=0}{summ=summ+$10}END{print summ}' &>> ${log_file}
curl -s -XPOST -uelastic:changeme "http://${ip}:9200/${line}-${day_1}/_forcemerge?max_num_segments=1" &>> ${log_file} sleep 10 curl -s -uelastic:changeme 'http://${ip}:9200/_cat/indices?v' | grep "${line}-${day_1}" &>> ${log_file} curl -s -uelastic:changeme 'http://${ip}:9200/_cat/segments?v'|grep "${line}-${day_1}" | awk 'BEGIN{summ=0}{summ=summ+$10}END{print summ}' &>> ${log_file} echo "" >> ${log_file} done
curl -s -uelastic:changeme 'http://${ip}:9200/_cat/nodes?v&h=segments.count,segments.memory,segments.index_writer_memory,segments.version_map_memory,segments.fixed_bitset_memory' &>> ${log_file}
#close index 7 day ago echo "${index_name}" | while read line do echo "" >> ${log_file} curl -s -XPOST -uelastic:changeme "http://${ip}:9200/${line}-${day_7}/_close" &>> ${log_file} sleep 2 done
curl -s -uelastic:changeme 'http://${ip}:9200/_cat/nodes?v&h=segments.count,segments.memory,segments.index_writer_memory,segments.version_map_memory,segments.fixed_bitset_memory' &>> ${log_file} |
4、调优效果:
①、索引调优效果:
环境说明【单机】:
==== 服务器信息 ==== CPU 信息: 8 Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz CPU 核数: 8 内存信息: 16G 硬盘信息: 1116 G 发行版信息: 6.5 系统架构: x86_64 系统运行级别: 3 #优化前和优化后数据量的对比: #优化前 2017-09-05【60G】 392.7mb 681.6kb 11.6gb 21.6gb 317.8mb 9.8mb 681.4mb 27.7gb
#优化后 2017-09-06【60G】 11.2gb 9.2mb 294.8mb 26.5gb 20.6gb 620.9mb 661.4kb 376.7mb
2017-09-07【60G】 10.5gb 289.5mb 412.3mb 8.7mb 20gb 25.6gb 598.7mb 830.9kb |
一段时间总PV:
一段时间的服务器负载:
优化前后负载对比:
如果你也使用了,可以把优化前后对比,在下方回复下,谢谢。