首先我操作如下状态的索引:
health index pri rep docs.count docs.deleted store.size pri.store.size green javaindex_20160518 5 1 23330821 0 15.8gb 7.9gb
合并前:
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160518?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'456833526
合并操作:
curl -XPOST 'http://localhost:9200/javaindex_20160518/_optimize?max_num_segments=1'
合并后:
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160518?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'369622567
合并后减少的segment memory为:
>>> print (456833526 - 369622567)87210959 ----> 87.2M 缩减大小
百分比:
>>> print (456833526 - 369622567) / 456833526.0 0.190903149696 ----> 19% 缩减百分比
换个更大的索引测试一遍,同样也是合并为一个segment
索引大小:
health index pri rep docs.count docs.deleted store.size pri.store.size green javaindex_20160520 5 1 103324505 0 70.3gb 35.1gb
合并前:
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160520?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'1698117764
合并操作:
curl -XPOST 'http://localhost:9200/javaindex_20160520/_optimize?max_num_segments=1'
合并后:
[root@betaing index]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160520?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'1622962469
>>> print ( 1698117764 - 1622962469 ) / 1698117764.00.0442579994116 压缩后释放了4.4%的内存,大小就是75.2M
总结:
从上面的例子,可以看出来索引越大,反而释放的segment memory效率越低!
下面我们针对单个索引,做个合并segment个数不同来对比一下效率:
合并前:
health index pri rep docs.count docs.deleted store.size pri.store.size green phpindex_20160526 5 1 260338401 0 96.5gb 48.2gb
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3955994758
合并为10个segment:
curl -XPOST 'http://localhost:9200/phpindex_20160526/_optimize?max_num_segments=10'
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3929919062
>>> print ( 3955994758 - 3929919062 ) / 3955994758.00.00659143846115>>> print ( 3955994758 - 3929919062 ) 26075696
合并后memory减少了26M,百分比为0.66%
合并为5个segment:
curl -XPOST 'http://localhost:9200/phpindex_20160526/_optimize?max_num_segments=5'{"_shards":{"total":10,"successful":10,"failed":0}}
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3899949448
合并后效率:
>>> print ( 3955994758 - 3899949448 ) / 3955994758.00.0141671851022>>> print ( 3955994758 - 3899949448 ) 56045310
合并后减少了56M segment memory,效率为1.42%
合并为1个segment:
[root@betaing nock]# curl -XPOST 'http://localhost:9200/phpindex_20160526/_optimize?max_num_segments=1'{"_shards":{"total":10,"successful":10,"failed":0}}
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3892073433
合并后效率:
>>> print ( 3955994758 - 3892073433 ) / 3955994758.00.0161580914309>>> print ( 3955994758 - 3892073433 ) 63921325
合并后减少了64M segment memory,效率为1.6%
总结:
随着合并数的减少,释放的segment memory增加,效率增大,但是不是成倍的。
性能如下: