首先我操作如下状态的索引:

health index                  pri rep docs.count docs.deleted store.size pri.store.size green  javaindex_20160518       5   1   23330821            0     15.8gb          7.9gb


合并前:

[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160518?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'456833526

合并操作:

curl -XPOST 'http://localhost:9200/javaindex_20160518/_optimize?max_num_segments=1'

合并后:

[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160518?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'369622567

合并后减少的segment memory为:

>>> print (456833526 - 369622567)87210959  ----> 87.2M  缩减大小

百分比:

>>> print (456833526 - 369622567) / 456833526.0 0.190903149696  ----> 19% 缩减百分比


换个更大的索引测试一遍,同样也是合并为一个segment
索引大小:

health index                  pri rep docs.count docs.deleted store.size pri.store.size green  javaindex_20160520       5   1  103324505            0     70.3gb         35.1gb

合并前:

[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160520?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'1698117764

合并操作:

curl -XPOST 'http://localhost:9200/javaindex_20160520/_optimize?max_num_segments=1'

合并后:

[root@betaing index]# curl -s "http://localhost:9200/_cat/segments/javaindex_20160520?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'1622962469
>>> print ( 1698117764 - 1622962469 ) / 1698117764.00.0442579994116

压缩后释放了4.4%的内存,大小就是75.2M

总结:
从上面的例子,可以看出来索引越大,反而释放的segment memory效率越低!
 
下面我们针对单个索引,做个合并segment个数不同来对比一下效率:
合并前:

health index                  pri rep docs.count docs.deleted store.size pri.store.size green  phpindex_20160526        5   1  260338401            0     96.5gb         48.2gb
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3955994758

合并为10个segment:

curl -XPOST 'http://localhost:9200/phpindex_20160526/_optimize?max_num_segments=10'
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3929919062
>>> print ( 3955994758 - 3929919062 ) / 3955994758.00.00659143846115>>> print ( 3955994758 - 3929919062 ) 
26075696

合并后memory减少了26M,百分比为0.66%
 
合并为5个segment:

curl -XPOST 'http://localhost:9200/phpindex_20160526/_optimize?max_num_segments=5'{"_shards":{"total":10,"successful":10,"failed":0}}
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3899949448

合并后效率:

>>> print ( 3955994758 - 3899949448 ) / 3955994758.00.0141671851022>>> print ( 3955994758 - 3899949448 ) 
56045310

合并后减少了56M segment memory,效率为1.42%

合并为1个segment:

[root@betaing nock]# curl -XPOST 'http://localhost:9200/phpindex_20160526/_optimize?max_num_segments=1'{"_shards":{"total":10,"successful":10,"failed":0}}
[root@betaing nock]# curl -s "http://localhost:9200/_cat/segments/phpindex_20160526?v&h=shard,segment,size,size.memory" | awk '{sum += $NF} END {print sum}'3892073433

合并后效率:

>>> print ( 3955994758 - 3892073433 ) / 3955994758.00.0161580914309>>> print ( 3955994758 - 3892073433 ) 
63921325

合并后减少了64M segment memory,效率为1.6%
 
总结:
随着合并数的减少,释放的segment memory增加,效率增大,但是不是成倍的。
 
性能如下:

Elasticsearch 使用optimize强制合并segment测试_java

Elasticsearch 使用optimize强制合并segment测试_java_02