Group分组划分结果,返回的是分组结果;
Facet分组统计,侧重统计,返回的是分组后的数量;
一、Group用法:
//组查询基础配置
params.set(GroupParams.GROUP, "true");
params.set(GroupParams.GROUP_FIELD, "dkeys");根据dkeys域上的值来分组划分结果,建议dkeys上不要分词;
params.set(GroupParams.GROUP_LIMIT, "5");
params.set(GroupParams.GROUP_FORMAT, "grouped");
params.set(GroupParams.GROUP_MAIN, "false");
Group查询结果遍历方式:
QueryResponse response = solrServer.query(query);
GroupResponse groupResponse = response.getGroupResponse();
List<GroupCommand> ls = groupResponse.getValues();
for(GroupCommand gc:ls){
List<Group> list = gc.getValues();
for(Group g : list){
SolrDocumentList sdl = g.getResult();
if (CollectionUtils.isNotEmpty(sdl)) {
for (SolrDocument doc : sdl) {
System.out.println(doc.toString());//相同的dkeys下的文档每5个作为一组返回;
}
}
}
}
此时普通遍历结果的方法无效:
SolrDocumentList results = response.getResults();
System.out.println(ls+"\t\t"+results);//输出null
二、Facet用法:类似。也有自己独立的遍历方式
1 、什么是Faceted Search
Facet['fæsɪt]很难翻译,只能靠例子来理解了。Solr作者Yonik Seeley也给出更为直接的名字:导航(Guided Navigation)、参数化查询(Paramatic Search)。
上面是比较直接的Faceted Search例子,品牌、产品特征、卖家,均是 Facet 。而Apple、Lenovo等品牌,就是 Facet values 或者说 Constraints ,而Facet values所带的统计值就是 Facet count/Constraint count 。
2 、Facet 使用
q = 超级本
facet = true
facet.field = 产品特性
facet.field = 品牌
facet.field = 卖家http://…/select?q=超级本&facet=true&wt=json
&facet.field=品牌&facet.field=产品特性&facet.field=卖家
也可以提交查询条件,设置fq(filter query)。
q = 电脑
facet = true
fq = 价格:[8000 TO *]
facet.mincount = 1 // fq将不符合的字段过滤后,会显示count为0
facet.field = 产品特性
facet.field = 品牌
facet.field = 卖家http://…/select?q=超级本&facet=true&wt=json
&fq=价格:[8000 TO *]&facet.mincount=1
&facet.field=品牌&facet.field=产品特性&facet.field=卖家
"facet_counts": {
"facet_fields": {
"品牌": [
"Apple", 4,
"Lenovo", 39
…]
"产品特性": [
"显卡", 42,
"酷睿", 38
…]
…}}
如果用户选择了Apple这个分类,查询条件中需要添加另外一个fq查询条件,并移除Apple所在的facet.field。
http://…/select?q=超级本&facet=true&wt=json
&fq=价格:[8000 TO *]&fq=品牌:Apple&facet.mincount=1
&facet.field= 品牌&facet.field=产品特性&facet.field=卖家
3 、Facet 参数
facet.prefix – 限制constaints的前缀
facet.mincount=0 – 限制constants count的最小返回值,默认为0
facet.sort=count – 排序的方式,根据count或者index
facet.offset=0 – 表示在当前排序情况下的偏移,可以做分页
facet.limit=100 – constraints返回的数目
facet.missing=false – 是否返回没有值的field
facet.date – Deprecated, use facet.range
facet.query
指定一个查询字符串作为Facet Constraint
facet.query = rank:[* TO 20]
facet.query = rank:[21 TO *]
"facet_counts": {
"facet_fields": {
"品牌": [
"Apple", 4,
"Lenovo", 10
…]
"产品特性": [
"显卡", 11,
"酷睿", 20
…]
…}}
facet.range
http://…/select?&facet=true
&facet.range=price
&facet.range.start=5000
&facet.range.end=8000
&facet.range.gap=1000(每1000分一组,5000-6000一组,6000-7000一组,7000-8000一组)
<result numFound="27" ... />
...
<lst name="facet_counts">
<lst name="facet_queries">
<int name="rank:[* TO 20]">2</int>
<int name="rank:[21 TO *]">15</int>
</lst>
...
WARNING: range范围是左闭右开,[start, end)
facet.pivot
这个是Solr 4.0的新特性,pivot和facet一样难理解,还是用例子来讲吧。
Syntax: facet.pivot=field1,field2,field3...
e.g. facet.pivot=comment_user, grade
#docs | #docs grade:好 | #docs 等级:中 | #docs 等级:差 | |
comment_user:1 | 10 | 8 | 1 | 1 |
comment_user:2 | 20 | 18 | 2 | 0 |
comment_user:3 | 15 | 12 | 2 | 1 |
comment_user:4 | 18 | 15 | 2 | 1 |
"facet_counts":{
"facet_pivot":{
"comment_user, grade ":[{
"field":"comment_user",
"value":"1",
"count":10,
"pivot":[{
"field":"grade",
"value":"好",
"count":8}, {
"field":"grade",
"value":"中",
"count":1}, {
"field":"grade",
"value":"差",
"count":1}]
}, {
"field":" comment_user ",
"value":"2",
"count":20,
"pivot":[{
…
没有pivot机制的话,要做到上面那点可能需要多次查询:
http://...q= comment&fq= grade:好&facet=true&facet.field=comment_user
http://...q=comment&fq=grade:中&facet=true&facet.field=comment_user
http://...q=comment&fq=grade:差&facet=true&facet.field=comment_user
Facet.pivot - Computes a Matrix of Constraint Counts across multiple Facet Fields. by Yonik Seeley.
上面那个解释很不错,只能理解不能翻译。
返回查询集合中指定field的统计情况,例如找到city一样的文档数目:
加入文档
[plain] view plaincopy
查询:http://localhost:8983/solr/select?q=name:company&facet=true&facet.field=city&facet.mincount=1
- <add>
- <doc>
- <field name="id">1</field>
- <field name="name">Company 1</field>
- <field name="city">New York</field>
- </doc>
- <doc>
- <field name="id">2</field>
- <field name="name">Company 2</field>
- <field name="city">New Orleans</field>
- </doc>
- <doc>
- <field name="id">3</field>
- <field name="name">Company 3</field>
- <field name="city">New York</field>
- </doc>
- </add>
结果:
[plain] view plaincopy
- <lst name="facet_fields">
- <lst name="city">
- <int name="New York">2</int>
- <int name="New Orleans">1</int>
- </lst>
- </lst>
获得指定日期范围内的文档:添加的文档
[plain] view plaincopy
查询语句:分别指定时间的field,开始时间和结束时间,gap指定时间的划分,结果显示时间区间的数目。
- <add>
- <doc>
- <field name="id">1</field>
- <field name="title">Lucene or Solr ?</field>
- <field name="added">2010-12-06T12:12:12Z</field>
- </doc>
- <doc>
- <field name="id">2</field>
- <field name="title">My Solr and the rest of the world</field>
- <field name="added">2010-12-07T11:11:11Z</field>
- </doc>
- <doc>
- <field name="id">3</field>
- <field name="title">Solr recipes</field>
- <field name="added">2010-11-30T12:12:12Z</field>
- </doc>
- <doc>
- <field name="id">4</field>
- <field name="title">Solr cookbook</field>
- <field name="added">2010-11-29T12:12:12Z</field>
- </doc>
- </add>
http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.date=added&facet.date.start=NOW/DAY-30DAYS&facet.date.end=NOW/DAY&facet.date.gap=+7DAY
结果:
[html] view plaincopy
- <int name="2010-11-08T00:00:00Z">0</int>
- <int name="2010-11-15T00:00:00Z">0</int>
- <int name="2010-11-22T00:00:00Z">0</int>
- <int name="2010-11-29T00:00:00Z">2</int>
- <int name="2010-12-06T00:00:00Z">2</int>
得到数值范围的数目,和时间范围一样:
http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.range=price&facet.range.start=0&facet.range.end=400&facet.range.gap=100
自定义区间,而不是连续区间的划分:
http://localhost:8983/solr/select?q=name:car&facet=true&facet.query=price:[10 TO 80]&facet.query=price:[90 TO 300]
移除过滤:
http://localhost:8983/solr/select?q=name:company&facet=true&fq={!tag=stateTag}state:"New York"&facet.field={!ex=stateTag}city&facet.field={!ex=stateTag}state
fq={!tag=stateTag}state:"New York":只显示state为"New York"的结果。
facet.field={!ex=stateTag}city:移除stateTag的过滤后,在结果集中,对city域进行统计。
命名facet结果集:
http://localhost:8983/solr/select?q=name:company&facet=true&fq={!tag=stateTag}state:Luiziana&facet.field={!key=stateFiltered}city&facet.field={!ex=stateTag key=stateUnfiltered}state
acet.field={!key=stateFiltered}city:命名city为stateFiltered,并按照前面的过滤
对facet结果集进行排序按照字典序排序,默认是按数量排序:
http://localhost:8983/solr/select?q=name:house&facet=true&facet.field=city&facet.sort=index
实现自动提示:前缀为so的都会显示,一般不需要分词
http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.field=title_autocomplete&facet.prefix=so
得到某一个域中不含某词的facet,也可以是不含某个域:
http://localhost:8983/solr/select?q=title:solr&facet=true&facet.query=!price:[* TO *]
指定结果集数目的统计facet:-1表示所有
http://localhost:8983/solr/select?q=title:solr&facet=true&facet.field=category&facet.limit=-1
指定不同域的facet的限制数目:一个没限制,一个限制10
http://localhost:8983/solr/select?q=name:car&facet=true&facet.field=category&facet.field=manufacturer&f.category.facet.limit=-1&f.manufacturer.face