主要解决的问题:根据子文档属性查询父文档,根据父文档属性查询子文档,父子文档一起返回(联查)。在google都不能很快搜到方案。文中例子为虚构的,代码是groovy的。
在这里有一部分说明:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers
嵌套文档的保存
solr的文档存储上是扁平结构(Lucene的限制),所以嵌套只是逻辑上的。这个反映在schema.xml里,field不能嵌套子fields(?),跟sql表类似,要通过父id来实现嵌套。
首先在schema.xml里面添加这一行:
<!-- points to the root document of a block of nested documents. Required for nested document support, may be removed otherwise -->
<field name="_root_" type="string" indexed="true" stored="false"/>
我的库里有多重文档,所以添加一个属性帮助标识文档类型:
<field name="docType" type="string" indexed="true" stored="true"/>
其次在代码里面用这个方法保存嵌套:
SolrInputDocument doc = new SolrInputDocument() SolrInputDocument subDoc = new SolrInputDocument() doc.addChildDocument(subDoc)
2. 根据子文档属性查询父文档
SolrClient sc = new HttpSolrClient(SOLR_CORE_URL) def fq = '{!parent which="docType:Student"}bookProp:bookPropValue' def q = 'docType:Student' SolrQuery sq = new SolrQuery(q) sq.addFilterQuery(fq) def rsp = sc.query(sq) def docs = rsp.getResults()
3. 根据父文档属性(比如prop)查询子文档
SolrClient sc = new HttpSolrClient(SOLR_CORE_URL) def fq = '{!child of="docType:Student"}studentProp:studentPropValue' def q = 'docType:Book' SolrQuery sq = new SolrQuery(q) sq.addFilterQuery(fq) def rsp = sc.query(sq) def docs = rsp.getResults()
4. 联查
SolrClient sc = new HttpSolrClient(SOLR_CORE_URL) def fq = '{!parent which="docType:Student"}' def q = '张三' fl = '*, [child parentFilter=docType:Student]' SolrQuery sq = new SolrQuery(q) sq.setParam(CommonParams.DF, 'name') sq.addFilterQuery(fq) // sq.addField(fl) sq.setParam(CommonParams.FL, fl) def rsp = sc.query(sq) SolrDocumentList docs = rsp.getResults() docs?.each {it -> println(it) def children = it.getChildDocuments() // ... }