查看数据的行索引查看索引内容

转载

mob64ca1401b651 2024-03-19 21:48:01

文章标签 查看数据的行索引 lucene apache analyzer 文章分类 数据仓库大数据

1 Lucen目录介绍

查看数据的行索引查看索引内容_apache

2 lucene-core-3.6.2.jar是lucene开发核心jar包

contrib 目录存放，包含一些扩展jar包

3 案例

建立第一个Lucene项目：lucene3_day1

（1）需要先将数据转换成为Document对象，每一个数据信息转换成为Field(String name, String value, Field.Store store, Field.Indexindex)

（2）指定索引库位置Directorydirectory = FSDirectory.open(new File("index"));// 当前Index目录

（3）分词器Analyzeranalyzer = new StandardAnalyzer(Version.LUCENE_36);

（4）写入索引：

`IndexWriterConfig indexWriterConfig = newIndexWriterConfig( Version.LUCENE_36, analyzer); IndexWriter indexWriter = new IndexWriter(directory,indexWriterConfig); //将document数据写入索引库 indexWriter.addDocument(document); //关闭索引 indexWriter.close();`

IndexWriterConfig indexWriterConfig = newIndexWriterConfig(

            Version.LUCENE_36, analyzer);

IndexWriter indexWriter = new IndexWriter(directory,indexWriterConfig);

      

//将document数据写入索引库

indexWriter.addDocument(document);

//关闭索引

indexWriter.close();

案例编写：

案例目录：
Article.java
package cn.toto.lucene.quickstart; publicclass Article { privateint id; private String title; private String content; /** * @return the id / publicint getId() { return id; } /* * @param id the id to set / publicvoid setId(int id) { this.id = id; } /* * @return the title / public String getTitle() { return title; } /* * @param title the title to set / publicvoid setTitle(String title) { this.title = title; } /* * @return the content / public String getContent() { return content; } /* * @param content the content to set */ publicvoid setContent(String content) { this.content = content; } }
package cn.toto.lucene.quickstart; import java.io.File; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.Field.Index; import org.apache.lucene.document.Field.Store; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; import org.apache.lucene.util.Version; import org.junit.Test; /** * @brief LuceneTest.java 测试Lucene的案例 * @attention * @authortoto-pc * @date 2014-12-7 * @note begin modify by 涂作权 2014/12/07 null */ publicclass LuceneTest { @Test publicvoid buildIndex() throws Exception { Article article = new Article(); article.setId(100); article.setTitle("Lucene快速入门"); article.setContent("Lucene是提供了一个简单却强大的应用程式接口，" + "能够做全文检索索引和搜寻，在Java开发环境里Lucene是" + "一个成熟的免费的开放源代码工具。"); // 将索引数据转换成为Document对象（Lucene要求） Document document = new Document(); document.add(new Field("id", // 字段 article.getId() + "", Store.YES, // 是否建立索引 Index.ANALYZED // 表示使用分词索引 )); document.add(new Field("title", article.getTitle(), Store.YES,Index.ANALYZED)); document.add(new Field("content", article.getContent(), Store.YES, Index.ANALYZED)); // 建立索引库 // 索引目录位置 Directory directory = FSDirectory.open(new File("index"));// 当前Index目录 // 分词器 Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36); // 写入索引 IndexWriterConfig indexWriterConfig = new IndexWriterConfig( Version.LUCENE_36, analyzer); IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig); // 将document数据写入索引库 indexWriter.addDocument(document); // 关闭索引 indexWriter.close(); } }
运行单元测试后的结果：运行后index目录下的结果：

4 可以通过luke工具查看索引库中内容（它是一个jar包）

查看数据的行索引查看索引内容_lucene_05

下载网址：http://code.google.com/p/luke/

打开方式：

查看数据的行索引查看索引内容_查看数据的行索引_06

如果用这种方式打不可以，可以用命令的方式打开文件，进入这个目录，选中Shift+鼠标右键—>此处打开命令窗口—>输入命令：java -jar lukeall-3.5.0.jar

查看数据的行索引查看索引内容_查看数据的行索引_07

工具的截图如下：

查看数据的行索引查看索引内容_apache_08

点击OK后的结果：

通过overview可以查看到索引信息，通过Document可以查看文档对象信息

查看数据的行索引查看索引内容_查看数据的行索引_09

查看数据的行索引查看索引内容_查看数据的行索引_10

5 查找

和上面的并集的query代码如下：
@Test publicvoid searchIndex() throws Exception { //建立Query对象--根据标题 String queryString = "Lucene"; //第一个参数，版本号 //第二个参数，字段 //第三个参数，分词器 Analyzernew StandardAnalyzer(Version.LUCENE_36); QueryParser queryParser = new QueryParser(Version.LUCENE_36,"title",analyzer); Query query = queryParser.parse(queryString); //根据Query查找 // 索引目录位置 Directory directory = FSDirectory.open(new File("index")); IndexSearcher indexSearcher = new IndexSearcher(IndexReader.open(directory)); //查询满足结果的前100条数据 TopDocs topDocs = indexSearcher.search(query, 100); System.out.println("满足结果记录条数：" + topDocs.totalHits); //获取结果 ScoreDoc[] scoreDocs = topDocs.scoreDocs; for (int i = 0; i < scoreDocs.length; i++) { //先获得Document下标 int docID = scoreDocs[i].doc; Document document = indexSearcher.doc(docID); System.out.println("id:" + document.get("id")); System.out.println("title:" + document.get("title")); System.out.println("content:" + document.get("content")); } indexSearcher.close(); }
运行结果：