lucene分組查詢的簡單使用

阿新 • • 發佈：2018-12-10

網上介紹的Lucene分組查詢的過程大多比較複雜，這裡提供一個較為簡單的實現，可以滿足基本的分組查詢需求。

1.首先引入依賴

    <!--組查詢-->
    <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-grouping -->
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-grouping</artifactId 
>
      <version>7.2.1</version>
    </dependency>

2.建立索引

 /**
     * 新增索引文件
     *
     * @param groupField
     * @param writer
     * @throws IOException
     */
    public static void addDocuments(String groupField, IndexWriter writer)
            throws IOException {
        // 0 

        Document doc = new Document();
        addGroupField(doc, groupField, "author1");
        doc.add(new StringField("author", "author1", Field.Store.YES));
        doc.add(new TextField("content", "random text", Field.Store.YES));
        doc.add(new StringField("id", "1", Field.Store.YES));
        writer.addDocument(doc);

        // 1 

        doc = new Document();
        addGroupField(doc, groupField, "author1");
        doc.add(new StringField("author", "author1", Field.Store.YES));
        doc.add(new TextField("content", "some more random text",
                Field.Store.YES));
        doc.add(new StringField("id", "2", Field.Store.YES));
        writer.addDocument(doc);

        // 2
        doc = new Document();
        addGroupField(doc, groupField, "author1");
        doc.add(new StringField("author", "author1", Field.Store.YES));
        doc.add(new TextField("content", "some more random textual data",
                Field.Store.YES));
        doc.add(new StringField("id", "3", Field.Store.YES));
        writer.addDocument(doc);

        // 3
        doc = new Document();
        addGroupField(doc, groupField, "author2");
        doc.add(new StringField("author", "author2", Field.Store.YES));
        doc.add(new TextField("content", "some random text", Field.Store.YES));
        doc.add(new StringField("id", "4", Field.Store.YES));
        writer.addDocument(doc);

        // 4
        doc = new Document();
        addGroupField(doc, groupField, "author3");
        doc.add(new StringField("author", "author3", Field.Store.YES));
        doc.add(new TextField("content", "some more random text",
                Field.Store.YES));
        doc.add(new StringField("id", "5", Field.Store.YES));
        writer.addDocument(doc);

        // 5
        doc = new Document();
        addGroupField(doc, groupField, "author3");
        doc.add(new StringField("author", "author3", Field.Store.YES));
        doc.add(new TextField("content", "random", Field.Store.YES));
        doc.add(new StringField("id", "6", Field.Store.YES));
        writer.addDocument(doc);

        // 6 -- no author field
        doc = new Document();
        doc.add(new StringField("author", "author4", Field.Store.YES));
        doc.add(new TextField("content",
                "random word stuck in alot of other text", Field.Store.YES));
        doc.add(new StringField("id", "6", Field.Store.YES));
        writer.addDocument(doc);
        writer.commit();
        writer.close();
    }

/**
     * 新增分組域
     *
     * @param doc
     *            索引文件
     * @param groupField
     *            需要分組的域名稱
     * @param value
     *            域值
     */
    private static void addGroupField(Document doc, String groupField,
                                      String value) {
        //進行分組的域上建立的必須是SortedDocValuesField型別
        doc.add(new SortedDocValuesField(groupField, new BytesRef(value)));
    }

3.對查詢分組，一些坑以及要點註釋已經說明的很清楚了

 /**
     * 測試lucene7環境下的分組查詢
     */
    @Test
    public void lucene7GroupBy() throws Exception{
        GroupingSearch groupingSearch = new GroupingSearch(groupField);//指定要進行分組的索引
        groupingSearch.setGroupSort(new Sort(SortField.FIELD_SCORE));//指定分組排序規則
        groupingSearch.setFillSortFields(true);//是否填充SearchGroup的sortValues
        groupingSearch.setCachingInMB(4.0, true);
        groupingSearch.setAllGroups(true);
        //groupingSearch.setAllGroupHeads(true);
        groupingSearch.setGroupDocsLimit(10);//限制分組個數

        Analyzer analyzer = new StandardAnalyzer();
        QueryParser parser = new QueryParser("content", analyzer);
        String queryExpression = "some content";
        Query query = parser.parse(queryExpression);
        Directory directory = FSDirectory.open(Paths.get(indexDir));
        IndexReader reader = DirectoryReader.open(directory);
        IndexSearcher searcher = new IndexSearcher(reader);
        //在content索引上對包含some與content分詞的索引進行具體查詢，結果按照author索引的內容進行分組
        TopGroups<BytesRef> result = groupingSearch.search(searcher, query, 0, 1000);

        //總命中數
        System.out.println("總命中數:"+result.totalHitCount);
        //分組數
        System.out.println("分組數:"+result.groups.length);
        //按照分組列印查詢結果
        for (GroupDocs<BytesRef> groupDocs : result.groups){
            if (groupDocs != null) {
                if (groupDocs.groupValue != null) {
                    System.out.println("分組:" + groupDocs.groupValue.utf8ToString());
                }else{
                    //由於建立索引時有一條資料沒有在分組索引上建立SortedDocValued索引，因此這個分組的groupValue為null
                    System.out.println("分組:" + "unknow");
                }
                System.out.println("組內資料條數:" + groupDocs.totalHits);

                for(ScoreDoc scoreDoc : groupDocs.scoreDocs){
                    System.out.println("author:" + searcher.doc(scoreDoc.doc).get("author"));
                    System.out.println("content:" + searcher.doc(scoreDoc.doc).get("content"));
                    System.out.println();
                }

                System.out.println("=====================================");
            }
        }
    }

完整程式碼可以參考我的github:github

lucene分組查詢的簡單使用

網上介紹的Lucene分組查詢的過程大多比較複雜，這裡提供一個較為簡單的實現，可以滿足基本的分組查詢需求。 1.首先引入依賴  <!-- https://mvnrepository.com/artif

Oracle資料庫DML（資料操縱語言）參考程式碼，簡單查詢，分組查詢，簡單增刪改操作

撰寫人——軟工二班——陳喜平 – 實驗內容： – 一、簡單查詢 – 編寫簡單查詢語句，理解笛卡爾積、選擇、投影的概念及其在SQL中的實現 –SQL PL/SQL SQLPLUS – DDL :CREATE ALTER DROP 資料定義語言 – DML INSERT DELETE UPDAT

8.4Solr API使用(Result Grouping分組查詢)

src adding offset resp iteye status jpg pan border 轉載請出自出處：http://eksliang.iteye.com/blog/2169458 一、概述分組統計查詢不同於分組統計（Facet）,facet只是簡單統計記錄

分組查詢限制

png src 查詢 group 分享 -1 image 技術統計函數限制一.無GROUP BY時統計函數不能和字段同時出現；限制二.有GROUP BY時字段部分只能出現分組的字段；限制三.統計函數嵌套時不能有字段。分組查詢限制

[Mysql 查詢語句]——分組查詢group by

dash sel concat avg 年齡 http 查詢語句表示單獨 #group by #group by + group_concat() #group by + 集合函數 #group by + having #group by (1) group by

oracle 分組查詢

com 職位 group 數據類型 () nth 常用 conn 全部常用的函數： ·：統計個數:COUNT()，根據表中的實際數據量返回結果； ·：求和：SUM（），是針對於數字的統計，求和 ·：平均值

sql分組查詢和連接查詢

avi rop select 連接 where子句統計表連接聚合 where 分組查詢select 查詢信息 from 表名where 條件group by 按照列分組（可多個，隔開）order by 排序方式（查詢信息如果列名和聚合函數同時出現，要麽在聚合函數中出

連接查詢和分組查詢

des titles 時也每次多表連接分類 inner name 通過一．使用group by 進行分組查詢語法： Select 列名 From 表名 Group by 列名 1.查詢男女學生的人數格式多少分析: 首先按照性別進行分組：group by SSe

鏈接查詢和分組查詢

多個查詢信息 sel bsp sql語句聚合函數列名 del out 分組查詢 select 查詢信息 from 表名where 條件group by 按照列分組（可多個，隔開）order by 排序方式（查詢信息如果列名和聚合函數同時出現，要麽在聚合函數中出現，

Group by 分組查詢實戰

男女 img 通過 ont rom 出現的次數是我實現一起實戰經歷，由於本人在共享單車上班，我們的單車管理模塊，可以根據單車號查詢單車，但是單車號沒有設置unique（獨一無二約束），說以這就增加了單車號可能重復的風險，但是一般情況下，單車號是不會重復的，因為平

高級查詢，分組查詢

_id 使用角度 payment 才有 from 利用修飾表達式分組查詢定義：利用內置的分組函數來查詢所謂分組，就是看待數據的“角度”不同。也就是把某類值相同的看做一組。語法：select 列名，組函數（列名）...from 表名where 條件group

Oracle 高級查詢1 關聯查詢分組查詢

null 函數定義關聯 group 顯示 tinc 查詢求平均值高級查詢 1.關聯查詢作用：可以跨越多表查詢 --查詢出員工的名字和他所在部門的的名字語法：select 列，列，列 from 表1 join 表2on 表1外鍵=表2主鍵 2.外聯接左外聯

sql server 分組查詢結合日期模糊查詢

.html 分組查詢 group lan 字符 max tar getdate ref 分組查詢： https://www.cnblogs.com/netserver/p/4518995.html 日期格式化格式： http://blog.csdn.net/qq_16769

mysql分組查詢n條記錄

sta 最大的查詢需求 dmi 依據 count 狀態重點當業務邏輯越來越復雜時，這個查詢的需求會越來越多，今天寫成筆記記錄下來，防止再忘記 SELECT * FROM `notice` a where add_time > 1513008000 and a

lucene 初探 - 查詢

log ets 執行 () 準備 parser 技術分享文件路徑 must lucene初探, 是為了後面solr做準備的. 如果跳過lucene, 直接去看solr, 估計有點懵. 由於時間的關系, lucene查詢方法也有多個, 所以單獨出來. 一. 精確查詢

表連接和分組查詢

兩張 -- 多表連接查詢數據行產生記錄 group bsp log 分組查詢：分組查詢就是按某一列分組，相同的列合並，得到結果可能他少於總記錄使用group by分組查詢：按什麽分（年級、姓氏、地址、年齡）年級分組查詢語法：Select * from <表名

SqlServer中的查詢簡單總結

結果集 sqlserve having 不重復結果 col 包含 sele HA 一、sql語句的執行順序　　查詢時數據庫中使用最多的操作，一條sql語句的查詢順序是　　　　1、from Tb1 [ join on ] 得到查詢的數據源　　　　2、where 　

ORACLE分組查詢和統計等

多層分組 != pre java代碼 from pex int base lec select flow_id,rw from (select t.flow_id ,rownum as rw from apex_030200.wwv_flow_list_templates

MySQL（七）DQL之分組查詢

員工 location _id cimage width SQ 結果 order by rom 一、語法 select 分組函數，分組後的字段from 表【where 篩選條件】group by 分組的字段【having 分組後的篩選】【order by 排序列表】二

SQL group by分組查詢

create server insert 一定的 ID all 註意至少滿足本文導讀：在實際SQL應用中，經常需要進行分組聚合，即將查詢對象按一定條件分組，然後對每一個組進行聚合分析。創建分組是通過GROUP BY子句實現的。與WHERE子句不同，GROUP BY

lucene分組查詢的簡單使用

相關推薦