RavenDb學習(三)靜態索引
阿新 • • 發佈:2022-04-29
在靜態索引這塊,RavenDb其實的是lucene,所以裡面有很多概念,其實都是lucene本身的。 1.定義靜態Indexes documentStore.DatabaseCommands.PutIndex( "BlogPosts/PostsCountByTag", new IndexDefinitionBuilder<BlogPost, BlogTagPostsCount> { // The Map function: for each tag of each post, create a new BlogTagPostsCount // object with the name of a tag and a count of one. Map = posts => from post in posts from tag in post.Tags select new { Tag = tag, Count = 1 }, // The Reduce function: group all the BlogTagPostsCount objects we got back // from the Map function, use the Tag name as the key, and sum up all the // counts. Since the Map function gives each tag a Count of 1, when the Reduce // function returns we are going to have the correct Count of posts filed under // each tag. Reduce = results => from result in results group result by result.Tag into g select new { Tag = g.Key, Count = g.Sum(x => x.Count) } }); public class BlogTagPostsCount { public string Tag { get; set; } public int Count { get; set; } } 2.索引層次化的資料 如下圖中的資料,如果我們要索引Comments的話,應該如何索引 { //posts/123 'Name': 'Hello Raven', 'Comments': [ { 'Author': 'Ayende', 'Text': '...', 'Comments': [ { 'Author': 'Rahien', 'Text': '...', "Comments": [] } ] } ] } store.DatabaseCommands.PutIndex("SampleRecurseIndex", new IndexDefinition { Map = @"from post in docs.Posts from comment in Recurse(post, (Func<dynamic, dynamic>)(x => x.Comments)) select new { Author = comment.Author, Text = comment.Text }" }); 當然我們也可以定義一個類 public class SampleRecurseIndex : AbstractIndexCreationTask<Post> { public SampleRecurseIndex() { Map = posts => from post in posts from comment in Recurse(post, x => x.Comments) select new { Author = comment.Author, Text = comment.Text }; } } 然後建立new SampleRecurseIndex().Execute(store); 3.索引相關文件 1)第一個例子 這個例子:Invoice和Customer,Invoice當中包含了Customer的Id ,現在我們要通過Customer的姓名來查詢invoices public class Invoice { public string Id { get; set; } public string CustomerId { get; set; } } public class Customer { public string Id { get; set; } public string Name { get; set; } } public class SampleIndex : AbstractIndexCreationTask<Invoice> { public SampleIndex() { Map = invoices => from invoice in invoices select new { CustomerId = invoice.CustomerId, CustomerName = LoadDocument<Customer>(invoice.CustomerId).Name }; } } 建立完索引之後,我們就可以客戶的名稱來查詢invoices了 2)第二個例子 public class Book { public string Id { get; set; } public string Name { get; set; } } public class Author { public string Id { get; set; } public string Name { get; set; } public IList<string> BookIds { get; set; } } public class AnotherIndex : AbstractIndexCreationTask<Author> { public AnotherIndex() { Map = authors => from author in authors select new { Name = author.Name, Books = author.BookIds.Select(x => LoadDocument<Book>(x).Name) }; } } Author當中儲存了所有的書的id,通過作者可以查詢他出了多少書,通過書名頁可以查到作者 這裡面需要注意的是: 1)當相關文件變化的時候,索引也會變化 2)使用LoadDocument 去跟蹤一個文件,當多個文件跟蹤同一個文件的時候,這會變成一個很耗費資源的開銷 4.TransformResults 有時候索引非常複雜,但是我們需要的資料比較簡單,這個時候我們需要怎麼做呢? public class PurchaseHistoryIndex : AbstractIndexCreationTask<Order, Order> { public PurchaseHistoryIndex() { Map = orders => from order in orders from item in order.Items select new { UserId = order.UserId, ProductId = item.Id }; TransformResults = (database, orders) => from order in orders from item in order.Items let product = database.Load<Product>(item.Id) where product != null select new { ProductId = item.Id, ProductName = product.Name }; } } 我們在查詢的時候只需要PurchaseHistoryViewItem,這樣子我們就用OfType來進行型別轉換。 documentSession.Query<Shipment, PurchaseHistoryIndex>() .Where(x => x.UserId == userId) .OfType<PurchaseHistoryViewItem>() .ToArray(); 5.錯誤處理 當索引出現錯誤的時候,因為它是由一個後臺執行緒執行的,索引我們很難發現的,通過檢視'/stats'表或者 '/raven/studio.html#/statistics'或者'/raven/statistics.html'。 當錯誤超過15%的時候,索引就會被禁用掉,15%的數量是在前10個文件之後統計的,為了防止一開始的文旦就不好使,就別禁用了。 下面是錯誤的一些資訊,檢視'/stats'得到的 { "LastDocEtag": "00000000-0000-0b00-0000-000000000001", "LastAttachmentEtag": "00000000-0000-0000-0000-000000000000", "CountOfIndexes": 1, "ApproximateTaskCount": 0, "CountOfDocuments": 1, "StaleIndexes": [], "CurrentNumberOfItemsToIndexInSingleBatch": 512, "CurrentNumberOfItemsToReduceInSingleBatch": 256, "Indexes":[ { "Name": "PostsByTitle", "IndexingAttempts": 1, "IndexingSuccesses": 0, "IndexingErrors": 1 } ], "Errors":[ { "Index": "PostsByTitle", "Error": "Cannot perform runtime binding on a null reference", "Timestamp": "/Date(1271778107096+0300)/", "Document": "bob" } ] } 6.查詢 在查詢當中用 string.Contains()方式是會報錯的,因為RavenDb不支援類似萬用字元*term*這樣的方式,這樣會引起效能問題,它會丟擲NotSupportedException異常。 1)多欄位索引 documentStore.DatabaseCommands.PutIndex("UsersByNameAndHobbies", new IndexDefinition { Map = "from user in docs.Users select new { user.Name, user.Hobbies }", Indexes = { { "Name", FieldIndexing.Analyzed }, { "Hobbies", FieldIndexing.Analyzed } } }); 2)多欄位查詢 users = session.Query<User>("UsersByNameAndHobbies") .Search(x => x.Name, "Adam") .Search(x => x.Hobbies, "sport").ToList(); 3)相關性加速 通過設定相關性欄位,可以減少一些不相關的內容搜尋 users = session.Query<User>("UsersByHobbies") .Search(x => x.Hobbies, "I love sport", boost:10) .Search(x => x.Hobbies, "but also like reading books", boost:5).ToList(); 也可以在索引定義時候設定 public class Users_ByName : AbstractIndexCreationTask<User> { public Users_ByName() { this.Map = users => from user in users select new { FirstName = user.FirstName.Boost(10), LastName = user.LastName }; } } 4)操作符 AND操作符 users = session.Query<User>("UsersByNameAndHobbiesAndAge") .Search(x => x.Hobbies, "computers") .Search(x => x.Name, "James") .Where(x => x.Age == 20).ToList(); 上面的這一句也可以這麼寫 users = session.Query<User>("UsersByNameAndHobbies") .Search(x => x.Name, "Adam") .Search(x => x.Hobbies, "sport", options: SearchOptions.And).ToList(); NOT操作符 users = session.Query<User>("UsersByName") .Search(x => x.Name, "James", options: SearchOptions.Not).ToList(); 多操作符合作 並且不等於 users = session.Query<User>("UsersByNameAndHobbies") .Search(x => x.Name, "Adam") .Search(x => x.Hobbies, "sport", options: SearchOptions.Not | SearchOptions.And) .ToList(); 5)萬用字元,模糊查詢 EscapeAll (default), AllowPostfixWildcard, AllowAllWildcards, RawQuery. users = session.Query<User>("UsersByName") .Search(x => x.Name, "Jo* Ad*", escapeQueryOptions:EscapeQueryOptions.AllowPostfixWildcard).ToList(); users = session.Query<User>("UsersByName") .Search(x => x.Name, "*oh* *da*", escapeQueryOptions: EscapeQueryOptions.AllowAllWildcards).ToList(); users = session.Query<User>("UsersByName") .Search(x => x.Name, "*J?n*", escapeQueryOptions: EscapeQueryOptions.RawQuery).ToList(); 6)高亮顯示 public class SearchItem { public string Id { get; set; } public string Text { get; set; } } public class ContentSearchIndex : AbstractIndexCreationTask<SearchItem> { public ContentSearchIndex() { Map = (docs => from doc in docs select new { doc.Text }); Index(x => x.Text, FieldIndexing.Analyzed); Store(x => x.Text, FieldStorage.Yes); TermVector(x => x.Text, FieldTermVector.WithPositionsAndOffsets); } } //查詢完畢之後進行處理 FieldHighlightings highlightings; var results = session.Advanced.LuceneQuery<SearchItem>("ContentSearchIndex") .Highlight("Text", 128, 1, out highlightings) .Search("Text", "raven") .ToArray(); var builder = new StringBuilder() .AppendLine("<ul>"); foreach (var result in results) { var fragments = highlightings.GetFragments(result.Id); builder.AppendLine(string.Format("<li>{0}</li>", fragments.First())); } var ul = builder .AppendLine("</ul>") .ToString(); //查詢時候設定前後符號 FieldHighlightings highlightings; var results = session.Advanced.LuceneQuery<SearchItem>("ContentSearchIndex") .Highlight("Text", 128, 1, out highlightings) .SetHighlighterTags("**", "**") .Search("Text", "raven") .ToArray(); 7)推薦 下面是使用者和基於使用者名稱的索引 public class User { public string Id { get; set; } public string FullName { get; set; } } public class Users_ByFullName : AbstractIndexCreationTask<User> { public Users_ByFullName() { Map = users => from user in users select new { user.FullName }; Indexes.Add(x => x.FullName, FieldIndexing.Analyzed); } } 假設資料庫裡面存著以下資料: // users/1 { "Name": "John Smith" } // users/2 { "Name": "Jack Johnson" } // users/3 { "Name": "Robery Jones" } // users/4 { "Name": "David Jones" } 你使用了以下的查詢語句 var query = session.Query<User, Users_ByFullName>().Where(x => x.FullName == "johne"); var user = query.FirstOrDefault(); 如果查詢不到,可以使用推薦功能 if (user == null) { SuggestionQueryResult suggestionResult = query.Suggest(); Console.WriteLine("Did you mean?"); foreach (var suggestion in suggestionResult.Suggestions) { Console.WriteLine("t{0}", suggestion); } } 它會給你推薦 john jones johnson 下面是包括全部引數的查詢: session.Query<User, Users_ByFullName>() .Suggest(new SuggestionQuery() { Field = "FullName", Term = "johne", Accuracy = 0.4f, MaxSuggestions = 5, Distance = StringDistanceTypes.JaroWinkler, Popularity = true, }); 另外一種查詢方式: store.DatabaseCommands.Suggest("Users/ByFullName", new SuggestionQuery() { Field = "FullName", Term = "johne" }); 多個關鍵詞的推薦: 同時輸入johne davi SuggestionQueryResult resultsByMultipleWords = session.Query<User, Users_ByFullName>() .Suggest(new SuggestionQuery() { Field = "FullName", Term = "<<johne davi>>", Accuracy = 0.4f, MaxSuggestions = 5, Distance = StringDistanceTypes.JaroWinkler, Popularity = true, }); Console.WriteLine("Did you mean?"); foreach (var suggestion in resultsByMultipleWords.Suggestions) { Console.WriteLine("t{0}", suggestion); }