1. 程式人生 > 其它 >RavenDb學習(三)靜態索引

RavenDb學習(三)靜態索引

在靜態索引這塊,RavenDb其實的是lucene,所以裡面有很多概念,其實都是lucene本身的。

1.定義靜態Indexes
documentStore.DatabaseCommands.PutIndex(
    "BlogPosts/PostsCountByTag",
    new IndexDefinitionBuilder<BlogPost, BlogTagPostsCount>
    {
        // The Map function: for each tag of each post, create a new BlogTagPostsCount
        // object with the name of a tag and a count of one.
        Map = posts => from post in posts
                       from tag in post.Tags
                       select new
                       {
                           Tag = tag,
                           Count = 1
                       },
 
        // The Reduce function: group all the BlogTagPostsCount objects we got back
        // from the Map function, use the Tag name as the key, and sum up all the
        // counts. Since the Map function gives each tag a Count of 1, when the Reduce
        // function returns we are going to have the correct Count of posts filed under
        // each tag.
        Reduce = results => from result in results
                            group result by result.Tag
                                into g
                                select new
                                {
                                    Tag = g.Key,
                                    Count = g.Sum(x => x.Count)
                                }
    });
public class BlogTagPostsCount
{
    public string Tag { get; set; }
    public int Count { get; set; }
}

2.索引層次化的資料
如下圖中的資料,如果我們要索引Comments的話,應該如何索引
{  //posts/123
  'Name': 'Hello Raven',
  'Comments': [
    {
      'Author': 'Ayende',
      'Text': '...',
      'Comments': [
        {
          'Author': 'Rahien',
          'Text': '...',
          "Comments": []
        }
      ]
    }
  ]
}

store.DatabaseCommands.PutIndex("SampleRecurseIndex", new IndexDefinition
{
    Map = @"from post in docs.Posts
            from comment in Recurse(post, (Func<dynamic, dynamic>)(x => x.Comments))
            select new
            {
                Author = comment.Author,
                Text = comment.Text
            }"
});

當然我們也可以定義一個類
public class SampleRecurseIndex : AbstractIndexCreationTask<Post>
{
    public SampleRecurseIndex()
    {
        Map = posts => from post in posts
                       from comment in Recurse(post, x => x.Comments)
                       select new
                       {
                           Author = comment.Author,
                           Text = comment.Text
                       };
    }
}

然後建立new SampleRecurseIndex().Execute(store);

3.索引相關文件

1)第一個例子

這個例子:Invoice和Customer,Invoice當中包含了Customer的Id ,現在我們要通過Customer的姓名來查詢invoices
public class Invoice
{
    public string Id { get; set; }
 
    public string CustomerId { get; set; }
}
 
public class Customer
{
    public string Id { get; set; }
 
    public string Name { get; set; }
}

public class SampleIndex : AbstractIndexCreationTask<Invoice>
{
    public SampleIndex()
    {
        Map = invoices => from invoice in invoices
                          select new
                          {
                              CustomerId = invoice.CustomerId,
                              CustomerName = LoadDocument<Customer>(invoice.CustomerId).Name
                          };
    }
}

建立完索引之後,我們就可以客戶的名稱來查詢invoices了 

2)第二個例子
public class Book
{
    public string Id { get; set; }
     
    public string Name { get; set; }
}
 
public class Author
{
    public string Id { get; set; }
 
    public string Name { get; set; }
 
    public IList<string> BookIds { get; set; }
}

public class AnotherIndex : AbstractIndexCreationTask<Author>
{
    public AnotherIndex()
    {
        Map = authors => from author in authors
                         select new
                             {
                                 Name = author.Name,
                                 Books = author.BookIds.Select(x => LoadDocument<Book>(x).Name)
                             };
    }
}

Author當中儲存了所有的書的id,通過作者可以查詢他出了多少書,通過書名頁可以查到作者

這裡面需要注意的是:
1)當相關文件變化的時候,索引也會變化
2)使用LoadDocument 去跟蹤一個文件,當多個文件跟蹤同一個文件的時候,這會變成一個很耗費資源的開銷

4.TransformResults 
有時候索引非常複雜,但是我們需要的資料比較簡單,這個時候我們需要怎麼做呢?
public class PurchaseHistoryIndex : AbstractIndexCreationTask<Order, Order>
{
    public PurchaseHistoryIndex()
    {
        Map = orders => from order in orders
                        from item in order.Items
                        select new
                        {
                            UserId = order.UserId,
                            ProductId = item.Id
                        };
 
        TransformResults = (database, orders) =>
                           from order in orders
                           from item in order.Items
                           let product = database.Load<Product>(item.Id)
                           where product != null
                           select new
                           {
                               ProductId = item.Id,
                               ProductName = product.Name
                           };
    }
}

我們在查詢的時候只需要PurchaseHistoryViewItem,這樣子我們就用OfType來進行型別轉換。
documentSession.Query<Shipment, PurchaseHistoryIndex>()
    .Where(x => x.UserId == userId)
    .OfType<PurchaseHistoryViewItem>()
    .ToArray();

5.錯誤處理
當索引出現錯誤的時候,因為它是由一個後臺執行緒執行的,索引我們很難發現的,通過檢視'/stats'表或者 '/raven/studio.html#/statistics'或者'/raven/statistics.html'。
當錯誤超過15%的時候,索引就會被禁用掉,15%的數量是在前10個文件之後統計的,為了防止一開始的文旦就不好使,就別禁用了。
下面是錯誤的一些資訊,檢視'/stats'得到的
{
    "LastDocEtag": "00000000-0000-0b00-0000-000000000001",
    "LastAttachmentEtag": "00000000-0000-0000-0000-000000000000",
    "CountOfIndexes": 1,
    "ApproximateTaskCount": 0,
    "CountOfDocuments": 1,
    "StaleIndexes": [],
    "CurrentNumberOfItemsToIndexInSingleBatch": 512,
    "CurrentNumberOfItemsToReduceInSingleBatch": 256,
    "Indexes":[
        {
            "Name": "PostsByTitle",
            "IndexingAttempts": 1,
            "IndexingSuccesses": 0,
            "IndexingErrors": 1
        }
    ],
    "Errors":[
        {
            "Index": "PostsByTitle",
            "Error": "Cannot   perform   runtime   binding   on   a   null   reference",
            "Timestamp": "/Date(1271778107096+0300)/",
            "Document": "bob"
        }
    ]
}

6.查詢

在查詢當中用 string.Contains()方式是會報錯的,因為RavenDb不支援類似萬用字元*term*這樣的方式,這樣會引起效能問題,它會丟擲NotSupportedException異常。

1)多欄位索引
documentStore.DatabaseCommands.PutIndex("UsersByNameAndHobbies", new IndexDefinition
{
    Map = "from user in docs.Users select new { user.Name, user.Hobbies }",
    Indexes = { { "Name", FieldIndexing.Analyzed }, { "Hobbies", FieldIndexing.Analyzed } }
});

2)多欄位查詢
users = session.Query<User>("UsersByNameAndHobbies")
               .Search(x => x.Name, "Adam")
               .Search(x => x.Hobbies, "sport").ToList();

3)相關性加速
通過設定相關性欄位,可以減少一些不相關的內容搜尋
users = session.Query<User>("UsersByHobbies")
               .Search(x => x.Hobbies, "I love sport", boost:10)
               .Search(x => x.Hobbies, "but also like reading books", boost:5).ToList();

也可以在索引定義時候設定
public class Users_ByName : AbstractIndexCreationTask<User>
{
    public Users_ByName()
    {
        this.Map = users => from user in users
                            select new
                                {
                                    FirstName = user.FirstName.Boost(10),
                                    LastName = user.LastName
                                };
    }
}

4)操作符
AND操作符
users = session.Query<User>("UsersByNameAndHobbiesAndAge")
               .Search(x => x.Hobbies, "computers")
               .Search(x => x.Name, "James")
               .Where(x => x.Age == 20).ToList();

上面的這一句也可以這麼寫
users = session.Query<User>("UsersByNameAndHobbies")
               .Search(x => x.Name, "Adam")
               .Search(x => x.Hobbies, "sport", options: SearchOptions.And).ToList();

NOT操作符
users = session.Query<User>("UsersByName")
        .Search(x => x.Name, "James", options: SearchOptions.Not).ToList();

多操作符合作
並且不等於
users = session.Query<User>("UsersByNameAndHobbies")
        .Search(x => x.Name, "Adam")
        .Search(x => x.Hobbies, "sport", options: SearchOptions.Not | SearchOptions.And)
        .ToList();

5)萬用字元,模糊查詢
EscapeAll (default),
AllowPostfixWildcard,
AllowAllWildcards,
RawQuery.
users = session.Query<User>("UsersByName")
    .Search(x => x.Name, "Jo* Ad*",
            escapeQueryOptions:EscapeQueryOptions.AllowPostfixWildcard).ToList();

users = session.Query<User>("UsersByName")
    .Search(x => x.Name, "*oh* *da*",
            escapeQueryOptions: EscapeQueryOptions.AllowAllWildcards).ToList();

users = session.Query<User>("UsersByName")
    .Search(x => x.Name, "*J?n*",
            escapeQueryOptions: EscapeQueryOptions.RawQuery).ToList();

6)高亮顯示

public class SearchItem
{
    public string Id { get; set; }
 
    public string Text { get; set; }
}
 
public class ContentSearchIndex : AbstractIndexCreationTask<SearchItem>
{
    public ContentSearchIndex()
    {
        Map = (docs => from doc in docs
                       select new { doc.Text });
 
        Index(x => x.Text, FieldIndexing.Analyzed);
        Store(x => x.Text, FieldStorage.Yes);
        TermVector(x => x.Text, FieldTermVector.WithPositionsAndOffsets);
    }
}
//查詢完畢之後進行處理
FieldHighlightings highlightings;
var results = session.Advanced.LuceneQuery<SearchItem>("ContentSearchIndex")
                 .Highlight("Text", 128, 1, out highlightings)
                 .Search("Text", "raven")
                 .ToArray();
 
var builder = new StringBuilder()
    .AppendLine("<ul>");
 
foreach (var result in results)
{
    var fragments = highlightings.GetFragments(result.Id);
    builder.AppendLine(string.Format("<li>{0}</li>", fragments.First()));
}
 
var ul = builder
    .AppendLine("</ul>")
    .ToString();

//查詢時候設定前後符號
FieldHighlightings highlightings;
var results = session.Advanced.LuceneQuery<SearchItem>("ContentSearchIndex")
                 .Highlight("Text", 128, 1, out highlightings)
                 .SetHighlighterTags("**", "**")
                 .Search("Text", "raven")
                 .ToArray();

7)推薦

下面是使用者和基於使用者名稱的索引
public class User
{
    public string Id { get; set; }
    public string FullName { get; set; }
}

public class Users_ByFullName : AbstractIndexCreationTask<User>
{
    public Users_ByFullName()
    {
        Map = users => from user in users
                       select new { user.FullName };
 
        Indexes.Add(x => x.FullName, FieldIndexing.Analyzed);
    }
}

假設資料庫裡面存著以下資料:
// users/1
{
    "Name": "John Smith"
}
// users/2
{
    "Name": "Jack Johnson"
}
// users/3
{
    "Name": "Robery Jones"
}
// users/4
{
    "Name": "David Jones"
}

你使用了以下的查詢語句
var query = session.Query<User, Users_ByFullName>().Where(x => x.FullName == "johne");
var user = query.FirstOrDefault();

如果查詢不到,可以使用推薦功能
if (user == null)
{
    SuggestionQueryResult suggestionResult = query.Suggest();
 
    Console.WriteLine("Did you mean?");
 
    foreach (var suggestion in suggestionResult.Suggestions)
    {
        Console.WriteLine("t{0}", suggestion);
    }
}

它會給你推薦
 john
 jones
 johnson
下面是包括全部引數的查詢:
session.Query<User, Users_ByFullName>()
       .Suggest(new SuggestionQuery()
                    {
                        Field = "FullName",
                        Term = "johne",
                        Accuracy = 0.4f,
                        MaxSuggestions = 5,
                        Distance = StringDistanceTypes.JaroWinkler,
                        Popularity = true,
                    });
另外一種查詢方式:
store.DatabaseCommands.Suggest("Users/ByFullName", new SuggestionQuery()
                                                   {
                                                       Field = "FullName",
                                                       Term = "johne"
                                                   });

多個關鍵詞的推薦:
同時輸入johne davi
SuggestionQueryResult resultsByMultipleWords = session.Query<User, Users_ByFullName>()
       .Suggest(new SuggestionQuery()
       {
           Field = "FullName",
           Term = "<<johne davi>>",
           Accuracy = 0.4f,
           MaxSuggestions = 5,
           Distance = StringDistanceTypes.JaroWinkler,
           Popularity = true,
       });
 
Console.WriteLine("Did you mean?");
 
foreach (var suggestion in resultsByMultipleWords.Suggestions)
{
    Console.WriteLine("t{0}", suggestion);
}