1. 程式人生 > >圖數據庫對比關系一些優點

圖數據庫對比關系一些優點

存在 partition oot 的人 sts www col 為知 rom

    上一篇簡要介紹了圖數據庫的一些基本內容(初識SQL Server2017 圖數據庫(一)),本篇通過對比關系型一些語法來體現圖數據庫模式的一些優點,比如查詢方便,語句易理解等。

在圖數據庫模型上構建查詢的優勢:

T-SQL 帶給圖表查詢一些新的語法。在SELECT語句中我們有一些特殊的語句來關聯點和邊。讓我們來演練一些,構建查詢語句檢索發帖和回復,如下:

  1. 我們檢索每個記錄的兩個部分,發帖和回復,因此我們需要在FROM子句中引用兩次ForumPosts’表,這個地方可以采用一些有意義的別名:

    

FROM dbo.ForumPosts ReplyPost, dbo.ForumPosts RepliedPost

  1. 盡管我們能選擇任何別名,但是在處理圖對象時最好選擇有意義的名字。
  2. 我們需要“posts”之間的關系,而這個關系就是表Reply_to’。語法如下:

    

FROM dbo.ForumPosts ReplyPost, dbo.Reply_to, dbo.ForumPosts RepliedPost

  1. 在WHERE 子句中,我們需要關聯所有的表,用下面這種MATCH語句來實現關聯:
FROM dbo.ForumPosts ReplyPost, dbo.Reply_to, dbo.ForumPosts RepliedPost

WHERE MATCH(ReplyPost-(Reply_to)->RepliedPost)

  1. 這個語法很有意思:“-”破折號表示邊的$From_id字段表示關系,然後“->”破折號和大於號用邊的$To_id字段表示關系。
  2. 因為知道那個別名有reply,那個別名有replied post,我們可以構建一個查詢字段列表:
FROM dbo.ForumPosts ReplyPost, dbo.Reply_to, dbo.ForumPosts RepliedPost

WHERE MATCH(ReplyPost-(Reply_to)->RepliedPost)

  1. 在關系型模型中相同功能的查詢如下:
     

select RepliedPost.PostId,RepliedPost.PostTitle,

   ReplyPost.PostId 
as ReplyId, ReplyPost.PostTitle as ReplyTitle from Forum.ForumPosts ReplyPost, Forum.ForumPosts RepliedPost where ReplyPost.PostId=RepliedPost.ReplyTo

  1. 這些查詢很相似,當然MATCH的語法更容易理解。
  2. 執行完上面語句查詢結果如下:

技術分享

  1. 我們加上寫這個回復貼人的名字。需要在FROM子句中添加‘ForumMembers’節點和‘Written_By’這個邊。語句如下:
FROM dbo.ForumPosts ReplyPost, dbo.Reply_to, dbo.ForumPosts RepliedPost, dbo.ForumMembers RepliedMember, Written_By RepliedWritten_By

  1. 還要添加MATCH語句的內部關系:
     
WHERE MATCH(ReplyPost-(Reply_to)->RepliedPost-(RepliedWritten_by)->RepliedMember)

  1. 這就可以在SELECT列表中添加回帖人的名字,最終的查詢如下:
     

-- Posts 、members 和replies

   SELECT RepliedPost.PostId,RepliedPost.PostTitle,RepliedMember.MemberName,

   ReplyPost.PostId as ReplyId, ReplyPost.PostTitle as ReplyTitle

      FROM dbo.ForumPosts ReplyPost, dbo.Reply_to, dbo.ForumPosts RepliedPost,

        dbo.ForumMembers RepliedMember, Written_By RepliedWritten_By

   WHERE MATCH(ReplyPost-(Reply_to)->RepliedPost-(RepliedWritten_by)->RepliedMember)

  1. 在關系型模型中的對應查詢如下:
SELECT RepliedPost.PostId,RepliedPost.PostTitle,

   ReplyPost.PostId as ReplyId, ReplyPost.PostTitle as ReplyTitle,

   RepliedMember.MemberName

  FROM Forum.ForumPosts ReplyPost, Forum.ForumPosts RepliedPost,

   Forum.ForumMembers RepliedMember

  WHERE ReplyPost.PostId=RepliedPost.ReplyTo

        and RepliedPost.OwnerId=RepliedMember.MemberId

  1. 結果如下所示:

技術分享

  1. 還缺少回復對象的名字。像上面一樣增加‘ForumMembers’‘Written_By’在FROM子句中:
From dbo.ForumPosts ReplyPost, dbo.Reply_to, dbo.ForumPosts RepliedPost,

dbo.ForumMembers RepliedMember, Written_By RepliedWritten_By,

dbo.ForumMembers ReplyMember, Written_By ReplyWritten_By

  1. 接下來,修改MATCH子句,‘ReplyMember’需要關聯‘ReplyPost’,但是如何去處理這個關系而不影響其他關系?需要用不同的方式來實現:
WHERE MATCH(ReplyMember<-(ReplyWritten_By)-ReplyPost-(Reply_to)->RepliedPost-(RepliedWritten_by)->RepliedMember)

  1. 註意這個符號“<-”與之前的相反方向,但是意義是相同的:一個在邊表的$to_id與節點表的關系。
  2. 最終,還需增加寫著回復的成員姓名,代碼如下:
-- Posts and members and their replies and members

   SELECT RepliedPost.PostId, RepliedPost.PostTitle,RepliedMember.MemberName,

   ReplyPost.PostId as ReplyId, ReplyPost.PostTitle as ReplyTitle,

   ReplyMember.MemberName [ReplyMemberName]

      FROM dbo.ForumPosts ReplyPost, dbo.Reply_to, dbo.ForumPosts RepliedPost,

        dbo.ForumMembers RepliedMember, Written_By RepliedWritten_By,

        dbo.ForumMembers ReplyMember, Written_By ReplyWritten_By

      WHERE MATCH(ReplyMember<-(ReplyWritten_By)-ReplyPost-(Reply_to)->RepliedPost-(RepliedWritten_by)->RepliedMember)

  1. 結果集如下:

技術分享

  1. 在關系型查詢的對應語句:
SELECT RepliedPost.PostId,RepliedPost.PostTitle,

         RepliedMember.MemberName, ReplyPost.PostId as ReplyId,

      ReplyPost.PostTitle as ReplyTitle, ReplyMember.MemberName

  FROM Forum.ForumPosts ReplyPost, Forum.ForumPosts RepliedPost,

       Forum.ForumMembers RepliedMember, Forum.ForumMembers ReplyMember

  WHERE ReplyPost.PostId=RepliedPost.ReplyTo

        and RepliedPost.OwnerId=RepliedMember.MemberId

        and ReplyPost.OwnerId=ReplyMember.MemberId

  1. 在這個時候,可能在關系型模式裏面隨著關系的增多讀取就會越困難,而在圖數據模式中MATCH子句相對就容易很多。讓我們看一下在圖數據模式中一些有趣又有用的地方。

統計每篇帖子的回復數

  SELECT distinct RepliedPost.PostID,RepliedPost.PostTitle,

          RepliedPost.PostBody,

          count(ReplyPost.PostID) over(partition by RepliedPost.PostID)

              as TotalReplies

   FROM dbo.ForumPosts ReplyPost, Reply_To, dbo.ForumPosts RepliedPost

   WHERE MATCH(ReplyPost-(Reply_To)->RepliedPost)

在這個語句中我們統計了每一篇回復的數量,但是僅僅在一個層面中,並不是在整個回復的樹結構裏面。

根貼(主貼)的列表

我們通過下面不使用MATCH的語句得到所有的根貼:

SELECT Post1.PostId,Post1.PostTitle

  FROM dbo.ForumPosts Post1

  WHERE $node_id not in (select $from_id from dbo.Reply_To

MATCH語法只是允許我們關聯三個或者更多的實體(比如兩個節點和一個關系)。當我們只想關聯其中兩個的時候,只需要一個常規的連接或者子查詢。如上面的語句一樣。

在結果中添加‘Level’字段

添加一個‘Level’字段,顯示樹結構。在T-SQL中有一個簡單的語法,叫做CTE實現遞歸。但是有一個問題,不能使用MATCH語法在一個派生表上,此時可以使用CTE。如果有必要,可以在CTE中使用MATCH,但是反之就不行了,有這樣的限制。下面展示一下使用常規的關系僅僅使用CTE來叠代,代碼如下:

 with root as

  ( select $node_id as node_id,RootPosts.PostId,

           RootPosts.PostTitle,

           1 as Level, 0 as ReplyTo

     from dbo.ForumPosts RootPosts

     where $node_id not in (select $from_id from dbo.reply_to)

  union all

     select $node_id,ReplyPost.PostId, ReplyPost.PostTitle,

            Level+1 as [Level], root.PostId as ReplyTo

     from dbo.ForumPosts ReplyPost, reply_to, root

     where ReplyPost.$node_id=reply_to.$from_id

           and root.node_id=reply_to.$to_id

  )

  select PostId,PostTitle, Level, ReplyTo

  from root

檢索一個帖子中的所有回復

使用CTE遞歸語法,我們可以用一種樹結構檢索一個帖子的所有回復。如果使用常規的語法不能在檢索帖子1的時候檢索貼子3,因為3是對2的回復,而2是對1的回復。使用CTE.當查詢帖子1的所有回復時能檢索貼子3。代碼如下:

 with root as

  ( select $node_id as node_id,RootPosts.PostId,RootPosts.PostTitle,

           1 as Level, 0 as ReplyTo

     from dbo.ForumPosts RootPosts

     where PostId=1  

  union all

     select $node_id,ReplyPost.PostId, ReplyPost.PostTitle,

            Level+1 as [Level],root.PostId as ReplyTo

     from dbo.ForumPosts ReplyPost, reply_to, root

     where ReplyPost.$node_id=reply_to.$from_id

           and root.node_id=reply_to.$to_id

  )

  select PostId,PostTitle, Level, ReplyTo

  from root

我們也可以反過來做,在樹狀結構中按順序檢索所有父貼。由於CTE不支持OUTER join,所以要在外部添加,代碼如下:

with root as

  ( select LeafPost.$node_id as node_id,LeafPost.PostId,

           LeafPost.PostTitle

     from dbo.ForumPosts LeafPost

     where LeafPost.PostId=3  -- Single post

  union all

     select RepliedPost.$node_id as node_id,RepliedPost.PostId,

            RepliedPost.PostTitle

     from dbo.ForumPosts RepliedPost, Reply_to, root

     where root.node_id=Reply_to.$from_id

           and Reply_to.$to_id=RepliedPost.$node_id

  )

  select root.PostId,root.PostTitle,

         RepliedPost.PostId ParentPostId

  from root

  left join reply_to

       on root.node_id=reply_to.$from_id

  left join dbo.ForumPosts RepliedPost

       on reply_to.$to_id=RepliedPost.$node_id

檢索一個用戶所有帖子

查詢一個用所有的信息,與帖子不同,這不需要樹,要簡單不少:

--  Peter回復的所有帖子

   SELECT distinct RepliedPost.PostID,RepliedPost.PostTitle,

           RepliedPost.PostBody

   FROM dbo.ForumPosts ReplyPost, Reply_To, dbo.ForumPosts RepliedPost,

        dbo.ForumMembers Members,Written_By

   WHERE MATCH(Members<-(Written_By)-ReplyPost-(Reply_To)->RepliedPost)

   and Members.MemberName=Peter

  -- Peter發的所有帖子

   SELECT ReplyPost.PostID,ReplyPost.PostTitle,ReplyPost.PostBody,

          RepliedPost.PostId ReplyTo

   FROM dbo.ForumPosts ReplyPost, Reply_To, dbo.ForumPosts RepliedPost,

        dbo.ForumMembers Members,Written_By

   WHERE MATCH(Members<-(Written_By)-ReplyPost-(Reply_To)->RepliedPost)

   and Members.MemberName=Peter

或許你註意到上面兩個查詢的不同,就是在展示字段上是否使用DISTINCT。這個去重是因為Peter回復同一個帖子可以超過一次。

在模型中檢索Likes(點贊)

這個查詢是有意思的:‘Likes’邊是成員和發帖表的關系。每一個關系都是唯一的,並不受其他關系影響。代碼如下:

--點贊的帖子或者被別人點贊的帖子。

  SELECT Post.PostID,Post.PostTitle,Member.MemberName

  FROM   dbo.ForumPosts Post, Likes,

         dbo.ForumMembers Member

  WHERE MATCH(Member-(Likes)->Post)

  -- 點贊的人或者被人點贊

  SELECT Member.MemberId,Member.MemberName LikeMember,

         LikedMember.MemberName LikedMemberName

  FROM dbo.ForumMembers Member, Likes, dbo.ForumMembers LikedMember

  WHERE MATCH(Member-(Likes)->LikedMember)

還可以很容易地聚合信息,以獲得每個帖子或每個成員的總的Likes。

--每個帖子總的likes

  select Post.PostId,Post.PostTitle,

         count(*) totalLikes

  from dbo.ForumPosts Post,Likes,

       dbo.ForumMembers Members

  where Match(Members-(Likes)->Post)

  group by PostId,PostTitle

  --每個成員總的點贊數select LikedMembers.MemberId,LikedMembers.MemberName,

         count(*) totalLikes

  from dbo.ForumMembers Members,Likes,

       dbo.ForumMembers LikedMembers

  where Match(Members-(Likes)->LikedMembers)

  group by LikedMembers.MemberId,

           LikedMembers.MemberName

用戶點贊並且回復帖子

我們也可以創建一些更有趣的查詢,例如,查找這些點贊並回復的人,如下:


  SELECT Member.MemberName,Member.Memberid,

         LikedPost.PostId,LikedPost.PostTitle,

         ReplyPost.PostTitle ReplyTitle

  FROM dbo.ForumPosts LikedPost, Reply_To, dbo.ForumPosts ReplyPost,

       Likes, dbo.ForumMembers Member, Written_By

  WHERE MATCH(Member-(Likes)->LikedPost<-(Reply_To)-ReplyPost-(Written_By)->Member)

註意,對於‘Member’節點使用了兩次在同一個MATCH表達式中。這形成了一種過濾:點贊並且有回復的成員,需要在‘LikedPost’和‘ReplyPost’中都有記錄才可以。

那麽在關系型模式中代碼如下:

select Likes.MemberId,Members.MemberName

  from Forum.Likes Likes, Forum.ForumPosts Posts,

       Forum.ForumMembers Members

  where Likes.MemberId=Posts.OwnerId

   and Posts.ReplyTo=Likes.PostId

   and Members.MemberId=Likes.MemberId

看起來這種寫法更難理解和讀懂。

回帖給多個帖子的成員

SELECT Members.MemberId, Members.MemberName,

         Count(distinct RepliedPost.PostId) as Total

  FROM   dbo.ForumPosts ReplyPost, Reply_To, dbo.ForumPosts RepliedPost,

         Written_By,dbo.ForumMembers Members

  WHERE  MATCH(Members<-(Written_By)-ReplyPost-(Reply_To)->RepliedPost)

  GROUP BY MemberId, Members.MemberName

  Having Count(RepliedPost.PostId) >1

回帖個一個帖子多次的成員:

SELECT Members.MemberId, Members.MemberName,

         RepliedPost.PostId RepliedId,count(*) as TotalReplies

  FROM   dbo.ForumPosts ReplyPost, Reply_To, dbo.ForumPosts RepliedPost,

       Written_By,dbo.ForumMembers Members

  WHERE MATCH(Members<-(Written_By)-ReplyPost-(Reply_To)->RepliedPost)

  GROUP BY MemberId,MemberName,RepliedPost.PostId

  Having count(*) >1

上述兩種語句中唯一的不同就是展示結果的聚合。

總結

  通過上述構建在圖數據模式下的查詢和關聯,對比了常規語句以及在關系模式下的相同查詢,不難發現無論是在易讀性,邏輯理解上還是在性能上都有很大提高。當然這只是第一個版本,所以難免有很多問題, 下一篇我講介紹這個版本存在的一部分問題。

圖數據庫對比關系一些優點