一篇入門 — Scala 巨集

阿新 • • 發佈：2018-12-10

前情回顧

上一節, 我簡單的說了一下反射的基本概念以及執行時反射的用法, 同時簡單的介紹了一下編譯原理知識, 其中我感覺最為繞的地方, 就屬泛型的幾種使用方式了. 而最抽象的概念, 就是對於符號和抽象樹的這兩個概念的理解.

現在回顧一下泛型的幾種進階用法:

上界 <:
下界 >:
視界 <%
邊界 :
協變 +T
逆變 -T

現在想想, 既然已經有了泛型了, 還要這幾個功能幹嘛呢? 其實可以類比一下, 之前沒有泛型, 而為什麼引入泛型呢?

當然是為了程式碼更好的服用. 想象一下, 本來一個方法沒有入參, 但通過引數, 可以減少很多相似程式碼.

同理, 泛型是什麼, generics. 又叫什麼, 型別引數化. 本來方法的入參只能接受一種型別的引數, 加入泛型後, 可以處理多種型別的入參.

順著這條線接著往下想, 有了逆變和協變, 我們讓泛型的包裝類也有了類繼承關係, 有了繼承的層級關係, 方法的處理能力又會大大增加.

泛型, 並不神奇, 只是省略了一系列程式碼, 而且引入泛型還會導致泛型擦除, 以及一系列的隱患. 而型別擦除其實也是為了相容更早的語言, 我們束手無策. 但泛型在設計上實現的資料和邏輯分離, 卻可以大大提高程式程式碼的簡潔性和可讀性, 並提供可能的編譯時型別轉換安全檢測功能. 所以在可以使用泛型的地方我們還是推薦的.

編譯時反射

上篇文章已經介紹過, 編譯器反射也就是在Scala的表現形式, 就是我們本篇的重點巨集(Macros).

`Macros` 能做什麼呢?

直白一點, 巨集能夠

Code that generates code

還記得上篇文章中, 我們提到的AST(abstract syntax tree, 抽象語法樹)嗎? Macros 可以利用 compiler plugin 在 compile-time 操作 AST, 從而實現一些為所以為的...任性操作

所以, 可以理解巨集就是一段在編譯期執行的程式碼, 如果我們可以合理的利用這點, 就可以將一些程式碼提前執行, 這意味著什麼, 更早的(compile-time)發現錯誤, 從而避免了 run-time錯誤. 還有一個不大不小的好處, 就是可以減少方法呼叫的堆疊開銷.

是不是很吸引人, 好, 開始Macros的盛宴.

黑盒巨集和白盒巨集

黑盒和白盒的概念, 就不做過多介紹了. 而Scala既然引用了這兩個單詞來描述巨集, 那麼兩者區別也就顯而易見了. 當然, 這兩個是新概念, 在2.10之前, 只有一種巨集, 也就是白盒巨集的前身.

官網描述如下: Macros that faithfully follow their type signatures are called blackbox macros as their implementations are irrelevant to understanding their behaviour (could be treated as black boxes). Macros that can't have precise signatures in Scala's type system are called whitebox macros (whitebox def macros do have signatures, but these signatures are only approximations).

我怕每個人的理解不一樣, 所以先貼出了官網的描述, 而我的理解呢, 就是我們指定好返回型別的Macros就是黑盒巨集, 而我們雖然指定返回值型別, 甚至是以c.tree定義返回值型別, 而更加細緻的具體型別, 即真正的返回型別可以在巨集中實現的, 我們稱為白盒巨集.

可能還是有點繞哈, 我舉個例子吧. 在此之前, 先把二者的位置說一下:

2.10

scala.reflect.macros.Context

2.11 +

scala.reflect.macros.blackbox.Context
scala.reflect.macros.whitebox.Context

黑盒例子

import scala.reflect.macros.blackbox

object Macros {
    def hello: Unit = macro helloImpl

    def helloImpl(c: blackbox.Context): c.Expr[Unit] = {
        import c.universe._
        c.Expr {
              Apply(
                    Ident(TermName("println")),
                    List(Literal(Constant("hello!")))
              )
        }
    }
}

但是要注意, 黑盒巨集的使用, 會有四點限制, 主要方面是

型別檢查
型別推到
隱式推到
模式匹配

白盒例子

import scala.reflect.macros.blackbox

object Macros {
    def hello: Unit = macro helloImpl

    def helloImpl(c: blackbox.Context): c.Tree = {
      import c.universe._
      c.Expr(q"""println("hello!")""")
    }
}

Using macros is easy, developing macros is hard.

瞭解了Macros的兩種規範之後, 我們再來看看它的兩種用法, 一種和C的風格很像, 只是在編譯期將巨集展開, 減少了方法呼叫消耗. 還有一種用法, 我想大家更熟悉, 就是註解, 將一個巨集註解標記在一個類, 方法, 或者成員上, 就可以將所見的程式碼, 通過AST變成everything, 不過, 請不要變的太離譜.

Def Macros

方法巨集, 其實之前的程式碼中, 已經見識過了, 沒什麼稀奇, 但剛才的例子還是比較簡單的, 如果我們要傳遞一個引數, 或者泛型呢?

看下面例子:

object Macros {
    def hello2[T](s: String): Unit = macro hello2Impl[T]

    def hello2Impl[T](c: blackbox.Context)(s: c.Expr[String])(ttag: c.WeakTypeTag[T]): c.Expr[Unit] = {
        import c.universe._
        c.Expr {
            Apply(
                Ident(TermName("println")),
                List(
                    Apply(
                        Select(
                            Apply(
                                Select(
                                    Literal(Constant("hello ")),
                                    TermName("$plus")
                                ),
                                List(
                                    s.tree
                                )
                            ),
                            TermName("$plus")
                        ),
                        List(
                            Literal(Constant("!"))
                        )
                    )
                )
            )
        }
    }
}

和之前的不同之處, 暴露的方法hello2主要在於多了引數s和泛型T, 而hello2Impl實現也多了兩個括號

(s: c.Expr[String])
(ttag: c.WeakTypeTag[T])

我們來一一講解

c.Expr

這是Macros的表示式包裝器, 裡面放置著型別String, 為什麼不能直接傳String呢? 當然是不可以了, 因為巨集的入參只接受Expr, 呼叫巨集傳入的引數也會預設轉為Expr.

這裡要注意, 這個(s: c.Expr[String])的入參名必須等於hello2[T](s: String)的入參名

WeakTypeTag[T]

記得上一期已經說過的TypeTag 和 ClassTag.

scala> val ru = scala.reflect.runtime.universe
ru @ 6d657803: scala.reflect.api.JavaUniverse = [email protected]

scala> def foo[T: ru.TypeTag] = implicitly[ru.TypeTag[T]]
foo: [T](implicit evidence$1: reflect.runtime.universe.TypeTag[T])reflect.runtime.universe.TypeTag[T]

scala> foo[Int]
res0 @ 7eeb8007: reflect.runtime.universe.TypeTag[Int] = TypeTag[Int]

scala> foo[List[Int]]
res1 @ 7d53ccbe: reflect.runtime.universe.TypeTag[List[Int]] = TypeTag[scala.List[Int]]

這都沒有問題, 但是如果我傳遞一個泛型呢, 比如這樣:

scala> def bar[T] = foo[T] // T is not a concrete type here, hence the error
<console>:26: error: No TypeTag available for T
       def bar[T] = foo[T]
                       ^

沒錯, 對於不具體的型別(泛型), 就會報錯了, 必須讓T有一個邊界才可以呼叫, 比如這樣:

scala> def bar[T: TypeTag] = foo[T] // to the contrast T is concrete here
                                    // because it's bound by a concrete tag bound
bar: [T](implicit evidence$1: reflect.runtime.universe.TypeTag[T])reflect.runtime.universe.TypeTag[T]

但, 有時我們無法為泛型提供邊界, 比如在本章的Def Macros中, 這怎麼辦? 沒關係, 楊總說過:

任何計算機問題都可以通過加一層中介軟體解決.

所以, Scala引入了一個新的概念 => WeakTypeTag[T], 放在TypeTag之上, 之後可以

scala> def foo2[T] = weakTypeTag[T]
foo2: [T]=> reflect.runtime.universe.WeakTypeTag[T]

無須邊界, 照樣使用, 而TypeTag就不行了.

scala> def foo[T] = typeTag[T]
<console>:15: error: No TypeTag available for T
       def foo[T] = typeTag[T]

Apply

在前面的例子中, 我們多次看到了Apply(), 這是做什麼的呢? 我們可以理解為這是一個AST構建函式, 比較好奇的我看了下原始碼, 搜打死乃.

class ApplyExtractor{
    def apply(fun: Tree, args: List[Tree]): Apply = {
        ???
    }
}

看著眼熟不? 沒錯, 和Scala 的List[+A]的構建函式類似, 一個延遲建立函式. 好了, 先理解到這.

Ident

定義, 可以理解為Scala識別符號的構建函式.

Literal(Constant("hello "))

文字, 字串構建函式

Select

選擇構建函式, 選擇的什麼呢? 答案是一切, 不論是選擇方法, 還是選擇類. 我們可以理解為.這個呼叫符. 舉個例子吧:

scala> showRaw(q"scala.Some.apply")
res2: String = Select(Select(Ident(TermName("scala")), TermName("Some")), TermName("apply"))

還有上面的例子: "hello ".$plus(s.tree)

Apply(
    Select(
        Literal(Constant("hello ")),
        TermName("$plus")
    ),
    List(
        s.tree
    )
)

原始碼如下:

class SelectExtractor {
    def apply(qualifier: Tree, name: Name): Select = {
        ???
    }
}

TermName("$plus")

理解TermName之前, 我們先了解一下什麼是Names, Names在官網解釋是:

Names are simple wrappers for strings.

只是一個簡單的字串包裝器, 也就是把字串包裝起來, Names有兩個子類, 分別是TermName 和 TypeName, 將一個字串用兩個子類包裝起來, 就可以使用Select 在tree中進行查詢, 或者組裝新的tree.

官網地址

巨集插值器

剛剛就為了實現一個如此簡單的功能, 就寫了那麼巨長的程式碼, 如果如此的話, 即便Macros 功能強大, 也不易推廣Macros. 因此Scala又引入了一個新工具 => Quasiquotes

Quasiquotes 大大的簡化了巨集編寫的難度, 並極大的提升了效率, 因為它讓你感覺寫巨集就像寫scala程式碼一樣.

同樣上面的功能, Quasiquotes實現如下:

object Macros {
    def hello2[T](s: String): Unit = macro hello2Impl[T]

    def hello2Impl[T](c: blackbox.Context)(s: c.Expr[String])(ttag: c.WeakTypeTag[T]): c.Expr[Unit] = {
        import c.universe._
        val tree = q"""println("hello " + ${s.tree} + "!")"""
        
        c.Expr(tree)
    }
}

q""" ??? """ 就和 s""" ??? """, r""" ??? """ 一樣, 可以使用$引用外部屬性, 方便進行邏輯處理.

Macros ANNOTATIONS

巨集註釋, 就和我們在Java一樣, 下面是我寫的一個例子: 對於以class修飾的類, 我們也像case class修飾的類一樣, 完善toString()方法.

package com.pharbers.macros.common.connecting

import scala.reflect.macros.whitebox
import scala.language.experimental.macros
import scala.annotation.{StaticAnnotation, compileTimeOnly}

@compileTimeOnly("enable macro paradis to expand macro annotations")
final class ToStringMacro extends StaticAnnotation {
    def macroTransform(annottees: Any*): Any = macro ToStringMacro.impl
}

object ToStringMacro {
    def impl(c: whitebox.Context)(annottees: c.Expr[Any]*): c.Expr[Any] = {
        import c.universe._

        val class_tree = annottees.map(_.tree).toList match {
            case q"$mods class $tpname[..$tparams] $ctorMods(...$paramss) extends ..$parents { $self => ..$stats }" :: Nil =>

                val params = paramss.flatMap { params =>
                    val q"..$trees" = q"..$params"
                    trees
                }
                val fields = stats.flatMap { params =>
                    val q"..$trees" = q"..$params"
                    trees.map {
                        case q"$mods def toString(): $tpt = $expr" => q""
                        case x => x
                    }.filter(_ != EmptyTree)
                }
                val total_fields = params ++ fields

                val toStringDefList = total_fields.map {
                    case q"$mods val $tname: $tpt = $expr" => q"""${tname.toString} + " = " + $tname"""
                    case q"$mods var $tname: $tpt = $expr" => q"""${tname.toString} + " = " + $tname"""
                    case _ => q""
                }.filter(_ != EmptyTree)
                val toStringBody = if(toStringDefList.isEmpty) q""" "" """ else toStringDefList.reduce { (a, b) => q"""$a + ", " + $b""" }
                val toStringDef = q"""override def toString(): String = ${tpname.toString()} + "(" + $toStringBody + ")""""

                q"""
                    $mods class $tpname[..$tparams] $ctorMods(...$paramss) extends ..$parents { $self => ..$stats
                        $toStringDef
                    }
                """

            case _ => c.abort(c.enclosingPosition, "Annotation @One2OneConn can be used only with class")
        }

        c.Expr[Any](class_tree)
    }
}

compileTimeOnly

非強制的, 但建議加上. 官網解釋如下:

It is not mandatory, but is recommended to avoid confusion. Macro annotations look like normal annotations to the vanilla Scala compiler, so if you forget to enable the macro paradise plugin in your build, your annotations will silently fail to expand. The @compileTimeOnly annotation makes sure that no reference to the underlying definition is present in the program code after typer, so it will prevent the aforementioned situation from happening.

StaticAnnotation

繼承自StaticAnnotation的類, 將被Scala直譯器標記為註解類, 以註解的方式使用, 所以不建議直接生成例項, 加上final修飾符.

macroTransform

def macroTransform(annottees: Any*): Any = macro ToStringMacro.impl

對於使用@ToStringMacro修飾的程式碼, 編譯器會自動呼叫macroTransform方法, 該方法的入參, 是annottees: Any*, 返回值是Any, 主要是因為Scala缺少更細緻的描述, 所以使用這種籠統的方式描述可以接受一切型別引數. 而方法的實現, 和Def Macro一樣.

impl

def impl(c: whitebox.Context)(annottees: c.Expr[Any]*): c.Expr[Any] = {
    import c.universe._
    ???
}

到了Macros的具體實現了. 這裡其實和Def Macro也差不多. 但對於需要傳遞引數的巨集註解, 需要按照下面的寫法:

final class One2OneConn[C](param_name: String) extends StaticAnnotation {
    def macroTransform(annottees: Any*): Any = macro One2OneConn.impl
}

object One2OneConn {
    def impl(c: whitebox.Context)(annottees: c.Expr[Any]*): c.Expr[Any] = {
        import c.universe._
        
        // 匹配當前註解, 獲得引數資訊
        val (conn_type, conn_name) = c.prefix.tree match {
            case q"new One2OneConn[$conn_type]($conn_name)" =>
                (conn_type.toString, conn_name.toString.replace("\"", ""))
            case _ => c.abort(c.enclosingPosition, "Annotation @One2OneConn must provide conn_type and conn_name !")
        }
        
        ???
    }
}

有幾點需要注意的地方:

巨集註解只能操作當前自身註解, 和定義在當前註解之下的註解, 對於之前的註解, 因為已經展開, 所以已經不能操作了.
如果巨集註解生成多個結果, 例如既要展開註解標識的類, 還要直接生成類例項, 則返回結果需要以塊(Block)包起來.
巨集註釋必須使用白盒巨集.

Macro Paradise

Scala 推出了一款外掛, 叫做Macro Paradise(巨集天堂), 可以幫助開發者控制帶有巨集的Scala程式碼編譯順序, 同時還提供除錯功能, 這裡不做過多介紹, 有興趣的可以檢視官網: Macro Paradise