1. 程式人生 > >flask-sqlalchemy中 backref lazy的引數例項解釋和選擇

flask-sqlalchemy中 backref lazy的引數例項解釋和選擇

最近在學習到Flask中的Sqlalchemy, 不過在看到資料庫關係db.relations()時對lazy這個引數一直很模糊。主要是看到Flask Web開發這本書中對關注與被關注的關係建模中,被lazy的使用繞暈了。

看官方文件,也得不到多少資訊,於是就自己實踐,從lazy引數的不同值所執行的sql語句出發,結合one-to-manymany-to-many的關係,分析lazy引數取不同值(dynamic, joined, select)在不同場景下的選擇,因為涉及到資料庫效能問題,選擇不同差別很大,尤其在資料量比較大時。
以下的例項均是基於如下的模型和表:主要側重對relationship

中的backref的lazy屬性做修改。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
registrations = db.Table('registrations',
                         db.Column('student_id', db.Integer, db.ForeignKey('students.id')),
                         db.Column('class_id', db.Integer, db.ForeignKey('classes.id')))
class Student
(db.Model):
__tablename__ = 'students' id = db.Column(db.Integer, primary_key=True) name = db.Column(db.String(64)) class_id = db.Column(db.Integer, db.ForeignKey('classes.id')) def __repr__(self): return '<Student: %r>' %self.name class Class(db.Model): __tablename__ = 'classes'
id = db.Column(db.Integer, primary_key=True) students = db.relationship('Student', backref='_class', lazy="dynamic") name = db.Column(db.String(64)) def __repr__(self): return '<Class: %r>' %self.name

基本介紹

首先看官網的關於lazy的說明:

lazy 決定了 SQLAlchemy 什麼時候從資料庫中載入資料:,有如下四個值:(其實還有個noload不常用)
select: (which is the default) means that SQLAlchemy will load the data as necessary in one go using a standard select statement.
joined: tells SQLAlchemy to load the relationship in the same query as the parent using a JOIN statement.
subquery: works like ‘joined’ but instead SQLAlchemy will use a subquery.
dynamic : is special and useful if you have many items. Instead of loading the items SQLAlchemy will return another query object which
you can further refine before loading the items. This is usually what you want if you expect more than a handful of items for this relationship

通俗了說,select就是訪問到屬性的時候,就會全部載入該屬性的資料。joined則是在對關聯的兩個表進行join操作,從而獲取到所有相關的物件。dynamic則不一樣,在訪問屬性的時候,並沒有在記憶體中載入資料,而是返回一個query物件, 需要執行相應方法才可以獲取物件,比如.all().下面結合例項解釋這幾個的使用場景。

例項

首先是最開始一對多關係中,改動如下:將的lazy改為select:

1
students = db.relationship('Student', backref='_class', lazy="select")

這樣的話, class.students會直接返回結果列表:

1
2
3
4
>>> from app.models import Student as S, Class as C
>>> c1=C.query.first()
>>> c1.students
[<Student: u'test'>, <Student: u'test2'>, <Student: u'test3'>]

這種情況下,在資料量較大或者想做進一步操作時候,不太方便,因此這個時候, dynamic就用上了:

1
students = db.relationship('Student', backref='_class', lazy="dynamic")

同樣看看結果:

1
2
3
4
5
6
7
8
9
10
11
>>> from app.models import Student as S, Class as C
>>> s1=S.query.first()
>>> c1=C.query.first()
>>> c1.students
<sqlalchemy.orm.dynamic.AppenderBaseQuery object at 0x7f007d2e8ed0>
>>> print c1.students
SELECT students.id AS students_id, students.name AS students_name
FROM students, registrations
WHERE :param_1 = registrations.class_id AND students.id = registrations.student_id
>>> c1.students.all()
[<Student: u'test'>, <Student: u'test2'>, <Student: u'test3'>]

可以看到, 執行c1.student返回的是是一個 query物件,並且該物件的sql語句也可以看到,就是簡單查詢了Student。而如果lazy=select 或者 joined均是直接返回結果。 需要注意的是, lazy="dynamic"只可以用在一對多和多對對關係中,不可以用在一對一和多對一中,如果返回結果只有一個的話,也就無需要延遲載入資料了。
前面說的都是給當前屬性加lazy屬性,backref的lazy預設都是select,如果給反向引用backref加lazy屬性呢? 直接使用backref=db.backref('students', lazy='dynamic' 即可。這個在多對多關係需要進行考量。
先看一個最基本的多對多關係:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
registrations = db.Table('registrations',
                         db.Column('student_id', db.Integer, db.ForeignKey('students.id')),
                         db.Column('class_id', db.Integer, db.ForeignKey('classes.id')))
class Student(db.Model):
    __tablename__ = 'students'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(64))
    # class_id = db.Column(db.Integer, db.ForeignKey('classes.id')) 這裡需要註釋,不需要外來鍵了
    def __repr__(self):
        return '<Student: %r>' %self.name
class Class(db.Model):
    __tablename__ = 'classes'
    id = db.Column(db.Integer, primary_key=True)
    students = db.relationship('Student', secondary=registrations, backref='_class', lazy="dynamic") #這裡指定關聯表
    name = db.Column(db.String(64))
    def __repr__(self):
        return '<Class: %r>' %self.name

同樣執行結果可以看到:

1
2
3
4
5
6
7
8
9
10
11
12
>>> s1=S.query.first()
>>> c1=C.query.first()
>>> s1._class
[<Class: u'class1'>, <Class: u'class2'>]
>>> c1.students
<sqlalchemy.orm.dynamic.AppenderBaseQuery object at 0x7ff8691a8610>
>>> c1.students.all()
[<Student: u'test'>, <Student: u'test2'>, <Student: u'test3'>]
>>> print c1.students
SELECT students.id AS students_id, students.name AS students_name
FROM students, registrations
WHERE :param_1 = registrations.class_id AND students.id = registrations.student_id

可以看到這個跟一對多關係中的很類似,只不過s1._class成為了集合形式, 因為backref="_class"預設仍然是select,所以直接返回結果,而c1.students的sql語句也僅僅是查詢了students。但是如果修改反向引用的lazyjoined: 

1
2
students = db.relationship('Student', secondary=registrations,
                                           backref=db.backref('_class', lazy="joined"), lazy="dynamic")

然後看看結果:

1
2
3
4
5
6
7
8
9
....
>>> print c1.students
SELECT students.id AS students_id, students.name AS students_name, classes_1.id AS classes_1_id, classes_1.name AS classes_1_name
FROM registrations, students LEFT OUTER JOIN (registrations AS registrations_1 JOIN classes AS classes_1 ON classes_1.id = registrations_1.class_id) ON students.id = registrations_1.student_id
WHERE :param_1 = registrations.class_id AND students.id = registrations.student_id
>>> c1.students.all()
[<Student: u'test'>, <Student: u'test2'>, <Student: u'test3'>]
>>> s1._class
[<Class: u'class1'>, <Class: u'class2'>]

首先不變的還是s1._class還是直接返回資料。有變化的是c1.students的sql語句, 不僅僅是查詢Student物件, 而且還通過與關聯表做join操作,把相關聯的Class也查詢了。相關聯的意思是什麼呢?看下直接執行sql語句的結果就知道了:

1
2
3
4
5
6
7
8
9
10
mysql> SELECT students.id AS students_id, students.name AS students_name, classes_1.id AS classes_1_id, classes_1.name AS classes_1_name  FROM registrations, students LEFT OUTER JOIN (registrations AS registrations_1 JOIN classes AS classes_1 ON classes_1.id = registrations_1.class_id) ON students.id = registrations_1.student_id  WHERE 1 = registrations.class_id AND students.id = registrations.student_id;
+-------------+---------------+--------------+----------------+
| students_id | students_name | classes_1_id | classes_1_name |
+-------------+---------------+--------------+----------------+
|           1 | test          |            1 | class1         |
|           1 | test          |            2 | class2         |
|           2 | test2         |            1 | class1         |
|           3 | test3         |            1 | class1         |
+-------------+---------------+--------------+----------------+
4 rows in set (0.00 sec)

也就是說把查詢得到的students的對應的class實體也都查詢出來了。 但是貌似在這個例子中沒有意義,因為這種多對多的關係比較簡單,關聯表甚至都不是模型,只有兩個外來鍵的id, 上述程式碼中的registrations是直接被sqlalchemy接管的,程式無法直接訪問的。
在下面的多對多例子中,我們可以看到上述的lazy方式的優勢,我們把關聯表改為實體model,並且額外增加一個時間資訊。模型程式碼如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Registration(db.Model):
    '''關聯表'''
    __tablename__ = 'registrations'
    student_id = db.Column(db.Integer, db.ForeignKey('students.id'), primary_key=True)
    class_id = db.Column(db.Integer, db.ForeignKey('classes.id'), primary_key=True)
    create_at = db.Column(db.DateTime, default=datetime.utcnow)
class Student(db.Model):
    __tablename__ = 'students'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(64))
    _class = db.relationship('Registration', foreign_keys=[Registration.student_id],
                             backref=db.backref('student', lazy="joined"), lazy="dynamic")
    def __repr__(self):
        return '<Student: %r>' %self.name
class Class(db.Model):
    __tablename__ = 'classes'
    id = db.Column(db.Integer, primary_key=True)
    students = db.relationship('Registration', foreign_keys=[Registration.class_id],
                               backref=db.backref('_class', lazy="joined"), lazy="dynamic")
    name = db.Column(db.String(64))
    def __repr__(self):
        return '<Class: %r>' %self.name

提前準備資料:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
mysql> select * from classes;
+----+--------+
| id | name   |
+----+--------+
|  1 | class1 |
|  2 | class2 |
+----+--------+
2 rows in set (0.00 sec)
mysql> select * from students;
+----+-------+
| id | name  |
+----+-------+
|  1 | test  |
|  2 | test2 |
|  3 | test3 |
+----+-------+
3 rows in set (0.00 sec)
mysql> select * from registrations;
+------------+----------+-----------+
| student_id | class_id | create_at |
+------------+----------+-----------+
|          1 |        1 | NULL      |
|          2 |        1 | NULL      |
|          3 |        1 | NULL      |
|          1 |        2 | NULL      |
+------------+----------+-----------+
4 rows in set (0.00 sec)

之後看看結果:

1
2
3
4
>>> s1._class.all()
[<app.models.Registration object at 0x7f0348018ed0>, <app.models.Registration object at 0x7f0348018f50>]
>>> c1.students.all()
[<app.models.Registration object at 0x7f0348018ed0>, <app.models.Registration object at 0x7f03480412d0>, <app.models.Registration object at 0x7f034c32f250>]

可以看到返回值是Registration兩個物件, 不再直接返回StudentClass物件了。如果想要獲取的話,可以使用給Registration加的反向引用:

1
2
3
4
>>> map(lambda x: x._class, s1._class.all())
[<Class: u'class1'>, <Class: u'class2'>]
>>> map(lambda x: x.student, c1.students.all())
[<Student: u'test'>, <Student: u'test2'>, <Student: u'test3'>]

那麼問題就來了,這裡在呼叫Registration的_classstudent時候, 還需不需要再查詢一遍資料庫呢? 

下面通過檢視執行的sql語句來看看:

1
2
3
4
>>> print s1._class
SELECT registrations.student_id AS registrations_student_id, registrations.class_id AS registrations_class_id, registrations.create_at AS registrations_create_at, classes_1.id AS classes_1_id, classes_1.name AS classes_1_name, students_1.id AS students_1_id, students_1.name AS students_1_name
FROM registrations LEFT OUTER JOIN classes AS classes_1 ON classes_1.id = registrations.class_id LEFT OUTER JOIN students AS students_1 ON students_1.id = registrations.student_id
WHERE :param_1 = registrations.student_id

我們可以發現: 跟上一個例子一樣,s1._class不僅查詢了對應的class資訊,而且通過join操作,獲取到了相應的StudentClass物件,換句話說,把Registration的student_class兩個回引屬性均指向了對應的物件, 也就是說,s1._class這一條查詢語句就可以把上述操作都完成。這個就是backref=db.backref('_class', lazy='joined')的作用。 
下面再看看把lazy改為select的情況:

1
2
3
4
5
6
###
_class = db.relationship('Registration', foreign_keys=[Registration.student_id],
                         backref=db.backref('student', lazy="select"), lazy="dynamic")
###
students = db.relationship('Registration', foreign_keys=[Registration.class_id],
                           backref=db.backref('_class', lazy="select"), lazy="dynamic")

這樣看看查詢語句:

1
2
3
4
5
6
7
>>> s1=S.query.first()
>>> print s1._class
SELECT registrations.student_id AS registrations_student_id, registrations.class_id AS registrations_class_id, registrations.create_at AS registrations_create_at
FROM registrations
WHERE :param_1 = registrations.student_id
>>> map(lambda x : x._class , s1._class)
[<Class: u'class1'>, <Class: u'class2'>]

十分簡單的sql語句,僅僅查詢返回了 Registration物件, 雖然結果一樣,但是每一個Registration物件訪問_class屬性時,又各自都查詢了一遍資料庫! 這是很重的! 比如一個class有100個student, 那麼獲取class.students需要額外查詢100次資料庫! 每一次資料庫的查詢代價很大,因此這就是joined的作用了。

總結