Machine-Learning–Based Column Selection for Column Generation

阿新 • • 發佈：2021-07-22

論文閱讀筆記，個人理解，如有錯誤請指正，感激不盡！僅是對文章進行梳理，細節請閱讀參考文獻。該文分類到Machine learning alongside optimization algorithms。

01 Column Generation

列生成 (Column Generation) 演算法在組合優化領域有著非常廣泛的應用，是一種求解大規模問題 (large-scale optimization problems) 的有效演算法。在列生成演算法中，將大規模線性規劃問題分解為主問題 (Master Problem, MP) 和定價子問題 (Pricing Problem, PP)。演算法首先將一個MP給restricted到只帶少量的columns，得到RMP。求解RMP，得到dual solution，並將其傳遞給PP，隨後求解PP得到相應的column將其加到RMP中。RMP和PP不斷迭代求解直到再也找不到檢驗數為負的column，那麼得到的RMP的最優解也是MP的最優解。如下圖所示：

關於列生成的具體原理，之前已經寫過很多詳細的文章了。還不熟悉的小夥伴可以看看以下：

02 Column Selection

在列生成迭代的過程中，有很多技巧可以用來加快演算法收斂的速度。其中一個就是在每次迭代的時候，加入多條檢驗數為負的column，這樣可以減少迭代的次數，從而加快演算法整體的執行時間。特別是求解一次子問題得到多條column和得到一條column相差的時間不大的情況下（例如，最短路中的labeling演算法）。

而每次迭代中選擇哪些column加入？是一個值得研究的地方。因為加入的columns不同，最終收斂的速度也大不一樣。一方面，我們希望加入column以後，目標函式能儘可能降低（最小化問題）；另一方面，我們希望加入的column數目越少越好，太多的列會導致RMP求解難度上升。因此，在每次的迭代中，我們構建一個模型，用來選擇一些比較promising的column加入到RMP中：

Let \(\ell\) be the CG iteration number
\(\Omega_{\ell}\) the set of columns present in the RMP at the start of iteration \(\ell\)
\(\mathcal{G}_{\ell}\) the generated columns at this iteration
For each column \(p \in \mathcal{G}_{\ell}\), we define a decision variable \(y_p\) that takes value one if column \(p\)

is selected and zero otherwise

為了最小化所選擇的column，每選擇一條column的時候，都會發生一個足夠小的懲罰\(\epsilon\)。最終，構建column selection的模型 (MILP) 如下：

大家發現沒有，如果沒有\(y_p\)和約束(8)和(9)，那麼上面這個模型就直接變成了下一次迭代的RMP了。

假設\(\epsilon\)足夠小，這些約束目的是使得被選中新增到RMP中的column數量最小化，也就是這些\(y_p=1\)的columns。那麼在迭代\(\ell\)中要新增到RMP的的column為：

總體的流程如下圖所示：

03 Graph Neural Networks

在每次迭代中，通過求解MILP，可以知道加入哪些column有助於演算法速度的提高，但是求解MILP也需要一定的時間，最終不一定能達到加速的目的。因此通過機器學習的方法，學習一個MILP的模型，每次給出選中的column，將是一種比較可行的方法。

在此之前，先介紹一下Graph Neural Networks。圖神經網路（GNNs）是通過圖節點之間的資訊傳遞來獲取圖的依賴性的連線模型。與標準神經網路不同，圖神經網路可以以任意深度表示來自其鄰域的資訊。

給定一個網路\(G=(V,E)\)，其中\(V\)是頂點集而\(E\)是邊集。每一個節點\(v \in V\)有著特徵向量\(x_v\)。目的是迭代地aggregate（聚合）相鄰節點的資訊以更新節點的狀態，令：

\(h^{(k)}_v\) be the representation vector of node \(v \in V\)（注意和\(x_v\)區分開）at iteration \(k=0,1,...,K\)
Let \(\mathcal{N}(v)\) be the set of neighbor (adjacent) nodes of \(v \in V\)

如下圖所示，節點\(v_1\)從相鄰節點\(v_2,v_3,v_4\) aggregate資訊來更新自己：

在迭代\(k > 0\)中，一個aggregated function，記為\(aggr\)，在每個節點\(v \in V\)上首先用於計算一個aggregated information vector \(a^{(k)}_v\):

其中在初始時有\(h^{0}_v=x_v\)，\(\phi^{(k)}\)是一個學習函式。\(aggr\)應該和節點順序無關，例如sum, mean, and min/max functions。

接著，使用另一個函式，記為\(comb\)，將aggregated information與節點當前的狀態進行結合，更新後的節點representation vectors:

其中\(\psi^{(k)}\)是另一個學習函式。在不斷的迭代中，每一個節點都收集來自更遠鄰居節點的資訊，在最後的迭代\(K\)中，節點\(v \in V\)的 representation \(h^{(K)}_v\) 就可以用來預測其標籤值\(l_v\)了，使用最後的轉換函式（記為\(out\)），最終：

04 A Bipartite Graph for Column Selection

利用上面的GNN來做Column Selection，比較明顯的做法是一個節點表示一個column，然後將兩個column通過一條邊連線如果他們都與某個約束相關聯的話。但是這樣會導致大量的邊，並且對偶值的資訊也很難在模型中進行表示。

因此作者使用了bipartite graph with two node types：column nodes \(V\) and constraint
nodes \(C\). An edge \((v, c)\) exists between a node \(v \in V\) and a node \(c \in C\) if column \(v\) contributes to constraint \(c\). 這樣做的好處是可以在節點\(c\)上附加特徵向量，例如對偶解的資訊，如下圖(a)所示：

因為有兩種型別的節點，每一次迭代時均有兩階段：階段1更新Constraint節點\(c \in C\)(上圖(b))，階段2更新Column節點\(v \in V\)(上圖(c))。最終，column節點的 representations \(h^{(K)}_v, v \in V\)被用來預測節點的labels \(y_v \in \{0, 1\}\)。演算法的流程如下：

As described in the previous section, we start by initializing the representation vectors of both the column and constraint nodes by the feature vectors \(x_v\) and \(x_c\), respectively (steps 1 and 2). For each iteration \(k\), we perform the two phases: updating the constraint representations (steps 4 and 5), then the column ones (steps 6 and 7). The sum function is used for the aggr function and the vector concatenation for the comb function.

The functions \(\phi^{(k)}_C, \psi^{(k)}_C, \phi^{(k)}_V\), and \(\psi^{(k)}_V\) are two-layer feed forward neural networks with rectified linear unit (ReLU) activation functions and out is a three-layer feed forward neural network with a sigmoid function for producing the final probabilities (step 9).

A weighted binary cross entropy loss is used to evaluate the performance of the model, where the weights are used to deal with the imbalance between the two classes. Indeed, about 90% of the columns belong to the unselected class, that is, their label \(y_v = 0\).

05 資料採集

資料通過使用前面提到的MILP求解多個算例來採集column的labels。每次的列生成迭代都將構建一個Bipartite Graph並且儲存以下的資訊：

The sets of column and constraint nodes;
A sparse matrix \(E \in \mathbb{R}^{n\times m}\) storing the edges;
A column feature matrix \(X^V \in \mathbb{R}^{n\times d}\), where \(n\) is the number of columns and d the number of column features;
A constraint feature matrix \(X^C \in \mathbb{R}^{n\times p}\), where \(m\) is the number of constraints and \(p\) the number of constraint features;
The label vector y of the newly generated
columns in \(\mathcal{G}‘\).

06 Case Study I: Vehicle and Crew Scheduling Problem

關於這個問題的定義就不寫了，大家可以自行去了解一下。

6.1 MILP Performance

通過刻畫在列生成中使用MILP和不使用MILP(所有生成的檢驗數為負的column都丟進下一次的RMP裡)的收斂圖如下：

使用了MILP收斂速度反而有所下降，This is mainly due to the rejected columns that remain with a negative reduced cost after the RMP reoptimization and keep on being generated in subsequent iterations, even though they do not improve the objective value (degeneracy).

為了解決這個問題，一個可行的方法是執行MILP以後，額外再新增一些column。不過是先將MILP選出來的column加進RMP，進行求解，得到duals以後，再去未被選中的column中判斷，哪些column在新的duals下檢驗數依然為負，然後進行新增。當然，未選中的column過多的話，不一定把所有的都加進去，按照檢驗數排序，加一部分就好（該文是50%）。如下圖所示：

加入了額外的column以後，在進行preliminary的測試，結果如下(the computational time of the algorithm with column selection does not include the time spent solving the MILP at every iteration，因為作者只想瞭解selection對column generation的影響，反正MILP最後要被更快的GNN模型替代的。)：

可以看到，MILP能節省更多的計算時間，減少約34%的總體時間。

6.2 Comparison

隨後，對以下的策略進行對比：

No selection (NO-S): This is the standard CG algorithm with no selection involved, with the use of the acceleration strategies described in Section 2.
MILP selection (MILP-S): The MILP is used to select the columns at each iteration, with 50% additional columns to avoid convergence issues. Because the MILP is considered to be the expert we want to learn from and we are looking to replace it with a fast approximation, the total computational time does not include the time spent solving the MILP.
GNN selection (GNN-S): The learned model is used to select the columns. At every CG iteration, the column features are extracted, the predictions are obtained, and the selected columns are added to the RMP.
Sorting selection (Sort-S): The generated columns are sorted by reduced cost in ascending order, and a subset of the columns with the lowest reduced cost are selected. The number of columns selected is on average the same as with the GNN selection.
Random selection (Rand-S): A subset of the columns is selected randomly. The number of columns selected is on average the same as with the GNN selection

對比的結果如下，其中The time reduction column compares the GNN-S to the NO-S algorithm.相比平均減少26%的時間。

07 Case Study II: Vehicle Routing Problem with Time Windows

這是大家的老熟客了，就不過多介紹了。直接看對比的結果：

The last column corresponds to the time reduction when comparing GNN-S with NO-S. One can see that the column selection with the GNN model gives positive results, yielding average reductions ranging from 20%–29%. These reductions could have been larger if the number of CG iterations performed had not increased.

參考文獻

[1] Mouad Morabit, Guy Desaulniers, Andrea Lodi (2021) Machine-Learning–Based Column Selection for Column Generation. Transportation Science

Machine-Learning–Based Column Selection for Column Generation

[Machine Learning] Octave Control Statements, for while if

For: v = zeros(10, 1); for i=1:10, v(i) = 2^i; end; # the same as indices=1:10 for i=indices, disp(i) end; while & if & break:

HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

論文傳送門作者香港科技大學 Qianwen WangHuamin Qu 牛津大學 William AlexanderJack PeggMin Chen

Mysql 插入中文錯誤：Incorrect string value: '\xE7\xA8\x8B\xE5\xBA\x8F...' for column '

1. 確定資料庫編碼 Incorrect string value: \'\\xE7\\xA8\\x8B\\xE5\\xBA\\x8F...\' for column \'course\' at row 1

Data truncation: Incorrect datetime value: '' for column 'create_time' at row 1 問題

org.springframework.dao.DataIntegrityViolationException: PreparedStatementCallback; SQL [insert into orders values(?,?,?,?,?,?,?,?,?,?,?)]; Data truncation: Incorrect datetime value: \'\' for col

mysql插入資料報（Incorrect string value: '\xB6\xFE' for column 'name' at row 1）

這是我的表結構 mysql> describe students; +--------+---------------------+------+-----+---------+----------------+

A Duplication Analysis Based Evolutionary Algorithm for Bi-objective Feature Selection

基於重複分析的雙目標特徵選擇進化演算法 1. 摘要2. 介紹3. 演算法3.1 演算法整體框架3.2 初始化及終止條件3.3 繁殖3.4 重複分析3.5 多樣性的維護

ml-6-1-應用機器學習的建議Advice for Applying Machine Learning

應用機器學習的建議Advice for Applying Machine Learning Deciding What to Try Next 具體來講，我將重點關注的問題是假如你在開發一個機器學習系統，或者想試著改進一個機器學習系統的效能，你應如何決定

python Incorrect string value: '\\xF0\\x9F\\xA6\\x86\\xF0\\x9F...' for column

資料新增到資料庫時報錯，原因是資料中含有emoji表情，將其替換為空字串即可

paper1—Machine Learning Approach for Ship Detection using Remotely Sensed Images

1、Tensor Flow Tensor Flow is a programming system developed by Google which represents computations as graphs. Computation Graph is first constructed to train the neural network and then exe

MySQL儲存微信暱稱中的特殊符號造成：（Incorrect string value: "xxxx'for column ‘name’ at row 1）異常

今天有業務員反應，編輯某個使用者的資訊的時候出現了異常，異常資訊如下：

帶表情字元插入mysql欄位報錯問題處理 Incorrect string value: ‘\xF0\xA5\x8C\x93\xE5\x85...‘ for column ‘nickNa

今天在處理資料的時候出現這個錯誤 Incorrect string value: \'\\xF0\\xA5\\x8C\\x93\\xE5\\x85...\' for column \'nickName\' at row 126; nested exception is java.sql.SQLException: Incorrect string value

decimal型別 MysqlDataTruncation: Data truncation: Out of range value for column ‘unit_price‘ at row 1

今天程式中報了個錯，錯誤的大概意思是unit_price欄位被截斷了，為什麼會被截斷呢，資料庫的長度不夠，那麼navicat中顯示的長度，到底是什麼意思呢？ 100.123456的長度為9位，小數點6位 123.45678的長度為8位

ERROR 1366 (HY000): Incorrect string value: '\xE9\x83\x91\xE5\xB7\x9E' for column &#39

ERROR 1366 (HY000): Incorrect string value: \'\\xE9\\x83\\x91\\xE5\\xB7\\x9E\' for column \'aa\' at row 1建立表之後不能插入中文字元？為啥呢？瞭解字符集的重要性。它必須在建庫之前要確定好，恢復備份時也

解決mysql插入微信暱稱nickname出錯Incorrect string value: ‘\xF0\x9F\x8C\xB8’ for column ‘nickName’ at row 1

技術標籤：筆記資料庫mysql資料庫由於mysql預設編碼為utf-8，最大隻佔3個位元組，一些表情或者非常見字元，比如該例子中“xF0\\x9F\\x8C\\xB8”佔4個位元組，這樣往資料表裡插入4個位元組的資料就會出錯。

Entity Framework Core The same thing can be achieved by explicitly specifying the column type. For example, if the entity type is defined like so:- 建立並配置模型-值轉換器 Value Conversions

值轉換　　值轉換器允許在從資料庫讀取或寫入資料庫時轉換屬性值。這種轉換可以從一個值轉換到另一個相同型別的值(例如，加密字串)，也可以從一種型別的值轉換到另一種型別的值(例如，將列舉值轉換為資料庫中的字串

Machine-Learning–Based Column Selection for Column Generation

01 Column Generation

02 Column Selection

03 Graph Neural Networks

04 A Bipartite Graph for Column Selection

05 資料採集

06 Case Study I: Vehicle and Crew Scheduling Problem

6.1 MILP Performance

6.2 Comparison

07 Case Study II: Vehicle Routing Problem with Time Windows

參考文獻

Machine-Learning–Based Column Selection for Column Generation

[Machine Learning] Octave Control Statements, for while if

HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

Mysql 插入中文錯誤：Incorrect string value: '\xE7\xA8\x8B\xE5\xBA\x8F...' for column '

Data truncation: Incorrect datetime value: '' for column 'create_time' at row 1 問題

mysql插入資料報（Incorrect string value: '\xB6\xFE' for column 'name' at row 1）

A Duplication Analysis Based Evolutionary Algorithm for Bi-objective Feature Selection

ml-6-1-應用機器學習的建議Advice for Applying Machine Learning

python Incorrect string value: '\\xF0\\x9F\\xA6\\x86\\xF0\\x9F...' for column

paper1—Machine Learning Approach for Ship Detection using Remotely Sensed Images

MySQL儲存微信暱稱中的特殊符號造成：（Incorrect string value: "xxxx'for column ‘name’ at row 1）異常

帶表情字元插入mysql欄位報錯問題處理 Incorrect string value: ‘\xF0\xA5\x8C\x93\xE5\x85...‘ for column ‘nickNa

decimal型別 MysqlDataTruncation: Data truncation: Out of range value for column ‘unit_price‘ at row 1

ERROR 1366 (HY000): Incorrect string value: '\xE9\x83\x91\xE5\xB7\x9E' for column &#39

解決mysql插入微信暱稱nickname出錯Incorrect string value: ‘\xF0\x9F\x8C\xB8’ for column ‘nickName’ at row 1

Entity Framework Core The same thing can be achieved by explicitly specifying the column type. For example, if the entity type is defined like so:- 建立並配置模型-值轉換器 Value Conversions

細粒度相關 - Learning to Zoom: a Saliency-Based Sampling Layer for Neural Networks - 1 - 論文學習

插入MySQL報錯‘pymysql.err.DataError: (1406, "Data too long for column 'url' at row 1&quot

Cause: java.sql.SQLException: Incorrect string value: ‘\xF0\x9F\x98\x80--...‘ for column ‘html_b

MySql報錯-Data truncation: Data too long for column 'XXX' at row 1

Machine-Learning–Based Column Selection for Column Generation

01 Column Generation

02 Column Selection

03 Graph Neural Networks

04 A Bipartite Graph for Column Selection

05 資料採集

06 Case Study I: Vehicle and Crew Scheduling Problem

6.1 MILP Performance

6.2 Comparison

07 Case Study II: Vehicle Routing Problem with Time Windows

參考文獻

相關推薦