1. 程式人生 > 其它 >利用圖神經網路進行link prediction

利用圖神經網路進行link prediction

gcn for prediction of protein interactions

專案地址:https://github.com/jiangnanboy/gcn_for_prediction_of_protein_interactions

利用各種圖神經網路進行link prediction of protein interactions。

Guide

Intro

目前主要實現基於【data/yeast/yeast.edgelist】下的蛋白質資料進行link prediction。

Model

模型

模型主要使用圖神經網路,如gae、vgae等

Usage

  • 相關引數的配置config見每個模型資料夾中的config.cfg檔案,訓練和預測時會載入此檔案。

  • 訓練及預測

    1.GCNModelVAE(src/vgae)

    (1).訓練

    from src.vgae.train import Train
    train = Train()
    train.train_model('config.cfg')
    
      Epoch: 0001 train_loss =  0.73368 val_roc_score =  0.77485 average_precision_score =  0.69364 time= 0.79382
      Epoch: 0002 train_loss =  0.73334 val_roc_score =  0.80637 average_precision_score =  0.74248 time= 0.78920
      Epoch: 0003 train_loss =  0.73341 val_roc_score =  0.85901 average_precision_score =  0.84317 time= 0.78759
      Epoch: 0004 train_loss =  0.73353 val_roc_score =  0.86936 average_precision_score =  0.85909 time= 0.78880
      Epoch: 0005 train_loss =  0.73334 val_roc_score =  0.86945 average_precision_score =  0.86092 time= 0.78438
      Epoch: 0006 train_loss =  0.73353 val_roc_score =  0.87117 average_precision_score =  0.86205 time= 0.78761
      Epoch: 0007 train_loss =  0.73352 val_roc_score =  0.87235 average_precision_score =  0.86407 time= 0.78210
      Epoch: 0008 train_loss =  0.73338 val_roc_score =  0.87317 average_precision_score =  0.86462 time= 0.78477
      Epoch: 0009 train_loss =  0.73341 val_roc_score =  0.87462 average_precision_score =  0.86755 time= 0.78378
      Epoch: 0010 train_loss =  0.73348 val_roc_score =  0.87606 average_precision_score =  0.86853 time= 0.78587
      Epoch: 0011 train_loss =  0.73344 val_roc_score =  0.87686 average_precision_score =  0.86923 time= 0.78406
      Epoch: 0012 train_loss =  0.73331 val_roc_score =  0.87665 average_precision_score =  0.86880 time= 0.78253
      Epoch: 0013 train_loss =  0.73357 val_roc_score =  0.87426 average_precision_score =  0.86521 time= 0.78202
      Epoch: 0014 train_loss =  0.73327 val_roc_score =  0.87218 average_precision_score =  0.86192 time= 0.78299
      Epoch: 0015 train_loss =  0.73336 val_roc_score =  0.87118 average_precision_score =  0.85946 time= 0.78166
      Epoch: 0016 train_loss =  0.73336 val_roc_score =  0.86960 average_precision_score =  0.85835 time= 0.78792
      Epoch: 0017 train_loss =  0.73355 val_roc_score =  0.87126 average_precision_score =  0.85940 time= 0.78401
      Epoch: 0018 train_loss =  0.73357 val_roc_score =  0.87050 average_precision_score =  0.85648 time= 0.78511
      Epoch: 0019 train_loss =  0.73332 val_roc_score =  0.86737 average_precision_score =  0.84906 time= 0.78132
      Epoch: 0020 train_loss =  0.73345 val_roc_score =  0.86632 average_precision_score =  0.84532 time= 0.78603
      
      test roc score: 0.863696753293295
      test ap score: 0.8381410617542567
    

    (2).預測

    from src.vgae.predict import Predict
    
    predict = Predict()
    predict.load_model_adj('config_cfg')
    # 會返回原始的圖鄰接矩陣和經過模型編碼後的hidden embedding經過內積解碼的鄰接矩陣,可以對這兩個矩陣進行比對,得出link prediction.
    adj_orig, adj_rec = predict.predict()
    
    2.GCNModelARGA(src/arga)

    (1).訓練

    from src.arga.train import Train
    train = Train()
    train.train_model('config.cfg')
    
      Epoch: 0001 train_loss =  2.17176 val_roc_score =  0.77090 average_precision_score =  0.69050 time= 0.81113
      Epoch: 0002 train_loss =  2.16173 val_roc_score =  0.84636 average_precision_score =  0.81340 time= 0.81458
      Epoch: 0003 train_loss =  2.14979 val_roc_score =  0.87660 average_precision_score =  0.86472 time= 0.80898
      Epoch: 0004 train_loss =  2.13698 val_roc_score =  0.87735 average_precision_score =  0.86534 time= 0.80995
      Epoch: 0005 train_loss =  2.12339 val_roc_score =  0.87765 average_precision_score =  0.86592 time= 0.80865
      Epoch: 0006 train_loss =  2.10753 val_roc_score =  0.87756 average_precision_score =  0.86571 time= 0.80748
      Epoch: 0007 train_loss =  2.08996 val_roc_score =  0.87806 average_precision_score =  0.86621 time= 0.80738
      Epoch: 0008 train_loss =  2.06920 val_roc_score =  0.87801 average_precision_score =  0.86623 time= 0.80744
      Epoch: 0009 train_loss =  2.04701 val_roc_score =  0.87795 average_precision_score =  0.86618 time= 0.80932
      Epoch: 0010 train_loss =  2.02241 val_roc_score =  0.87830 average_precision_score =  0.86643 time= 0.80722
      Epoch: 0011 train_loss =  1.99754 val_roc_score =  0.87807 average_precision_score =  0.86620 time= 0.80533
      Epoch: 0012 train_loss =  1.97255 val_roc_score =  0.87749 average_precision_score =  0.86586 time= 0.80859
      Epoch: 0013 train_loss =  1.94664 val_roc_score =  0.87607 average_precision_score =  0.86483 time= 0.80660
      Epoch: 0014 train_loss =  1.92208 val_roc_score =  0.87408 average_precision_score =  0.86320 time= 0.80300
      Epoch: 0015 train_loss =  1.89869 val_roc_score =  0.87290 average_precision_score =  0.86218 time= 0.80400
      Epoch: 0016 train_loss =  1.87584 val_roc_score =  0.87244 average_precision_score =  0.86186 time= 0.80392
      Epoch: 0017 train_loss =  1.85415 val_roc_score =  0.87554 average_precision_score =  0.86400 time= 0.80675
      Epoch: 0018 train_loss =  1.83373 val_roc_score =  0.87653 average_precision_score =  0.86473 time= 0.80762
      Epoch: 0019 train_loss =  1.81515 val_roc_score =  0.87718 average_precision_score =  0.86532 time= 0.80596
      Epoch: 0020 train_loss =  1.79975 val_roc_score =  0.87745 average_precision_score =  0.86551 time= 0.80889
      
      test roc score: 0.8797451083479302
      test ap score: 0.8681038618348471
    

    (2).預測

    from src.arga.predict import Predict
    
    predict = Predict()
    predict.load_model_adj('config_cfg')
    # 會返回原始的圖鄰接矩陣和經過模型編碼後的hidden embedding經過內積解碼的鄰接矩陣,可以對這兩個矩陣進行比對,得出link prediction.
    adj_orig, adj_rec = predict.predict()
    

Dataset

資料來自酵母蛋白質相互作用yeast。 資料集的格式如下,具體可見data

 YLR418C	YOL145C
 YOL145C	YLR418C
 YLR418C	YOR123C
 YOR123C	YLR418C
 ......         ......