Tensorflow全域性設定可見GPU編號操作

阿新 • • 發佈：2020-07-01

筆者需要tensorflow僅執行在一個GPU上（機器本身有多GPU），而且需要依據系統引數動態調節，故無法簡單使用CUDA_VISIBLE_DEVICES。

一種方式是全域性使用tf.device函式生成的域，但裝置號需要在繪製Graph前指定，仍然不夠靈活。

查閱文件發現config的GPUOptions中的visible_device_list可以定義GPU編號從visible到virtual的對映，即可以設定tensorflow可見的GPU device，從而全域性設定了tensorflow可見的GPU編號。程式碼如下：

config = tf.ConfigProto()
config.gpu_options.visible_device_list = str(device_num)
sess = tf.Session(config=config)

參考多卡伺服器下隱藏部分 GPU 和 TensorFlow 的視訊記憶體使用設定，還可以通過os包設定全域性變數CUDA_VISIBLE_DEVICES，程式碼如下：

os.environ["CUDA_VISIBLE_DEVICES"] = "2"

補充知識：TensorFlow 設定程式可見GPU與邏輯分割槽

TensorFlow 設定程式可見GPU(多GPU情況)

import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf

from tensorflow_core.python.keras.api._v2 import keras

print(tf.__version__)
print(sys.version_info)
for module in mpl,np,pd,sklearn,tf,keras:
 print(module.__name__,module.__version__)

# 列印變數所在位置
tf.debugging.set_log_device_placement(True) 

# 獲取物理GPU的個數
gpus = tf.config.experimental.list_physical_devices("GPU") 

if len(gpus) >= 1:
 # 設定第幾個GPU 當前程式可見
 tf.config.experimental.set_visible_devices(gpus[0],"GPU")
 
print("物理GPU個數:",len(gpus))

# 獲取邏輯GPU的個數
logical_gpus = tf.config.experimental.list_logical_devices("GPU") 
print("邏輯GPU個數:",len(logical_gpus))

TensorFlow 設定GPU的邏輯分割槽

import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf

from tensorflow_core.python.keras.api._v2 import keras

print(tf.__version__)
print(sys.version_info)
for module in mpl,"GPU")
 
 # 設定GPU的 邏輯分割槽
 tf.config.experimental.set_virtual_device_configuration(
  gpus[0],[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=3072),tf.config.experimental.VirtualDeviceConfiguration(memory_limit=3072)])

print("物理GPU個數:",len(logical_gpus))

TensorFlow 手動設定處理GPU

import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf

from tensorflow_core.python.keras.api._v2 import keras

print(tf.__version__)
print(sys.version_info)
for module in mpl,module.__version__)

# 列印變數所在位置
tf.debugging.set_log_device_placement(True) 

# 自動指定處理裝置
tf.config.set_soft_device_placement(True)

# 獲取物理GPU的個數
gpus = tf.config.experimental.list_physical_devices("GPU") 
for gpu in gpus:
 # 設定記憶體自增長方式
 tf.config.experimental.set_memory_growth(gpu,True) 
print("物理GPU個數:",len(logical_gpus))

c = []

# 迴圈遍歷當前邏輯GPU
for gpu in logical_gpus:
 print(gpu.name)

 # 手動設定處理GPU
 with tf.device(gpu.name):
  a = tf.constant([[1.0,2.0,3.0],[4.0,5.0,6.0]])
  b = tf.constant([[1.0,2.0],[3.0,4.0],[5.0,6.0]])
  
  # 矩陣相乘 並且新增至列表
  c.append(tf.matmul(a,b))

# 手動設定處理GPU
with tf.device("/GPU:0"):
 matmul_sum = tf.add_n(c)

print(matmul_sum)

以上這篇Tensorflow全域性設定可見GPU編號操作就是小編分享給大家的全部內容了，希望能給大家一個參考，也希望大家多多支援我們。