1. 程式人生 > >ROS機器人Diego 1#整合Tensorflow MNIST,玩數字識別

ROS機器人Diego 1#整合Tensorflow MNIST,玩數字識別

機器學習中最經典的例子就是MNIST通過圖片來識別0~9的數字,這篇文章將介紹如何將基於Tensorflow MNIST整合到Diego1#機器人中作為一個節點,此節點將訂閱Image訊息,通過MNIST識別後將結果釋出訊息給訊飛語音節點,訊飛語音節點會告訴我們識別的數字是幾。

相關原始碼已經上傳到本人的github。

1. 安裝Tensorflow

只需一句命令即可安裝

pip install tensorflow

官方有4中安裝方法,在這裡選擇直接安裝的方式

2. 建立diego_tensorflow_mnist 包

catkin_create_pkg diego_tensorflow_mnist std_msgs rospy roscpp cv_bridge

這裡寫圖片描述

在diego_tensorflow_mnist目錄下建立scripts和launch目錄
這裡寫圖片描述
scripts目錄用於存放python的原始碼
launch目錄用於存放ROS launch檔案

下載相關程式碼到scripts目錄
這裡寫圖片描述

3.ROS節點

有關nnist的演算法都已經寫好,我們只需要呼叫其中的功能封裝成ROS節點即可,有關封裝的程式碼請見tensorflow_in_ros_mnist.py

#!/usr/bin/env python

import rospy
from sensor_msgs.msg import Image
from std_msgs.msg import
Int16 from std_msgs.msg import String from cv_bridge import CvBridge import cv2 import numpy as np import tensorflow as tf def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return
tf.Variable(initial) def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') def makeCNN(x,keep_prob): # --- define CNN model W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) W_conv2 = weight_variable([3, 3, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) return y_conv class RosTensorFlow(): def __init__(self): rospy.init_node('rostensorflow') # Set the shutdown function (stop the robot) rospy.on_shutdown(self.shutdown) model_path = rospy.get_param("~model_path", "") image_topic = rospy.get_param("~image_topic", "") self._cv_bridge = CvBridge() self.x = tf.placeholder(tf.float32, [None,28,28,1], name="x") self.keep_prob = tf.placeholder("float") self.y_conv = makeCNN(self.x,self.keep_prob) self._saver = tf.train.Saver() self._session = tf.InteractiveSession() init_op = tf.global_variables_initializer() self._session.run(init_op) self._saver.restore(self._session, model_path+"/model.ckpt") self._sub = rospy.Subscriber(image_topic, Image, self.callback, queue_size=1) #self._pub = rospy.Publisher('result', Int16, queue_size=1) self._pub = rospy.Publisher('xfwords', String, queue_size=1) def callback(self, image_msg): cv_image = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8") cv_image_gray = cv2.cvtColor(cv_image, cv2.COLOR_RGB2GRAY) ret,cv_image_binary = cv2.threshold(cv_image_gray,128,255,cv2.THRESH_BINARY_INV) cv_image_28 = cv2.resize(cv_image_binary,(28,28)) np_image = np.reshape(cv_image_28,(1,28,28,1)) predict_num = self._session.run(self.y_conv, feed_dict={self.x:np_image,self.keep_prob:1.0}) answer = np.argmax(predict_num,1) rospy.loginfo('%d' % answer) self._pub.publish(str(answer)) rospy.sleep(3) def shutdown(self): rospy.loginfo("Stopping the tensorflow nnist...") rospy.sleep(1) if __name__ == '__main__': try: RosTensorFlow() rospy.spin() except rospy.ROSInterruptException: rospy.loginfo("RosTensorFlow has started.")

有關MNIST具體演算法實現部分網上有很多教程,這裡只說明與ROS整合部分

class RosTensorFlow():
    def __init__(self):

    rospy.init_node('rostensorflow')

    # Set the shutdown function (stop the robot)
        rospy.on_shutdown(self.shutdown)

        model_path = rospy.get_param("~model_path", "")
        image_topic = rospy.get_param("~image_topic", "")

在RosTensorFlow類的開始部分,是標準的節點定義方法,model_path變數用於獲取launch檔案中定義的model的路徑,image_topic變數用於獲取launch檔案中定義image主題

        self._sub = rospy.Subscriber(image_topic, Image, self.callback, queue_size=1)

以上這段程式碼是讓該節點訂閱image topic,並且知道回撥函式

        self._pub = rospy.Publisher('xfwords', String, queue_size=1)

以上這段程式碼定義將釋出訊飛語音主題,xfwords

    def callback(self, image_msg):
        cv_image = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8")
        cv_image_gray = cv2.cvtColor(cv_image, cv2.COLOR_RGB2GRAY)
        ret,cv_image_binary = cv2.threshold(cv_image_gray,128,255,cv2.THRESH_BINARY_INV)
        cv_image_28 = cv2.resize(cv_image_binary,(28,28))
        np_image = np.reshape(cv_image_28,(1,28,28,1))
        predict_num = self._session.run(self.y_conv, feed_dict={self.x:np_image,self.keep_prob:1.0})
        answer = np.argmax(predict_num,1)
        rospy.loginfo('%d' % answer)
        self._pub.publish(str(answer))
        rospy.sleep(3) 

主要的處理都在callback回撥函式中,首先將從image主題中經過一系列的處理轉換成numpy陣列,然後呼叫tensorflow進行識別,將可能的結果過放在predict_num陣列中,取其中最有可能的值,就是結果,將結果作為訊飛語音topic傳送出去

4.launch檔案

在launch資料夾下建立一個名為nnist.launch的檔案,檔案內容如下:

<launch>

    <node pkg="diego_tensorflow_nnist" name="tensorflow_in_ros_mnist" type="tensorflow_in_ros_mnist.py" output="screen">
        <param name="image_topic" value="/usb_cam/image_raw" />
        <param name="model_path" value="$(find diego_tensorflow_nnist)/scripts/model" />
    </node> 
</launch>

相關的主題,和路徑可以在這裡修改

5.啟動節點

roscore
rosrun xfei_asr tts_subscribe_speak
roslaunch usb_cam usb_cam-test.launch
roslaunch diego_tensorflow_nnist nest.launch

啟動後我們就可以在紙上面寫幾個數字,放在攝像頭前,diego1#會告訴你數字是多少。