1. 程式人生 > >行為識別資料集調研

行為識別資料集調研

在深度學習出現之前,表現最好的演算法是iDT^{[1][2]},之後的工作基本上都是在iDT方法上進行改進。IDT的思路是利用光流場來獲得視訊序列中的一些軌跡,再沿著軌跡提取HOF,HOG,MBH,trajectory4中特徵,其中HOF基於灰度圖計算,另外幾個均基於dense optical flow(密集光流計算)。最後利用FV(Fisher Vector)方法對特徵進行編碼,再基於編碼訓練結果訓練SVM分類器。深度學習出來後,陸續出來多種方式來嘗試解決這個問題,包含:Two-Stream^{[3][4]}、C3D(Convolution 3 Dimension)^{[6]},還有RNN^{[7]}方向。

ActivityNet

A Large-Scale Video Benchmark for Human Activity Understanding
資料集介紹連結

類別
  • Eating and Drinking
  • Food and drink preparation
  • kitchen and food clean-up
  • participating in sports ,exercises and recreation
  • participation in equestrian sports
  • socializing, relaxing and leisure
  • personal care(brushing teeth……)
  • household activities
  • vehicle repair and maintenance
    ……

2017年的結果
untrimmed video是沒有修剪的視訊,每個視訊裡面包含多個行為的片段,trimmed video是修剪後的視訊,包含一個動作,對此進行分類,temproal action proposals是對存在動作的時間段進行查詢。
這裡寫圖片描述

UCF101 Action Recognition Data Set

這裡寫圖片描述

HMDB51

brush hair, cartwheel, catch, chew, clap, climb, climb stairs, dive, draw sword, dribble, drink, eat, fall floor, fencing shoot bow, shoot gun, flic flac, golf, hand stand, hit, hug, jump, kick, stand, kick ball, kiss, laugh, pick pour, pullup, punch, push, pushup, ride bike, ride horse, run, shake hands, shoot ball, shoot bow, shoot gun, sit, situp, smile, smoke, somersault, swing baseball, sword exercise, sword, talk, throw, turn, walk, wave.