返回
类型 基础研究 预答辩日期 2018-01-25
开始(开题)日期 2013-11-27 论文结束日期 2017-12-04
地点 中心楼2楼教育部重点实验室会议室 论文选题来源 非立项    论文字数 6.1 (万字)
题目 基于视频的动作识别中运动特征的研究
主题词 动作识别,运动特征,特征学习,半监督学习,自编码器
摘要 基于视频的动作识别在信息检索、安防监控等方面具有广阔的应用前景,是计算机视觉和模式识别领域的研究热点之一。近些年来对动作识别的研究取得了许多重要的进展:传统动作特征方法从视频或者图像中提取兴趣点,然后设计如梯度直方图等描述子来表示局部动作特征,尽管传统动作特征方法简单有效,但已经无法满足对含有多目标和复杂背景的视频中动作提取和描述要求。随着特征学习算法在图像识别领域的成功应用,动作特征学习通过无监督或者监督学习等算法自动的从数据中学习出具有语义信息的特征描述,并在动作识别中取得了一定的成功。一般来说,如何生成具有更强辨识力和表示力的动作特征,已经成为研究的核心问题之一。本文围绕动作特征设计和学习等相关问题,结合无监督学习、半监督学习以及流形学习等技术以及如何在视频中提取更有效的运动信息、描述更全面的动作内容,研究了基于运动信息的特征学习算法,并使用运动特征提高了动作识别的准确性。本文主要工作概括如下: 一、针对运动特征的设计和使用问题,提出了基于快速Haar3D运动特征的动作识别算法,该算法设计了一族三维Haar小波特征用来快速检测和表示视频中的运动区域,然后通过时空特征过完备池化获取丰富的动作描述,最后使用在线特征选择算法选取最优的特征维度作为分类模型。算法在计算效率和识别精度上取得了较好的结果,和经典动作识别方法相比有了进一步的提升。局部特征是动作识别中的重要方法,此类方法通过检测特征点、设计特征描述子以及时空金字塔关系建模来表示动作。为了提高运动特征的计算效率,本文将二维Haar小波扩展到三维空间形成Haar3D运动特征,提取目标运动部分在各方向上的分量作为局部特征;为了提高传统时空金字塔模型在动作上的描述能力,使用时空特征过完备池化增加了动作描述的完备性和局部特征在时间空间上的关联性;最后使用在线特征选择方法选取最优的分类特征以提高识别效率。 二、针对运动特征的学习问题,提出了基于运动显著性特征学习的动作识别算法,该算法基于视觉中的运动关注机制,首先通过运动分析提取运动边缘,然后使用原始视频与运动边缘两者联合学习并约束在动作中具有相同的表示作为动作特征。该特征学习算法减少了动作特征与图像纹理、色彩等相关性,提高了动作特征的表示和判别能力,在识别精度上取得了较好的结果。对于动作特征而言,有效表示视频中的运动部分是提高动作识别能力的关键。传统的特征学习算法直接从图像或视频的像素中学习,获取的特征包含大量的图像纹理、颜色等信息。为了提高特征对运动关注能力,算法首先设计了一种新的运动边缘提取方法,使用该运动边缘可以在保持目标运动信息的同时去除图像中其他因素的干扰;然后应用特征学习算法从原始视频和运动边缘中共同学习出具有相同动作表示的运动特征。这种动作特征学习算法与传统动作特征学习算法相比较,能更加有效的提取和表示视频的运动部分,同时提高动作识别精度。 三、针对动作特征中运动信息和静态信息的融合问题,提出了基于光流约束自编码器的动作识别算法,提出了一种运动特征融合学习算法,将运动光流场约束使用视频像素为输入的自编码器,学习动作特征,可以有效提高基于自编码器在动作识别上的识别准确性。自编码器是目前广泛使用的学习算法,被广泛应用于特征学习、深度学习等方面。本文设计了一种新的正则化自编码器学习网络用来学习动作特征,该网络由一个主网络和一个辅助网络组成,主网络通过自编码器学习视频的局部图像特性,辅助网络通过加入运动光流场约束特征的输出,两个网络协同训练获取融合了视频像素信息和动态光流信息的动作特征。这种学习算法改进了传统动作特征仅依靠视频像素,无法区分动作视频在时间维度上的变化和空间维度上变化的不足,更好的表示了动作的动态特性。实验表明光流约束自编码器提升了动作特征的判别能力,在识别性能上较传统的特征有了进一步的提升。 四、针对特征学习中的半监督问题,提出了基于半监督流形约束的自编码器算法,算法将流形学习应用到自编码器特征学习中,解决带部分样本带标记的半监督特征学习问题。半监督问题是模式识别中具有研究价值的问题之一,对海量数据而言,往往带标记的样本仅占全部样本的很少一部分。针对如何应用少量的标记样本获得具有良好泛化能力的模型,本文将流形学习方法应用到半监督特征学习中,设计了一种基于最大熵约束的自编码器网络。该网络由一个自编码器网络和一个基于半监督流形约束的正则化网络组成,自编码器网络对无标记数据进行无监督非线性特征映射,半监督流形网络使用类别标记调整特征的非线性映射,两者共同左右形成最终的特征映射方式。实验结果表明半监督自编码器在映射空间上能够压缩临近的相同类别的样本距离同时保持不同类别间的样本距离,在图像识别和动作识别均取得了较传统半监督算法更优的识别结果。
英文题目 THE RESEARCH OF MOTION FEATUERS IN VIDEO-BASED ACTION RECOGNITION
英文主题词 Action recongnition, Motion feature, Feature learning, Semi-supervised learning, Auto-encoder
英文摘要 Video-based action recognition has broad application prospects in information retrieval, security monitoring and etc. It is one of the most famous topics in computer vision and pattern recognition. In recent years, many advances have been made in the study of action recognition and the methods based on local action features have achieved great successes. The traditional action feature methods extract keypoints from video or image, and then design certain dimension of vectors named descriptors such as HOG, HOF to represent the local motion. Although these features are simple and effective, they can not meet the requirements of representation and description of videos which have multi-target and complex backgrounds. Therefore, how to get discriminative and representative motion features has always been the core issue in action recognition research. With the successful application of feature learning algorithms in the field of image recognition, the introduction of feature learning into action recognition has also achieved state-of-the-art results. Such algorithms obtain the representation of action features directly from the data through unsupervised or supervised learning methods which have better abilities to represent complex videos. This paper focuses on the related issues of feature design and learning in action recognition, and combines the knowledge of unsupervised learning, semi-supervised learning and manifold learning to study how to describe and analyze the videos in order to improve the accuracy of action recognition. The main work of this paper is summarized as follows: (1) An action recognition algorithm based on fast Haar3D features is proposed. The algorithm designs a family of three-dimensional wavelet features to detect and represent the motion areas quickly, then obtains an abundant global description through the spatio-temporal over-complete pooling, finally selects the optimal feature dimensions as the classification model via online feature selection algorithm. The algorithm improves in computational efficiency and recognition accuracy compared with classical action recognition methods. Local features are important methods in action recognition. These methods represent action by feature detection, feature descriptors design, and spatio-temporal pyramid modeling. In order to improve the computational efficiency of the motion features, this paper generalizes Haar wavelet into three-dimensional space to form Haar3D motion features, and extracts the components of the object’s moving parts in all directions as the local motion representations. The spatio-temporal over-complete pooling is utilized to increases the completeness of the action description and the temporal-spatial correlation of the local features because of the insufficient description ability of traditional pyramids model. Finally, the online feature selection method is used to select the best classification feature to improve the recognition efficiency while ensuring the accuracy of recognition (2) An action recognition algorithm based on motion saliency feature learning is proposed. The algorithm applies a visual attention mechanism, extracts motion boundaries through motion analysis and then utilizes the original video and motion boundaries to jointly learn motion features via constraining them with same representations of actions. The feature learning algorithm reduces the correlation between motion features and image texture and color, and improves the representation and discriminating ability of motion features. For motion features, how to effectively representing the motion in the video is the key to improving the ability to recognition. Traditional feature learning algorithms learn directly from the pixels of an image or video. The learned features contain a large amount of information such as the texture and color of the image itself. In order to improve the ability of the features to focus on the motion, the algorithm firstly extracts the motion boundaries to keep the motion while removing the interference of other information in the image, and then jointly learn the features with the same motion representation from the original video and the motion edge as the motion features via learning algorithm is. Compared with traditional learning algorithms that depend only on video pixels, this kind of motion feature learning algorithm can effectively obtain the moving parts of video while avoiding the interference caused by the video itself and further improving the recognition accuracy. (3) An action feature learning algorithm based on optical flow constrained auto-encoder is proposed. In order to improve the performance of action feature learning in extracting the dynamic characteristics of video, the algorithm takes use of optical flow field to constrain the auto-encoder which can effectively improve the recognition results on action recognition. The auto-encoder is widely used learning algorithm which can be used in deep learning and so on. In this paper, a new regularization auto-encoder learning network is designed which consists of a main network and an auxiliary network. The main network learns the statistical properties of the local video itself through the auto-encoder, and the auxiliary network constrains the output of the main network by adding the optical flow field. Two networks collaboratively acquire the characteristics of the motion features that combine video pixel information and dynamic optical flow information. This learning algorithm improves the traditional action features learning methods which only rely on video pixels and can not distinguish changes in the time dimension and changes in spatial dimension. The optical flow constrained auto-encoder not only improves the action feature’s representation of the dynamic characteristics of the video, but also enhances the action feature’s discriminating ability. (4) A semi-supervised manifold constrained auto-encoder is proposed. The algorithm applies a semi-supervised manifold learning method to auto-encoder and solves the semi-supervised feature learning problem with few labeled samples. Semi-supervised problem is one of the most valuable problems in pattern recognition. For the mass data, the sample which is often labeled only a very small part of the total sample. Aiming at how to apply a small number of labeled samples to obtain a generalization model, this paper applies a semi-supervised manifold learning method to feature learning, and designs a maximum entropy constraints based auto-encoder network. The network consists of an auto-encoder network and a regularized network, the auto-encoder network performs unsupervised non-linear feature mapping of unlabeled data and the semi-supervised manifold network uses labels to adjust the nonlinear mapping of features. Experiments have achieved state-of-the-art results in both image recognition and attion recognition tasks.
学术讨论
主办单位时间地点报告人报告主题
航空学会CIFC 2013-9-13 深圳 李亚玮 基于稀疏运动特征学习的动作识别
中国控制会议 2013-7-26 西安 李亚玮 Efficient local filter bank with over-complete spatiotemporal pooling in action recognition
信息融合大会 2016-7-23 南京 李亚玮 A Combined Visual Tracker based on Global Appearance and Local Features
信息融合大会 2017-7-21 西安 李亚玮 正则化自编码器
自动化学院 2012-10-30 致知堂 夏思宇 图像中的人物分析与理解
自动化学院 2012-5-3 中心楼2楼教育部重点实验室会议室 徐 雷 BYY智能系统、和谐学习理论、五行迭代算法
自动化学院 2015-7-2 中心楼2楼教育部重点实验室会议室 Yi Yang Large Scale Video Analysis in the Real World
自动化学院 2015-8-27 中心楼2楼教育部重点实验室会议室 Richard Xu Computer vision experiments and Monte-carlo Inference
     
学术会议
会议名称时间地点本人报告本人报告题目
IEEE WCCI 2016-6-24 Vancouver Semi-supervised Auto-encoder Based on Manifold learning
中国控制会议 2015-7-28 杭州 Local Salient motion analysis for action recognition
     
代表作
论文名称
基于光流约束自编码器的动作识别
Semi-supervised auto-encoder based on manifold learning
 
答辩委员会组成信息
姓名职称导师类别工作单位是否主席备注
岳东 正高 教授 博导 南京邮电大学
罗琦 正高 教授 博导 南京信息工程大学
费树岷 正高 教授 博导 东南大学
袁晓辉 正高 教授 硕导 东南大学
李久贤 正高 教授 硕导 东南大学
      
答辩秘书信息
姓名职称工作单位备注
夏思宇 副高 副教授 东南大学