Article Contents

Article Navigation> JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY> 2019> 28(1): 27-34

Zheng Li, Xiaobing Du, Cuixia Ma, Yanfeng Li, Hongan Wang. Interactive System for Video Summarization Based on Multimodal Fusion[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2019, 28(1): 27-34. doi: 10.15918/j.jbit1004-0579.18023

Citation:

Zheng Li, Xiaobing Du, Cuixia Ma, Yanfeng Li, Hongan Wang. Interactive System for Video Summarization Based on Multimodal Fusion[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2019, 28(1): 27-34.doi:10.15918/j.jbit1004-0579.18023

Citation:

Zheng Li, Xiaobing Du, Cuixia Ma, Yanfeng Li, Hongan Wang. Interactive System for Video Summarization Based on Multimodal Fusion[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2019, 28(1): 27-34.doi:10.15918/j.jbit1004-0579.18023

PDF( 3978 KB)

Interactive System for Video Summarization Based on Multimodal Fusion

doi:10.15918/j.jbit1004-0579.18023

1.
School of Management, Hefei University of Technology, Hefei 230009, China
2.
Jinling Institute of Technology, Nanjing Software Research Institute, Nanjing 211169, China;University of Chinese Academy of Sciences, Beijing 100049, China
3.
University of Chinese Academy of Sciences, Beijing 100049, China;Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
4.
Jinling Institute of Technology, Nanjing Software Research Institute, Nanjing 211169, China

Received Date:2018-01-21

Abstract

Abstract

Biography videos based on life performances of prominent figures in history aim to describe great men's life. In this paper, a novel interactive video summarization for biography video based on multimodal fusion is proposed, which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality. In general, a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie. In this paper, JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the character's whole life. In terms of fusing keywords and key-frames, affinity propagation is adopted to calculate the similarity between each key-frame cluster and key-words. Through the method mentioned above, a video summarization is presented based on multimodal fusion which describes video content more completely. In order to reduce the time spent on searching the interest video content and get the relationship between main characters, a kind of map is adopted to visualize video content and interact with video summarization. An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently.
- video visualization,
- interaction,
- multimodal fusion,
- video summarization

FullText(HTML)

References (22)

References

[1]	Feng S, Lei Z, Yi D, et al. Online content-aware video condensation[C]//CVPR, 2012.
[2]	Lee Y J, Ghosh J, Grauman K. Discovering important people and objects for egocentric video summarization[C]//CVPR, 2012.
[3]	Amir A H W, Iyengar G, Lin C-Y, et al. IBM research TRECVID-2003 system[C]//NIST Text Retrieval Conf (TREC), 2003.
[4]	Kolenda T, Hansen L K, Larsen J, et al.Independent component analysis for understanding multimedia content[C]//IEEE Workshop on Neural Networks for Signal Processing,2002:757-766.
[5]	Langlois T, Chambel T, Oliveira E, et al.VIRUS:video information retrieval using subtitles[C]//Proceedings of the 14th International Academic Mind Trek Conference:Envisioning Future Media Environments, 2010:197-200.
[6]	Katsiouli P, Tsetsos V, Hadjifethymiades S. Semantic video classification based on subtitles and domain terminologies[C]//Proceedings of the KAMC, http://ceur-ws.org/Vol-253/paper05.pdf,2007.
[7]	Mihalcea R, Tarau P. TextRank:bringing order into texts[C]//Proceedings of EMNLP, Association for Computational Linguistics, 2004:404-411.
[8]	Taniguchi Y, Akutsu A, Tonomura Y. Panorama excerpts:extracting and packing panoramas for video browsing[C]//Proceedings of the Fifth ACM International Conference on Multimedia, MULTIMEDIA'97, 1997:427-436.
[9]	Goldman D B, Curless B, Salesin D, et al. Schematic storyboarding for video visualization and editing[J]. ACM Transactions on Graphics,2006, 25(3):862-871.
[10]	Hua X S, Li N S, Zhang H J. Video booklet[C]//Proceedings of International Conference on Multimedia & Expo, IEEE, 2005.
[11]	Nguyen C, Niu Y, Liu F. Video summagator:an interface for video summarization and navigation[C]//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2012:647-650.
[12]	Shah R, Narayanan P J. Interactive video manipulation using object trajectories and scene backgrounds[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2013,23(9):1565-1576.
[13]	Ma Cuixia, Liu Yongjin, Zhao Guozhen, et al. Visualizing and analyzing video content with interactive scalable maps[C]//IEEE Transactions on Multimedia, 2016:1-11.
[14]	Park Seung-Bo, Kim Heung-Nam, Kim Hyunsik, et al. Exploiting script-subtitles alignment to scene boundary dectection in movie[C]//2010 IEEE International Symposium on Multimedia, 2010:49-56.
[15]	Yeung M M, Yeo B L. Video visualization for compact presentation and fast browsing of pictorial content[C]//Circuits & Systems for Video Technology IEEE Transactions on 7.5,1997:771-785.
[16]	Uchihashi S. Video Manga:generating semantically meaningful video summaries[C]//Proceedings of the 7th ACM International Conference on Multimedia'99, Orlando, FL, USA, 1999:383-392.
[17]	Goldman D B, Curless B, Salesin D, et al. Schematic storyboarding for video visualization and editing[J]. ACM Trans Graph, 2006,25(3):862-887.
[18]	Tapaswi M, Bäuml M, Stiefelhagen R. StoryGraphs:visualizing character interactions as a timeline[C]//Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, Columbus, OH, 2014:827-834.
[19]	Wang Feng, Merialdo Bernard. Multi-document video summarization[C]//ICME 2009, 2009:1326-1329.
[20]	Papandreou George, Katsamanis Athanassios, Pitsikalis Vassilis,et al. Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17(3):423-435.
[21]	Frey B J, Dueck D. Clustering by passing messages between data points[J]. Science, 2007, 315:972-976.
[22]	Otani Mayu, Nakashima Yuta, Sato Tomokazu, et al. Textual description-based video summarization for video blogs[C]//ICME 2015,2015:1-6.

Relative Articles

Supplements (0)

Cited By

Proportional views

Proportional views

通讯作者:陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Get Citation

PDF

XML

Article Metrics

Article views (590) PDF downloads(651)

Interactive System for Video Summarization Based on Multimodal Fusion

doi:10.15918/j.jbit1004-0579.18023

Abstract

References

Proportional views

Catalog

通讯作者:陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Interactive System for Video Summarization Based on Multimodal Fusion

doi:10.15918/j.jbit1004-0579.18023

Abstract

References

Proportional views

Catalog

通讯作者:陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content