Multimedia group meetings Lab gathering photos

MOST 108-2221-E-003-017-MY2 Deep Cross-Modal Embedding Models for Multilabel Classification 2019/08/01-2021/07/31
Qualcomm NAT-414697 Zero-Shot Learning for Multilabel Classification and Reinforcement Learning on Improving Gaming Programs 2019/06/01-2019/12/31
MOST 106-2221-E-003-031-MY2 Analysis of High Dimensional Data: Computational Topology and Its Application in Video Summarization 2017/08/01-2019/07/31
MOST 105-2221-E-003-023- Computational Methods for Understanding Social Face Perception: Babyface as Example 2016/08/01-2017/07/31
MOST 104-2221-E-003-020- Aesthetic Quality Analysis of Self-Portrait Photographs Based on Angle 2015/08/01-2016/07/31
MOST 103-2221-E-003-015- A Study on Automated Cinemagraph 2014/08/01-2015/07/31
NSC 102-2221-E-003-026- Aesthetic Quality Analysis of Images and Videos 2013/08/01-2014/07/31
NSC 101-2221-E-003-023- A Fusion Framework for Face Clustering and Social Network Construction from Movies 2012/08/01-2013/07/31
NTNU 100A06 A Mobile Product Recognition System 2012/01/01-2012/12/31
NSC 99-2221-E-003-027- A Study on Real-World Face Recognition Techniques 2010/08/01-2011/07/31
NSC 100-2015-S-003-003- Scientific Exploration of Multimedia Technology (2/2) 2011/05/01-2012/04/30
NSC 99-2515-S-003-004- Scientific Exploration of Multimedia Technology (1/2) 2010/05/01-2011/04/30
NSC 99-2218-E-003-001- A Study on Advanced Methods for Video Copy Detection 2010/02/01-2010/10/31
NTNU 98091039 A Study on Machine Tagging Techniques for Personal Photos 2009/10/01-2010/07/31


Yun-Jie Jerry Lin

Shih-Min Ethan Yang

Fang Li

Yi-Ru Lin

Kuan-Ying Chen

Shun-Ta Wang

Yu-Ting Lai

Bo-Heng Li


Chuan-Shen Hu
PhD, Dept. of Math

Ben Wu
Class of 2018

Yi-Ning Chen
Class of 2018

Yi-Nan Li
Class of 2017

Chun-Ju Lin
Class of 2017

Shu-Yao Chang
Class of 2017

Yi-Tsung Hsieh
Class of 2016

Tsuo-Chen Wu
Class of 2016

Yen-Wei Tsai
Class of 2014

Yin-Ting Tsai
Class of 2014

Hsiao-Wei Lin
Class of 2014

Shao-Ting Yang
Class of 2014

Chun-Hui Chuang
Class of 2013

Yin-Tzu Chan
Class of 2013

Hao-Chen Hsu
Class of 2013

Po-Yi Li
Class of 2012

Wen-Po Wu
Class of 2012

Chih-Chieh Tai
Class of 2011

Yu-Chen Cheng
Class of 2011

Ming-Chi Tseng
Class of 2011

Projects (before June 2009)

Fast Video Copy Detection

video copy detection

Sequence matching techniques are effective for comparing two videos. However, existing approaches suffer from demanding computational costs and thus are not scalable for large-scale applications. In this paper we propose a two-level filtration approach which achieves significant acceleration to the matching process.

Approximate String Matching on Visual Data

string matching

We present an approach to measuring similarities between visual data based on approximate string matching. In this approach, an image is represented by an ordered list of fea-ture descriptors. We show the extraction of local features sequences from two types of 2-D signals--scene and shape images. Our experimental study shows that such representation is more discriminative than a bag-of-features representation and the similarity measure based on string matching is effective
. (more)

Automatic Face Annotation in Images


The objective of this work is to automatically annotate all localized faces in either an image with labels using a small number of training faces. We formulate the face annotation problem as a face detection and classification problem, and add categories for non-face and anonymous face images. Experiments conducted on two realistic face datasets show encouraging results for recognizing known faces, as well as for rejecting anonymous faces and non-face images.

Machine Tagging for Personl Photos


Large sets of
personal media in a future digital home are not readily accessible due to the lack of meaningful meta-data. Our objective is to enable a set of pragmatic ways to access personal media across a collection, specifically of photos and home videos.

Multimodal Fusion for Image Categorization


We have developed a multimodal fusion scheme to improve image classification accuracy by incorporating the information derived from embedded text detected in the image that is being classified. Experiments on a challenging image database demonstrate that the proposed fusion framework achieves a higher accuracy than the state-of-art methods
. (more)

Face Categorization using SIFT


We view face recognition problem as object class recognition problem. Different pople are considered different object categories. In this project, the SIFT features are integrated into a bag-of-words representation of face images. This method achieves some preliminary promising results, which can be used as the first step for face recognition problem, and be further improved by considering spatial relations between features.

(slides)        (report)                      

Manifold Learning

swiss roll

Tasks of image clustering and classification often deal with data of very high dimensions. To alleviate the dimensionality curse, several methods, such as Isomap, LLE and KPCA, have been proposed ane applied to learn low-dimensional non-linear embedded monifolds. In this work, we empirically examine these methods on a more realistic but not so difficult dataset. We discuss these dimension-reduction schemes' promises and limintations. (paper)

MBRS for H.264 Video Decoder


In this project, we design and implement functional models of MBRS, which produces the macroblock-layer related parameter information from a video bit stream parser in a h.264 decoder. The techniques were highly designed for the Intel Deepwater platform.

Shape Coding


Binary shape coding, a new feature of MPEG-4, is essential for effiiently coding and representing the arbitrary shape of any content object. In this work, we present an optimal chain-code-like representation to code contour shapes. In addition, this representation can be easily extended to as a scalable form. Lossy coding scheme is also proposed for low-bit-rate applications. Compared with the block-based CAE method in MPEG-4 and DCC with arithmetic coder, this method has higher compression ratio with less computation steps. (paper)