YUAN XIAOTONG

Email Address
eleyuxi@nus.edu.sg


Organizational Units
Organizational Unit
ENGINEERING
faculty
Organizational Unit

Publication Search Results

Now showing 1 - 10 of 16
  • Publication
    Visual classification with multi-task joint sparse representation
    (2010) Yuan, X.-T.; Yan, S.; ELECTRICAL & COMPUTER ENGINEERING
    We address the problem of computing joint sparse representation of visual signal across multiple kernel-based representations. Such a problem arises naturally in supervised visual recognition applications where one aims to reconstruct a test sample with multiple features from as few training subjects as possible. We cast the linear version of this problem into a multi-task joint covariate selection model [15], which can be very efficiently optimized via kernelizable accelerated proximal gradient method. Furthermore, two kernel-view extensions of this method are provided to handle the situations where descriptors and similarity functions are in the form of kernel matrices. We then investigate into two applications of our algorithm to feature combination: 1) fusing gray-level and LBP features for face recognition, and 2) combining multiple kernels for object categorization. Experimental results on challenging real-world datasets show that the feature combination capability of our proposed algorithm is competitive to the state-of-theart multiple kernel learning methods. ©2010 IEEE.
  • Publication
    Movie2comics: A feast of multimedia artwork
    (2010) Hong, R.; Yuan, X.-T.; Xu, M.; Wang, M.; Yan, S.; Chua, T.-S.; ELECTRICAL & COMPUTER ENGINEERING; COMPUTER SCIENCE
    As a type of artwork, comics are prevalent and popular around the world. However, although there are several assistive software and tools available, the creation of comics is still a tedious and labor intensive process. This paper proposes a scheme that is able to automatically turn a movie to comics with two principles: (1) optimizing the information reservation of movie; and (2) generating outputs following the rules and styles of comics. The scheme mainly contains three components: script-face mapping, key-scene extraction, and cartoonization. Script-face mapping utilizes face recognition and tracking techniques to accomplish the mapping between character's faces and their scripts. Key-scene extraction then combines the frames derived from subshots and the extracted index frames based on subtitle to select a sequence of frames for cartoonization. Finally, the cartoonization is accomplished via four steps: panel scale, stylization, word balloon placement and comics layout. Experiments conducted on a set of movie clips have demonstrates the usefulness and effectiveness of the scheme. © 2010 ACM.
  • Publication
    A finite Newton algorithm for non-degenerate piecewise linear systems
    (2011) Yuan, X.-T.; Yan, S.; ELECTRICAL & COMPUTER ENGINEERING
    We investigate Newton-type optimization methods for solving piecewise linear systems (PLS) with non-degenerate coefficient matrix. Such systems arise, for example, from the numerical solution of linear complementarity problem which is useful to model several learning and optimization problems. In this paper, we propose an effective damped Newton method, namely PLSDN, to find the exact solution of non-degenerate PLS. PLS-DN exhibits provable semi-iterative property, i.e., the algorithm converges globally to the exact solution in a finite number of iterations. The rate of convergence is shown to be at least linear before termination. We emphasize the applications of our method to modeling, from a novel perspective of PLS, several statistical learning problems such as elitist Lasso, non-negative least squares and support vector machines. Numerical results on synthetic and benchmark data sets are presented to demonstrate the effectiveness and efficiency of PLS-DN on these problems. Copyright 2011 by the authors.
  • Publication
    Supervised sparse patch coding towards misalignment-robust face recognition
    (2011) Lang, C.; Cheng, B.; Feng, S.; Yuan, X.; ELECTRICAL & COMPUTER ENGINEERING
    We address the challenging problem of face recognition under the scenarios where both training and test data are possibly contaminated with spatial misalignments. A supervised sparse coding framework is developed in this paper towards a practical solution to misalignment-robust face recognition. Each given probe face image is then uniformly divided into a set of local patches. We propose to sparsely reconstruct each probe image patch from the patches of all gallery images, and at the same time the reconstructions for all patches of the probe image are regularized by one term towards enforcing sparsity on the subjects of those selected patches. The derived reconstruction coefficients by ℓ 1-norm minimization are then utilized to fuse the subject information of the patches for identifying the probe face. Such a supervised sparse coding framework provides a unique solution to face recognition. Extensive face recognition experiments on three benchmark face datasets demonstrate the advantages of the proposed framework over holistic sparse coding and conventional subspace learning based algorithms in terms of robustness to spatial misalignments and image occlusions. © 2011 IEEE.
  • Publication
    iComics: Automatic conversion of movie into comics
    (2010) Hong, R.; Wang, M.; Li, G.; Yuan, X.-T.; Yan, S.; Chua, T.-S.; ELECTRICAL & COMPUTER ENGINEERING; COMPUTER SCIENCE
    This demonstration presents a system, named iComics, for automatic conversion of movie into comics. We design three components to realize the system: script-face mapping, key-scene extraction, and cartoonization. Script-face mapping utilizes face recognition and tracking techniques to accomplish the mapping between character's faces and their scripts. Key-scene extraction combines the frames derived from subshots and the extracted index frames based on subtitle to select a sequence of frames for cartoonization. Finally, the cartoonization is accomplished via four steps: panel scale, stylization, word balloon placement and comics layout. © 2010 ACM.
  • Publication
    Robust low-rank subspace segmentation with semidefinite guarantees
    (2010) Ni, Y.; Sun, J.; Yuan, X.; Yan, S.; Cheong, L.-F.; INTERACTIVE & DIGITAL MEDIA INSTITUTE; ELECTRICAL & COMPUTER ENGINEERING; COMPUTER SCIENCE
    Recently there is a line of research work proposing to employ Spectral Clustering (SC) to segment (group)1\ high-dimensional structural data such as those (approximately) lying on subspaces2 or low-dimensional manifolds. By learning the affinity matrix in the form of sparse reconstruction, techniques proposed in this vein often considerably boost the performance in subspace settings where traditional SC can fail. Despite the success, there are fundamental problems that have been left unsolved: the spectrum property of the learned affinity matrix cannot be gauged in advance, and there is often one ugly symmetrization step that post-processes the affinity for SC input. Hence we advocate to enforce the symmetric positive semi definite constraint explicitly during learning (Low-Rank Representation with Positive Semi Definite constraint, or LRR-PSD), and show that factually it can be solved in an exquisite scheme efficiently instead of general-purpose SDP solvers that usually scale up poorly. We provide rigorous mathematical derivations to show that, in its canonical form, LRR-PSD is equivalent to the recently proposed Low-Rank Representation (LRR) scheme[1], and hence offer theoretic and practical insights to both LRR-PSD and LRR, inviting future research. As per the computational cost, our proposal is at most comparable to that of LRR, if not less. We validate our theoretic analysis and optimization scheme by experiments on both synthetic and real data sets. © 2010 IEEE.
  • Publication
    Towards multi-semantic image annotation with graph regularized exclusive group Lasso
    (2011) Chen, X.; Yuan, X.-T.; Yan, S.; Tang, J.; Rui, Y.; Chua, T.-S.; ELECTRICAL & COMPUTER ENGINEERING; COMPUTER SCIENCE
    To bridge the semantic gap between low level feature and human perception, most of the existing algorithms aim mainly at annotating images with concepts coming from only one semantic space, e.g. cognitive or affective. The naive combination of the outputs from these spaces will implicitly force the conditional independence and ignore the correlations among the spaces. In this paper, to exploit the comprehensive semantic of images, we propose a general framework for harmoniously integrating the above multiple semantics, and investigating the problem of learning to annotate images with training images labeled in two or more correlated semantic spaces, such as fascinating nighttime, or exciting cat. This kind of semantic annotation is more oriented to real world search scenario. Our proposed approach outperforms the baseline algorithms by making the following contributions. 1) Unlike previous methods that annotate images within only one semantic space, our proposed multi-semantic annotation associates each image with labels from multiple semantic spaces. 2) We develop a multi-task linear discriminative model to learn a linear mapping from features to labels. The tasks are correlated by imposing the exclusive group lasso regularization for competitive feature selection, and the graph Laplacian regularization to deal with insufficient training sample issue. 3) A Nesterov-type smoothing approximation algorithm is presented for efficient optimization of our model. Extensive experiments on NUS-WIDEEmotive dataset (56k images) with 8×81 emotive cognitive concepts and Object&Scene datasets from NUS-WIDE well validate the effectiveness of the proposed approach. © 2011 ACM.
  • Publication
    Efficient subspace segmentation via quadratic programming
    (2011) Wang, S.; Yuan, X.; Yao, T.; Yan, S.; Shen, J.; ELECTRICAL & COMPUTER ENGINEERING
    We explore in this paper efficient algorithmic solutions to robust subspace segmentation. We propose the SSQP, namely Subspace Segmentation via Quadratic Programming, to partition data drawn from multiple subspaces into multiple clusters. The basic idea of SSQP is to express each datum as the linear combination of other data regularized by an overall term targeting zero reconstruction coefficients over vectors from different subspaces. The derived coefficient matrix by solving a quadratic programming problem is taken as an affinity matrix, upon which spectral clustering is applied to obtain the ultimate segmentation result. Similar to sparse subspace clustering (SCC) and low-rank representation (LRR), SSQP is robust to data noises as validated by experiments on toy data. Experiments on Hopkins 155 database show that SSQP can achieve competitive accuracy as SCC and LRR in segmenting affine subspaces, while experimental results on the Extended Yale Face Database B demonstrate SSQP's superiority over SCC and LRR. Beyond segmentation accuracy, all experiments show that SSQP is much faster than both SSC and LRR in the practice of subspace segmentation. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved.
  • Publication
    Multi-label visual classification with label exclusive context
    (2011) Chen, X.; Yuan, X.-T.; Chen, Q.; Yan, S.; Chua, T.-S.; ELECTRICAL & COMPUTER ENGINEERING; COMPUTER SCIENCE
    We introduce in this paper a novel approach to multi-label image classification which incorporates a new type of context label exclusive context with linear representation and classification. Given a set of exclusive label groups that describe the negative relationship among class labels, our method, namely LELR for Label Exclusive Linear Representation, enforces repulsive assignment of the labels from each group to a query image. The problem can be formulated as an exclusive Lasso (eLasso) model with group overlaps and affine transformation. Since existing eLasso solvers are not directly applicable to solving such an variant of eLasso in our setting, we propose a Nesterov's smoothing approximation algorithm for efficient optimization. Extensive comparing experiments on the challenging real-world visual classification benchmarks demonstrate the effectiveness of incorporating label exclusive context into visual classification. © 2011 IEEE.
  • Publication
    Visual tracking via weakly supervised learning from multiple imperfect oracles
    (2010) Zhong, B.; Yao, H.; Chen, S.; Ji, R.; Yuan, X.; Liu, S.; Gao, W.; ELECTRICAL & COMPUTER ENGINEERING
    Long-term persistent tracking in ever-changing environments is a challenging task, which often requires addressing difficult object appearance update problems. To solve them, most top-performing methods rely on online learning-based algorithms. Unfortunately, one inherent problem of online learning-based trackers is drift, a gradual adaptation of the tracker to non-targets. To alleviate this problem, we consider visual tracking in a novel weakly supervised learning scenario where (possibly noisy) labels but no ground truth are provided by multiple imperfect oracles (i.e., trackers), some of which may be mediocre. A probabilistic approach is proposed to simultaneously infer the most likely object position and the accuracy of each tracker. Moreover, an online evaluation strategy of trackers and a heuristic training data selection scheme are adopted to make the inference more effective and fast. Consequently, the proposed method can avoid the pitfalls of purely single tracking approaches and get reliable labeled samples to incrementally update each tracker (if it is an appearance-adaptive tracker) to capture the appearance changes. Extensive comparing experiments on challenging video sequences demonstrate the robustness and effectiveness of the proposed method. ©2010 IEEE.