Single-Class Classification (SCC) seeks to distinguish one class of data from universal set of multiple classes. We call the target class positive and the complement set of samples negative. In SCC problems, it is assumed that a reasonable sample of the negative data is not available. Since it is not natural to collect the ``non-interesting'' objects (i.e., negative data) to train the concept of the ``interesting'' objects (i.e., positive data), SCC problems are prevalent in the real world where positive and unlabeled data are widely available but negative data are hard or expensive to acquire.
Our SCC algorithm - Mapping Convergence (MC) -
computes boundary functions of the target class from positive and unlabeled data
(without labeled negative data). The basic idea of the MC is to exploit the
natural ``gap'' between positive and negative data by incrementally labeling
negative data from the unlabeled data using the margin maximization property of SVM. Our analyses report the MC algorithm significantly outperform other SCC
methods, and its classification functions become very close to the SVM with
fully labeled data when the positive data is not much under-sampled.
H. Yu, "Single-Class Classification with Mapping Convergence", Machine Learning, Springer, 61:49-69, 2005. (ML'05) [pdf]
H. Yu, J. Han & K. C.-C. Chang, "PEBL: Web Page Classification without Negative Examples", IEEE Transaction on Knowledge and Data Engineering, Special Issue on Mining and Searching the Web, IEEE Computer Society, 16(1): 70-81, 2004. (TKDE'04 Special Issue, 11% accepted) [pdf]
H. Yu, "SVMC: Single-Class Classification With Support Vector Machines", Proc. of Int. Joint Conf. on Artificial Intelligence, 2003. (IJCAI'03 full paper, 20% accepted, received student scholarship award) [pdf]
H. Yu, J. Han & K. C.-C. Chang, "PEBL: Positive Example Based Learning for Web Page Classification Using SVM", Proc. of ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2002. (KDD'02 full paper, 14% accepted, received student scholarship award) [pdf]
DM - Data Mining Lab, Department of Computer Science and Engineering, Pohang University of Science and Technology
Copyright (c) 2008-2009 POSTECH Data Mining Lab, All Rights Reserved.