Towards Information-Economical Classification and Ranking (2008)

by Kin Fai Kan

Abstract: The standard formulations of classification and ranking are rather strict in terms of information usage and may incur high information costs. Standard classification assumes that we need all the attributes to classify every test example. This is neither cheap nor necessary because the attributes can be expensive to obtain and quite often we can predict the label of a test example by looking at a small subset of the attributes. A more economical approach is to acquire the attributes of test examples sequentially and weigh the benefit and cost of acquiring more attributes at every step. To do this, we need to learn a sequence of decision rules that together minimizes the penalty costs due to misclassification and information acquisition costs. We focus on rejecting negative examples as early as possible and present the catenary support vector machine (catSVM).

Standard classification learns a prediction function from a set of labeled examples. However, it is often expensive (and sometimes impossible) to obtain labels of individual ex- amples. One economical alternative is to learn from aggregate label information which are easily available and cheap to obtain. We propose an SVM method to learn classification from group label proportions and provide a theoretical bound on its generalization error. The idea of learning from aggregate label information is also useful for ranking. Standard ranking relies on the relative preferences between individual examples for training which can be expensive or difficult to obtain. We propose a probabilistic model for the relative preferences among groups of individual examples and present several estimation methods.

Download Information

Kin Fai Kan (2008). Towards Information-Economical Classification and Ranking. Doctoral dissertation, University of California at Riverside. pdf        

Bibtex citation

@phdthesis{Kan08,
   author = "Kin Fai Kan",
   title = "Towards Information-Economical Classification and Ranking",
   school = "University of California at Riverside",
   schoolabbr = "UC Riverside",
   year = 2008,
   month = Aug,
}

full list