27,99 €
inkl. MwSt.
Versandkostenfrei*
Versandfertig in über 4 Wochen
  • Broschiertes Buch

In a number of applications involving classification, the final goal is not determining which class (or classes) individual unlabelled instances belong to, but estimating the prevalence (or "relative frequency", or "prior probability") of each class in the unlabelled data. In recent years it has been pointed out that, in these cases, it would make sense to directly optimise machine learning algorithms for this goal, rather than (somehow indirectly) just optimising the classifier's ability to label individual instances. The task of training estimators of class prevalence via supervised learning…mehr

Produktbeschreibung
In a number of applications involving classification, the final goal is not determining which class (or classes) individual unlabelled instances belong to, but estimating the prevalence (or "relative frequency", or "prior probability") of each class in the unlabelled data. In recent years it has been pointed out that, in these cases, it would make sense to directly optimise machine learning algorithms for this goal, rather than (somehow indirectly) just optimising the classifier's ability to label individual instances. The task of training estimators of class prevalence via supervised learning is known as learning to quantify, or, more simply, quantification. It is by now well known that performing quantification by classifying each unlabelled instance via a standard classifier and then counting the instances that have been assigned to the class (the Classify and Count method) usually leads to biased estimators of class prevalence, i.e., to poor quantification accuracy; as a result, methods (and evaluation measures) that address quantification as a task in its own right have been developed. This book covers the main applications of quantification, the main methods that have been developed for learning to quantify, the measures that have been adopted for evaluating it, and the challenges that still need to be addressed by future research. The book is divided in seven chapters. Chapter 1 sets the stage for the rest of the book by introducing fundamental notions such as class distributions, their estimation, and dataset shift, by arguing for the suboptimality of using classification techniques for performing this estimation, and by discussing why learning to quantify has evolved as a task of its own, rather than remaining a by-product of classification. Chapter 2 provides the motivation for what is to come by describing the applications that quantification has been put at, ranging from improving classification accuracy in domain adaptation, to measuring and improving the fairness of classification systems with respect to a sensitive attribute, to supporting research and development in the social sciences, in political science, epidemiology, market research, and others. In Chapter 3 we move on to discuss the experimental evaluation of quantification systems; we look at evaluation measures for the various types of quantification systems (binary, single-label multiclass, multi-label multiclass, ordinal), but also at evaluation protocols for quantification, that essentially consist in ways to extract multiple testing samples for use in quantification evaluation