Decontamination of Mutual Contamination Models
- Many machine learning problems can be characterized by mutual contamination models. In these problems, one observes several random samples from different convex combinations of a set of unknown base distributions and the goal is to infer these base distributions. This paper considers the general setting where the base distributions are defined on arbitrary probability spaces. We examine three popular machine learning problems that arise in this general setting: multiclass classification with label noise, demixing of mixed membership models, and classification with partial labels. In each case, we give sufficient conditions for identifiability and present algorithms for the infinite and finite sample settings, with associated performance guarantees.
Author details: | Julian Katz-Samuels, Gilles BlanchardGND, Clayton Scott |
---|---|
URL: | http://arxiv.org/abs/1710.01167 |
ISSN: | 1532-4435 |
Title of parent work (English): | Journal of machine learning research |
Publisher: | Microtome Publishing |
Place of publishing: | Cambridge, Mass. |
Publication type: | Article |
Language: | English |
Date of first publication: | 2019/04/11 |
Publication year: | 2019 |
Release date: | 2021/05/19 |
Tag: | classification with partial labels; mixed membership models; multiclass classification with label noise; mutual contamination models; topic modeling |
Volume: | 20 |
Number of pages: | 57 |
Funding institution: | NSFNational Science Foundation (NSF) [1422157, 1838179]; DFGGerman Research Foundation (DFG) [FOR-1735]; DFG under the Collaborative Research Center [SFB-1294] |
Organizational units: | Mathematisch-Naturwissenschaftliche Fakultät / Institut für Mathematik |
DDC classification: | 5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik |
Peer review: | Referiert |