## Statistical Scaling of Categorical Data

• Estimation and testing of distributions in metric spaces are well known. R.A. Fisher, J. Neyman, W. Cochran and M. Bartlett achieved essential results on the statistical analysis of categorical data. In the last 40 years many other statisticians found important results in this field. Often data sets contain categorical data, e.g. levels of factors or names. There does not exist any ordering or any distance between these categories. At each level there are measured some metric or categorical values. We introduce a new method of scaling based on statistical decisions. For this we define empirical probabilities for the original observations and find a class of distributions in a metric space where these empirical probabilities can be found as approximations for equivalently defined probabilities. With this method we identify probabilities connected with the categorical data and probabilities in metric spaces. Here we get a mapping from the levels of factors or names into points of a metric space. This mapping yields the scale for theEstimation and testing of distributions in metric spaces are well known. R.A. Fisher, J. Neyman, W. Cochran and M. Bartlett achieved essential results on the statistical analysis of categorical data. In the last 40 years many other statisticians found important results in this field. Often data sets contain categorical data, e.g. levels of factors or names. There does not exist any ordering or any distance between these categories. At each level there are measured some metric or categorical values. We introduce a new method of scaling based on statistical decisions. For this we define empirical probabilities for the original observations and find a class of distributions in a metric space where these empirical probabilities can be found as approximations for equivalently defined probabilities. With this method we identify probabilities connected with the categorical data and probabilities in metric spaces. Here we get a mapping from the levels of factors or names into points of a metric space. This mapping yields the scale for the categorical data. From the statistical point of view we use multivariate statistical methods, we calculate maximum likelihood estimations and compare different approaches for scaling.

### Additional Services

Author: Henning Läuter, Ayad Ramadan urn:nbn:de:kobv:517-opus-49566 Mathematische Statistik und Wahrscheinlichkeitstheorie : Preprint (2010, 01) Preprint English 2010 Universität Potsdam 2011/03/31 SI 990 Mathematisch-Naturwissenschaftliche Fakultät / Institut für Mathematik 5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik Keine Nutzungslizenz vergeben - es gilt das deutsche Urheberrecht