## Modeling and Scaling of Categorical Data

- Estimation and testing of distributions in metric spaces are well known. R.A. Fisher, J. Neyman, W. Cochran and M. Bartlett achieved essential results on the statistical analysis of categorical data. In the last 40 years many other statisticians found important results in this field. Often data sets contain categorical data, e.g. levels of factors or names. There does not exist any ordering or any distance between these categories. At each level there are measured some metric or categorical values. We introduce a new method of scaling based on statistical decisions. For this we define empirical probabilities for the original observations and find a class of distributions in a metric space where these empirical probabilities can be found as approximations for equivalently defined probabilities. With this method we identify probabilities connected with the categorical data and probabilities in metric spaces. Here we get a mapping from the levels of factors or names into points of a metric space. This mapping yields the scale for the catEstimation and testing of distributions in metric spaces are well known. R.A. Fisher, J. Neyman, W. Cochran and M. Bartlett achieved essential results on the statistical analysis of categorical data. In the last 40 years many other statisticians found important results in this field. Often data sets contain categorical data, e.g. levels of factors or names. There does not exist any ordering or any distance between these categories. At each level there are measured some metric or categorical values. We introduce a new method of scaling based on statistical decisions. For this we define empirical probabilities for the original observations and find a class of distributions in a metric space where these empirical probabilities can be found as approximations for equivalently defined probabilities. With this method we identify probabilities connected with the categorical data and probabilities in metric spaces. Here we get a mapping from the levels of factors or names into points of a metric space. This mapping yields the scale for the categorical data. From the statistical point of view we use multivariate statistical methods, we calculate maximum likelihood estimations and compare different approaches for scaling.…

Author: | Henning Läuter, Ayad Ramadan |
---|---|

URN: | urn:nbn:de:kobv:517-opus-49572 |

Series (Serial Number): | Mathematische Statistik und Wahrscheinlichkeitstheorie : Preprint / Institut für Mathematik (2010, 03) |

Document Type: | Preprint |

Language: | German |

Date of Publication (online): | 2011/03/31 |

Year of Completion: | 2010 |

Publishing Institution: | Universität Potsdam |

Release Date: | 2011/03/31 |

RVK - Regensburg Classification: | SI 990 |

Organizational units: | Mathematisch-Naturwissenschaftliche Fakultät / Institut für Mathematik |

Dewey Decimal Classification: | 5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik |

Collections: | Universität Potsdam / Aufsätze (Pre- und Postprints) / Mathematisch-Naturwissenschaftliche Fakultät / Institut für Mathematik / Wahrscheinlichkeitstheorie |

Licence (German): | Keine Nutzungslizenz vergeben - es gilt das deutsche Urheberrecht |