In Python available: St. Nicolas House Algorithm (SNHA) with bootstrap support for improved performance in dense networks
- The St. Nicolas House Algorithm (SNHA) finds association chains of direct dependent variables in a data set. The dependency is based on the correlation coefficient, which is visualized as an undirected graph. The network prediction is improved by a bootstrap routine. It enables the computation of the empirical p-value, which is used to evaluate the significance of the predicted edges. Synthetic data generated with the Monte Carlo method were used to firstly compare the Python package with the original R package, and secondly to evaluate the predicted network using the sensitivity, specificity, balanced classification rate and the Matthew's correlation coefficient (MCC). The Python implementation yields the same results as the R package. Hence, the algorithm was correctly ported into Python. The SNHA scores high specificity values for all tested graphs. For graphs with high edge densities, the other evaluation metrics decrease due to lower sensitivity, which could be partially improved by using bootstrap,while for graphs with low edgeThe St. Nicolas House Algorithm (SNHA) finds association chains of direct dependent variables in a data set. The dependency is based on the correlation coefficient, which is visualized as an undirected graph. The network prediction is improved by a bootstrap routine. It enables the computation of the empirical p-value, which is used to evaluate the significance of the predicted edges. Synthetic data generated with the Monte Carlo method were used to firstly compare the Python package with the original R package, and secondly to evaluate the predicted network using the sensitivity, specificity, balanced classification rate and the Matthew's correlation coefficient (MCC). The Python implementation yields the same results as the R package. Hence, the algorithm was correctly ported into Python. The SNHA scores high specificity values for all tested graphs. For graphs with high edge densities, the other evaluation metrics decrease due to lower sensitivity, which could be partially improved by using bootstrap,while for graphs with low edge densities the algorithm achieves high evaluation scores. The empirical p-values indicated that the predicted edges indeed are significant.…
Author details: | Tim Hake, Bernhard Bodenberger, Detlef GrothORCiDGND |
---|---|
DOI: | https://doi.org/10.52905/hbph2023.1.63 |
ISSN: | 2748-9957 |
Title of parent work (English): | Human biology and public health |
Publisher: | Universitätsverlag Potsdam |
Place of publishing: | Potsdam |
Publication type: | Article |
Language: | English |
Date of first publication: | 2023/07/21 |
Publication year: | 2023 |
Release date: | 2023/07/21 |
Tag: | Python; St. Nicolas House Algorithm; bootstrap; correlation; network reconstruction |
Volume: | 1 |
Number of pages: | 16 |
Organizational units: | Mathematisch-Naturwissenschaftliche Fakultät / Institut für Biochemie und Biologie |
DDC classification: | 5 Naturwissenschaften und Mathematik / 57 Biowissenschaften; Biologie / 570 Biowissenschaften; Biologie |
6 Technik, Medizin, angewandte Wissenschaften / 61 Medizin und Gesundheit / 610 Medizin und Gesundheit | |
Peer review: | Referiert |
Publishing method: | Universitätsverlag Potsdam |
Open Access / Gold Open-Access | |
License (German): | CC-BY - Namensnennung 4.0 International |