TY  - GEN
A1  - Degkwitz, Andreas
A1  - Andermann, Heike
T1  - Angebots-, Nutzungs- und Bezugsstrukturen elektronischer Fachinformation in Deutschland
N2  - Mit dem Übergang zum digitalen Medium haben sich die Bezugsstrukturen und das Angebot an elektronischer Fachinformation in den Bibliotheken nachhaltig verändert. In den vorliegenden Untersuchungen wird das Angebot elektronischer Zeitschriften und Datenbanken und die Nutzung elektronischer Zeitschriften in fünf ausgewählten Fachgebieten und in unterschiedlichen Bibliothekstypen dargelegt. Darüber hinaus werden die derzeitigen Bezugsstrukturen beschrieben sowie die Ergebnisse einer Befragung der Konsortien zu Zielsetzungen, Vertragsformen und Geschäftsmodellen dargestellt. Chancen und Risiken der konsortialen Bezugsform werden erörtert.
N2  - With the transition to the digital medium the structures for purchasing digital information and the offer of scientific information in the libraries changed strongly. In the available examination the offer of electronic journals and databases and the usage of electronic journals in five selected disciplines and in different types of libraries is evaluated. Further more the current purchasing structures and the results of interviews with consortia in regard to objectives, forms of contracts and pricing models are described. Chances and risks of consortia purchasing are discussed.
KW  - elektronische Zeitschriften
KW  - Datenbanken
KW  - Konsortien
KW  - Bezugsstrukturen
KW  - electronic journals
KW  - databases
KW  - consortia
KW  - purchasing structures
Y1  - 2003
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-672
ER  - 
TY  - JOUR
A1  - Klie, Sebastian
A1  - Nikoloski, Zoran
A1  - Selbig, Joachim
T1  - Biological cluster evaluation for gene function prediction
JF  - Journal of computational biology
N2  - Recent advances in high-throughput omics techniques render it possible to decode the function of genes by using the "guilt-by-association" principle on biologically meaningful clusters of gene expression data. However, the existing frameworks for biological evaluation of gene clusters are hindered by two bottleneck issues: (1) the choice for the number of clusters, and (2) the external measures which do not take in consideration the structure of the analyzed data and the ontology of the existing biological knowledge. Here, we address the identified bottlenecks by developing a novel framework that allows not only for biological evaluation of gene expression clusters based on existing structured knowledge, but also for prediction of putative gene functions. The proposed framework facilitates propagation of statistical significance at each of the following steps: (1) estimating the number of clusters, (2) evaluating the clusters in terms of novel external structural measures, (3) selecting an optimal clustering algorithm, and (4) predicting gene functions. The framework also includes a method for evaluation of gene clusters based on the structure of the employed ontology. Moreover, our method for obtaining a probabilistic range for the number of clusters is demonstrated valid on synthetic data and available gene expression profiles from Saccharomyces cerevisiae. Finally, we propose a network-based approach for gene function prediction which relies on the clustering of optimal score and the employed ontology. Our approach effectively predicts gene function on the Saccharomyces cerevisiae data set and is also employed to obtain putative gene functions for an Arabidopsis thaliana data set.
KW  - algorithms
KW  - biochemical networks
KW  - combinatorics
KW  - computational molecular biology
KW  - databases
KW  - functional genomics
KW  - gene expression
KW  - NP-completeness
Y1  - 2014
U6  - https://doi.org/10.1089/cmb.2009.0129
SN  - 1066-5277
SN  - 1557-8666
VL  - 21
IS  - 6
SP  - 428
EP  - 445
PB  - Liebert
CY  - New Rochelle
ER  - 
TY  - JOUR
A1  - Subetto, D. A.
A1  - Nazarova, Larisa B.
A1  - Pestryakova, Luidmila Agafyevna
A1  - Syrykh, Liudmila
A1  - Andronikov, A. V.
A1  - Biskaborn, Boris
A1  - Diekmann, Bernhard
A1  - Kuznetsov, D. D.
A1  - Sapelko, T. V.
A1  - Grekov, I. M.
T1  - Paleolimnological studies in Russian northern Eurasia
BT  - a review
JF  - Contemporary Problems of Ecology
N2  - This article presents a review of the current data on the level of paleolimnological knowledge about lakes in the Russian part of the northern Eurasia. The results of investigation of the northwestern European part of Russia as the best paleolimnologically studied sector of the Russian north is presented in detail. The conditions of lacustrine sedimentation at the boundary between the Late Pleistocene and Holocene and the role of different external factors in formation of their chemical composition, including active volcanic activity and possible large meteorite impacts, are also discussed. The results of major paleoclimatic and paleoecological reconstructions in northern Siberia are presented. Particular attention is given to the databases of abiotic and biotic parameters of lake ecosystems as an important basis for quantitative reconstructions of climatic and ecological changes in the Late Pleistocene and Holocene. Keywords: paleolimnology, lakes, bottom sediments, northern.
KW  - paleolimnology
KW  - lakes
KW  - bottom sediments
KW  - northern Eurasia
KW  - Russian Arctic
KW  - databases
Y1  - 2017
U6  - https://doi.org/10.1134/S1995425517040102
SN  - 1995-4255
SN  - 1995-4263
VL  - 10
SP  - 327
EP  - 335
PB  - Pleiades Publ.
CY  - New York
ER  - 
TY  - JOUR
A1  - Caruccio, Loredana
A1  - Deufemia, Vincenzo
A1  - Naumann, Felix
A1  - Polese, Giuseppe
T1  - Discovering relaxed functional dependencies based on multi-attribute dominance
JF  - IEEE transactions on knowledge and data engineering
N2  - With the advent of big data and data lakes, data are often integrated from multiple sources. Such integrated data are often of poor quality, due to inconsistencies, errors, and so forth. One way to check the quality of data is to infer functional dependencies (fds). However, in many modern applications it might be necessary to extract properties and relationships that are not captured through fds, due to the necessity to admit exceptions, or to consider similarity rather than equality of data values. Relaxed fds (rfds) have been introduced to meet these needs, but their discovery from data adds further complexity to an already complex problem, also due to the necessity of specifying similarity and validity thresholds. We propose Domino, a new discovery algorithm for rfds that exploits the concept of dominance in order to derive similarity thresholds of attribute values while inferring rfds. An experimental evaluation on real datasets demonstrates the discovery performance and the effectiveness of the proposed algorithm.
KW  - Complexity theory
KW  - Approximation algorithms
KW  - Big Data
KW  - Distributed
KW  - databases
KW  - Semantics
KW  - Lakes
KW  - Functional dependencies
KW  - data profiling
KW  - data cleansing
Y1  - 2020
U6  - https://doi.org/10.1109/TKDE.2020.2967722
SN  - 1041-4347
SN  - 1558-2191
VL  - 33
IS  - 9
SP  - 3212
EP  - 3228
PB  - Institute of Electrical and Electronics Engineers
CY  - New York, NY
ER  - 
TY  - JOUR
A1  - Gronau, Norbert
A1  - Schaefer, Martin
T1  - Why metadata matters for the future of copyright
JF  - European Intellectual Property Review
N2  - In the copyright industries of the 21st century, metadata is the grease required to make the engine of copyright run smoothly and powerfully for the benefit of creators, copyright industries and users alike. However, metadata is difficult to acquire and even more difficult to keep up to date as the rights in content are mostly multi-layered, fragmented, international and volatile. This article explores the idea of a neutral metadata search and enhancement tool that could constitute a buffer to safeguard the interests of the various proprietary database owners and avoid the shortcomings of centralised databases.
KW  - copyright
KW  - databases
KW  - metadata
KW  - music industry
Y1  - 2021
SN  - 0142-0461
VL  - 43
IS  - 8
SP  - 488
EP  - 494
PB  - Sweet & Maxwell
CY  - London
ER  - 
TY  - JOUR
A1  - Datta, Suparno
A1  - Sachs, Jan Philipp
A1  - Freitas da Cruz, Harry
A1  - Martensen, Tom
A1  - Bode, Philipp
A1  - Morassi Sasso, Ariane
A1  - Glicksberg, Benjamin S.
A1  - Böttinger, Erwin
T1  - FIBER
BT  - enabling flexible retrieval of electronic health records data for clinical predictive modeling
JF  - JAMIA open
N2  - Objectives: 
The development of clinical predictive models hinges upon the availability of comprehensive clinical data. Tapping into such resources requires considerable effort from clinicians, data scientists, and engineers. Specifically, these efforts are focused on data extraction and preprocessing steps required prior to modeling, including complex database queries. A handful of software libraries exist that can reduce this complexity by building upon data standards. However, a gap remains concerning electronic health records (EHRs) stored in star schema clinical data warehouses, an approach often adopted in practice. In this article, we introduce the FlexIBle EHR Retrieval (FIBER) tool: a Python library built on top of a star schema (i2b2) clinical data warehouse that enables flexible generation of modeling-ready cohorts as data frames. 

Materials and Methods: 
FIBER was developed on top of a large-scale star schema EHR database which contains data from 8 million patients and over 120 million encounters. To illustrate FIBER's capabilities, we present its application by building a heart surgery patient cohort with subsequent prediction of acute kidney injury (AKI) with various machine learning models. 

Results:
Using FIBER, we were able to build the heart surgery cohort (n = 12 061), identify the patients that developed AKI (n = 1005), and automatically extract relevant features (n = 774). Finally, we trained machine learning models that achieved area under the curve values of up to 0.77 for this exemplary use case.

Conclusion: 
FIBER is an open-source Python library developed for extracting information from star schema clinical data warehouses and reduces time-to-modeling, helping to streamline the clinical modeling process.
KW  - databases
KW  - factual
KW  - electronic health records
KW  - information storage and
KW  - retrieval
KW  - workflow
KW  - software/instrumentation
Y1  - 2021
U6  - https://doi.org/10.1093/jamiaopen/ooab048
SN  - 2574-2531
VL  - 4
IS  - 3
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Rana, Kamal
A1  - Öztürk, Ugur
A1  - Malik, Nishant
T1  - Landslide geometry reveals its trigger
JF  - Geophysical research letters : GRL / American Geophysical Union
N2  - Electronic databases of landslides seldom include the triggering mechanisms, rendering these inventories unusable for landslide hazard modeling. We present a method for classifying the triggering mechanisms of landslides in existing inventories, thus, allowing these inventories to aid in landslide hazard modeling corresponding to the correct event chain. Our method uses various geometric characteristics of landslides as the feature space for the machine-learning classifier random forest, resulting in accurate and robust classifications of landslide triggers. We applied the method to six landslide inventories spread over the Japanese archipelago in several different tests and training configurations to demonstrate the effectiveness of our approach. We achieved mean accuracy ranging from 67% to 92%. We also provide an illustrative example of a real-world usage scenario for our method using an additional inventory with unknown ground truth. Furthermore, our feature importance analysis indicates that landslides having identical trigger mechanisms exhibit similar geometric properties.
KW  - databases
KW  - Japan | landslides
KW  - random forest
Y1  - 2021
U6  - https://doi.org/10.1029/2020GL090848
SN  - 0094-8276
SN  - 1944-8007
VL  - 48
IS  - 4
PB  - American Geophysical Union
CY  - Washington
ER  -