TY  - JOUR
A1  - Levy, Jessica
A1  - Mussack, Dominic
A1  - Brunner, Martin
A1  - Keller, Ulrich
A1  - Cardoso-Leite, Pedro
A1  - Fischbach, Antoine
T1  - Contrasting classical and machine learning approaches in the estimation of value-added scores in large-scale educational data
JF  - Frontiers in psychology
N2  - There is no consensus on which statistical model estimates school value-added (VA) most accurately. To date, the two most common statistical models used for the calculation of VA scores are two classical methods: linear regression and multilevel models. These models have the advantage of being relatively transparent and thus understandable for most researchers and practitioners. However, these statistical models are bound to certain assumptions (e.g., linearity) that might limit their prediction accuracy. Machine learning methods, which have yielded spectacular results in numerous fields, may be a valuable alternative to these classical models. Although big data is not new in general, it is relatively new in the realm of social sciences and education. New types of data require new data analytical approaches. Such techniques have already evolved in fields with a long tradition in crunching big data (e.g., gene technology). The objective of the present paper is to competently apply these "imported" techniques to education data, more precisely VA scores, and assess when and how they can extend or replace the classical psychometrics toolbox. The different models include linear and non-linear methods and extend classical models with the most commonly used machine learning methods (i.e., random forest, neural networks, support vector machines, and boosting). We used representative data of 3,026 students in 153 schools who took part in the standardized achievement tests of the Luxembourg School Monitoring Program in grades 1 and 3. Multilevel models outperformed classical linear and polynomial regressions, as well as different machine learning models. However, it could be observed that across all schools, school VA scores from different model types correlated highly. Yet, the percentage of disagreements as compared to multilevel models was not trivial and real-life implications for individual schools may still be dramatic depending on the model type used. Implications of these results and possible ethical concerns regarding the use of machine learning methods for decision-making in education are discussed.
KW  - value-added modeling
KW  - school effectiveness
KW  - machine learning
KW  - model
KW  - comparison
KW  - longitudinal data
Y1  - 2020
U6  - https://doi.org/10.3389/fpsyg.2020.02190
SN  - 1664-1078
VL  - 11
PB  - Frontiers Research Foundation
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Ayzel, Georgy
A1  - Izhitskiy, Alexander
T1  - Climate Change Impact Assessment on Freshwater Inflow into the Small Aral Sea
JF  - Water
N2  - During the last few decades, the rapid separation of the Small Aral Sea from the isolated basin has changed its hydrological and ecological conditions tremendously. In the present study, we developed and validated the hybrid model for the Syr Darya River basin based on a combination of state-of-the-art hydrological and machine learning models. Climate change impact on freshwater inflow into the Small Aral Sea for the projection period 2007-2099 has been quantified based on the developed hybrid model and bias corrected and downscaled meteorological projections simulated by four General Circulation Models (GCM) for each of three Representative Concentration Pathway scenarios (RCP). The developed hybrid model reliably simulates freshwater inflow for the historical period with a Nash-Sutcliffe efficiency of 0.72 and a Kling-Gupta efficiency of 0.77. Results of the climate change impact assessment showed that the freshwater inflow projections produced by different GCMs are misleading by providing contradictory results for the projection period. However, we identified that the relative runoff changes are expected to be more pronounced in the case of more aggressive RCP scenarios. The simulated projections of freshwater inflow provide a basis for further assessment of climate change impacts on hydrological and ecological conditions of the Small Aral Sea in the 21st Century.
KW  - Small Aral Sea
KW  - hydrology
KW  - climate change
KW  - modeling
KW  - machine learning
Y1  - 2019
U6  - https://doi.org/10.3390/w11112377
SN  - 2073-4441
VL  - 11
IS  - 11
PB  - MDPI
CY  - Basel
ER  - 
TY  - JOUR
A1  - Frommhold, Martin
A1  - Heim, Arend
A1  - Barabanov, Mikhail
A1  - Maier, Franziska
A1  - Mühle, Ralf-Udo
A1  - Smirenski, Sergei M.
A1  - Heim, Wieland
T1  - Breeding habitat and nest-site selection by an obligatory "nest-cleptoparasite", the Amur Falcon Falco amurensis
JF  - Ecology and evolution
N2  - The selection of a nest site is crucial for successful reproduction of birds. Animals which re-use or occupy nest sites constructed by other species often have limited choice. Little is known about the criteria of nest-stealing species to choose suitable nesting sites and habitats. Here, we analyze breeding-site selection of an obligatory "nest-cleptoparasite", the Amur Falcon Falco amurensis. We collected data on nest sites at Muraviovka Park in the Russian Far East, where the species breeds exclusively in nests of the Eurasian Magpie Pica pica. We sampled 117 Eurasian Magpie nests, 38 of which were occupied by Amur Falcons. Nest-specific variables were assessed, and a recently developed habitat classification map was used to derive landscape metrics. We found that Amur Falcons chose a wide range of nesting sites, but significantly preferred nests with a domed roof. Breeding pairs of Eurasian Hobby Falco subbuteo and Eurasian Magpie were often found to breed near the nest in about the same distance as neighboring Amur Falcon pairs. Additionally, the occurrence of the species was positively associated with bare soil cover, forest cover, and shrub patches within their home range and negatively with the distance to wetlands. Areas of wetlands and fallow land might be used for foraging since Amur Falcons mostly depend on an insect diet. Additionally, we found that rarely burned habitats were preferred. Overall, the effect of landscape variables on the choice of actual nest sites appeared to be rather small. We used different classification methods to predict the probability of occurrence, of which the Random forest method showed the highest accuracy. The areas determined as suitable habitat showed a high concordance with the actual nest locations. We conclude that Amur Falcons prefer to occupy newly built (domed) nests to ensure high nest quality, as well as nests surrounded by available feeding habitats.
KW  - cleptoparasitism
KW  - fire
KW  - habitat use
KW  - machine learning
KW  - magpie
KW  - nest-site selection
KW  - random forest
Y1  - 2019
U6  - https://doi.org/10.1002/ece3.5878
SN  - 2045-7758
VL  - 9
IS  - 24
SP  - 14430
EP  - 14441
PB  - Wiley
CY  - Hoboken
ER  - 
TY  - JOUR
A1  - Wilksch, Moritz
A1  - Abramova, Olga
T1  - PyFin-sentiment
BT  - towards a machine-learning-based model for deriving sentiment from financial tweets
JF  - International journal of information management data insights
N2  - Responding to the poor performance of generic automated sentiment analysis solutions on domain-specific texts, we collect a dataset of 10,000 tweets discussing the topics of finance and investing. We manually assign each tweet its market sentiment, i.e., the investor’s anticipation of a stock’s future return. Using this data, we show that all existing sentiment models trained on adjacent domains struggle with accurate market sentiment analysis due to the task’s specialized vocabulary. Consequently, we design, train, and deploy our own sentiment model. It outperforms all previous models (VADER, NTUSD-Fin, FinBERT, TwitterRoBERTa) when evaluated on Twitter posts. On posts from a different platform, our model performs on par with BERT-based large language models. We achieve this result at a fraction of the training and inference costs due to the model’s simple design. We publish the artifact as a python library to facilitate its use by future researchers and practitioners.
KW  - sentiment analysis
KW  - financial market sentiment
KW  - opinion mining
KW  - machine learning
KW  - deep learning
Y1  - 2023
U6  - https://doi.org/10.1016/j.jjimei.2023.100171
SN  - 2667-0968
VL  - 3
IS  - 1
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Brandes, Stefanie
A1  - Sicks, Florian
A1  - Berger, Anne
T1  - Behaviour classification on giraffes (Giraffa camelopardalis) using machine learning algorithms on triaxial acceleration data of two commonly used GPS devices and its possible application for their management and conservation
JF  - Sensors
N2  - Averting today's loss of biodiversity and ecosystem services can be achieved through conservation efforts, especially of keystone species. Giraffes (Giraffa camelopardalis) play an important role in sustaining Africa's ecosystems, but are 'vulnerable' according to the IUCN Red List since 2016. Monitoring an animal's behavior in the wild helps to develop and assess their conservation management. One mechanism for remote tracking of wildlife behavior is to attach accelerometers to animals to record their body movement. We tested two different commercially available high-resolution accelerometers, e-obs and Africa Wildlife Tracking (AWT), attached to the top of the heads of three captive giraffes and analyzed the accuracy of automatic behavior classifications, focused on the Random Forests algorithm. For both accelerometers, behaviors of lower variety in head and neck movements could be better predicted (i.e., feeding above eye level, mean prediction accuracy e-obs/AWT: 97.6%/99.7%; drinking: 96.7%/97.0%) than those with a higher variety of body postures (such as standing: 90.7-91.0%/75.2-76.7%; rumination: 89.6-91.6%/53.5-86.5%). Nonetheless both devices come with limitations and especially the AWT needs technological adaptations before applying it on animals in the wild. Nevertheless, looking at the prediction results, both are promising accelerometers for behavioral classification of giraffes. Therefore, these devices when applied to free-ranging animals, in combination with GPS tracking, can contribute greatly to the conservation of giraffes.
KW  - giraffe
KW  - triaxial acceleration
KW  - machine learning
KW  - random forests
KW  - behavior classification
KW  - giraffe conservation
Y1  - 2021
U6  - https://doi.org/10.3390/s21062229
SN  - 1424-8220
VL  - 21
IS  - 6
PB  - MDPI
CY  - Basel
ER  - 
TY  - JOUR
A1  - Hampf, Anna
A1  - Nendel, Claas
A1  - Strey, Simone
A1  - Strey, Robert
T1  - Biotic yield losses in the Southern Amazon, Brazil
BT  - making use of smartphone-assisted plant disease diagnosis data
JF  - Frontiers in plant science : FPLS
N2  - Pathogens and animal pests (P&A) are a major threat to global food security as they directly affect the quantity and quality of food. The Southern Amazon, Brazil's largest domestic region for soybean, maize and cotton production, is particularly vulnerable to the outbreak of P&A due to its (sub)tropical climate and intensive farming systems. However, little is known about the spatial distribution of P&A and the related yield losses. Machine learning approaches for the automated recognition of plant diseases can help to overcome this research gap. The main objectives of this study are to (1) evaluate the performance of Convolutional Neural Networks (ConvNets) in classifying P&A, (2) map the spatial distribution of P&A in the Southern Amazon, and (3) quantify perceived yield and economic losses for the main soybean and maize P&A. The objectives were addressed by making use of data collected with the smartphone application Plantix. The core of the app's functioning is the automated recognition of plant diseases via ConvNets. Data on expected yield losses were gathered through a short survey included in an "expert" version of the application, which was distributed among agronomists. Between 2016 and 2020, Plantix users collected approximately 78,000 georeferenced P&A images in the Southern Amazon. The study results indicate a high performance of the trained ConvNets in classifying 420 different crop-disease combinations. Spatial distribution maps and expert-based yield loss estimates indicate that maize rust, bacterial stalk rot and the fall armyworm are among the most severe maize P&A, whereas soybean is mainly affected by P&A like anthracnose, downy mildew, frogeye leaf spot, stink bugs and brown spot. Perceived soybean and maize yield losses amount to 12 and 16%, respectively, resulting in annual yield losses of approximately 3.75 million tonnes for each crop and economic losses of US$2 billion for both crops together. The high level of accuracy of the trained ConvNets, when paired with widespread use from following a citizen-science approach, results in a data source that will shed new light on yield loss estimates, e.g., for the analysis of yield gaps and the development of measures to minimise them.
KW  - plant pathology
KW  - animal pests
KW  - pathogens
KW  - machine learning
KW  - digital
KW  - image processing
KW  - disease diagnosis
KW  - crowdsourcing
KW  - crop losses
Y1  - 2021
U6  - https://doi.org/10.3389/fpls.2021.621168
SN  - 1664-462X
VL  - 12
PB  - Frontiers Media
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Bornhorst, Julia
A1  - Nustede, Eike Jannik
A1  - Fudickar, Sebastian
T1  - Mass Surveilance of C. elegans-Smartphone-Based DIY Microscope and Machine-Learning-Based Approach for Worm Detection
JF  - Sensors
N2  - The nematode Caenorhabditis elegans (C. elegans) is often used as an alternative animal model due to several advantages such as morphological changes that can be seen directly under a microscope. Limitations of the model include the usage of expensive and cumbersome microscopes, and restrictions of the comprehensive use of C. elegans for toxicological trials. With the general applicability of the detection of C. elegans from microscope images via machine learning, as well as of smartphone-based microscopes, this article investigates the suitability of smartphone-based microscopy to detect C. elegans in a complete Petri dish. Thereby, the article introduces a smartphone-based microscope (including optics, lighting, and housing) for monitoring C. elegans and the corresponding classification via a trained Histogram of Oriented Gradients (HOG) feature-based Support Vector Machine for the automatic detection of C. elegans. Evaluation showed classification sensitivity of 0.90 and specificity of 0.85, and thereby confirms the general practicability of the chosen approach.
KW  - Caenorhabditis elegans
KW  - machine learning
KW  - smartphone
KW  - microscope
KW  - SVM
KW  - HOG
Y1  - 2019
U6  - https://doi.org/10.3390/s19061468
SN  - 1424-8220
VL  - 19
IS  - 6
PB  - MDPI
CY  - Basel
ER  - 
TY  - JOUR
A1  - Baumgart, Lene
A1  - Boos, Pauline
A1  - Eckstein, Bernd
T1  - Datafication and algorithmic contingency
BT  - how agile organisations deal with technical systems
JF  - Work organisation, labour & globalisation
N2  - In the context of persistent images of self-perpetuated technologies, we discuss the interplay of digital technologies and organisational dynamics against the backdrop of systems theory. Building on the case of an international corporation that, during an agile reorganisation, introduced an AI-based personnel management platform, we show how technical systems produce a form of algorithmic contingency that subsequently leads to the emergence of formal and informal interaction systems. Using the concept of datafication, we explain how these interactions are barriers to the self-perpetuation of data-based decision-making, making it possible to take into consideration further decision factors and complementing the output of the platform. The research was carried out within the scope of the research project ‘Organisational Implications of Digitalisation: The Development of (Post-)Bureaucratic Organisational Structures in the Context of Digital Transformation’ funded by the German Research Foundation (DFG).
KW  - digitalisation
KW  - datafication
KW  - organisation
KW  - agile
KW  - technical system
KW  - systems theory
KW  - interaction
KW  - algorithmic contingency
KW  - machine learning
KW  - platform
Y1  - 2023
U6  - https://doi.org/10.13169/workorgalaboglob.17.1.0061
SN  - 1745-641X
SN  - 1745-6428
VL  - 17
IS  - 1
SP  - 61
EP  - 73
PB  - Pluto Journals
CY  - London
ER  - 
TY  - JOUR
A1  - Kühn, Daniela
A1  - Hainzl, Sebastian
A1  - Dahm, Torsten
A1  - Richter, Gudrun
A1  - Vera Rodriguez, Ismael
T1  - A review of source models to further the understanding of the seismicity of the Groningen field
JF  - Netherlands journal of geosciences : NJG
N2  - The occurrence of felt earthquakes due to gas production in Groningen has initiated numerous studies and model attempts to understand and quantify induced seismicity in this region. The whole bandwidth of available models spans the range from fully deterministic models to purely empirical and stochastic models. In this article, we summarise the most important model approaches, describing their main achievements and limitations. In addition, we discuss remaining open questions and potential future directions of development.
KW  - deterministic
KW  - empirical
KW  - hybrid
KW  - machine learning
KW  - seismicity model
Y1  - 2022
U6  - https://doi.org/10.1017/njg.2022.7
SN  - 0016-7746
SN  - 1573-9708
VL  - 101
PB  - Cambridge Univ. Press
CY  - Cambridge
ER  - 
TY  - JOUR
A1  - Ebers, Martin
A1  - Hoch, Veronica R. S.
A1  - Rosenkranz, Frank
A1  - Ruschemeier, Hannah
A1  - Steinrötter, Björn
T1  - The European Commission’s proposal for an Artificial Intelligence Act
BT  - a critical assessment by members of the Robotics and AI Law Society (RAILS)
JF  - J : multidisciplinary scientific journal
N2  - On 21 April 2021, the European Commission presented its long-awaited proposal for a Regulation “laying down harmonized rules on Artificial Intelligence”, the so-called “Artificial Intelligence Act” (AIA). This article takes a critical look at the proposed regulation. After an introduction (1), the paper analyzes the unclear preemptive effect of the AIA and EU competences (2), the scope of application (3), the prohibited uses of Artificial Intelligence (AI) (4), the provisions on high-risk AI systems (5), the obligations of providers and users (6), the requirements for AI systems with limited risks (7), the enforcement system (8), the relationship of the AIA with the existing legal framework (9), and the regulatory gaps (10). The last section draws some final conclusions (11).
KW  - artificial intelligence
KW  - machine learning
KW  - European Union
KW  - regulation
KW  - harmonization
KW  - Artificial Intelligence Act
Y1  - 2021
U6  - https://doi.org/10.3390/j4040043
SN  - 2571-8800
VL  - 4
IS  - 4
SP  - 589
EP  - 603
PB  - MDPI
CY  - Basel
ER  -