Refine
Has Fulltext
- no (8) (remove)
Year of publication
- 2022 (8) (remove)
Document Type
- Article (7)
- Doctoral Thesis (1)
Language
- English (8)
Is part of the Bibliography
- yes (8)
Keywords
- machine learning (8) (remove)
Institute
- Institut für Geowissenschaften (2)
- Extern (1)
- Fachgruppe Betriebswirtschaftslehre (1)
- Fachgruppe Volkswirtschaftslehre (1)
- Hasso-Plattner-Institut für Digital Engineering GmbH (1)
- Institut für Biochemie und Biologie (1)
- Institut für Informatik und Computational Science (1)
- Mathematisch-Naturwissenschaftliche Fakultät (1)
- Sozialwissenschaften (1)
“Broadcast your gender.”
(2022)
Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we are interested in estimating the gender of German social media creators. By using the example of a random sample of 200 YouTube channels, we compare several classification methods, namely (1) a survey among university staff, (2) a name dictionary method with the World Gender Name Dictionary as a reference list, (3) an algorithmic approach using the website gender-api.com, and (4) a Multinomial Naïve Bayes (MNB) machine learning technique. These different methods identify gender attributes based on YouTube channel names and descriptions in German but are adaptable to other languages. Our contribution will evaluate the share of identifiable channels, accuracy and meaningfulness of classification, as well as limits and benefits of each approach. We aim to address methodological challenges connected to classifying gender attributes for YouTube channels as well as related to reinforcing stereotypes and ethical implications.
Nowadays, production planning and control must cope with mass customization, increased fluctuations in demand, and high competition pressures. Despite prevailing market risks, planning accuracy and increased adaptability in the event of disruptions or failures must be ensured, while simultaneously optimizing key process indicators. To manage that complex task, neural networks that can process large quantities of high-dimensional data in real time have been widely adopted in recent years. Although these are already extensively deployed in production systems, a systematic review of applications and implemented agent embeddings and architectures has not yet been conducted. The main contribution of this paper is to provide researchers and practitioners with an overview of applications and applied embeddings and to motivate further research in neural agent-based production. Findings indicate that neural agents are not only deployed in diverse applications, but are also increasingly implemented in multi-agent environments or in combination with conventional methods — leveraging performances compared to benchmarks and reducing dependence on human experience. This not only implies a more sophisticated focus on distributed production resources, but also broadening the perspective from a local to a global scale. Nevertheless, future research must further increase scalability and reproducibility to guarantee a simplified transfer of results to reality.
A review of source models to further the understanding of the seismicity of the Groningen field
(2022)
The occurrence of felt earthquakes due to gas production in Groningen has initiated numerous studies and model attempts to understand and quantify induced seismicity in this region. The whole bandwidth of available models spans the range from fully deterministic models to purely empirical and stochastic models. In this article, we summarise the most important model approaches, describing their main achievements and limitations. In addition, we discuss remaining open questions and potential future directions of development.
Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance.
Forest microclimate can buffer biotic responses to summer heat waves, which are expected to become more extreme under climate warming. Prediction of forest microclimate is limited because meteorological observation standards seldom include situations inside forests.
We use eXtreme Gradient Boosting - a Machine Learning technique - to predict the microclimate of forest sites in Brandenburg, Germany, using seasonal data comprising weather features.
The analysis was amended by applying a SHapley Additive explanation to show the interaction effect of variables and individualised feature attributions.
We evaluate model performance in comparison to artificial neural networks, random forest, support vector machine, and multi-linear regression.
After implementing a feature selection, an ensemble approach was applied to combine individual models for each forest and improve robustness over a given single prediction model.
The resulting model can be applied to translate climate change scenarios into temperatures inside forests to assess temperature-related ecosystem services provided by forests.
Seismology, like many scientific fields, e.g., music information retrieval and speech signal pro- cessing, is experiencing exponential growth in the amount of data acquired by modern seismo- logical networks. In this thesis, I take advantage of the opportunities offered by "big data" and by the methods developed in the areas of music information retrieval and machine learning to predict better the ground motion generated by earthquakes and to study the properties of the surface layers of the Earth. In order to better predict seismic ground motions, I propose two approaches based on unsupervised deep learning methods, an autoencoder network and Generative Adversarial Networks. The autoencoder technique explores a massive amount of ground motion data, evaluates the required parameters, and generates synthetic ground motion data in the Fourier amplitude spectra (FAS) domain. This method is tested on two synthetic datasets and one real dataset. The application on the real dataset shows that the substantial information contained within the FAS data can be encoded to a four to the five-dimensional manifold. Consequently, only a few independent parameters are required for efficient ground motion prediction. I also propose a method based on Conditional Generative Adversarial Networks (CGAN) for simulating ground motion records in the time-frequency and time domains. CGAN generates the time-frequency domains based on the parameters: magnitude, distance, and shear wave velocities to 30 m depth (VS30). After generating the amplitude of the time-frequency domains using the CGAN model, instead of classical conventional methods that assume the amplitude spectra with a random phase spectrum, the phase of the time-frequency domains is recovered by minimizing the observed and reconstructed spectrograms. In the second part of this dissertation, I propose two methods for the monitoring and characterization of near-surface materials and site effect analyses. I implement an autocorrelation function and an interferometry method to monitor the velocity changes of near-surface materials resulting from the Kumamoto earthquake sequence (Japan, 2016). The observed seismic velocity changes during the strong shaking are due to the non-linear response of the near-surface materials. The results show that the velocity changes lasted for about two months after the Kumamoto mainshock. Furthermore, I used the velocity changes to evaluate the in-situ strain-stress relationship. I also propose a method for assessing the site proxy "VS30" using non-invasive analysis. In the proposed method, a dispersion curve of surface waves is inverted to estimate the shear wave velocity of the subsurface. This method is based on the Dix-like linear operators, which relate the shear wave velocity to the phase velocity. The proposed method is fast, efficient, and stable. All of the methods presented in this work can be used for processing "big data" in seismology and for the analysis of weak and strong ground motion data, to predict ground shaking, and to analyze site responses by considering potential time dependencies and nonlinearities.
The intensity of cosmic radiation may differ over five orders of magnitude within a few hours or days during the Solar Particle Events (SPEs), thus increasing for several orders of magnitude the probability of Single Event Upsets (SEUs) in space-borne electronic systems. Therefore, it is vital to enable the early detection of the SEU rate changes in order to ensure timely activation of dynamic radiation hardening measures. In this paper, an embedded approach for the prediction of SPEs and SRAM SEU rate is presented. The proposed solution combines the real-time SRAM-based SEU monitor, the offline-trained machine learning model and online learning algorithm for the prediction. With respect to the state-of-the-art, our solution brings the following benefits: (1) Use of existing on-chip data storage SRAM as a particle detector, thus minimizing the hardware and power overhead, (2) Prediction of SRAM SEU rate one hour in advance, with the fine-grained hourly tracking of SEU variations during SPEs as well as under normal conditions, (3) Online optimization of the prediction model for enhancing the prediction accuracy during run-time, (4) Negligible cost of hardware accelerator design for the implementation of selected machine learning model and online learning algorithm. The proposed design is intended for a highly dependable and self-adaptive multiprocessing system employed in space applications, allowing to trigger the radiation mitigation mechanisms before the onset of high radiation levels.
This paper sheds new light on the role of communication for cartel formation. Using machine learning to evaluate free-form chat communication among firms in a laboratory experiment, we identify typical communication patterns for both explicit cartel formation and indirect attempts to collude tacitly. We document that firms are less likely to communicate explicitly about price fixing and more likely to use indirect messages when sanctioning institutions are present. This effect of sanctions on communication reinforces the direct cartel-deterring effect of sanctions as collusion is more difficult to reach and sustain without an explicit agreement. Indirect messages have no, or even a negative, effect on prices.