TY  - THES
A1  - Tan, Jing
T1  - Multi-Agent Reinforcement Learning for Interactive Decision-Making
T1  - Multiagenten Verstärkendes Lernen für Interaktive Entscheidungsfindung
N2  - Distributed decision-making studies the choices made among a group of interactive and self-interested agents. Specifically, this thesis is concerned with the optimal sequence of choices an agent makes as it tries to maximize its achievement on one or multiple objectives in the dynamic environment. The optimization of distributed decision-making is important in many real-life applications, e.g., resource allocation (of products, energy, bandwidth, computing power, etc.) and robotics (heterogeneous agent cooperation on games or tasks), in various fields such as vehicular network, Internet of Things, smart grid, etc.
This thesis proposes three multi-agent reinforcement learning algorithms combined with game-theoretic tools to study strategic interaction between decision makers, using resource allocation in vehicular network as an example. Specifically, the thesis designs an interaction mechanism based on second-price auction, incentivizes the agents to maximize multiple short-term and long-term, individual and system objectives, and simulates a dynamic environment with realistic mobility data to evaluate algorithm performance and study agent behavior. 

Theoretical results show that the mechanism has Nash equilibria, is a maximization of social welfare and Pareto optimal allocation of resources in a stationary environment. Empirical results show that in the dynamic environment, our proposed learning algorithms outperform state-of-the-art algorithms in single and multi-objective optimization, and demonstrate very good generalization property in significantly different environments. Specifically, with the long-term multi-objective learning algorithm, we demonstrate that by considering the long-term impact of decisions, as well as by incentivizing the agents with a system fairness reward, the agents achieve better results in both individual and system objectives, even when their objectives are private, randomized, and changing over time. Moreover, the agents show competitive behavior to maximize individual payoff when resource is scarce, and cooperative behavior in achieving a system objective when resource is abundant; they also learn the rules of the game, without prior knowledge, to overcome disadvantages in initial parameters (e.g., a lower budget).

To address practicality concerns, the thesis also provides several computational performance improvement methods, and tests the algorithm in a single-board computer. Results show the feasibility of online training and inference in milliseconds. 

There are many potential future topics following this work. 1) The interaction mechanism can be modified into a double-auction, eliminating the auctioneer, resembling a completely distributed, ad hoc network; 2) the objectives are assumed to be independent in this thesis, there may be a more realistic assumption regarding correlation between objectives, such as a hierarchy of objectives; 3) current work limits information-sharing between agents, the setup befits applications with privacy requirements or sparse signaling; by allowing more information-sharing between the agents, the algorithms can be modified for more cooperative scenarios such as robotics.
N2  - Die Verteilte Entscheidungsfindung untersucht Entscheidungen innerhalb einer Gruppe von interaktiven und eigennützigen Agenten. Diese Arbeit befasst sich insbesondere mit der optimalen Folge von Entscheidungen eines Agenten, der das Erreichen eines oder mehrerer Ziele in einer dynamischen Umgebung zu maximieren versucht. Die Optimierung einer verteilten Entscheidungsfindung ist in vielen alltäglichen Anwendungen relevant, z.B. zur Allokation von Ressourcen (Produkte, Energie, Bandbreite, Rechenressourcen etc.) und in der Robotik (heterogene Agenten-Kooperation in Spielen oder Aufträgen) in diversen Feldern wie Fahrzeugkommunikation, Internet of Things, Smart Grid, usw.
Diese Arbeit schlägt drei Multi-Agenten Reinforcement Learning Algorithmen kombiniert mit spieltheoretischen Ansätzen vor, um die strategische Interaktion zwischen Entscheidungsträgern zu untersuchen. Dies wird am Beispiel einer Ressourcenallokation in der Fahrzeug-zu-X-Kommunikation (vehicle-to-everything) gezeigt. Speziell wird in der Arbeit ein Interaktionsmechanismus entwickelt, der auf Basis einer Zweitpreisauktion den Agenten zur Maximierung mehrerer kurz- und langfristiger Ziele sowie individueller und Systemziele anregt. Dabei wird eine dynamische Umgebung mit realistischen Mobilitätsdaten simuliert, um die Leistungsfähigkeit des Algorithmus zu evaluieren und das Agentenverhalten zu untersuchen.

Eine theoretische Analyse zeigt, dass bei diesem Mechanismus das Nash-Gleichgewicht sowie eine Maximierung von Wohlfahrt und Pareto-optimaler Ressourcenallokation in einer statischen Umgebung vorliegen. Empirische Untersuchungen ergeben, dass in einer dynamischen Umgebung der vorgeschlagene Lernalgorithmus den aktuellen Stand der Technik bei ein- und mehrdimensionaler Optimierung übertrifft, und dabei sehr gut auch auf stark abweichende Umgebungen generalisiert werden kann.

Speziell mit dem langfristigen mehrdimensionalen Lernalgorithmus wird gezeigt, dass bei Berücksichtigung von langfristigen Auswirkungen von Entscheidungen, als auch durch einen Anreiz zur Systemgerechtigkeit, die Agenten in individuellen als auch Systemzielen bessere Ergebnisse liefern, und das auch, wenn ihre Ziele privat, zufällig und zeitveränderlich sind. Weiter zeigen die Agenten Wettbewerbsverhalten, um ihre eigenen Ziele zu maximieren, wenn die Ressourcen knapp sind, und kooperatives Verhalten, um Systemziele zu erreichen, wenn die Ressourcen ausreichend sind. Darüber hinaus lernen sie die Ziele des Spiels ohne vorheriges Wissen über dieses, um Startschwierigkeiten, wie z.B. ein niedrigeres Budget, zu überwinden.

Für die praktische Umsetzung zeigt diese Arbeit auch mehrere Methoden auf, welche die Rechenleistung verbessern können, und testet den Algorithmus auf einem handelsüblichen Einplatinencomputer. Die Ergebnisse zeigen die Durchführbarkeit von inkrementellem Lernen und Inferenz innerhalb weniger Millisekunden auf. Ausgehend von den Ergebnissen dieser Arbeit könnten sich verschiedene Forschungsfragen anschließen: 1) Der Interaktionsmechanismus kann zu einer Doppelauktion verändert und dabei der Auktionator entfernt werden. Dies würde einem vollständig verteilten Ad-Hoc-Netzwerk entsprechen. 2) Die Ziele werden in dieser Arbeit als unabhängig betrachtet. Es könnte eine Korrelation zwischen mehreren Zielen angenommen werden, so wie eine Zielhierarchie. 3) Die aktuelle Arbeit begrenzt den Informationsaustausch zwischen Agenten. Diese Annahme passt zu Anwendungen mit Anforderungen an den Schutz der Privatsphäre oder bei spärlichen Signalen. Indem der Informationsaustausch erhöht wird, könnte der Algorithmus auf stärker kooperative Anwendungen wie z.B. in der Robotik erweitert werden.
KW  - V2X
KW  - distributed systems
KW  - reinforcement learning
KW  - game theory
KW  - auction
KW  - decision making
KW  - behavioral sciences
KW  - multi-objective
KW  - V2X
KW  - Verteilte Systeme
KW  - Spieltheorie
KW  - Auktion
KW  - Entscheidungsfindung
KW  - Verhaltensforschung
KW  - verstärkendes Lernen
KW  - Multiziel
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-607000
ER  - 
TY  - THES
A1  - Negri, Michael
T1  - How coaches influence referee decisions
BT  - a principal-agent perspective on non-professional soccer
BT  - eine Prinzipal-Agent Perspektive auf den nicht-professionellen Fußball
N2  - The work elaborates on the question if coaches in non-professional soccer can influence referee decisions. Modeled from a principal-agent perspective, the managing referee boards can be seen as the principal. They aim at facilitating a fair competition which is in accordance with the existing rules and regulations. In doing so, the referees are assigned as impartial agents on the pitch. The coaches take over a non-legitimate principal-like role trying to influence the referees even though they do not have the formal right to do so.
Separate questionnaires were set up for referees and coaches. The coach questionnaire aimed at identifying the extent and the forms of influencing attempts by coaches. The referee questionnaire tried to elaborate on the questions if referees take notice of possible influencing attempts and how they react accordingly. 
The results were put into relation with official match data in order to identify significant influences on personal sanctions (yellow cards, second yellow cards, red cards) and the match result. 
It is found that there is a slight effect on the referee’s decisions. However, this effect is rather disadvantageous for the influencing coach and there is no evidence for an impact on the match result itself.
N2  - Die Arbeit untersucht die Frage, ob Trainer im nicht-professionellen Fußball Schiedsrichterentscheidungen beeinflussen können. Aufbauend auf einer Prinzipal-Agent Perspektive nehmen die Schiedsrichterausschüsse die Rolle des Prinzipals ein. Sie zielen darauf ab, einen fairen Wettkampf zu organisieren, der in Übereinstimmung mit dem geltenden Regelwerk durchgeführt wird. Um dies zu erreichen, werden die Schiedsrichter als unparteiische Agenten auf dem Spielfeld eingesetzt. Die Trainer nehmen in dieser Konstellation eine illegitime, prinzipal-ähnliche Rolle ein und versuchen, den Schiedsrichter zu ihren Gunsten zu beeinflussen. Dies geschieht, ohne dass die Trainer ein entsprechendes Recht dazu hätten.
Sowohl für die Trainer als auch für die Schiedsrichter wurde ein Fragebogen entworfen. Der erstgenannte zielt darauf ab, das Ausmaß und die Form von Beeinflussungsversuchen zu erheben. Der Schiedsrichter-Fragebogen hingegen erörtert die Fragen, ob die Unparteiischen mögliche Beeinflussungsversuche durch die Trainer wahrnehmen und wie sie gegebenenfalls darauf reagieren.
Die Ergebnisse wurden mit offiziellen Spieldaten in Verbindung gebracht um potentielle Einflüsse auf Spiele (persönliche Strafen sowie das Endergebnis) zu identifizieren.
Es wurde herausgefunden, dass es einen leichten Effekt auf Schiedsrichterentscheidungen gibt. Dieser ist zumeist jedoch nachteilig für den jeweiligen Trainer und es gibt kein Indiz für einen Einfluss auf das Endergebnis.
T2  - Wie Trainer Schiedsrichterentscheidungen beeinflussen
KW  - referees
KW  - decision making
KW  - soccer
KW  - bias
KW  - Schiedsrichter
KW  - Trainer
KW  - Entscheidungen
KW  - Urteilsverzerrung
KW  - Prinzipal-Agent
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-72247
ER  - 
TY  - THES
A1  - Rühling, Markus
T1  - Substitution effect through fiscal transfers?! : incidence of the Peruvian property tax
T1  - Substituierungseffekt durch Finanzzuweisungen?! : Auswirkungen auf die Grundsteuererhebung im Falle Perus.
N2  - Whether the results of fiscal transfers have positive or negative implications depends upon the incentives that transfer systems create for both central and local governments. The complexity and ambiguity of the relationship between fiscal transfers and tax revenues of local governments is one of the main causes why research projects, even in the same country, come to different results. This investigation is seriously questioning the often stated substitution effect based only on an analysis of aggregated data and finally rejects in the qualitative part of this research (using survey techniques) a substitution effect in the majority of the assessed municipalities. While most theories are modeling governments as tax-maximizers (Leviathan) or as being prone to fiscal laziness, this investigation shows that mayors react to a whole set of incentives. Most mayors react rational and rather pragmatically in respect to the incentives and constraints which are established by the particular context of a municipality, the central government and their own personality/identity/interests. While the yield on property tax in Peru is low, there are no signs that increases in transfers have had, on average, a negative impact on their revenue generation. On an individual basis there exist mayors who are revenue maximizers, others who are substituting revenues and others who show apathy. Many engage in property tax. While rural or small municipalities have limited potential, property taxes are the main revenue sources for the Peruvian urban municipalities, rising on average 10% during the last five years. The property tax in Peru accounts for less than 0.2% of GDP, which compared to the Latin American average, is extremely low. In 2002, property tax was collecting nationwide about 10% of the overall budget of local governments. In 2006, the share was closer to 6% due to windfall transfers. The property tax can enhance accountability at the local level and has important impacts on urban spatial development. It is also important considering that most charges or transfers are earmarked such that property tax yields can cover discretionary finances. The intergovernmental fiscal transfers can be described as a patchwork of political liabilities of the past rather than connected with thorough compensation or service improvement functions. The fiscal base of local governments in Peru remains small for the municipalities and the incentive structure to enhance property tax revenues is far from optimal. The central government and sector institutions, which are in the Peruvian institutional design of the property tax responsible for the enablement environment, can reinforce local tax efforts. In the past the central government permanently changed the rules of the game, giving municipalities reduced predictability of policy choices. There are no relevant signs that a stronger property tax is captured by Peruvian interest groups. Since the central government has responsibility for tax regulation and partly valuation there has been little debate about financial issues on the local political agenda. Most council members are therefore not familiar with tax issues. If the central government did not set the tax rate and valuation then there would probably be a more vigorous public debate and an electorate that was better informed about local politics. Elected mayors (as political and administrative leaders) are not counterbalanced and held in check by an active council and/or by vigorous local political parties. Local politics are concentrated on the mayor, electoral rules, the institutional design and political culture – all of which are not helpful in increasing the degree of influence that citizens and associations have upon collective decision-making at the local level. The many alternations between democracy and autocracy have not been helpful in building strong institutions at the local level. Property tax revenues react slowly and the institutional context matters because an effective tax system as a public good can only be created if actors have long time horizons. The property tax has a substantial revenue potential, however, since municipalities are going through a transfer bonanza, it is especially difficult to make a plea for increasing their own revenue base. Local governments should be the proponents of property tax reform, but they have, in Peru, little policy clout because the municipal associations are dispersed and there exists little relevant information concerning important local policy issues.
N2  - Ob die Auswirkungen von Fiskaltransfers auf die Generierung von lokalen Steuereinnahmen positiv oder negativ sind, wird in der akademischen Literatur weiterhin offen diskutiert. Die Komplexität und Ambivalenz der Fiskalbeziehungen zwischen Gebietsköperschaften und Zentralregierung führt manchmal selbst innerhalb eines gleichen Landes zu unterschiedlichen Ergebnissen. Die hier vorliegende Untersuchung hinterfragt kritisch den oft postulierten Effekt in dem Eigeneinahmen durch Transferzahlungen substituiert werden. Während die meisten wissenschaftlichen Arbeiten Regierungen entweder als tax-maximizers (Leviathan) oder als fiscal lazy darstellen, zeigt diese Untersuchung, dass die meisten Bürgermeister spezifisch auf eine Vielzahl von Anreizen rational und pragmatisch reagieren. Obwohl die Eigeneinnahmen der Lokalregierungen in Peru generell niedrig sind, kann ein direkter Zusammenhang zwischen kontinuierlich ansteigenden Grundsteuereinnahmen und Fiskalzuweisungen eher verneint werden. Die Anreizstruktur in Peru zur Generierung von lokalen Steuereinnahmen ist hinderlich und teilweise sogar kontraproduktiv. Die Zentralregierung und gewisse Spezialinstitutionen spielen in Peru wichtige Funktionen hinsichtlich lokaler Steuergenerierung und sind mitverantwortlich für die positive Gestaltung der Anreizstruktur.
KW  - Fiskaltransfers
KW  - Grundsteuer
KW  - Fiskalausgleich
KW  - Lokalsteuern
KW  - Dezentralisierung
KW  - fiscal transfers
KW  - property tax
KW  - decentralization
KW  - decision making
KW  - municipal government
Y1  - 2008
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-42100
ER  -