A distributed data exchange engine for polystores

Kaitoua, Abdulrahman; Rabl, Tilmann; Markl, Volker

doi:10.1515/itit-2019-0037

Treffer 1 von 1

Zurück zur Trefferliste

A distributed data exchange engine for polystores

Abdulrahman Kaitoua, Tilmann Rabl, Volker Markl

There is an increasing interest in fusing data from heterogeneous sources. Combining data sources increases the utility of existing datasets, generating new information and creating services of higher quality. A central issue in working with heterogeneous sources is data migration: In order to share and process data in different engines, resource intensive and complex movements and transformations between computing engines, services, and stores are necessary. Muses is a distributed, high-performance data migration engine that is able to interconnect distributed data stores by forwarding, transforming, repartitioning, or broadcasting data among distributed engines' instances in a resource-, cost-, and performance-adaptive manner. As such, it performs seamless information sharing across all participating resources in a standard, modular manner. We show an overall improvement of 30 % for pipelining jobs across multiple engines, even when we count the overhead of Muses in the execution time. This performance gain implies that Muses canThere is an increasing interest in fusing data from heterogeneous sources. Combining data sources increases the utility of existing datasets, generating new information and creating services of higher quality. A central issue in working with heterogeneous sources is data migration: In order to share and process data in different engines, resource intensive and complex movements and transformations between computing engines, services, and stores are necessary. Muses is a distributed, high-performance data migration engine that is able to interconnect distributed data stores by forwarding, transforming, repartitioning, or broadcasting data among distributed engines' instances in a resource-, cost-, and performance-adaptive manner. As such, it performs seamless information sharing across all participating resources in a standard, modular manner. We show an overall improvement of 30 % for pipelining jobs across multiple engines, even when we count the overhead of Muses in the execution time. This performance gain implies that Muses can be used to optimise large pipelines that leverage multiple engines.…

Metadaten
Verfasserangaben:	Abdulrahman Kaitoua ORCiD, Tilmann Rabl ORCiD GND, Volker Markl GND
DOI:	https://doi.org/10.1515/itit-2019-0037
ISSN:	1611-2776
ISSN:	2196-7032
Titel des übergeordneten Werks (Englisch):	Information technology : methods and applications of informatics and information technology
Titel des übergeordneten Werks (Deutsch):	Information technology : Methoden und innovative Anwendungen der Informatik und Informationstechnik
Verlag:	De Gruyter
Verlagsort:	Berlin
Publikationstyp:	Wissenschaftlicher Artikel
Sprache:	Englisch
Datum der Erstveröffentlichung:	04.03.2020
Erscheinungsjahr:	2020
Datum der Freischaltung:	27.03.2023
Freies Schlagwort / Tag:	big data; data integration; data migration; data transformation; distributed systems; engine
Band:	62
Ausgabe:	3-4
Seitenanzahl:	12
Erste Seite:	145
Letzte Seite:	156
Fördernde Institution:	German Ministry for Education and Research as s BI-FOLD [01IS18025A,; 01IS18037]
Organisationseinheiten:	An-Institute / Hasso-Plattner-Institut für Digital Engineering gGmbH
DDC-Klassifikation:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
	6 Technik, Medizin, angewandte Wissenschaften / 62 Ingenieurwissenschaften / 620 Ingenieurwissenschaften und zugeordnete Tätigkeiten
Peer Review:	Referiert

A distributed data exchange engine for polystores

Metadaten exportieren

Weitere Dienste