• Treffer 6 von 24
Zurück zur Trefferliste

Migol : a fault-tolerant service framework for MPI applications in the grid

  • Especially for sciences the provision of massive parallel CPU capacity is one of the most attractive features of a grid. A major challenge in a distributed, inherently dynamic grid is fault tolerance. The more resources and components involved, the more complicated and error-prone becomes the system. In a grid with potentially thousands of machines connected to each other the reliability of individual resources cannot be guaranteed.The benefit of the grid is that in case of a failure ail application may be migrated and restarted from a checkpoint file on another site. This approach requires a service infrastructure which handles the necessary activities transparently. In this article, we present Migol, a fault-tolerant and self-healing grid middleware for MPI applications. Migol is based on open standards and extends the services of the Globus toolkit to support the fault tolerance of grid applications.Further, the Migol framework itself is designed with special focus on fault tolerance. For example, Migol eplicates ritical servicesEspecially for sciences the provision of massive parallel CPU capacity is one of the most attractive features of a grid. A major challenge in a distributed, inherently dynamic grid is fault tolerance. The more resources and components involved, the more complicated and error-prone becomes the system. In a grid with potentially thousands of machines connected to each other the reliability of individual resources cannot be guaranteed.The benefit of the grid is that in case of a failure ail application may be migrated and restarted from a checkpoint file on another site. This approach requires a service infrastructure which handles the necessary activities transparently. In this article, we present Migol, a fault-tolerant and self-healing grid middleware for MPI applications. Migol is based on open standards and extends the services of the Globus toolkit to support the fault tolerance of grid applications.Further, the Migol framework itself is designed with special focus on fault tolerance. For example, Migol eplicates ritical services and uses a ring-based replication protocol to achieve data consistency. (c) 2007 Elsevier B.V. All rights reserved.zeige mehrzeige weniger

Metadaten exportieren

Weitere Dienste

Suche bei Google Scholar Statistik - Anzahl der Zugriffe auf das Dokument
Metadaten
Verfasserangaben:André Luckow, Bettina SchnorORCiDGND
DOI:https://doi.org/10.1016/j.future.2007.03.007
Publikationstyp:Wissenschaftlicher Artikel
Sprache:Englisch
Jahr der Erstveröffentlichung:2008
Erscheinungsjahr:2008
Datum der Freischaltung:25.03.2017
Quelle:Future Generation Computer Systems. - 24 (2008), 2, S. 142 - 152
Organisationseinheiten:Mathematisch-Naturwissenschaftliche Fakultät / Institut für Informatik und Computational Science
Name der Einrichtung zum Zeitpunkt der Publikation:Mathematisch-Naturwissenschaftliche Fakultät / Institut für Informatik
Verstanden ✔
Diese Webseite verwendet technisch erforderliche Session-Cookies. Durch die weitere Nutzung der Webseite stimmen Sie diesem zu. Unsere Datenschutzerklärung finden Sie hier.