2009
Nieto Tovar, E.; Hernández Palacios, R.; Camacho Cruz, H. E.; Díaz García, A.F.; Anguita López, M.; Ortega Lopera, J. (2009). Data replication in PVFS2 to achieve fault tolerance. Escuela Superior de Huejutla. UAEH. Mexico. ISBN: In process.
Abstract
The use of cluster node disks as a global storage system is an inexpensive solution; but for it to be a viable solution, the problem of frequent disk and cluster node failures must be addressed. These failures cause application file access errors. The number of file access failures is especially important in platforms with a high number of nodes and with parallel file systems, such as PVFS. This paper shows how data replication has been added to the second version of the PVFS parallel file system to achieve fault tolerance and the impact on the performance of the implementation.