Date of Original Version
© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Abstract or Description
Lack of a highly scalable and parallel metadata service is the Achilles heel for many cluster file system deployments in both the HPC world and the Internet services world. This is because most cluster file systems have focused on scaling the data path, i.e. providing high bandwidth parallel I/O to files that are gigabytes in size. But with proliferation of massively parallel applications that produce metadata-intensive workloads, such as large number of simultaneous file creates and large-scale storage management, cluster file systems also need to scale metadata performance. To realize these goals, this paper makes a case for a scalable metadata service middleware that layers on existing cluster file system deployments and distributes file system metadata, including the namespace tree, small directories and large directories, across many servers. Our key idea is to effectively synthesize a concurrent indexing technique to distribute metadata with a tabular, on-disk representation of all file system metadata.
Proceedings of SC Companion: High Performance Computing, Networking Storage and Analysis Workshop, 2012, 30-35.