Garbage Collected Filesystem

In computer science, a Garbage Collected Filesystem is a computer filesystem which uses garbage collection to free disk space when files or directories are deleted. Unlike traditional Unix-derived filesystems hard links to directories are allowed.

Data & Metadata
Files and directories consist of two parts: data and metadata. In this case, data is the contents of the file and metadata is data about the file or directory such as file size, owner, and permissions.

Names and Nodes
The filesystem can be considered to be a graph. Files and directories are nodes on that graph, and names are arrows from directory nodes to file nodes or other directory nodes.

Operations that operate on names are mknod (create new node linked to by this name), open (get node from existing name), unlink (remove name from directory). Operations that operate on nodes are anything that accesses any file data or metadata (most other operations). .

The special case is the link operation. On traditional systems, it takes two names and creates a link between the two. However since the first name already exists, a version that takes a node can be conceived, and should be provided.

Overview of Semantics
Local semantics are preserved from traditional hierarchical filesystems under most cases (an exception is .. which is ambiguous). Hard links exist in other filesystems: UNIX has always had them, and Windows NT has had them from at least version 4. These filesystems do not allow hard links to directories, only to files. In a garbage collected filesystem, it is always allowed to hard link directories, even if loops will be created. If /sandbox/foo and /sandbox/bar are two links to the same directory, then any /sandbox/foo/quux is also /sandbox/bar/quux. In addition, directory loops can be created. /sandbox/foo/baz can be created as a hard link to /sandbox/bar. /sandbox/foo/baz/baz/baz/quux is now the same as /sandbox/foo/quux.

Unlinking /sandbox/bar disturbs nothing: even /sandbox/foo/baz/quux works. If /sandbox/foo is unlinked the directory still exists, but it cannot be reached because all names and links to it have been removed. Also, quux is unreachable because its parent (/sandbox/foo or /sandbox/bar) is unreachable. The garbage collector will (when it next runs) find all unreachable nodes and convert them to free disk space.
 
< Prev   Next >