A Fast File System for UNIX

A common design for storing a file’s data blocks is via direct and indirect pointers. Direct pointers in blue point to data blocks containing the file’s data in pink. Indirect pointers are in green and point to blocks of direct pointers. Double indirect pointers are in orange pointing to blocks of indirect pointers, and triple indirect pointers are in red pointing to blocks of double indirect pointers. Having indirect pointers help scale the file (and its number of data blocks). For example, consider a block that can hold 4 pointers. With all direct pointers, it can point to 4 data blocks, but with 1 direct pointer it can point to 7 data blocks (3 direct, and 4 via the indirect block).
The first image shows the layout of the old file system (OFS) — superblock, inodes, and data blocks laid out sequentially. The second image shows one of the performance problems with OFS’s layout — a file’s inode and data can be far apart on the disk, requiring costly seek operations. The third image shows another problem with OFS’s layout — over time, because of file allocations and deletions, the free blocks can be fragmented across the disk. When a new file is allocated using these fragmented free blocks, its data is spread across the disk and accessing it requires expensive seek operations.
Cylinder groups, as shown in the OSTEP chapter on fast file system. The diagram shows platter with tracks. The outermost tracks (in dark gray) across all the platters form a cylinder. A group of cylinder (here, 3 outermost ones) form a cylinder group.
An schematic diagram of the fast file system (FFS) layout. It shows 4 cylinder groups. Each cylinder group has a copy of the file system’s super block. Each group has its own set of inodes and data blocks. Essentially, each cylinder group has its own mini-OFS. This addresses the key performance problem of OFS that it required costly seek operations. Each mini-OFS instance is within a cylinder group and seeks within a cylinder group are fast. FFS also tries to allocated related data (e.g., files in the same directory) to the same cylinder group while also trying to load-balance across cylinder groups (e.g., in its allocation of directories to cylinder groups).

--

--

--

Visit afterhoursacademic.com for summaries of computer science research papers.

Love podcasts or audiobooks? Learn on the go with our new app.

What is Spark?

My lifelong love affair with Microsoft Excel

Excel spreadsheet with love letter inside

Clind: Grow with your tribe

Sell your NFT in multiple currencies at the same time

STEPN: An NFT project that you can earn money with your every single step

SAFE and Solid: the internet as it should be

CUDA by Example : Notes

Zoker Game Demo — Information

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Rajat Kateja, After Hours Academic

Rajat Kateja, After Hours Academic

Visit afterhoursacademic.com for summaries of computer science research papers.

More from Medium

The value of fast feedback loop for developers

How to store multiple Organization's CloudTrail Logs into an S3 Bucket of a separate account

Indexing strategies in databases

Apache Pulsar is #5 in Commits to ASF Projects