OS/2 Filesystem Shootout!

By Michal Necasek

Because I'm a curious person, I wanted to know which one of these IFSs is the best. The following paragraphs are an attempt to find an answer to this question. Perhaps not too surprisingly, there is no clear answer... but read on. The days of DOS when you could choose any filesystem as long as it was FAT are long gone. Since the introduction of OS/2 Warp Server for e-Business, OS/2 and eCS users can choose between three IBM supplied filesystems (FAT, HPFS, JFS) and four file system drivers - for HPFS there are two possibilities, "regular" HPFS and HPFS386, although the latter is only available in server versions and at considerable extra cost. I don't even count the variety of third-party IFSs - most of them have very specific purposes and are not suitable as generic filesystem drivers due to poor performance, lack of features or special intended uses.

Because I'm a curious person, I wanted to know which one of these IFSs is the best. The following paragraphs are an attempt to find an answer to this question. Perhaps not too surprisingly, there is no clear answer... but read on. If we want to choose the "best" IFS, we need to be able to tell if one IFS is better than another. I decided to compare the filesystems based on performance and features. Features are rather difficult to express in numbers- which leaves us with performance, and to compare the filesystem performance it is necessary to run benchmarks. Unlike some hot benchmarking "pros" I'm not going to pretend that my benchmarks are universally applicable or necessarily even important. Draw your own conclusions from the numbers - and at your own risk. If you want to be really sure, roll your own benchmarks which model how you use your computer. First I will describe the hardware used for benchmarks. It is a 600 MHz Pentium III with 256MB RAM and Maxtor 40GB 7200RPM EIDE disk. I did not run the tests on this disk however - for that I used a Fujitsu Ultra-160 18.2GB 7200RPM SCSI disk attached to a venerable Adaptec 2940UW PCI SCSI host controller. This setup allowed me to create one relatively small 1GB volume and reformat it with various filesystem. That ensured that all tests were run on a clean, unfragmented volume and that there was no skew introduced by different speeds of different parts of the disk. With this setup I am reasonably certain that it was only the IFSs that made the difference between test results and no other factors. In retrospect, using the fast SCSI drive for benchmarking wasn't a very good idea for a simple reason: a fast drive tends to equalize the filesystem differences in performance. Look at it this way - if the disk was infinitely fast, there should be no differences in filesystem performance. The slower the disk, the more important the IFS efficiency becomes. Still, even with the fast drive there were clearly discernible trends in filesystem performance. For both HPFS386 and JFS I used 32MB caches with default settings for lazy write timeouts etc. For plain HPFS I used the maximum 2MB cache and for FAT the maximum 14MB cache. The base sytem was eCS GA with HPFS386.IFS from WSeB GA and JFS.IFS from October 16, 2001. At first I intended to run "classic" benchmarks from SysBench but then decided against it for two important reasons:


 * Sysbench for some reason refused to run on my JFS volumes. That alone was a very good reason not to use
 * I realized that Sysbench runs read/write tests in 1:1 ratio. That is very different from real world usage where the read:write ratio is usually more like 10:1 or even much higher.

So I decided to set up my own tests. Yes, these tests were completely arbitrary. And yes, I believe they told me more than Sysbench would have - because they were "real world" tests. I will briefly describe the tests:


 * ZIP test. A largish (120MB) ZIP file containing mostly large files (about hundred total) was unzipped, zipped up to another archive and the uncompressed files deleted. This test stressed the primarily the read and write throughput of the filesystem with relatively little file creation and deletion.
 * Build test. Running "dmake build" on the SciTech MGL libraries using Watcom 11.0c C/C++ compiler. The choice of compiler and source code is largely irrelevant, I'm sure the results would be similar with other projects and  other compilers. This test primarily stressed cache efficiency and also creation and deletion of a large amount of mostly very small files (lots of small temporary files get created and deleted in the build process).
 * Read test. Reading a big (about 600MB) file. This test stressed raw filesystem read throughput (the results were rather intriguing).

The following table summarizes the test results. Each test was run at least three times and results were averaged to suppress any flukes. All figures are given in seconds, hence smaller numbers are better. For clarity, best score in each test is highlighted in blue and the worst score in red.



The above table makes several points very clear:


 * The performance of plain HPFS not very good. The maximum cache size is ridiculously small but that doesn't explain the surprisingly poor performance of large sequential reads.
 * The performance delta between HPFS386 and JFS is very small.
 * HPFS386 is the fastest on writes, JFS is somewhat slowed down by the journaling overhead.
 * JFS has clearly the best read throughput, most likely due to straighter path through the kernel. I suspect that FAT has a similar advantage (though for different reasons).
 * The differences in raw read throughput are simply amazing. The winner (JFS) was very nearly 100% faster than the loser (HPFS). I was impressed by JFS's performance because the theoretical maximum throughput of UW SCSI is 40MB/sec. I consider achieving slightly over 75% of the theoretical maximum at application level quite good.
 * It is necessary to differentiate between the filesystem layout on storage media and the actual filesystem driver. The latter is obviously tremendously important as the comparison of HPFS versus HPFS386 shows. The performance difference is striking when we consider that both IFSs organize the data on storage media in exactly the same way.
 * It is interesting that out of only three tests and four filesystems, no filesystem consistently scored best or worst. That shows how difficult it is to pick a winner.

So which filesystem is the best? The answer is "it depends" - that is, it depends on user's needs. To make things simpler, first let's see which filesystem is not the best:


 * FAT - the performance isn't terribly good even with a big fat cache. And when it comes to features, FAT is the clear loser. Lack of long filenames and maximum volume size limit of 2GB preclude FAT from serious use. Its only saving grace is wide compatibility with other OSes and the fact that FAT is still a good filesystem for floppies.
 * HPFS - features are almost as good as HPFS386 but the performance isn't stellar. Extremely small cache size limit seems to be HPFS's worst deficiency but sequential read performance isn't very impressive either - HPFS was by far the slowest in that test, slower even than FAT. My recommendation: use plain HPFS as little as possible.

That leaves two contestants ahead of the pack: HPFS386 and JFS. There is no clear winner. There is little difference between these two IFSs performance-wise. HPFS386 is faster on writes but JFS has a clear edge when it comes to reading big chunks of data. Both have very efficient caches - in the build test the CPU was 100% utilized almost all the time with both filesystems. Unless you actually take a stopwatch, both IFSs perform equally well, although each of them has specific strengths and weaknesses.

These differences are not suprising given these filesystems' very different history: HPFS386 was reportedly written by Gordon Letwin, the father of HPFS, in late 1980's. JFS was developed for IBM's variant of Unix, AIX (probably in early 1990s). HPFS386 is hand optimized 386 assembly code, JFS is written in C - but the above numbers show that there's more to performance than tight loops. If we can't decide which IFS is better on performance, we can try to base the decision on features. As I hinted above, this is not easy because features are impossible to quantify. Only individual user with specific needs can decide which feature set suits him or her best. A detailed description of the respective filesystems' feature sets can be found elsewhere so I'll just summarize the points that I find most important:


 * HPFS386 is bootable, JFS is not (although that might change in future).
 * HPFS386 supports files with maximum size of 2GB, the JFS limit is far larger.
 * CHKDSK on a large HPFS386 volume may take hours, on JFS it's usually seconds.
 * JFS comes with newer versions of OS/2 and eCS, HPFS386 costs extra.

I have been using both JFS and HPFS386 in the past years. I have found both to be fast and reliable - those are my own experiences, I can't speak for anyone else.

My personal choice is JFS for one simple reason: HPFS386 is a dead end. It is partially owned by Microsoft and no one can seriously expect any new development to be done on it. That was the reason why IBM decided to include JFS in WSeB after all. JFS on the other hand is available in source code and JFS support was added to Linux. In other words - nobody can take JFS away which is more than can be said for HPFS386.

Conclusion
I'm not going to make a choice for anyone else. You decide what's best for you. I attempted to present some interesting (and perhaps even unexpected) data here but you have'll to draw your own conclusions. I won't pretend I know what's best for you. Maybe you don't either, but that's another matter.