There are 2 posts tagged with benchmark.

There is an Atom feed for posts tagged with benchmark.

Tail call elimination is good in C too

2008-04-09 Tags: , ,

All recursive algorithms can be converted into an iterative version. They teach that in school. They teach it as something that should be done, and it used to be true. However, we a living in a wonderful time and old wisdoms sometimes cease to be true.

Some algorithms are much more elegant when written recursively; the recursive version is also easier to prove. On the other hand, the iterative version will avoid many stack manipulation operations. Theoretically, any tail recursive function can be converted into an iteration transparently by the compiler without performance penalty compared to the hand written iterative version. Theoretically.

When I wanted to prove that Common Lisp had really fast implementations, I did my best to use the iterative version of the Mandelbrot formula. Unfortunately, there was no way around it, the iterative version would always run a bit faster. "So be it," I said to myself in resignation.

Too many files: Reiser FS vs hashed paths

2007-12-26 Tags: , ,

On my hard drive, some directories attract lint faster than my CPU fan. I wipe then clean but as soon as I blink they already contain a zillion files. And for some reason, having a zillion files in a directory on GNU/Linux is a really bad thing.

With Gazest, I decided to store only the files meta data in the database and to keep the content on disk. To prevent name clashes, the content filenames are the HMAC-SHA1 hashes of the data. Of course, that means that I have to expect a zillion files in the content directory and as soon as you mention the problem of to many files you hear "Reiser FS," loud and proud from the cube next to yours. But it is really the best solution?

I'll focus on a single problem: access time of a file stored in a directory with a lot of other files. I don't care about CPU usage, disk usage, fragmentation, paging, or recovery tools. When I read a file, how long will it take? Thats it. Surprisingly, I could not find good benchmark for that typical use case. Sure, there are many anecdotal tales of people who noticed a significant improvement with a new FS when they re-installed their mail server but they end up comparing completely different setups with a variable real world server load. Lack of measurements lead to counter productive faith based debates so here is my attempt to quantify the issue.