Chasing FIFO Bug

Apr 27, 2007 | 2 minutes read
Share this:

Tags: Bug, Crash

When testing RELENG_6 (for weeks now), i encountered a strange bug running an UP and a SMP kernel under heavy parallel load, as when building the world using the -j flag. The symptom was simply a panic, no more no less.

After posting about it on current@, i received good help from Robert N M Watson himself asking for crash dump and testing. After some work on his side, he came with a lot of work on the FreeBSD's FIFO implementation, now committed to the source tree.

As a side note, it be mentioned that these modifications seems to solve the problem even on the UP machine, but Robert said that this one may had another origin and may not be totally solved. Here is an excerpt of one of his reply:

This is actually interesting -- Kris Kenneway reported the same thing -- that is, that with the most recent spate of FIFO bug fixes, the problem has gone away. The only problem is that I don't think I fixed a bug with the symptoms that you have experienced, which suggests instead that I've changed the timing of the bug so that it occurs less rarely. I sent Kris a set of assertion patches to run with, and he generates some nice parallel make load, and I have a regression test I've been running. I think we're set for now and I'll continue to work on reproducing it after my recent fifo cleanup. It is possible I fixed a race condition without meaning to, of course... Please let me know if you have any further FIFO panics! Thanks again, Robert N M Watson

I just can say that panic doesn't occur since then. Thanks to him.