possible deadlock through raid5/md

am 05.10.2006 10:24:52 von ptb

A user has sent me a ps ax output showing an enbd client daemon
blocked in get_active_stripe (I presume in raid5.c).

ps ax -o f,uid,pid,ppid,pri,ni,vsz,rss,wchan:30,stat,tty,time,command

F UID PID PPID PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND

5 0 26540 1 23 0 2140 1048 get_active_stripe
Ds ? 00:00:00 enbd-client iss04 1300 -i iss04-hdd -n 2 -e -m -b 4096 -p 30 /dev/ndl

Any idea how it can get there and what the blockage is? I assume it
is in wait_event_lock_irq(conf->wait_for_stripe ...) or
unplug_slaves(), modulo inlining.

I believe the client(s) was/were doing a read over the network in
general terms, looking at other info supplied. That means something
would be writing into a local kernel buffer attached to a bh.

That buffer would have come in attached to a kernel request to the enbd
driver. I presume that the enbd device is a component of a raid5 array,
being read.

Curiously, the above client daemon appears NOT to be a transfer daemon,
but rather a "watchdog". Its only function is to hold the enbd device
open. Getting into a D state like that is a neat trick, but nothing
compared with the trick of getting into the raid code!

My theory, and it is mine, is that on the last close of a device, the
blkdev_put/get code does a flush of requests to the device as the
openers count falls to zero. That would exert pressure through the
device at least, which could deadlock since the transfer daemons are
dying, but again, I have no idea how anything got into raid code.

Maybe a method attached to a page or bh? If in order to write into
a buffer, the buffer somehow had to be "decided" by the raid code
via a method attached, maybe that would account for where this
got parked?

I'll add more info later as I get it. For the moment, wild theories
appreciated.

Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: possible deadlock through raid5/md

am 15.10.2006 22:06:51 von ptb

While travelling the last few days, a theory has occurred to me to
explain this sort of thing ...

> A user has sent me a ps ax output showing an enbd client daemon
> blocked in get_active_stripe (I presume in raid5.c).
>
> ps ax -of,uid,pid,ppid,pri,ni,vsz,rss,wchan:30,stat,tty,time,comma nd
>
> F UID PID PPID PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
> 5 0 26540 1 23 0 2140 1048 get_active_stripe Ds ? 00:00:00 enbd-client iss04 1300 -i iss04-hdd -n 2 -e -m -b 4096 -p 30 /dev/ndl

Suppose that memory is full of dirty buffers and that the _transport_
for the medium on which one of the raid disks is running (in this case
tcp, under enbd and elsewhere) needs buffers. It needs buffers both to
read and write. But there are none available so the call through the
user process which wants to use the transport causes the kernel to try
and free pages.

That causes the user process to end up in the kernel routines which try
and flush devices to disk, and through them in the various (request?)
functions of device drivers, and perhaps even in raid5's
get_active_stripe.

However, if that stripe is on a remote disk availale through tcp, then
tcp is blocked by lack of the resources that are trying to be freed, so
we are in deadlock?

Sound plausible? Cure ought to be to keep some kernel memory available
for tcp that is not available to dirty buffers.

Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html