Re: [PATCH 05/10] block: remove per-queue plugging
Re: [PATCH 05/10] block: remove per-queue plugging
am 18.04.2011 09:25:51 von NeilBrown
On Mon, 18 Apr 2011 08:38:24 +0200 Jens Axboe wrote:
> On 2011-04-18 00:19, NeilBrown wrote:
> > On Mon, 11 Apr 2011 14:11:58 +0200 Jens Axboe wrote:
> >
> >>> Yes. But I need to know when to release the requests that I have stored.
> >>> I need to know when ->write_pages or ->read_pages or whatever has finished
> >>> submitting a pile of pages so that I can start processing the request that I
> >>> have put aside. So I need a callback from blk_finish_plug.
> >>
> >> OK fair enough, I'll add your callback patch.
> >>
> >
> > But you didn't did you? You added a completely different patch which is
> > completely pointless.
> > If you don't like my patch I would really prefer you said so rather than
> > silently replace it with something completely different (and broken).
>
> First of all, you were CC'ed on all that discussion, yet didn't speak up
> until now. This was last week. Secondly, please change your tone.
Yes, I was CC'ed on a discussion. In that discussion it was never mentioned
that you had completely changed the patch I sent you, and it never contained
the new patch in-line for review. Nothing that was discussed was
particularly relevant to md's needs so there was nothing to speak up about.
Yes- there were 'git pull' requests and I could have done a pull myself to
review the code but there seemed to be no urgency because you had already
agreed to apply my patch.
When I did finally pull the patches (after all the other issues had settle
down and I had time to finish of the RAID side) I found ... what I found.
I apologise for my tone, but I was very frustrated.
>
> > I'll try to explain again.
> >
> > md does not use __make_request. At all.
> > md does not use 'struct request'. At all.
> >
> > The 'list' in 'struct blk_plug' is a list of 'struct request'.
>
> I'm well aware of how these facts, but thanks for bringing it up.
>
> > Therefore md cannot put anything useful on the list in 'struct blk_plug'.
> >
> > So when blk_flush_plug_list calls queue_unplugged() on a queue that belonged
> > to a request found on the blk_plug list, that queue cannot possibly ever be
> > for an 'md' device (because no 'struct request' ever belongs to an md device,
> > because md doesn't not use 'struct request').
> >
> > So your patch (commit f75664570d8b) doesn't help MD at all.
> >
> > For md, I need to attach something to blk_plug which somehow identifies an md
> > device, so that blk_finish_plug can get to that device and let it unplug.
> > The most sensible thing to have is a completely generic callback. That way
> > different block devices (which choose not to use __make_request) can attach
> > different sorts of things to blk_plug.
> >
> > So can we please have my original patch applied? (Revised version using
> > list_splice_init included below).
> >
> > Or if not, a clear explanation of why not?
>
> So correct me if I'm wrong here, but the _only_ real difference between
> this patch and the current code in the tree, is the checking of the
> callback list indicating a need to flush the callbacks. And that's
> definitely an oversight. It should be functionally equivelant if md
> would just flag this need to get a callback, eg instead of queueing a
> callback on the list, just set plug->need_unplug from md instead of
> queuing a callback and have blk_needs_flush_plug() do:
>
> return plug && (!list_empty(&plug->list) || plug->need_unplug);
>
> instead. Something like the below, completely untested.
>
No, that is not the only real difference.
The real difference is that in the current code, md has no way to register
anything with a blk_plug because you can only register a 'struct request' on a
blk_plug, and md doesn't make any use of 'struct request'.
As I said in the Email you quote above:
> > Therefore md cannot put anything useful on the list in 'struct blk_plug'.
That is the heart of the problem.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] block: remove per-queue plugging
am 18.04.2011 10:10:18 von Jens Axboe
On 2011-04-18 09:25, NeilBrown wrote:
> On Mon, 18 Apr 2011 08:38:24 +0200 Jens Axboe wrote:
>
>> On 2011-04-18 00:19, NeilBrown wrote:
>>> On Mon, 11 Apr 2011 14:11:58 +0200 Jens Axboe wrote:
>>>
>>>>> Yes. But I need to know when to release the requests that I have stored.
>>>>> I need to know when ->write_pages or ->read_pages or whatever has finished
>>>>> submitting a pile of pages so that I can start processing the request that I
>>>>> have put aside. So I need a callback from blk_finish_plug.
>>>>
>>>> OK fair enough, I'll add your callback patch.
>>>>
>>>
>>> But you didn't did you? You added a completely different patch which is
>>> completely pointless.
>>> If you don't like my patch I would really prefer you said so rather than
>>> silently replace it with something completely different (and broken).
>>
>> First of all, you were CC'ed on all that discussion, yet didn't speak up
>> until now. This was last week. Secondly, please change your tone.
>
> Yes, I was CC'ed on a discussion. In that discussion it was never mentioned
> that you had completely changed the patch I sent you, and it never contained
> the new patch in-line for review. Nothing that was discussed was
> particularly relevant to md's needs so there was nothing to speak up about.
>
> Yes- there were 'git pull' requests and I could have done a pull myself to
> review the code but there seemed to be no urgency because you had already
> agreed to apply my patch.
> When I did finally pull the patches (after all the other issues had settle
> down and I had time to finish of the RAID side) I found ... what I found.
>
> I apologise for my tone, but I was very frustrated.
>
>>
>>> I'll try to explain again.
>>>
>>> md does not use __make_request. At all.
>>> md does not use 'struct request'. At all.
>>>
>>> The 'list' in 'struct blk_plug' is a list of 'struct request'.
>>
>> I'm well aware of how these facts, but thanks for bringing it up.
>>
>>> Therefore md cannot put anything useful on the list in 'struct blk_plug'.
>>>
>>> So when blk_flush_plug_list calls queue_unplugged() on a queue that belonged
>>> to a request found on the blk_plug list, that queue cannot possibly ever be
>>> for an 'md' device (because no 'struct request' ever belongs to an md device,
>>> because md doesn't not use 'struct request').
>>>
>>> So your patch (commit f75664570d8b) doesn't help MD at all.
>>>
>>> For md, I need to attach something to blk_plug which somehow identifies an md
>>> device, so that blk_finish_plug can get to that device and let it unplug.
>>> The most sensible thing to have is a completely generic callback. That way
>>> different block devices (which choose not to use __make_request) can attach
>>> different sorts of things to blk_plug.
>>>
>>> So can we please have my original patch applied? (Revised version using
>>> list_splice_init included below).
>>>
>>> Or if not, a clear explanation of why not?
>>
>> So correct me if I'm wrong here, but the _only_ real difference between
>> this patch and the current code in the tree, is the checking of the
>> callback list indicating a need to flush the callbacks. And that's
>> definitely an oversight. It should be functionally equivelant if md
>> would just flag this need to get a callback, eg instead of queueing a
>> callback on the list, just set plug->need_unplug from md instead of
>> queuing a callback and have blk_needs_flush_plug() do:
>>
>> return plug && (!list_empty(&plug->list) || plug->need_unplug);
>>
>> instead. Something like the below, completely untested.
>>
>
> No, that is not the only real difference.
>
> The real difference is that in the current code, md has no way to register
> anything with a blk_plug because you can only register a 'struct request' on a
> blk_plug, and md doesn't make any use of 'struct request'.
>
> As I said in the Email you quote above:
>
>>> Therefore md cannot put anything useful on the list in 'struct blk_plug'.
>
> That is the heart of the problem.
Hmm, I don't really see a way to avoid the list in that case. You really
do need some way to queue items, a single callback or flag or pointer
will not suffice.
I've added the patch and removed the (now) useless ->unplugged_fn
callback. I suggest you base your md changes on top of my for-linus
branch and tell me when you are confident it looks good, then I'll pull
in your MD changes and submit them later today.
OK with you?
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] block: remove per-queue plugging
am 18.04.2011 11:19:59 von Christoph Hellwig
Btw, I really start to wonder if the request level is the right place
to do this on-stack plugging. Wouldn't it be better to just plug
bios in the on-stack queue? That way we could also stop doing the
special case merging when adding to the plug list, and leave all the
merging / I/O schedule logic in the __make_request path. Probably
not .39 material, but worth a prototype?
Also what this dicussion brought up is that the block layer data
structures are highly confusing. Using a small subset of the
request_queue also for make_request based driver just doesn't make
sense. It seems like we should try to migrate the required state
to struct gendisk, and submit I/O through a block_device_ops.submit
method, leaving the request_queue as an internal abstraction for
the request based drivers.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [PATCH 05/10] block: remove per-queue plugging
am 18.04.2011 11:40:28 von Hannes Reinecke
On 04/18/2011 11:19 AM, hch@infradead.org wrote:
> Btw, I really start to wonder if the request level is the right place
> to do this on-stack plugging. Wouldn't it be better to just plug
> bios in the on-stack queue? That way we could also stop doing the
> special case merging when adding to the plug list, and leave all the
> merging / I/O schedule logic in the __make_request path. Probably
> not .39 material, but worth a prototype?
>
> Also what this dicussion brought up is that the block layer data
> structures are highly confusing. Using a small subset of the
> request_queue also for make_request based driver just doesn't make
> sense. It seems like we should try to migrate the required state
> to struct gendisk, and submit I/O through a block_device_ops.submit
> method, leaving the request_queue as an internal abstraction for
> the request based drivers.
>
Good point.
It would also help us we the device-mapper redesign agk and myself=20
discussed at LSF. Having a block_device_ops.submit function would
allow us remap the actual request queue generically; and we would=20
even be able to address more than one request queue, which sounds=20
awfully similar to what Jens is trying to do ...
Cheers,
Hannes
--=20
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] block: remove per-queue plugging
am 18.04.2011 11:46:30 von Jens Axboe
On 2011-04-18 11:19, hch@infradead.org wrote:
> Btw, I really start to wonder if the request level is the right place
> to do this on-stack plugging. Wouldn't it be better to just plug
> bios in the on-stack queue? That way we could also stop doing the
> special case merging when adding to the plug list, and leave all the
> merging / I/O schedule logic in the __make_request path. Probably
> not .39 material, but worth a prototype?
>
> Also what this dicussion brought up is that the block layer data
> structures are highly confusing. Using a small subset of the
> request_queue also for make_request based driver just doesn't make
> sense. It seems like we should try to migrate the required state
> to struct gendisk, and submit I/O through a block_device_ops.submit
> method, leaving the request_queue as an internal abstraction for
> the request based drivers.
Partially agree, I've never really liked the two methods we have where
the light light version was originally meant for stacked devices but
gets used elsewhere now too. It also causes IO scheduling problems, and
then you get things like request based dm to work around that.
But the idea is really to move more towards private queueing more
localized, the multiqueue setup will really apply well there too. I'm
trying to flesh out the design of that, ideally it would be nice to
unify the different bits we have now.
But agree on pulling the stacked bits into some lower part, like the
gendisk. It would clean that up nicely.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [PATCH 05/10] block: remove per-queue plugging
am 18.04.2011 11:47:39 von Jens Axboe
On 2011-04-18 11:40, Hannes Reinecke wrote:
> On 04/18/2011 11:19 AM, hch@infradead.org wrote:
>> Btw, I really start to wonder if the request level is the right place
>> to do this on-stack plugging. Wouldn't it be better to just plug
>> bios in the on-stack queue? That way we could also stop doing the
>> special case merging when adding to the plug list, and leave all the
>> merging / I/O schedule logic in the __make_request path. Probably
>> not .39 material, but worth a prototype?
>>
>> Also what this dicussion brought up is that the block layer data
>> structures are highly confusing. Using a small subset of the
>> request_queue also for make_request based driver just doesn't make
>> sense. It seems like we should try to migrate the required state
>> to struct gendisk, and submit I/O through a block_device_ops.submit
>> method, leaving the request_queue as an internal abstraction for
>> the request based drivers.
>>
> Good point.
> It would also help us we the device-mapper redesign agk and myself
> discussed at LSF. Having a block_device_ops.submit function would
> allow us remap the actual request queue generically; and we would
> even be able to address more than one request queue, which sounds
> awfully similar to what Jens is trying to do ...
The multiqueue bits would still have one request_queue, but multiple
queueing structures (I called those blk_queue_ctx, iirc).
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html