Return-Path: <axboe@suse.de>
Delivered-To: kernel@kolivas.org
Received: from bhhdoa.org.au (bhhdoa.org.au [216.17.101.199])
	by mail.kolivas.org (Postfix) with ESMTP id B2F1E3E52C
	for <kernel@kolivas.org>; Wed, 14 Jul 2004 22:12:16 +1000 (EST)
Received: from virtualhost.dk (ns.virtualhost.dk [195.184.98.160])
	by bhhdoa.org.au (Postfix) with ESMTP id 08EEB50B4E
	for <kernel@kolivas.org>; Wed, 14 Jul 2004 05:45:00 -0400 (EDT)
Received: from [62.242.22.158] (helo=wiggum.home.kernel.dk)
	by virtualhost.dk with esmtp (Exim 3.36 #1)
	id 1BkicB-0006v8-00; Wed, 14 Jul 2004 14:11:59 +0200
Received: from axboe by wiggum.home.kernel.dk with local (Exim 4.30)
	id 1Bkic9-0001Rn-N4; Wed, 14 Jul 2004 14:11:57 +0200
Date: Wed, 14 Jul 2004 14:11:57 +0200
From: Jens Axboe <axboe@suse.de>
To: Andrew Morton <akpm@osdl.org>
Cc: kernel@kolivas.org
Subject: [axboe@suse.de: Re: [waider@waider.ie: some sort of IDE-related hang in the last two or three 2.6test kernel rpms]]
Message-ID: <20040714121157.GA5558@suse.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline


Two fixes:

o Still a potential __GFP_WAIT allocation under queue lock

o memset crq after alloc, with slab poisoning enabled rb_next() can
  return crap next pointer.

----- Forwarded message from Jens Axboe <axboe@suse.de> -----

From: Jens Axboe <axboe@suse.de>
Date: Wed, 14 Jul 2004 11:56:27 +0200
To: Arjan van de Ven <arjanv@redhat.com>
Subject: Re: [waider@waider.ie: some sort of IDE-related hang in the last two or three 2.6test kernel rpms]

On Wed, Jul 14 2004, Jens Axboe wrote:
> On Wed, Jul 14 2004, Jens Axboe wrote:
> > On Wed, Jul 14 2004, Arjan van de Ven wrote:
> > > Hi,
> > > 
> > > This is the 3rd such report I get... I wonder if there's still something
> > > wrong with the cfq changes......
> > 
> > Fudge, let me double check it...
> 
> Duh... I'll test this now.

Actually, I think there are two bugs... crq needs to be completey
cleared if it is poisned, else rb_next() can return a bogus pointer.
This one works for me.

Index: linux-2.6.7-ck/drivers/block/cfq-iosched.c
===================================================================
--- linux-2.6.7-ck.orig/drivers/block/cfq-iosched.c	2004-07-30 10:47:18.121053387 +1000
+++ linux-2.6.7-ck/drivers/block/cfq-iosched.c	2004-07-30 10:47:21.923457496 +1000
@@ -460,22 +460,36 @@
 					 int gfp_mask)
 {
 	const int hashval = hash_long(current->tgid, CFQ_QHASH_SHIFT);
-	struct cfq_queue *cfqq = __cfq_find_cfq_hash(cfqd, pid, hashval);
+	struct cfq_queue *cfqq, *new_cfqq = NULL;
+	request_queue_t *q = cfqd->queue;
 
-	if (!cfqq) {
-		cfqq = mempool_alloc(cfq_mpool, gfp_mask);
+retry:
+	cfqq = __cfq_find_cfq_hash(cfqd, pid, hashval);
 
-		if (cfqq) {
-			INIT_LIST_HEAD(&cfqq->cfq_hash);
-			INIT_LIST_HEAD(&cfqq->cfq_list);
-			RB_CLEAR_ROOT(&cfqq->sort_list);
-
-			cfqq->pid = pid;
-			cfqq->queued[0] = cfqq->queued[1] = 0;
-			list_add(&cfqq->cfq_hash, &cfqd->cfq_hash[hashval]);
-		}
+	if (!cfqq) {
+		if (new_cfqq) {
+			cfqq = new_cfqq;
+			new_cfqq = NULL;
+		} else if (gfp_mask & __GFP_WAIT) {
+			spin_unlock_irq(q->queue_lock);
+			new_cfqq = mempool_alloc(cfq_mpool, gfp_mask);
+			spin_lock_irq(q->queue_lock);
+			goto retry;
+		} else
+			return NULL;
+
+		INIT_LIST_HEAD(&cfqq->cfq_hash);
+		INIT_LIST_HEAD(&cfqq->cfq_list);
+		RB_CLEAR_ROOT(&cfqq->sort_list);
+
+		cfqq->pid = pid;
+		cfqq->queued[0] = cfqq->queued[1] = 0;
+		list_add(&cfqq->cfq_hash, &cfqd->cfq_hash[hashval]);
 	}
 
+	if (new_cfqq)
+		mempool_free(new_cfqq, cfq_mpool);
+
 	return cfqq;
 }
 
@@ -653,6 +667,7 @@
 
 	crq = mempool_alloc(cfqd->crq_pool, gfp_mask);
 	if (crq) {
+		memset(crq, 0, sizeof(*crq));
 		RB_CLEAR(&crq->rb_node);
 		crq->request = rq;
 		crq->cfq_queue = NULL;

