| From: |
| Linux Kernel Mailing List <linux-kernel-AT-vger.kernel.org> |
| To: |
| bk-commits-head-AT-vger.kernel.org |
| Subject: |
| [PATCH] make the pagecache lock irq-safe. |
| Date: |
| Mon, 12 Apr 2004 20:10:41 +0000 |
ChangeSet 1.1928, 2004/04/12 13:10:41-07:00, akpm@osdl.org
[PATCH] make the pagecache lock irq-safe.
Intro to these patches:
- Major surgery against the pagecache, radix-tree and writeback code. This
work is to address the O_DIRECT-vs-buffered data exposure horrors which
we've been struggling with for months.
As a side-effect, 32 bytes are saved from struct inode and eight bytes
are removed from struct page. At a cost of approximately 2.5 bits per page
in the radix tree nodes on 4k pagesize, assuming the pagecache is densely
populated. Not all pages are pagecache; other pages gain the full 8 byte
saving.
This change will break any arch code which is using page->list and will
also break any arch code which is using page->lru of memory which was
obtained from slab.
The basic problem which we (mainly Daniel McNeil) have been struggling
with is in getting a really reliable fsync() across the page lists while
other processes are performing writeback against the same file. It's like
juggling four bars of wet soap with your eyes shut while someone is
whacking you with a baseball bat. Daniel pretty much has the problem
plugged but I suspect that's just because we don't have testcases to
trigger the remaining problems. The complexity and additional locking
which those patches add is worrisome.
So the approach taken here is to remove the page lists altogether and
replace the list-based writeback and wait operations with in-order
radix-tree walks.
The radix-tree code has been enhanced to support "tagging" of pages, for
later searches for pages which have a particular tag set. This means that
we can ask the radix tree code "find me the next 16 dirty pages starting at
pagecache index N" and it will do that in O(log64(N)) time.
This affects I/O scheduling potentially quite significantly. It is no
longer the case that the kernel will submit pages for I/O in the order in
which the application dirtied them. We instead submit them in file-offset
order all the time.
This is likely to be advantageous when applications are seeking all over
a large file randomly writing small amounts of data. I haven't performed
much benchmarking, but tiobench random write throughput seems to be
increased by 30%. Other tests appear to be unaltered. dbench may have got
10-20% quicker, but it's variable.
There is one large file which everyone seeks all over randomly writing
small amounts of data: the blockdev mapping which caches filesystem
metadata. The kernel's IO submission patterns for this are now ideal.
Because writeback and wait-for-writeback use a tree walk instead of a
list walk they are no longer livelockable. This probably means that we no
longer need to hold i_sem across O_SYNC writes and perhaps fsync() and
fdatasync(). This may be beneficial for databases: multiple processes
writing and syncing different parts of the same file at the same time can
now all submit and wait upon writes to just their own little bit of the
file, so we can get a lot more data into the queues.
It is trivial to implement a part-file-fdatasync() as well, so
applications can say "sync the file from byte N to byte M", and multiple
applications can do this concurrently. This is easy for ext2 filesystems,
but probably needs lots of work for data-journalled filesystems and XFS and
it probably doesn't offer much benefit over an i_semless O_SYNC write.
These patches can end up making ext3 (even) slower:
for i in 1 2 3 4
do
dd if=/dev/zero of=$i bs=1M count=2000 &
done
runs awfully slow on SMP. This is, yet again, because all the file
blocks are jumbled up and the per-file linear writeout causes tons of
seeking. The above test runs sweetly on UP because the on UP we don't
allocate blocks to different files in parallel.
Mingming and Badari are working on getting block reservation working for
ext3 (preallocation on steroids). That should fix ext3 up.
This patch:
- Later, we'll need to access the radix trees from inside disk I/O
completion handlers. So make mapping->page_lock irq-safe. And rename it
to tree_lock to reliably break any missed conversions.
# This patch includes the following deltas:
# ChangeSet 1.1927 -> 1.1928
# fs/cifs/file.c 1.41 -> 1.42
# mm/readahead.c 1.41 -> 1.42
# mm/vmscan.c 1.199 -> 1.200
# mm/swapfile.c 1.92 -> 1.93
# fs/mpage.c 1.45 -> 1.46
# include/linux/fs.h 1.299 -> 1.300
# mm/filemap.c 1.230 -> 1.231
# mm/swap_state.c 1.62 -> 1.63
# ipc/shm.c 1.34 -> 1.35
# fs/buffer.c 1.226 -> 1.227
# mm/page-writeback.c 1.77 -> 1.78
# mm/truncate.c 1.12 -> 1.13
# fs/inode.c 1.115 -> 1.116
# fs/fs-writeback.c 1.46 -> 1.47
#
fs/buffer.c | 8 ++++----
fs/cifs/file.c | 10 +---------
fs/fs-writeback.c | 4 ++--
fs/inode.c | 2 +-
fs/mpage.c | 10 +++++-----
include/linux/fs.h | 2 +-
ipc/shm.c | 2 --
mm/filemap.c | 50 +++++++++++++++++++++++++-------------------------
mm/page-writeback.c | 10 +++++-----
mm/readahead.c | 8 ++++----
mm/swap_state.c | 22 +++++++++++-----------
mm/swapfile.c | 8 ++++----
mm/truncate.c | 8 ++++----
mm/vmscan.c | 13 ++++---------
14 files changed, 71 insertions(+), 86 deletions(-)
diff -Nru a/fs/buffer.c b/fs/buffer.c
--- a/fs/buffer.c Tue Apr 13 01:32:22 2004
+++ b/fs/buffer.c Tue Apr 13 01:32:22 2004
@@ -396,7 +396,7 @@
* Hack idea: for the blockdev mapping, i_bufferlist_lock contention
* may be quite high. This code could TryLock the page, and if that
* succeeds, there is no need to take private_lock. (But if
- * private_lock is contended then so is mapping->page_lock).
+ * private_lock is contended then so is mapping->tree_lock).
*/
static struct buffer_head *
__find_get_block_slow(struct block_device *bdev, sector_t block, int unused)
@@ -867,14 +867,14 @@
spin_unlock(&mapping->private_lock);
if (!TestSetPageDirty(page)) {
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
if (page->mapping) { /* Race with truncate? */
if (!mapping->backing_dev_info->memory_backed)
inc_page_state(nr_dirty);
list_del(&page->list);
list_add(&page->list, &mapping->dirty_pages);
}
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
}
@@ -1254,7 +1254,7 @@
* inode to its superblock's dirty inode list.
*
* mark_buffer_dirty() is atomic. It takes bh->b_page->mapping->private_lock,
- * mapping->page_lock and the global inode_lock.
+ * mapping->tree_lock and the global inode_lock.
*/
void fastcall mark_buffer_dirty(struct buffer_head *bh)
{
diff -Nru a/fs/cifs/file.c b/fs/cifs/file.c
--- a/fs/cifs/file.c Tue Apr 13 01:32:22 2004
+++ b/fs/cifs/file.c Tue Apr 13 01:32:22 2004
@@ -898,11 +898,9 @@
if(list_empty(pages))
break;
- spin_lock(&mapping->page_lock);
page = list_entry(pages->prev, struct page, list);
list_del(&page->list);
- spin_unlock(&mapping->page_lock);
if (add_to_page_cache(page, mapping, page->index, GFP_KERNEL)) {
page_cache_release(page);
@@ -962,14 +960,10 @@
pagevec_init(&lru_pvec, 0);
for(i = 0;i<num_pages;) {
- spin_lock(&mapping->page_lock);
- if(list_empty(page_list)) {
- spin_unlock(&mapping->page_lock);
+ if(list_empty(page_list))
break;
- }
page = list_entry(page_list->prev, struct page, list);
offset = (loff_t)page->index << PAGE_CACHE_SHIFT;
- spin_unlock(&mapping->page_lock);
/* for reads over a certain size could initiate async read ahead */
@@ -989,12 +983,10 @@
cFYI(1,("Read error in readpages: %d",rc));
/* clean up remaing pages off list */
- spin_lock(&mapping->page_lock);
while (!list_empty(page_list) && (i < num_pages)) {
page = list_entry(page_list->prev, struct page, list);
list_del(&page->list);
}
- spin_unlock(&mapping->page_lock);
break;
} else if (bytes_read > 0) {
pSMBr = (struct smb_com_read_rsp *)smb_read_data;
diff -Nru a/fs/fs-writeback.c b/fs/fs-writeback.c
--- a/fs/fs-writeback.c Tue Apr 13 01:32:22 2004
+++ b/fs/fs-writeback.c Tue Apr 13 01:32:22 2004
@@ -159,10 +159,10 @@
* read speculatively by this cpu before &= ~I_DIRTY -- mikulas
*/
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
if (wait || !wbc->for_kupdate || list_empty(&mapping->io_pages))
list_splice_init(&mapping->dirty_pages, &mapping->io_pages);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
spin_unlock(&inode_lock);
ret = do_writepages(mapping, wbc);
diff -Nru a/fs/inode.c b/fs/inode.c
--- a/fs/inode.c Tue Apr 13 01:32:22 2004
+++ b/fs/inode.c Tue Apr 13 01:32:22 2004
@@ -187,7 +187,7 @@
sema_init(&inode->i_sem, 1);
init_rwsem(&inode->i_alloc_sem);
INIT_RADIX_TREE(&inode->i_data.page_tree, GFP_ATOMIC);
- spin_lock_init(&inode->i_data.page_lock);
+ spin_lock_init(&inode->i_data.tree_lock);
init_MUTEX(&inode->i_data.i_shared_sem);
atomic_set(&inode->i_data.truncate_count, 0);
INIT_LIST_HEAD(&inode->i_data.private_list);
diff -Nru a/fs/mpage.c b/fs/mpage.c
--- a/fs/mpage.c Tue Apr 13 01:32:22 2004
+++ b/fs/mpage.c Tue Apr 13 01:32:22 2004
@@ -635,7 +635,7 @@
if (get_block == NULL)
writepage = mapping->a_ops->writepage;
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
while (!list_empty(&mapping->io_pages) && !done) {
struct page *page = list_entry(mapping->io_pages.prev,
struct page, list);
@@ -655,10 +655,10 @@
list_add(&page->list, &mapping->locked_pages);
page_cache_get(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
/*
- * At this point we hold neither mapping->page_lock nor
+ * At this point we hold neither mapping->tree_lock nor
* lock on the page itself: the page may be truncated or
* invalidated (changing page->mapping to NULL), or even
* swizzled back from swapper_space to tmpfs file mapping.
@@ -695,12 +695,12 @@
unlock_page(page);
}
page_cache_release(page);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
}
/*
* Leave any remaining dirty pages on ->io_pages
*/
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
if (bio)
mpage_bio_submit(WRITE, bio);
return ret;
diff -Nru a/include/linux/fs.h b/include/linux/fs.h
--- a/include/linux/fs.h Tue Apr 13 01:32:22 2004
+++ b/include/linux/fs.h Tue Apr 13 01:32:22 2004
@@ -322,7 +322,7 @@
struct address_space {
struct inode *host; /* owner: inode, block_device */
struct radix_tree_root page_tree; /* radix tree of all pages */
- spinlock_t page_lock; /* and spinlock protecting it */
+ spinlock_t tree_lock; /* and spinlock protecting it */
struct list_head clean_pages; /* list of clean pages */
struct list_head dirty_pages; /* list of dirty pages */
struct list_head locked_pages; /* list of locked pages */
diff -Nru a/ipc/shm.c b/ipc/shm.c
--- a/ipc/shm.c Tue Apr 13 01:32:22 2004
+++ b/ipc/shm.c Tue Apr 13 01:32:22 2004
@@ -380,9 +380,7 @@
if (is_file_hugepages(shp->shm_file)) {
struct address_space *mapping = inode->i_mapping;
- spin_lock(&mapping->page_lock);
*rss += (HPAGE_SIZE/PAGE_SIZE)*mapping->nrpages;
- spin_unlock(&mapping->page_lock);
} else {
struct shmem_inode_info *info = SHMEM_I(inode);
spin_lock(&info->lock);
diff -Nru a/mm/filemap.c b/mm/filemap.c
--- a/mm/filemap.c Tue Apr 13 01:32:22 2004
+++ b/mm/filemap.c Tue Apr 13 01:32:22 2004
@@ -59,7 +59,7 @@
* ->private_lock (__free_pte->__set_page_dirty_buffers)
* ->swap_list_lock
* ->swap_device_lock (exclusive_swap_page, others)
- * ->mapping->page_lock
+ * ->mapping->tree_lock
*
* ->i_sem
* ->i_shared_sem (truncate->invalidate_mmap_range)
@@ -78,12 +78,12 @@
*
* ->inode_lock
* ->sb_lock (fs/fs-writeback.c)
- * ->mapping->page_lock (__sync_single_inode)
+ * ->mapping->tree_lock (__sync_single_inode)
*
* ->page_table_lock
* ->swap_device_lock (try_to_unmap_one)
* ->private_lock (try_to_unmap_one)
- * ->page_lock (try_to_unmap_one)
+ * ->tree_lock (try_to_unmap_one)
* ->zone.lru_lock (follow_page->mark_page_accessed)
*
* ->task->proc_lock
@@ -93,7 +93,7 @@
/*
* Remove a page from the page cache and free it. Caller has to make
* sure the page is locked and that nobody else uses it - or that usage
- * is safe. The caller must hold a write_lock on the mapping's page_lock.
+ * is safe. The caller must hold a write_lock on the mapping's tree_lock.
*/
void __remove_from_page_cache(struct page *page)
{
@@ -114,9 +114,9 @@
if (unlikely(!PageLocked(page)))
PAGE_BUG(page);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
__remove_from_page_cache(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
}
static inline int sync_page(struct page *page)
@@ -148,9 +148,9 @@
if (mapping->backing_dev_info->memory_backed)
return 0;
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
list_splice_init(&mapping->dirty_pages, &mapping->io_pages);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
ret = do_writepages(mapping, &wbc);
return ret;
}
@@ -185,7 +185,7 @@
restart:
progress = 0;
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
while (!list_empty(&mapping->locked_pages)) {
struct page *page;
@@ -199,7 +199,7 @@
if (!PageWriteback(page)) {
if (++progress > 32) {
if (need_resched()) {
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
__cond_resched();
goto restart;
}
@@ -209,16 +209,16 @@
progress = 0;
page_cache_get(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
wait_on_page_writeback(page);
if (PageError(page))
ret = -EIO;
page_cache_release(page);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
}
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
/* Check for outstanding write errors */
if (test_and_clear_bit(AS_ENOSPC, &mapping->flags))
@@ -267,7 +267,7 @@
if (error == 0) {
page_cache_get(page);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
error = radix_tree_insert(&mapping->page_tree, offset, page);
if (!error) {
SetPageLocked(page);
@@ -275,7 +275,7 @@
} else {
page_cache_release(page);
}
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
radix_tree_preload_end();
}
return error;
@@ -411,11 +411,11 @@
* We scan the hash list read-only. Addition to and removal from
* the hash-list needs a held write-lock.
*/
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
page = radix_tree_lookup(&mapping->page_tree, offset);
if (page)
page_cache_get(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
return page;
}
@@ -428,11 +428,11 @@
{
struct page *page;
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
page = radix_tree_lookup(&mapping->page_tree, offset);
if (page && TestSetPageLocked(page))
page = NULL;
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
return page;
}
@@ -454,15 +454,15 @@
{
struct page *page;
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
repeat:
page = radix_tree_lookup(&mapping->page_tree, offset);
if (page) {
page_cache_get(page);
if (TestSetPageLocked(page)) {
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
lock_page(page);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
/* Has the page been truncated while we slept? */
if (page->mapping != mapping || page->index != offset) {
@@ -472,7 +472,7 @@
}
}
}
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
return page;
}
@@ -546,12 +546,12 @@
unsigned int i;
unsigned int ret;
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
ret = radix_tree_gang_lookup(&mapping->page_tree,
(void **)pages, start, nr_pages);
for (i = 0; i < ret; i++)
page_cache_get(pages[i]);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
return ret;
}
diff -Nru a/mm/page-writeback.c b/mm/page-writeback.c
--- a/mm/page-writeback.c Tue Apr 13 01:32:22 2004
+++ b/mm/page-writeback.c Tue Apr 13 01:32:22 2004
@@ -472,12 +472,12 @@
if (wait)
wait_on_page_writeback(page);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
list_del(&page->list);
if (test_clear_page_dirty(page)) {
list_add(&page->list, &mapping->locked_pages);
page_cache_get(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
ret = mapping->a_ops->writepage(page, &wbc);
if (ret == 0 && wait) {
wait_on_page_writeback(page);
@@ -487,7 +487,7 @@
page_cache_release(page);
} else {
list_add(&page->list, &mapping->clean_pages);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
unlock_page(page);
}
return ret;
@@ -515,7 +515,7 @@
struct address_space *mapping = page->mapping;
if (mapping) {
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
if (page->mapping) { /* Race with truncate? */
BUG_ON(page->mapping != mapping);
if (!mapping->backing_dev_info->memory_backed)
@@ -523,7 +523,7 @@
list_del(&page->list);
list_add(&page->list, &mapping->dirty_pages);
}
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
if (!PageSwapCache(page))
__mark_inode_dirty(mapping->host,
I_DIRTY_PAGES);
diff -Nru a/mm/readahead.c b/mm/readahead.c
--- a/mm/readahead.c Tue Apr 13 01:32:22 2004
+++ b/mm/readahead.c Tue Apr 13 01:32:22 2004
@@ -230,7 +230,7 @@
/*
* Preallocate as many pages as we will need.
*/
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
for (page_idx = 0; page_idx < nr_to_read; page_idx++) {
unsigned long page_offset = offset + page_idx;
@@ -241,16 +241,16 @@
if (page)
continue;
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
page = page_cache_alloc_cold(mapping);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
if (!page)
break;
page->index = page_offset;
list_add(&page->list, &page_pool);
ret++;
}
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
/*
* Now start the IO. We ignore I/O errors - if the page is not
diff -Nru a/mm/swap_state.c b/mm/swap_state.c
--- a/mm/swap_state.c Tue Apr 13 01:32:22 2004
+++ b/mm/swap_state.c Tue Apr 13 01:32:22 2004
@@ -25,7 +25,7 @@
struct address_space swapper_space = {
.page_tree = RADIX_TREE_INIT(GFP_ATOMIC),
- .page_lock = SPIN_LOCK_UNLOCKED,
+ .tree_lock = SPIN_LOCK_UNLOCKED,
.clean_pages = LIST_HEAD_INIT(swapper_space.clean_pages),
.dirty_pages = LIST_HEAD_INIT(swapper_space.dirty_pages),
.io_pages = LIST_HEAD_INIT(swapper_space.io_pages),
@@ -182,9 +182,9 @@
entry.val = page->index;
- spin_lock(&swapper_space.page_lock);
+ spin_lock_irq(&swapper_space.tree_lock);
__delete_from_swap_cache(page);
- spin_unlock(&swapper_space.page_lock);
+ spin_unlock_irq(&swapper_space.tree_lock);
swap_free(entry);
page_cache_release(page);
@@ -195,8 +195,8 @@
struct address_space *mapping = page->mapping;
int err;
- spin_lock(&swapper_space.page_lock);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&swapper_space.tree_lock);
+ spin_lock(&mapping->tree_lock);
err = radix_tree_insert(&swapper_space.page_tree, entry.val, page);
if (!err) {
@@ -204,8 +204,8 @@
___add_to_page_cache(page, &swapper_space, entry.val);
}
- spin_unlock(&mapping->page_lock);
- spin_unlock(&swapper_space.page_lock);
+ spin_unlock(&mapping->tree_lock);
+ spin_unlock_irq(&swapper_space.tree_lock);
if (!err) {
if (!swap_duplicate(entry))
@@ -231,8 +231,8 @@
entry.val = page->index;
- spin_lock(&swapper_space.page_lock);
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&swapper_space.tree_lock);
+ spin_lock(&mapping->tree_lock);
err = radix_tree_insert(&mapping->page_tree, index, page);
if (!err) {
@@ -240,8 +240,8 @@
___add_to_page_cache(page, mapping, index);
}
- spin_unlock(&mapping->page_lock);
- spin_unlock(&swapper_space.page_lock);
+ spin_unlock(&mapping->tree_lock);
+ spin_unlock_irq(&swapper_space.tree_lock);
if (!err) {
swap_free(entry);
diff -Nru a/mm/swapfile.c b/mm/swapfile.c
--- a/mm/swapfile.c Tue Apr 13 01:32:22 2004
+++ b/mm/swapfile.c Tue Apr 13 01:32:22 2004
@@ -253,10 +253,10 @@
/* Is the only swap cache user the cache itself? */
if (p->swap_map[swp_offset(entry)] == 1) {
/* Recheck the page count with the pagecache lock held.. */
- spin_lock(&swapper_space.page_lock);
+ spin_lock_irq(&swapper_space.tree_lock);
if (page_count(page) - !!PagePrivate(page) == 2)
retval = 1;
- spin_unlock(&swapper_space.page_lock);
+ spin_unlock_irq(&swapper_space.tree_lock);
}
swap_info_put(p);
}
@@ -324,13 +324,13 @@
retval = 0;
if (p->swap_map[swp_offset(entry)] == 1) {
/* Recheck the page count with the pagecache lock held.. */
- spin_lock(&swapper_space.page_lock);
+ spin_lock_irq(&swapper_space.tree_lock);
if ((page_count(page) == 2) && !PageWriteback(page)) {
__delete_from_swap_cache(page);
SetPageDirty(page);
retval = 1;
}
- spin_unlock(&swapper_space.page_lock);
+ spin_unlock_irq(&swapper_space.tree_lock);
}
swap_info_put(p);
diff -Nru a/mm/truncate.c b/mm/truncate.c
--- a/mm/truncate.c Tue Apr 13 01:32:22 2004
+++ b/mm/truncate.c Tue Apr 13 01:32:22 2004
@@ -62,7 +62,7 @@
* This is for invalidate_inode_pages(). That function can be called at
* any time, and is not supposed to throw away dirty pages. But pages can
* be marked dirty at any time too. So we re-check the dirtiness inside
- * ->page_lock. That provides exclusion against the __set_page_dirty
+ * ->tree_lock. That provides exclusion against the __set_page_dirty
* functions.
*/
static int
@@ -74,13 +74,13 @@
if (PagePrivate(page) && !try_to_release_page(page, 0))
return 0;
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
if (PageDirty(page)) {
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
return 0;
}
__remove_from_page_cache(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
ClearPageUptodate(page);
page_cache_release(page); /* pagecache ref */
return 1;
diff -Nru a/mm/vmscan.c b/mm/vmscan.c
--- a/mm/vmscan.c Tue Apr 13 01:32:22 2004
+++ b/mm/vmscan.c Tue Apr 13 01:32:22 2004
@@ -354,7 +354,6 @@
goto keep_locked;
if (!may_write_to_queue(mapping->backing_dev_info))
goto keep_locked;
- spin_lock(&mapping->page_lock);
if (test_clear_page_dirty(page)) {
int res;
struct writeback_control wbc = {
@@ -364,9 +363,6 @@
.for_reclaim = 1,
};
- list_move(&page->list, &mapping->locked_pages);
- spin_unlock(&mapping->page_lock);
-
SetPageReclaim(page);
res = mapping->a_ops->writepage(page, &wbc);
if (res < 0)
@@ -381,7 +377,6 @@
}
goto keep;
}
- spin_unlock(&mapping->page_lock);
}
/*
@@ -415,7 +410,7 @@
if (!mapping)
goto keep_locked; /* truncate got there first */
- spin_lock(&mapping->page_lock);
+ spin_lock_irq(&mapping->tree_lock);
/*
* The non-racy check for busy page. It is critical to check
@@ -423,7 +418,7 @@
* not in use by anybody. (pagecache + us == 2)
*/
if (page_count(page) != 2 || PageDirty(page)) {
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
goto keep_locked;
}
@@ -431,7 +426,7 @@
if (PageSwapCache(page)) {
swp_entry_t swap = { .val = page->index };
__delete_from_swap_cache(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
swap_free(swap);
__put_page(page); /* The pagecache ref */
goto free_it;
@@ -439,7 +434,7 @@
#endif /* CONFIG_SWAP */
__remove_from_page_cache(page);
- spin_unlock(&mapping->page_lock);
+ spin_unlock_irq(&mapping->tree_lock);
__put_page(page);
free_it:
-
To unsubscribe from this list: send the line "unsubscribe bk-commits-head" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
(
Log in to post comments)