Skip to content

Commit

Permalink
memfd: Fix locking when tagging pins
Browse files Browse the repository at this point in the history
The RCU lock is insufficient to protect the radix tree iteration as
a deletion from the tree can occur before we take the spinlock to
tag the entry.  In 4.19, this has manifested as a bug with the following
trace:

kernel BUG at lib/radix-tree.c:1429!
invalid opcode: 0000 [beagleboard#1] SMP KASAN PTI
CPU: 7 PID: 6935 Comm: syz-executor.2 Not tainted 4.19.36 beagleboard#25
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:radix_tree_tag_set+0x200/0x2f0 lib/radix-tree.c:1429
Code: 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 44 24 10 e8 a3 29 7e fe 48 8b 44 24 10 48 0f ab 03 e9 d2 fe ff ff e8 90 29 7e fe <0f> 0b 48 c7 c7 e0 5a 87 84 e8 f0 e7 08 ff 4c 89 ef e8 4a ff ac fe
RSP: 0018:ffff88837b13fb60 EFLAGS: 00010016
RAX: 0000000000040000 RBX: ffff8883c5515d58 RCX: ffffffff82cb2ef0
RDX: 0000000000000b72 RSI: ffffc90004cf2000 RDI: ffff8883c5515d98
RBP: ffff88837b13fb98 R08: ffffed106f627f7e R09: ffffed106f627f7e
R10: 0000000000000001 R11: ffffed106f627f7d R12: 0000000000000004
R13: ffffea000d7fea80 R14: 1ffff1106f627f6f R15: 0000000000000002
FS:  00007fa1b8df2700(0000) GS:ffff8883e2fc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa1b8df1db8 CR3: 000000037d4d2001 CR4: 0000000000160ee0
Call Trace:
 memfd_tag_pins mm/memfd.c:51 [inline]
 memfd_wait_for_pins+0x2c5/0x12d0 mm/memfd.c:81
 memfd_add_seals mm/memfd.c:215 [inline]
 memfd_fcntl+0x33d/0x4a0 mm/memfd.c:247
 do_fcntl+0x589/0xeb0 fs/fcntl.c:421
 __do_sys_fcntl fs/fcntl.c:463 [inline]
 __se_sys_fcntl fs/fcntl.c:448 [inline]
 __x64_sys_fcntl+0x12d/0x180 fs/fcntl.c:448
 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:293

The problem does not occur in mainline due to the XArray rewrite which
changed the locking to exclude modification of the tree during iteration.
At the time, nobody realised this was a bugfix.  Backport the locking
changes to stable.

Cc: [email protected]
Reported-by: zhong jiang <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
  • Loading branch information
Matthew Wilcox (Oracle) authored and gregkh committed Oct 29, 2019
1 parent 2770f80 commit 99b45e7
Showing 1 changed file with 10 additions and 8 deletions.
18 changes: 10 additions & 8 deletions mm/memfd.c
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,12 @@ static void memfd_tag_pins(struct address_space *mapping)
void __rcu **slot;
pgoff_t start;
struct page *page;
unsigned int tagged = 0;

lru_add_drain();
start = 0;
rcu_read_lock();

xa_lock_irq(&mapping->i_pages);
radix_tree_for_each_slot(slot, &mapping->i_pages, &iter, start) {
page = radix_tree_deref_slot(slot);
if (!page || radix_tree_exception(page)) {
Expand All @@ -47,18 +48,19 @@ static void memfd_tag_pins(struct address_space *mapping)
continue;
}
} else if (page_count(page) - page_mapcount(page) > 1) {
xa_lock_irq(&mapping->i_pages);
radix_tree_tag_set(&mapping->i_pages, iter.index,
MEMFD_TAG_PINNED);
xa_unlock_irq(&mapping->i_pages);
}

if (need_resched()) {
slot = radix_tree_iter_resume(slot, &iter);
cond_resched_rcu();
}
if (++tagged % 1024)
continue;

slot = radix_tree_iter_resume(slot, &iter);
xa_unlock_irq(&mapping->i_pages);
cond_resched();
xa_lock_irq(&mapping->i_pages);
}
rcu_read_unlock();
xa_unlock_irq(&mapping->i_pages);
}

/*
Expand Down

0 comments on commit 99b45e7

Please sign in to comment.