From 7eb54824b76793dd86afb54f182ef9aa64b3a45a Mon Sep 17 00:00:00 2001 From: Andy Whitcroft Date: Fri, 23 May 2008 13:04:50 -0700 Subject: zonelists: handle a node zonelist with no applicable entries When booting 2.6.26-rc3 on a multi-node x86_32 numa system we are seeing panics when trying node local allocations: BUG: unable to handle kernel NULL pointer dereference at 0000034c IP: [] get_page_from_freelist+0x4a/0x18e *pdpt = 00000000013a7001 *pde = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 0, comm: swapper Not tainted (2.6.26-rc3-00003-g5abc28d #82) EIP: 0060:[] EFLAGS: 00010282 CPU: 0 EIP is at get_page_from_freelist+0x4a/0x18e EAX: c1371ed8 EBX: 00000000 ECX: 00000000 EDX: 00000000 ESI: f7801180 EDI: 00000000 EBP: 00000000 ESP: c1371ec0 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 0, ti=c1370000 task=c12f5b40 task.ti=c1370000) Stack: 00000000 00000000 00000000 00000000 000612d0 000412d0 00000000 000412d0 f7801180 f7c0101c f7c01018 c10426e4 f7c01018 00000001 00000044 00000000 00000001 c12f5b40 00000001 00000010 00000000 000412d0 00000286 000412d0 Call Trace: [] __alloc_pages_internal+0x99/0x378 [] __alloc_pages+0x7/0x9 [] kmem_getpages+0x66/0xef [] cache_grow+0x8f/0x123 [] ____cache_alloc_node+0xb9/0xe4 [] kmem_cache_alloc_node+0x92/0xd2 [] setup_cpu_cache+0xaf/0x177 [] kmem_cache_create+0x2c8/0x353 [] kmem_cache_init+0x1ce/0x3ad [] start_kernel+0x178/0x1ee This occurs when we are scanning the zonelists looking for a ZONE_NORMAL page. In this system there is only ZONE_DMA and ZONE_NORMAL memory on node 0, all other nodes are mapped above 4GB physical. Here is a dump of the zonelists from this system: zonelists pgdat=c1400000 0: c14006c0:2 f7c006c0:2 f7e006c0:2 c1400360:1 c1400000:0 1: c14006c0:2 c1400360:1 c1400000:0 zonelists pgdat=f7c00000 0: f7c006c0:2 f7e006c0:2 c14006c0:2 c1400360:1 c1400000:0 1: f7c006c0:2 zonelists pgdat=f7e00000 0: f7e006c0:2 c14006c0:2 f7c006c0:2 c1400360:1 c1400000:0 1: f7e006c0:2 When performing a node local allocation we call get_page_from_freelist() looking for a page. It in turn calls first_zones_zonelist() which returns a preferred_zone. Where there are no applicable zones this will be NULL. However we use this unconditionally, leading to this panic. Where there are no applicable zones there is no possibility of a successful allocation, so simply fail the allocation. Signed-off-by: Andy Whitcroft Acked-by: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/page_alloc.c | 3 +++ 1 file changed, 3 insertions(+) (limited to 'mm/page_alloc.c') diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 035300299f94..7f4c66ff65b7 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1396,6 +1396,9 @@ get_page_from_freelist(gfp_t gfp_mask, nodemask_t *nodemask, unsigned int order, (void)first_zones_zonelist(zonelist, high_zoneidx, nodemask, &preferred_zone); + if (!preferred_zone) + return NULL; + classzone_idx = zone_idx(preferred_zone); zonelist_scan: -- cgit v1.2.3