summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-12-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds3-9/+22
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: packet: fix possible dev refcnt leak when bind fail netem: dont call vfree() under spinlock and BH disabled netfilter: ctnetlink: fix scheduling while atomic if helper is autoloaded netfilter: ctnetlink: fix return value of ctnetlink_get_expect()
2011-12-29Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds3-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix raw_spin_unlock_irqrestore() usage oprofile, arm/sh: Fix oprofile_arch_exit() linkage issue
2011-12-29Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds3-25/+43
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: log all dirty inodes in xfs_fs_sync_fs xfs: log the inode in ->write_inode calls for kupdate
2011-12-29Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds3-12/+15
* 'for-linus' of git://git.kernel.dk/linux-block: block: fix blk_queue_end_tag() block: re-use existing 'reading' variable instead of checking direction again block, cfq: fix empty queue crash caused by request merge
2011-12-29mm: hugetlb: fix non-atomic enqueue of huge pageHillf Danton1-1/+1
If a huge page is enqueued under the protection of hugetlb_lock, then the operation is atomic and safe. Signed-off-by: Hillf Danton <dhillf@gmail.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> [2.6.37+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-29procfs: do not confuse jiffies with cputime64_tAndreas Schwab5-2/+8
Commit 2a95ea6c0d129b4 ("procfs: do not overflow get_{idle,iowait}_time for nohz") did not take into account that one some architectures jiffies and cputime use different units. This causes get_idle_time() to return numbers in the wrong units, making the idle time fields in /proc/stat wrong. Instead of converting the usec value returned by get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function usecs_to_cputime64 to convert it to the correct unit of cputime64_t. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Artem S. Tashkinov" <t.artem@mailcity.com> Cc: Dave Jones <davej@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-29mm/mempolicy.c: refix mbind_range() vma issueKOSAKI Motohiro1-1/+10
commit 8aacc9f550 ("mm/mempolicy.c: fix pgoff in mbind vma merge") is the slightly incorrect fix. Why? Think following case. 1. map 4 pages of a file at offset 0 [0123] 2. map 2 pages just after the first mapping of the same file but with page offset 2 [0123][23] 3. mbind() 2 pages from the first mapping at offset 2. mbind_range() should treat new vma is, [0123][23] |23| mbind vma but it does [0123][23] |01| mbind vma Oops. then, it makes wrong vma merge and splitting ([01][0123] or similar). This patch fixes it. [testcase] test result - before the patch case4: 126: test failed. expect '2,4', actual '2,2,2' case5: passed case6: passed case7: passed case8: passed case_n: 246: test failed. expect '4,2', actual '1,4' ------------[ cut here ]------------ kernel BUG at mm/filemap.c:135! invalid opcode: 0000 [#4] SMP DEBUG_PAGEALLOC (snip long bug on messages) test result - after the patch case4: passed case5: passed case6: passed case7: passed case8: passed case_n: passed source: mbind_vma_test.c ============================================================ #include <numaif.h> #include <numa.h> #include <sys/mman.h> #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <string.h> static unsigned long pagesize; void* mmap_addr; struct bitmask *nmask; char buf[1024]; FILE *file; char retbuf[10240] = ""; int mapped_fd; char *rubysrc = "ruby -e '\ pid = %d; \ vstart = 0x%llx; \ vend = 0x%llx; \ s = `pmap -q #{pid}`; \ rary = []; \ s.each_line {|line|; \ ary=line.split(\" \"); \ addr = ary[0].to_i(16); \ if(vstart <= addr && addr < vend) then \ rary.push(ary[1].to_i()/4); \ end; \ }; \ print rary.join(\",\"); \ '"; void init(void) { void* addr; char buf[128]; nmask = numa_allocate_nodemask(); numa_bitmask_setbit(nmask, 0); pagesize = getpagesize(); sprintf(buf, "%s", "mbind_vma_XXXXXX"); mapped_fd = mkstemp(buf); if (mapped_fd == -1) perror("mkstemp "), exit(1); unlink(buf); if (lseek(mapped_fd, pagesize*8, SEEK_SET) < 0) perror("lseek "), exit(1); if (write(mapped_fd, "\0", 1) < 0) perror("write "), exit(1); addr = mmap(NULL, pagesize*8, PROT_NONE, MAP_SHARED, mapped_fd, 0); if (addr == MAP_FAILED) perror("mmap "), exit(1); if (mprotect(addr+pagesize, pagesize*6, PROT_READ|PROT_WRITE) < 0) perror("mprotect "), exit(1); mmap_addr = addr + pagesize; /* make page populate */ memset(mmap_addr, 0, pagesize*6); } void fin(void) { void* addr = mmap_addr - pagesize; munmap(addr, pagesize*8); memset(buf, 0, sizeof(buf)); memset(retbuf, 0, sizeof(retbuf)); } void mem_bind(int index, int len) { int err; err = mbind(mmap_addr+pagesize*index, pagesize*len, MPOL_BIND, nmask->maskp, nmask->size, 0); if (err) perror("mbind "), exit(err); } void mem_interleave(int index, int len) { int err; err = mbind(mmap_addr+pagesize*index, pagesize*len, MPOL_INTERLEAVE, nmask->maskp, nmask->size, 0); if (err) perror("mbind "), exit(err); } void mem_unbind(int index, int len) { int err; err = mbind(mmap_addr+pagesize*index, pagesize*len, MPOL_DEFAULT, NULL, 0, 0); if (err) perror("mbind "), exit(err); } void Assert(char *expected, char *value, char *name, int line) { if (strcmp(expected, value) == 0) { fprintf(stderr, "%s: passed\n", name); return; } else { fprintf(stderr, "%s: %d: test failed. expect '%s', actual '%s'\n", name, line, expected, value); // exit(1); } } /* AAAA PPPPPPNNNNNN might become PPNNNNNNNNNN case 4 below */ void case4(void) { init(); sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6); mem_bind(0, 4); mem_unbind(2, 2); file = popen(buf, "r"); fread(retbuf, sizeof(retbuf), 1, file); Assert("2,4", retbuf, "case4", __LINE__); fin(); } /* AAAA PPPPPPNNNNNN might become PPPPPPPPPPNN case 5 below */ void case5(void) { init(); sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6); mem_bind(0, 2); mem_bind(2, 2); file = popen(buf, "r"); fread(retbuf, sizeof(retbuf), 1, file); Assert("4,2", retbuf, "case5", __LINE__); fin(); } /* AAAA PPPPNNNNXXXX might become PPPPPPPPPPPP 6 */ void case6(void) { init(); sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6); mem_bind(0, 2); mem_bind(4, 2); mem_bind(2, 2); file = popen(buf, "r"); fread(retbuf, sizeof(retbuf), 1, file); Assert("6", retbuf, "case6", __LINE__); fin(); } /* AAAA PPPPNNNNXXXX might become PPPPPPPPXXXX 7 */ void case7(void) { init(); sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6); mem_bind(0, 2); mem_interleave(4, 2); mem_bind(2, 2); file = popen(buf, "r"); fread(retbuf, sizeof(retbuf), 1, file); Assert("4,2", retbuf, "case7", __LINE__); fin(); } /* AAAA PPPPNNNNXXXX might become PPPPNNNNNNNN 8 */ void case8(void) { init(); sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6); mem_bind(0, 2); mem_interleave(4, 2); mem_interleave(2, 2); file = popen(buf, "r"); fread(retbuf, sizeof(retbuf), 1, file); Assert("2,4", retbuf, "case8", __LINE__); fin(); } void case_n(void) { init(); sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6); /* make redundunt mappings [0][1234][34][7] */ mmap(mmap_addr + pagesize*4, pagesize*2, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED, mapped_fd, pagesize*3); /* Expect to do nothing. */ mem_unbind(2, 2); file = popen(buf, "r"); fread(retbuf, sizeof(retbuf), 1, file); Assert("4,2", retbuf, "case_n", __LINE__); fin(); } int main(int argc, char** argv) { case4(); case5(); case6(); case7(); case8(); case_n(); return 0; } ============================================================= Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Caspar Zhang <caspar@casparzhang.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: <stable@vger.kernel.org> [3.1.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-29gspca: Fix bulk mode cameras no longer working (regression fix)Hans de Goede1-2/+2
The new iso bandwidth calculation code accidentally has broken support for bulk mode cameras. This has broken the following drivers: finepix, jeilinj, ovfx2, ov534, ov534_9, se401, sq905, sq905c, sq930x, stv0680, vicam. Thix patch fixes this. Fix tested with: se401, sq905, sq905c, stv0680 & vicam cams. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-29block: fix blk_queue_end_tag()Dan Williams1-11/+2
Commit 5e081591 "block: warn if tag is greater than real_max_depth" cleaned up blk_queue_end_tag() to warn when the tag is truly invalid (greater than real_max_depth). However, it changed behavior in the tag < max_depth case to not end the request. Leading to triggering of BUG_ON(blk_queued_rq(rq)) in the request completion path: http://marc.info/?l=linux-kernel&m=132204370518629&w=2 In order to allow blk_queue_resize_tags() to shrink the tag space blk_queue_end_tag() must always complete tags with a value less than real_max_depth regardless of the current max_depth. The comment about "handling the shrink case" seems to be what prompted changes in this space, so remove it and BUG on all invalid tags (made even simpler by Matthew's suggestion to use an unsigned compare). Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Tao Ma <boyu.mt@taobao.com> Cc: Matthew Wilcox <matthew@wil.cx> Reported-by: Meelis Roos <mroos@ut.ee> Reported-by: Ed Nadolski <edmund.nadolski@intel.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-12-27packet: fix possible dev refcnt leak when bind failWei Yongjun1-1/+5
If bind is fail when bind is called after set PACKET_FANOUT sock option, the dev refcnt will leak. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-26drm/i915: Disable RC6 on Sandybridge by defaultKeith Packard1-5/+3
RC6 fails again. > I found my system freeze mostly during starting up X and KDE. Sometimes it > works for some minutes, sometimes it freezes immediatly. When the freeze > happens, everything is dead (even the reset button does not work, I need to > power cycle). > I disabled RC6, and my system runs wonderfully. > The system is a Z68 Pro board with Sandybridge i5-2500K processor, 8 > GB of RAM and UEFI firmware. Reported-by: Kai Krakow <hurikhan77@gmail.com> Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-26drm/i915: Disable semaphores by default on SNBKeith Packard1-2/+2
Semaphores still cause problems on some machines: > From Udo Steinberg: > > With Linux-3.2-rc6 I'm frequently seeing GPU hangs when large amounts of > text scroll in an xterm, such as when extracting a tar archive. Such as this > one (note the timestamps): > > I can reproduce it fairly easily with something > as simple as: > > while true; do dmesg; done This patch turns them off on SNB while leaving them on for IVB. Reported-by: Udo Steinberg <udo@hypervisor.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Eugeni Dodonov <eugeni@dodonov.net> Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-26Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds10-56/+154
* 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: e500: include linux/export.h KVM: PPC: fix kvmppc_start_thread() for CONFIG_SMP=N KVM: PPC: protect use of kvmppc_h_pr KVM: PPC: move compute_tlbie_rb to book3s_64 common header KVM: Don't automatically expose the TSC deadline timer in cpuid KVM: Device assignment permission checks KVM: Remove ability to assign a device without iommu support KVM: x86: Prevent starting PIT timers in the absence of irqchip support
2011-12-26Merge tag 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 post 3.2-rc7 pull request * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: MAINTAINERS: firewire git URL update
2011-12-26vfs: fix handling of lock allocation failure in lease-break caseLinus Torvalds1-8/+3
Bruce Fields notes that commit 778fc546f749 ("locks: fix tracking of inprogress lease breaks") introduced a possible error pointer dereference on failure to allocate memory. locks_conflict() will dereference the passed-in new lease lock structure that may be an error pointer. This means an open (without O_NONBLOCK set) on a file with a lease applied (generally only done when Samba or nfsd (with v4) is running) could crash if a kmalloc() fails. So instead of playing games with IS_ERROR() all over the place, just check the allocation failure early. That makes the code more straightforward, and avoids this possible bad pointer dereference. Based-on-patch-by: J. Bruce Fields <bfields@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-26KVM: PPC: e500: include linux/export.hScott Wood1-0/+1
This is required for THIS_MODULE. We recently stopped acquiring it via some other header. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-12-26KVM: PPC: fix kvmppc_start_thread() for CONFIG_SMP=NMichael Neuling1-1/+1
Currently kvmppc_start_thread() tries to wake other SMT threads via xics_wake_cpu(). Unfortunately xics_wake_cpu only exists when CONFIG_SMP=Y so when compiling with CONFIG_SMP=N we get: arch/powerpc/kvm/built-in.o: In function `.kvmppc_start_thread': book3s_hv.c:(.text+0xa1e0): undefined reference to `.xics_wake_cpu' The following should be fine since kvmppc_start_thread() shouldn't called to start non-zero threads when SMP=N since threads_per_core=1. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-12-26KVM: PPC: protect use of kvmppc_h_prAndreas Schwab1-0/+2
kvmppc_h_pr is only available if CONFIG_KVM_BOOK3S_64_PR. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-12-26KVM: PPC: move compute_tlbie_rb to book3s_64 common headerAndreas Schwab2-33/+33
compute_tlbie_rb is only used on ppc64 and cannot be compiled on ppc32. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-12-26KVM: Don't automatically expose the TSC deadline timer in cpuidJan Kiszka3-10/+19
Unlike all of the other cpuid bits, the TSC deadline timer bit is set unconditionally, regardless of what userspace wants. This is broken in several ways: - if userspace doesn't use KVM_CREATE_IRQCHIP, and doesn't emulate the TSC deadline timer feature, a guest that uses the feature will break - live migration to older host kernels that don't support the TSC deadline timer will cause the feature to be pulled from under the guest's feet; breaking it - guests that are broken wrt the feature will fail. Fix by not enabling the feature automatically; instead report it to userspace. Because the feature depends on KVM_CREATE_IRQCHIP, which we cannot guarantee will be called, we expose it via a KVM_CAP_TSC_DEADLINE_TIMER and not KVM_GET_SUPPORTED_CPUID. Fixes the Illumos guest kernel, which uses the TSC deadline timer feature. [avi: add the KVM_CAP + documentation] Reported-by: Alexey Zaytsev <alexey.zaytsev@gmail.com> Tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-25KVM: Device assignment permission checksAlex Williamson2-0/+79
Only allow KVM device assignment to attach to devices which: - Are not bridges - Have BAR resources (assume others are special devices) - The user has permissions to use Assigning a bridge is a configuration error, it's not supported, and typically doesn't result in the behavior the user is expecting anyway. Devices without BAR resources are typically chipset components that also don't have host drivers. We don't want users to hold such devices captive or cause system problems by fencing them off into an iommu domain. We determine "permission to use" by testing whether the user has access to the PCI sysfs resource files. By default a normal user will not have access to these files, so it provides a good indication that an administration agent has granted the user access to the device. [Yang Bai: add missing #include] [avi: fix comment style] Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Yang Bai <hamo.by@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-25KVM: Remove ability to assign a device without iommu supportAlex Williamson2-9/+12
This option has no users and it exposes a security hole that we can allow devices to be assigned without iommu protection. Make KVM_DEV_ASSIGN_ENABLE_IOMMU a mandatory option. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-25KVM: x86: Prevent starting PIT timers in the absence of irqchip supportJan Kiszka1-3/+7
User space may create the PIT and forgets about setting up the irqchips. In that case, firing PIT IRQs will crash the host: BUG: unable to handle kernel NULL pointer dereference at 0000000000000128 IP: [<ffffffffa10f6280>] kvm_set_irq+0x30/0x170 [kvm] ... Call Trace: [<ffffffffa11228c1>] pit_do_work+0x51/0xd0 [kvm] [<ffffffff81071431>] process_one_work+0x111/0x4d0 [<ffffffff81071bb2>] worker_thread+0x152/0x340 [<ffffffff81075c8e>] kthread+0x7e/0x90 [<ffffffff815a4474>] kernel_thread_helper+0x4/0x10 Prevent this by checking the irqchip mode before starting a timer. We can't deny creating the PIT if the irqchips aren't set up yet as current user land expects this order to work. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-25MAINTAINERS: firewire git URL updateStefan Richter1-1/+1
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2011-12-24Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2-3/+15
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: vmwgfx: fix incorrect VRAM size check in vmw_kms_fb_create() drm/radeon/kms: bail on BTC parts if MC ucode is missing
2011-12-24Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller1-5/+13
2011-12-24netem: dont call vfree() under spinlock and BH disabledEric Dumazet1-3/+4
commit 6373a9a286 (netem: use vmalloc for distribution table) added a regression, since vfree() is called while holding a spinlock and BH being disabled. Fix this by doing the pointers swap in critical section, and freeing after spinlock release. Also add __GFP_NOWARN to the kmalloc() try, since we fallback to vmalloc(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-24netfilter: ctnetlink: fix scheduling while atomic if helper is autoloadedPablo Neira Ayuso1-0/+3
This patch fixes one scheduling while atomic error: [ 385.565186] ctnetlink v0.93: registering with nfnetlink. [ 385.565349] BUG: scheduling while atomic: lt-expect_creat/16163/0x00000200 It can be triggered with utils/expect_create included in libnetfilter_conntrack if the FTP helper is not loaded. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-24netfilter: ctnetlink: fix return value of ctnetlink_get_expect()Pablo Neira Ayuso1-5/+10
This fixes one bogus error that is returned to user-space: libnetfilter_conntrack/utils# ./expect_get TEST: get expectation (-1)(Unknown error 18446744073709551504) This patch includes the correct handling for EAGAIN (nfnetlink uses this error value to restart the operation after module auto-loading). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-23Linux 3.2-rc7Linus Torvalds1-1/+1
2011-12-23Merge branch 'for-linus' of ↵Linus Torvalds1-4/+32
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: VFS: Fix race between CPU hotplug and lglocks
2011-12-23Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds2-13/+13
for linus: writeback reason binary tracing format fix * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: show writeback reason with __print_symbolic
2011-12-23Merge branch 'rc-fixes' of ↵Linus Torvalds1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kconfig: adapt update-po-config to new UML layout
2011-12-23Merge branch 'v4l_for_linus' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes
2011-12-23Merge branch 'for-linus' of ↵Linus Torvalds2-5/+7
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: call d_instantiate after all ops are setup Btrfs: fix worker lock misuse in find_worker
2011-12-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds1-2/+2
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().
2011-12-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds10-20/+26
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: netfilter: xt_connbytes: handle negation correctly net: relax rcvbuf limits rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt() net: introduce DST_NOPEER dst flag mqprio: Avoid panic if no options are provided bridge: provide a mtu() method for fake_dst_ops
2011-12-23xfs: log all dirty inodes in xfs_fs_sync_fsChristoph Hellwig3-24/+42
Since Linux 2.6.36 the writeback code has introduces various measures for live lock prevention during sync(). Unfortunately some of these are actively harmful for the XFS model, where the inode gets marked dirty for metadata from the data I/O handler. The older_than_this checks that are now more strictly enforced since writeback: avoid livelocking WB_SYNC_ALL writeback by only calling into __writeback_inodes_sb and thus only sampling the current cut off time once. But on a slow enough devices the previous asynchronous sync pass might not have fully completed yet, and thus XFS might mark metadata dirty only after that sampling of the cut off time for the blocking pass already happened. I have not myself reproduced this myself on a real system, but by introducing artificial delay into the XFS I/O completion workqueues it can be reproduced easily. Fix this by iterating over all XFS inodes in ->sync_fs and log all that are dirty. This might log inode that only got redirtied after the previous pass, but given how cheap delayed logging of inodes is it isn't a major concern for performance. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Tested-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2011-12-23xfs: log the inode in ->write_inode calls for kupdateChristoph Hellwig1-1/+1
If the writeback code writes back an inode because it has expired we currently use the non-blockin ->write_inode path. This means any inode that is pinned is skipped. With delayed logging and a workload that has very little log traffic otherwise it is very likely that an inode that gets constantly written to is always pinned, and thus we keep refusing to write it. The VM writeback code at that point redirties it and doesn't try to write it again for another 30 seconds. This means under certain scenarious time based metadata writeback never happens. Fix this by calling into xfs_log_inode for kupdate in addition to data integrity syncs, and thus transfer the inode to the log ASAP. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Tested-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2011-12-23Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller1-3/+3
2011-12-23perf/x86: Fix raw_spin_unlock_irqrestore() usageRobert Richter1-1/+1
Use raw_spin_unlock_irqrestore() as equivalent to raw_spin_lock_irqsave(). Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1324646665-13334-1-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-23netfilter: xt_connbytes: handle negation correctlyFlorian Westphal1-3/+3
"! --connbytes 23:42" should match if the packet/byte count is not in range. As there is no explict "invert match" toggle in the match structure, userspace swaps the from and to arguments (i.e., as if "--connbytes 42:23" were given). However, "what <= 23 && what >= 42" will always be false. Change things so we use "||" in case "from" is larger than "to". This change may look like it breaks backwards compatibility when "to" is 0. However, older iptables binaries will refuse "connbytes 42:0", and current releases treat it to mean "! --connbytes 0:42", so we should be fine. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-23Btrfs: call d_instantiate after all ops are setupAl Viro1-4/+5
This closes races where btrfs is calling d_instantiate too soon during inode creation. All of the callers of btrfs_add_nondir are updated to instantiate after the inode is fully setup in memory. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-12-23Btrfs: fix worker lock misuse in find_workerChris Mason1-1/+2
Dan Carpenter noticed that we were doing a double unlock on the worker lock, and sometimes picking a worker thread without the lock held. This fixes both errors. Signed-off-by: Chris Mason <chris.mason@oracle.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
2011-12-23oprofile, arm/sh: Fix oprofile_arch_exit() linkage issueVladimir Zapolskiy2-3/+3
This change fixes a linking problem, which happens if oprofile is selected to be compiled as built-in: `oprofile_arch_exit' referenced in section `.init.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o The problem is appeared after commit 87121ca504, which introduced oprofile_arch_exit() calls from __init function. Note that the aforementioned commit has been backported to stable branches, and the problem is known to be reproduced at least with 3.0.13 and 3.1.5 kernels. Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@nokia.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: oprofile-list <oprofile-list@lists.sourceforge.net> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/20111222151540.GB16765@erda.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-23net: relax rcvbuf limitsEric Dumazet3-10/+6
skb->truesize might be big even for a small packet. Its even bigger after commit 87fb4b7b533 (net: more accurate skb truesize) and big MTU. We should allow queueing at least one packet per receiver, even with a low RCVBUF setting. Reported-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()Xi Wang1-2/+5
Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will cause a kernel oops due to insufficient bounds checking. if (count > 1<<30) { /* Enforce a limit to prevent overflow */ return -EINVAL; } count = roundup_pow_of_two(count); table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count)); Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as: ... + (count * sizeof(struct rps_dev_flow)) where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow 32 bits. This patch replaces the magic number (1 << 30) with a symbolic bound. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22net: introduce DST_NOPEER dst flagEric Dumazet4-4/+5
Chris Boot reported crashes occurring in ipv6_select_ident(). [ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>] ipv6_select_ident+0x31/0xa7 [ 461.578229] Call Trace: [ 461.580742] <IRQ> [ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2 [ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155 [ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b [ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e [nf_conntrack_ipv6] [ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543 [ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a [bridge] [ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459 [ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76 [bridge] [ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f [bridge] [ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.677299] [<ffffffffa0379215>] ? nf_bridge_update_protocol+0x20/0x20 [bridge] [ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack] [ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56 [bridge] [ 461.704616] [<ffffffffa0379031>] ? nf_bridge_push_encap_header+0x1c/0x26 [bridge] [ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95 [bridge] [ 461.719490] [<ffffffffa037900a>] ? nf_bridge_pull_encap_header+0x1c/0x27 [bridge] [ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge] [ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56 [bridge] This is caused by bridge netfilter special dst_entry (fake_rtable), a special shared entry, where attaching an inetpeer makes no sense. Problem is present since commit 87c48fa3b46 (ipv6: make fragment identifications less predictable) Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and __ip_select_ident() fallback to the 'no peer attached' handling. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22mqprio: Avoid panic if no options are providedThomas Graf1-1/+1
Userspace may not provide TCA_OPTIONS, in fact tc currently does so not do so if no arguments are specified on the command line. Return EINVAL instead of panicing. Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-22bridge: provide a mtu() method for fake_dst_opsEric Dumazet1-0/+6
Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol depended handlers) forgot the bridge netfilter case, adding a NULL dereference in ip_fragment(). Reported-by: Chris Boot <bootc@bootc.net> CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>