summaryrefslogtreecommitdiff
path: root/drivers/char/mem.c
AgeCommit message (Collapse)AuthorFilesLines
2017-04-12mm: Tighten x86 /dev/mem with zeroing readsKees Cook1-30/+52
Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is disallowed. However, on x86, the first 1MB was always allowed for BIOS and similar things, regardless of it actually being System RAM. It was possible for heap to end up getting allocated in low 1MB RAM, and then read by things like x86info or dd, which would trip hardened usercopy: usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes) This changes the x86 exception for the low 1MB by reading back zeros for System RAM areas instead of blindly allowing them. More work is needed to extend this to mmap, but currently mmap doesn't go through usercopy, so hardened usercopy won't Oops the kernel. Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com> Tested-by: Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-01-11drivers: char: mem: Fix thinkos in kmem address checksRobin Murphy1-6/+4
When borrowing the pfn_valid() check from mmap_kmem(), somebody managed to get physical and virtual addresses spectacularly muddled up, such that we've ended up with checks for one being the other. Whilst this does indeed prevent out-of-bounds accesses crashing, on most systems it also prevents the more desirable use-case of working at all ever. Check the *virtual* offset correctly for what it is. Furthermore, do so in the right place - a read or write may span multiple pages, so a single up-front check is insufficient. High memory accesses already have a similar validity check just before the copy_to_user() call, so just make the low memory path fully consistent with that. Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> CC: stable@vger.kernel.org Fixes: 148a1bc84398 ("drivers: char: mem: Check {read,write}_kmem() addresses") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-31drivers: char: mem: Check {read,write}_kmem() addressesRobin Murphy1-0/+6
Arriving at read_kmem() with an offset representing a bogus kernel address (e.g. 0 from a simple "cat /dev/kmem") leads to copy_to_user faulting on the kernel-side read. x86_64 happens to get away with this since the optimised implementation uses "rep movs*", thus the user write (which is allowed to fault) and the kernel read are the same instruction, the kernel-side fault falls into the user-side fixup handler and the chain of events which transpires ends up returning an error as one might expect, even if it's an inappropriate -EFAULT. On other architectures, though, the read is not covered by the fixup entry for the write, and we get a big scary "Unable to hande kernel paging request..." dump. The more typical use-case of mmap_kmem() has always (within living memory at least) returned -EIO for addresses which don't satisfy pfn_valid(), so let's make that consistent across {read,write}_kem() too. Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-26shmem: get_unmapped_area align huge pageHugh Dickins1-0/+24
Provide a shmem_get_unmapped_area method in file_operations, called at mmap time to decide the mapping address. It could be conditional on CONFIG_TRANSPARENT_HUGEPAGE, but save #ifdefs in other places by making it unconditional. shmem_get_unmapped_area() first calls the usual mm->get_unmapped_area (which we treat as a black box, highly dependent on architecture and config and executable layout). Lots of conditions, and in most cases it just goes with the address that chose; but when our huge stars are rightly aligned, yet that did not provide a suitable address, go back to ask for a larger arena, within which to align the mapping suitably. There have to be some direct calls to shmem_get_unmapped_area(), not via the file_operations: because of the way shmem_zero_setup() is called to create a shmem object late in the mmap sequence, when MAP_SHARED is requested with MAP_ANONYMOUS or /dev/zero. Though this only matters when /proc/sys/vm/shmem_huge has been set. Link: http://lkml.kernel.org/r/1466021202-61880-29-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-08x86/mm/pat, /dev/mem: Remove superfluous error messageJiri Kosina1-5/+1
Currently it's possible for broken (or malicious) userspace to flood a kernel log indefinitely with messages a-la Program dmidecode tried to access /dev/mem between f0000->100000 because range_is_allowed() is case of CONFIG_STRICT_DEVMEM being turned on dumps this information each and every time devmem_is_allowed() fails. Reportedly userspace that is able to trigger contignuous flow of these messages exists. It would be possible to rate limit this message, but that'd have a questionable value; the administrator wouldn't get information about all the failing accessess, so then the information would be both superfluous and incomplete at the same time :) Returning EPERM (which is what is actually happening) is enough indication for userspace what has happened; no need to log this particular error as some sort of special condition. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1607081137020.24757@cbobk.fhfr.pm Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-05drivers: char: mem: fix IS_ERROR_VALUE usageAndrzej Hajda1-1/+1
IS_ERR_VALUE macro should be used only with unsigned long type. Specifically it works incorrectly with longer types. The patch follows conclusion from discussion on LKML [1][2]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2120927 [2]: http://permalink.gmane.org/gmane.linux.kernel/2150581 Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-22wrappers for ->i_mutex accessAl Viro1-2/+2
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11switch drivers/char/mem.c to ->read_iter/->write_iterAl Viro1-9/+9
Note that _these_ guys have ->read() and ->write() left in place - they are eqiuvalent to what we'd get if we replaced those with NULL, but we are talking about hot paths here. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11make new_sync_{read,write}() staticAl Viro1-2/+0
All places outside of core VFS that checked ->read and ->write for being NULL or called the methods directly are gone now, so NULL {read,write} with non-NULL {read,write}_iter will do the right thing in all cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-25fs: move struct kiocb to fs.hChristoph Hellwig1-1/+1
struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-15Merge tag 'char-misc-3.20-rc1' of ↵Linus Torvalds1-14/+8
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc patches from Greg KH: "Here's the big char/misc driver update for 3.20-rc1. Lots of little things in here, all described in the changelog. Nothing major or unusual, except maybe the binder selinux stuff, which was all acked by the proper selinux people and they thought it best to come through this tree. All of this has been in linux-next with no reported issues for a while" * tag 'char-misc-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (90 commits) coresight: fix function etm_writel_cp14() parameter order coresight-etm: remove check for unknown Kconfig macro coresight: fixing CPU hwid lookup in device tree coresight: remove the unnecessary function coresight_is_bit_set() coresight: fix the debug AMBA bus name coresight: remove the extra spaces coresight: fix the link between orphan connection and newly added device coresight: remove the unnecessary replicator property coresight: fix the replicator subtype value pdfdocs: Fix 'make pdfdocs' failure for 'uio-howto.tmpl' mcb: Fix error path of mcb_pci_probe virtio/console: verify device has config space ti-st: clean up data types (fix harmless memory corruption) mei: me: release hw from reset only during the reset flow mei: mask interrupt set bit on clean reset bit extcon: max77693: Constify struct regmap_config extcon: adc-jack: Release IIO channel on driver remove extcon: Remove duplicated include from extcon-class.c Drivers: hv: vmbus: hv_process_timer_expiration() can be static Drivers: hv: vmbus: serialize Offer and Rescind offer ...
2015-01-20fs: introduce f_op->mmap_capabilities for nommu mmap supportChristoph Hellwig1-32/+32
Since "BDI: Provide backing device capability information [try #3]" the backing_dev_info structure also provides flags for the kind of mmap operation available in a nommu environment, which is entirely unrelated to it's original purpose. Introduce a new nommu-only file operation to provide this information to the nommu mmap code instead. Splitting this from the backing_dev_info structure allows to remove lots of backing_dev_info instance that aren't otherwise needed, and entirely gets rid of the concept of providing a backing_dev_info for a character device. It also removes the need for the mtd_inodefs filesystem. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tejun Heo <tj@kernel.org> Acked-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-12drivers: char: mem: Fix Missing blank line issuesRob Ward1-0/+2
Fixes "Missing a blank line after declarations" reported by checkpatch. This patch introduces no functional changes. Signed-off-by: Rob Ward <robert.ward114@googlemail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-12drivers: char: mem: Replace usage of asm includeRob Ward1-1/+1
Replaces the use of asm/uaccess.h with linux/uaccess.h. Signed-off-by: Rob Ward <robert.ward114@googlemail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-12drivers: char: mem: Simplify DEVPORT configurationRob Ward1-5/+1
Simplify the use of CONFIG_DEVPORT by making the port_fops so that it includes __maybe_unused. This enabled the multiple #ifdef's used for this structure to be removed and brings it in line with the use of CONFIG_DEVMEM This change should introduce no functional changes. Signed-off-by: Rob Ward <robert.ward114@googlemail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-12drivers: char: mem: Simplify DEVKMEM configurationRob Ward1-7/+1
Simplify the use of CONFIG_DEVKMEM by making the kmem_fops so that it is __maybe_unused. This enabled the multiple #ifdef's used for this structure to be removed and brings it in line with the use of CONFIG_DEVMEM This change should introduce no functional changes. Signed-off-by: Rob Ward <robert.ward114@googlemail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-12drivers: char: mem: Make /dev/mem an optional deviceRob Ward1-1/+3
Adds Kconfig option CONFIG_DEVMEM that allows the /dev/mem device to be disabled. Option defaults to /dev/mem enabled. Signed-off-by: Rob Ward <robert.ward114@googlemail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-10/dev/mem: Use more consistent data typesThierry Reding1-5/+8
The xlate_dev_{kmem,mem}_ptr() functions take either a physical address or a kernel virtual address, so data types should be phys_addr_t and void *. They both return a kernel virtual address which is only ever used in calls to copy_{from,to}_user(), so make variables that store it void * rather than char * for consistency. Also only define a weak unxlate_dev_mem_ptr() function if architectures haven't overridden them in the asm/io.h header file. Signed-off-by: Thierry Reding <treding@nvidia.com>
2014-10-09switch /dev/zero and /dev/full to ->read_iter()Al Viro1-43/+13
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-02-15/dev/mem: handle out-of-bounds read/writePetr Tesarik1-0/+6
The loff_t type may be wider than phys_addr_t (e.g. on 32-bit systems). Consequently, the file offset may be truncated in the assignment. Currently, /dev/mem wraps around, which may cause applications to read or write incorrect regions of memory by accident. Let's follow POSIX file semantics here and return 0 when reading from and -EFBIG when writing to an offset that cannot be represented by a phys_addr_t. Note that the conditional is optimized out by the compiler if loff_t has the same size as phys_addr_t. Signed-off-by: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-21mm/memblock: remove unnecessary inclusions of bootmem.hGrygorii Strashko1-1/+0
Clean-up to remove depedency with bootmem headers. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03/dev/oldmem: Remove the interfaceZhang Yanfei1-47/+0
/dev/oldmem provides the interface for us to access the "old memory" in the dump-capture kernel. Unfortunately, no one actually uses this interface. And this interface could actually cause some real problems if used on ia64 where the cached/uncached accesses are mixed. See the discussion from the link: https://lkml.org/lkml/2013/4/12/386. So Eric suggested that we should remove /dev/oldmem as an unused piece of code. [akpm@linux-foundation.org: mention /dev/oldmem obsolescence in devices.txt] Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Hansen <dave@sr71.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-06drivers: char: mem: use IS_ERR_VALUE() in memory_lseek()Rasmus Villemoes1-1/+1
Use IS_ERR_VALUE() instead of comparing the new offset to a hard-coded value of -MAX_ERRNO (which is also off-by-one due to the use of ~ instead of -). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07aio: don't include aio.h in sched.hKent Overstreet1-0/+1
Faster kernel compiles by way of fewer unnecessary includes. [akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07char: add aio_{read,write} to /dev/{null,zero}Zach Brown1-0/+35
These are handy for measuring the cost of the aio infrastructure with operations that do very little and complete immediately. Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-26Merge branch 'for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
2013-02-22new helper: file_inode(file)Al Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-06drivers/char/mem.c: fix small coding style issuesHans Grob1-5/+5
This patch fixes four foo * bar errors, and one trailing whitespace complaint from checkpatch.pl Signed-off-by: Hans Grob <H.Grob@physik.uni-muenchen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-24/dev/mem: use phys_addr_t for physical addressesCyril Chemparathy1-5/+5
This patch fixes the /dev/mem driver to use phys_addr_t for physical addresses. This is required on PAE systems, especially those that run entirely out of >4G physical memory space. Signed-off-by: Cyril Chemparathy <cyril@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-09mm: kill vma flag VM_RESERVED and mm->reserved_vm counterKonstantin Khlebnikov1-1/+1
A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA, currently it lost original meaning but still has some effects: | effect | alternative flags -+------------------------+--------------------------------------------- 1| account as reserved_vm | VM_IO 2| skip in core dump | VM_IO, VM_DONTDUMP 3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP 4| do not mlock | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP This patch removes reserved_vm counter from mm_struct. Seems like nobody cares about it, it does not exported into userspace directly, it only reduces total_vm showed in proc. Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP. remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP. remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP. [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-11powerpc: Disable /dev/port interface on systems without an ISA bridgeHaren Myneni1-1/+10
Some power systems do not have legacy ISA devices. So, /dev/port is not a valid interface on these systems. User level tools such as kbdrate is trying to access the device using this interface which is causing the system crash. This patch will fix this issue by not creating this interface on these powerpc systems. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-05-09printk - fix compilation for CONFIG_PRINTK=nKay Sievers1-0/+2
Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Kay Sievers <kay@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-07kmsg: export printk records to the /dev/kmsg interfaceKay Sievers1-60/+1
Support for multiple concurrent readers of /dev/kmsg, with read(), seek(), poll() support. Output of message sequence numbers, to allow userspace log consumers to reliably reconnect and reconstruct their state at any given time. After open("/dev/kmsg"), read() always returns *all* buffered records. If only future messages should be read, SEEK_END can be used. In case records get overwritten while /dev/kmsg is held open, or records get faster overwritten than they are read, the next read() will return -EPIPE and the current reading position gets updated to the next available record. The passed sequence numbers allow the log consumer to calculate the amount of lost messages. [root@mop ~]# cat /dev/kmsg 5,0,0;Linux version 3.4.0-rc1+ (kay@mop) (gcc version 4.7.0 20120315 ... 6,159,423091;ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) 7,160,424069;pci_root PNP0A03:00: host bridge window [io 0x0000-0x0cf7] (ignored) SUBSYSTEM=acpi DEVICE=+acpi:PNP0A03:00 6,339,5140900;NET: Registered protocol family 10 30,340,5690716;udevd[80]: starting version 181 6,341,6081421;FDC 0 is a S82078B 6,345,6154686;microcode: CPU0 sig=0x623, pf=0x0, revision=0x0 7,346,6156968;sr 1:0:0:0: Attached scsi CD-ROM sr0 SUBSYSTEM=scsi DEVICE=+scsi:1:0:0:0 6,347,6289375;microcode: CPU1 sig=0x623, pf=0x0, revision=0x0 Cc: Karel Zak <kzak@redhat.com> Tested-by: William Douglas <william.douglas@intel.com> Signed-off-by: Kay Sievers <kay@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-07printk: convert byte-buffer to variable-length record bufferKay Sievers1-17/+38
- Record-based stream instead of the traditional byte stream buffer. All records carry a 64 bit timestamp, the syslog facility and priority in the record header. - Records consume almost the same amount, sometimes less memory than the traditional byte stream buffer (if printk_time is enabled). The record header is 16 bytes long, plus some padding bytes at the end if needed. The byte-stream buffer needed 3 chars for the syslog prefix, 15 char for the timestamp and a newline. - Buffer management is based on message sequence numbers. When records need to be discarded, the reading heads move on to the next full record. Unlike the byte-stream buffer, no old logged lines get truncated or partly overwritten by new ones. Sequence numbers also allow consumers of the log stream to get notified if any message in the stream they are about to read gets discarded during the time of reading. - Better buffered IO support for KERN_CONT continuation lines, when printk() is called multiple times for a single line. The use of KERN_CONT is now mandatory to use continuation; a few places in the kernel need trivial fixes here. The buffering could possibly be extended to per-cpu variables to allow better thread-safety for multiple printk() invocations for a single line. - Full-featured syslog facility value support. Different facilities can tag their messages. All userspace-injected messages enforce a facility value > 0 now, to be able to reliably distinguish them from the kernel-generated messages. Independent subsystems like a baseband processor running its own firmware, or a kernel-related userspace process can use their own unique facility values. Multiple independent log streams can co-exist that way in the same buffer. All share the same global sequence number counter to ensure proper ordering (and interleaving) and to allow the consumers of the log to reliably correlate the events from different facilities. Tested-by: William Douglas <william.douglas@intel.com> Signed-off-by: Kay Sievers <kay@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-01-03switch device_get_devnode() and ->devnode() to umode_t *Al Viro1-2/+2
both callers of device_get_devnode() are only interested in lower 16bits and nobody tries to return anything wider than 16bit anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-10-31drivers/char: Add export.h for EXPORT_SYMBOL/THIS_MODULE as requiredPaul Gortmaker1-0/+1
They will need it called out explicitly in the near future due to a module.h usage cleanup that removes its implicit presence everywhere. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-04-19kmsg: properly support writev to avoid interleaved printk lines fixAndrew Morton1-3/+4
make `len' size_t, avoid multiple-assignments. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-19printk: /dev/kmsg - properly support writev() to avoid interleaved printk() ↵Kay Sievers1-15/+26
lines printk: /dev/kmsg - properly support writev() to avoid interleaved printk lines We should avoid calling printk() in a loop, when we pass a single string to /dev/kmsg with writev(). Cc: Lennart Poettering <lennart@poettering.net> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-23drivers/char/mem.c: clean up the codeChangli Gao1-4/+1
Reduce the lines of code and simplify the logic. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-25vfs: introduce FMODE_UNSIGNED_OFFSET for allowing negative f_posKAMEZAWA Hiroyuki1-0/+4
Now, rw_verify_area() checsk f_pos is negative or not. And if negative, returns -EINVAL. But, some special files as /dev/(k)mem and /proc/<pid>/mem etc.. has negative offsets. And we can't do any access via read/write to the file(device). So introduce FMODE_UNSIGNED_OFFSET to allow negative file offsets. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-22Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bklLinus Torvalds1-0/+3
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek
2010-10-15llseek: automatically add .llseek fopArnd Bergmann1-0/+3
All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-09-22char: Mark /dev/zero and /dev/kmem as not capable of writebackJan Kara1-1/+2
These devices don't do any writeback but their device inodes still can get dirty so mark bdi appropriately so that bdi code does the right thing and files inodes to lists of bdi carrying the device inodes. Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-06Fix init ordering of /dev/console vs callers of modprobeDavid Howells1-1/+1
Make /dev/console get initialised before any initialisation routine that invokes modprobe because if modprobe fails, it's going to want to open /dev/console, presumably to write an error message to. The problem with that is that if the /dev/console driver is not yet initialised, the chardev handler will call request_module() to invoke modprobe, which will fail, because we never compile /dev/console as a module. This will lead to a modprobe loop, showing the following in the kernel log: request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 This can happen, for example, when the built in md5 module can't find the built in cryptomgr module (because the latter fails to initialise). The md5 module comes before the call to tty_init(), presumably because 'crypto' comes before 'drivers' alphabetically. Fix this by calling tty_init() from chrdev_init(). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07frv: hide uncached_access() when pgprot_noncached is not #definedDavid Howells1-0/+2
Hide uncached_access() when pgprot_noncached is not #defined. This prevents the following warning: CC drivers/char/mem.o drivers/char/mem.c:229: warning: 'uncached_access' defined but not used Repairs d7d4d849b4e3acc405ec222884936800ffb26d48 ("drivers/char/mem.c: cleanups"). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07/dev/mem: allow rewindingEric Dumazet1-5/+0
commit dcefafb6 ("/dev/mem: dont allow seek to last page") inadvertently disabled rewinding on /dev/mem. This broke x86info for example. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07devmem: handle class_create() failureAnton Blanchard1-0/+3
I hit this when we had a bug in IDR for a few days. Basically sysfs would fail to create new inodes since it uses an IDR and therefore class_create would fail. While we are unlikely to see this fail we may as well handle it instead of oopsing. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12drivers/char/mem.c: cleanupsAndrew Morton1-102/+102
- fix switch statement layout - fix whitespace stuff - fix comment layout - remove unneeded inlining - use __weak - remove trailing whitespace - move uncached_access() inside `#ifndef __HAVE_PHYS_MEM_ACCESS_PROT' - it is otherwise unused. Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12/dev/mem: dont allow seek to last pageWu Fengguang1-6/+13
So as to return a uniform error -EOVERFLOW instead of a random one: # kmem-seek 0xfffffffffffffff0 seek /dev/kmem: Device or resource busy # kmem-seek 0xfffffffffffffff1 seek /dev/kmem: Block device required Suggested by OGAWA Hirofumi. Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-02devmem: fix kmem write bug on memory holesWu Fengguang1-1/+1
write_kmem() used to assume vwrite() always return the full buffer length. However now vwrite() could return 0 to indicate memory hole. This creates a bug that "buf" is not advanced accordingly. Fix it to simply ignore the return value, hence the memory hole. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tejun Heo <tj@kernel.org> Cc: Nick Piggin <npiggin@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>