summaryrefslogtreecommitdiff
path: root/fs/sysfs
AgeCommit message (Collapse)AuthorFilesLines
2012-12-17Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...
2012-11-26sysfs: Mark sysfs_attr_ns staticJosh Triplett1-2/+2
Nothing outside of fs/sysfs/file.c references this function, so mark it static. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-20userns: Allow unprivilged mounts of proc and sysfsEric W. Biederman1-0/+1
- The context in which proc and sysfs are mounted have no effect on the the uid/gid of their files so no conversion is needed except allowing the mount. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-10-24sysfs: sysfs_pathname/sysfs_add_one: Use strlcat() instead of strcat()Geert Uytterhoeven1-8/+8
The warning check for duplicate sysfs entries can cause a buffer overflow when printing the warning, as strcat() doesn't check buffer sizes. Use strlcat() instead. Since strlcat() doesn't return a pointer to the passed buffer, unlike strcat(), I had to convert the nested concatenation in sysfs_add_one() to an admittedly more obscure comma operator construct, to avoid emitting code for the concatenation if CONFIG_BUG is disabled. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-04sysfs: Fix comment typo "sysf_create_link".Robert P. J. Day1-1/+1
More pedantry. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-01Merge branch 'for-linus' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second vfs pile from Al Viro: "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the deadlock reproduced by xfstests 068), symlink and hardlink restriction patches, plus assorted cleanups and fixes. Note that another fsfreeze deadlock (emergency thaw one) is *not* dealt with - the series by Fernando conflicts a lot with Jan's, breaks userland ABI (FIFREEZE semantics gets changed) and trades the deadlock for massive vfsmount leak; this is going to be handled next cycle. There probably will be another pull request, but that stuff won't be in it." Fix up trivial conflicts due to unrelated changes next to each other in drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c} * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits) delousing target_core_file a bit Documentation: Correct s_umount state for freeze_fs/unfreeze_fs fs: Remove old freezing mechanism ext2: Implement freezing btrfs: Convert to new freezing mechanism nilfs2: Convert to new freezing mechanism ntfs: Convert to new freezing mechanism fuse: Convert to new freezing mechanism gfs2: Convert to new freezing mechanism ocfs2: Convert to new freezing mechanism xfs: Convert to new freezing code ext4: Convert to new freezing mechanism fs: Protect write paths by sb_start_write - sb_end_write fs: Skip atime update on frozen filesystem fs: Add freezing handling to mnt_want_write() / mnt_drop_write() fs: Improve filesystem freezing handling switch the protection of percpu_counter list to spinlock nfsd: Push mnt_want_write() outside of i_mutex btrfs: Push mnt_want_write() outside of i_mutex fat: Push mnt_want_write() outside of i_mutex ...
2012-07-31sysfs: Push file_update_time() into bin_page_mkwrite()Jan Kara1-0/+2
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-26Merge tag 'driver-core-3.6-rc1' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core changes from Greg Kroah-Hartman: "Here's the big driver core pull request for 3.6-rc1. Unlike 3.5, this kernel should be a lot tamer, with the printk changes now settled down. All we have here is some extcon driver updates, w1 driver updates, a few printk cleanups that weren't needed for 3.5, but are good to have now, and some other minor fixes/changes in the driver core. All of these have been in the linux-next releases for a while now. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (38 commits) printk: Export struct log size and member offsets through vmcoreinfo Drivers: hv: Change the hex constant to a decimal constant driver core: don't trigger uevent after failure extcon: MAX77693: Add extcon-max77693 driver to support Maxim MAX77693 MUIC device sysfs: fail dentry revalidation after namespace change fix sysfs: fail dentry revalidation after namespace change extcon: spelling of detach in function doc extcon: arizona: Stop microphone detection if we give up on it extcon: arizona: Update cable reporting calls and split headset PM / Runtime: Do not increment device usage counts before probing kmsg - do not flush partial lines when the console is busy kmsg - export "continuation record" flag to /dev/kmsg kmsg - avoid warning for CONFIG_PRINTK=n compilations kmsg - properly print over-long continuation lines driver-core: Use kobj_to_dev instead of re-implementing it driver-core: Move kobj_to_dev from genhd.h to device.h driver core: Move deferred devices to the end of dpm_list before probing driver core: move uevent call to driver_register driver core: fix shutdown races with probe/remove(v3) Extcon: Arizona: Add driver for Wolfson Arizona class devices ...
2012-07-17sysfs: fail dentry revalidation after namespace change fixAndrew Morton1-3/+5
don't assume that KOBJ_NS_TYPE_NONE==0. Also save a test-n-branch. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-17sysfs: fail dentry revalidation after namespace changeGlauber Costa1-0/+8
When we change the namespace tag of a sysfs entry, the associated dentry is still kept around. readdir() will work correctly and not display the old entries, but open() will still succeed, so will reads and writes. This will no longer happen if sysfs is remounted, hinting that this is a cache-related problem. I am using the following sequence to demonstrate that: shell1: ip link add type veth unshare -nm shell2: ip link set veth1 <pid_of_shell_1> cat /sys/devices/virtual/net/veth1/ifindex Before that patch, this will succeed (fail to fail). After it, it will correctly return an error. Differently from a normal rename, which we handle fine, changing the object namespace will keep it's path intact. So this check seems necessary as well. [ v2: get type from parent, as suggested by Eric Biederman ] Signed-off-by: Glauber Costa <glommer@parallels.com> CC: Tejun Heo <tj@kernel.org> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-14VFS: Pass mount flags to sget()David Howells1-2/+1
Pass mount flags to sget() so that it can use them in initialising a new superblock before the set function is called. They could also be passed to the compare function. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14sysfs: just use d_materialise_unique()Al Viro1-8/+1
same as for nfs et.al. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14sysfs: switch to ->s_d_op and ->d_release()Al Viro3-10/+8
a) ->d_iput() is wrong here - what we do to inode is completely usual, it's dentry->d_fsdata that we want to drop. Just use ->d_release(). b) switch to ->s_d_op - no need to play with d_set_d_op() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14stop passing nameidata to ->lookup()Al Viro1-1/+1
Just the flags; only NFS cares even about that, but there are legitimate uses for such argument. And getting rid of that completely would require splitting ->lookup() into a couple of methods (at least), so let's leave that alone for now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14stop passing nameidata * to ->d_revalidate()Al Viro1-2/+2
Just the lookup flags. Die, bastard, die... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-28Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds1-1/+1
Pull writeback tree from Wu Fengguang: "Mainly from Jan Kara to avoid iput() in the flusher threads." * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Avoid iput() from flusher thread vfs: Rename end_writeback() to clear_inode() vfs: Move waiting for inode writeback from end_writeback() to evict_inode() writeback: Refactor writeback_single_inode() writeback: Remove wb->list_lock from writeback_single_inode() writeback: Separate inode requeueing after writeback writeback: Move I_DIRTY_PAGES handling writeback: Move requeueing when I_SYNC set to writeback_sb_inodes() writeback: Move clearing of I_SYNC into inode_sync_complete() writeback: initialize global_dirty_limit fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds mm: page-writeback.c: local functions should not be exposed globally
2012-05-23Merge branch 'for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...
2012-05-15userns: Convert sysfs to use kgid/kuid where appropriateEric W. Biederman1-2/+2
Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-05-14sysfs: get rid of some lockdep false positivesAlan Stern1-5/+26
This patch (as1554) fixes a lockdep false-positive report. The problem arises because lockdep is unable to deal with the tree-structured locks created by the device core and sysfs. This particular problem involves a sysfs attribute method that unregisters itself, not from the device it was called for, but from a descendant device. Lockdep doesn't understand the distinction and reports a possible deadlock, even though the operation is safe. This is the sort of thing that would normally be handled by using a nested lock annotation; unfortunately it's not feasible to do that here. There's no sensible way to tell sysfs when attribute removal occurs in the context of a parent attribute method. As a workaround, the patch adds a new flag to struct attribute telling sysfs not to inform lockdep when it acquires a readlock on a sysfs_dirent instance for the attribute. The readlock is still acquired, but lockdep doesn't know about it and hence does not complain about impossible deadlock scenarios. Also added are macros for static initialization of attribute structures with the ignore_lockdep flag set. The three offending attributes in the USB subsystem are converted to use the new macros. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Tejun Heo <tj@kernel.org> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-06vfs: Rename end_writeback() to clear_inode()Jan Kara1-1/+1
After we moved inode_sync_wait() from end_writeback() it doesn't make sense to call the function end_writeback() anymore. Rename it to clear_inode() which well says what the function really does - set I_CLEAR flag. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-05-02sysfs: Removed dup_name entirely in sysfs_renameSasikantha babu1-4/+2
Since no one using "dup_name", removed it completely in sysfs_rename. Signed-off-by: Sasikantha babu <sasikanth.v19@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-10sysfs: handle 'parent deleted before child added'Dan Williams1-0/+3
In scsi at least two cases of the parent device being deleted before the child is added have been observed. 1/ scsi is performing async scans and the device is removed prior to the async can thread running (can happen with an in-opportune / unlikely unplug during initial scan). 2/ libsas discovery event running after the parent port has been torn down (this is a bug in libsas). Result in crash signatures like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6 ... Process scsi_scan_8 (pid: 5417, threadinfo ffff88080bd16000, task ffff880801b8a0b0) Stack: 00000000fffffffe ffff880813470628 ffff88080bd17cd0 ffff88080614b7e8 ffff88080b45c108 00000000fffffffe ffff88080bd17d20 ffffffff8125e4a8 ffff88080bd17cf0 ffffffff81075149 ffff88080bd17d30 ffff88080614b7e8 Call Trace: [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3 [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8125e641>] kobject_add_varg+0x41/0x50 [<ffffffff8125e70b>] kobject_add+0x64/0x66 [<ffffffff8131122b>] device_add+0x12d/0x63a In this scenario the parent is still valid (because we have a reference), but it has been device_del()'d which means its kobj->sd pointer is NULL'd via: device_del()->kobject_del()->sysfs_remove_dir() ...and then sysfs_create_dir() (without this fix) goes ahead and de-references parent_sd via sysfs_ns_type(): return (sd->s_flags & SYSFS_NS_TYPE_MASK) >> SYSFS_NS_TYPE_SHIFT; This scenario is being fixed in scsi/libsas, but if other subsystems present the same ordering the system need not immediately crash. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: James Bottomley <JBottomley@parallels.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-10sysfs: Prevent crash on unset sysfs group attributesBruno Prémont1-1/+5
Do not let the kernel crash when a device is registered with sysfs while group attributes are not set (aka NULL). Warn about the offender with some information about the offending device. This would warn instead of trying NULL pointer deref like: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81152673>] internal_create_group+0x83/0x1a0 PGD 0 Oops: 0000 [#1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc1-x86_64 #3 HP ProLiant DL360 G4 RIP: 0010:[<ffffffff81152673>] [<ffffffff81152673>] internal_create_group+0x83/0x1a0 RSP: 0018:ffff88019485fd70 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 RDX: ffff880192e99908 RSI: ffff880192e99630 RDI: ffffffff81a26c60 RBP: ffff88019485fdc0 R08: 0000000000000000 R09: 0000000000000000 R10: ffff880192e99908 R11: 0000000000000000 R12: ffffffff81a16a00 R13: ffff880192e99908 R14: ffffffff81a16900 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88019bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a0c000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/0 (pid: 1, threadinfo ffff88019485e000, task ffff880194878000) Stack: ffff88019485fdd0 ffff880192da9d60 0000000000000000 ffff880192e99908 ffff880192e995d8 0000000000000001 ffffffff81a16a00 ffff880192da9d60 0000000000000000 0000000000000000 ffff88019485fdd0 ffffffff811527be Call Trace: [<ffffffff811527be>] sysfs_create_group+0xe/0x10 [<ffffffff81376ca6>] device_add_groups+0x46/0x80 [<ffffffff81377d3d>] device_add+0x46d/0x6a0 ... Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-09sysfs: Update the name hash for an entry after changing the namespaceTom Goff1-1/+1
This is needed to allow renaming network devices that have been moved to another network namespace. Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-21Merge branch 'for-linus' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "This is _not_ all; in particular, Miklos' and Jan's stuff is not there yet." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits) ext4: initialization of ext4_li_mtx needs to be done earlier debugfs-related mode_t whack-a-mole hfsplus: add an ioctl to bless files hfsplus: change finder_info to u32 hfsplus: initialise userflags qnx4: new helper - try_extent() qnx4: get rid of qnx4_bread/qnx4_getblk take removal of PF_FORKNOEXEC to flush_old_exec() trim includes in inode.c um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it um: embed ->stub_pages[] into mmu_context gadgetfs: list_for_each_safe() misuse ocfs2: fix leaks on failure exits in module_init ecryptfs: make register_filesystem() the last potential failure exit ntfs: forgets to unregister sysctls on register_filesystem() failure logfs: missing cleanup on register_filesystem() failure jfs: mising cleanup on register_filesystem() failure make configfs_pin_fs() return root dentry on success configfs: configfs_create_dir() has parent dentry in dentry->d_parent configfs: sanitize configfs_create() ...
2012-03-20switch open-coded instances of d_make_root() to new helperAl Viro1-2/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-08Revert "sysfs: Kill nlink counting."Greg Kroah-Hartman3-0/+10
This reverts commit 524b6c5b39b931311dfe5a2f5abae2f5c9731676. It has shown to break userspace tools, which is not acceptable. Reported-by: Jiri Slaby <jslaby@suse.cz> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-24sysfs: Fix memory leak in sysfs_sd_setsecdata().Masami Ichikawa1-5/+6
This patch fixies follwing two memory leak patterns that reported by kmemleak. sysfs_sd_setsecdata() is called during sys_lsetxattr() operation. It checks sd->s_iattr is NULL or not. Then if it is NULL, it calls sysfs_init_inode_attrs() to allocate memory. That code is this. iattrs = sd->s_iattr; if (!iattrs) iattrs = sysfs_init_inode_attrs(sd); The iattrs recieves sysfs_init_inode_attrs()'s result, but sd->s_iattr doesn't know the address. so it needs to set correct address to sd->s_iattr to free memory in other function. unreferenced object 0xffff880250b73e60 (size 32): comm "systemd", pid 1, jiffies 4294683888 (age 94.553s) hex dump (first 32 bytes): 73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f system_u:object_ 72 3a 73 79 73 66 73 5f 74 3a 73 30 00 00 00 00 r:sysfs_t:s0.... backtrace: [<ffffffff814cb1d0>] kmemleak_alloc+0x73/0x98 [<ffffffff811270ab>] __kmalloc+0x100/0x12c [<ffffffff8120775a>] context_struct_to_string+0x106/0x210 [<ffffffff81207cc1>] security_sid_to_context_core+0x10b/0x129 [<ffffffff812090ef>] security_sid_to_context+0x10/0x12 [<ffffffff811fb0da>] selinux_inode_getsecurity+0x7d/0xa8 [<ffffffff811fb127>] selinux_inode_getsecctx+0x22/0x2e [<ffffffff811f4d62>] security_inode_getsecctx+0x16/0x18 [<ffffffff81191dad>] sysfs_setxattr+0x96/0x117 [<ffffffff811542f0>] __vfs_setxattr_noperm+0x73/0xd9 [<ffffffff811543d9>] vfs_setxattr+0x83/0xa1 [<ffffffff811544c6>] setxattr+0xcf/0x101 [<ffffffff81154745>] sys_lsetxattr+0x6a/0x8f [<ffffffff814efda9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff88024163c5a0 (size 96): comm "systemd", pid 1, jiffies 4294683888 (age 94.553s) hex dump (first 32 bytes): 00 00 00 00 ed 41 00 00 00 00 00 00 00 00 00 00 .....A.......... 00 00 00 00 00 00 00 00 0c 64 42 4f 00 00 00 00 .........dBO.... backtrace: [<ffffffff814cb1d0>] kmemleak_alloc+0x73/0x98 [<ffffffff81127402>] kmem_cache_alloc_trace+0xc4/0xee [<ffffffff81191cbe>] sysfs_init_inode_attrs+0x2a/0x83 [<ffffffff81191dd6>] sysfs_setxattr+0xbf/0x117 [<ffffffff811542f0>] __vfs_setxattr_noperm+0x73/0xd9 [<ffffffff811543d9>] vfs_setxattr+0x83/0xa1 [<ffffffff811544c6>] setxattr+0xcf/0x101 [<ffffffff81154745>] sys_lsetxattr+0x6a/0x8f [<ffffffff814efda9>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff ` Signed-off-by: Masami Ichikawa <masami256@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-02Merge 3.3-rc2 into the driver-core-next branch.Greg Kroah-Hartman2-1/+10
This was done to resolve a merge and build problem with the drivers/acpi/processor_driver.c file. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-01-31sysfs: Update the name hash when renaming sysfs entriesEric W. Biederman1-0/+1
This fixes a bug introduced with sysfs name hashes where renaming a network device appears to succeed but silently makes the sysfs files for that network device inaccessible. In at least one configuration this bug has stopped networking from coming up during boot. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: change permissions for /sys from 0755 to 0555Vitaly Kuznetsov1-1/+1
There is a misleading difference between /proc and /sys permissions, /proc is 0555 and /sys is 0755. But as it is impossible to create or unlink something in /sys it would be nice to have same permissions. Signed-off-by: Vitaly Kuznetsov <vitty@altlinux.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Kill nlink counting.Eric W. Biederman3-10/+0
Tracking the number of subdirectories requires an extra field that increases the size of sysfs_dirent. nlinks are not particularly interesting for sysfs and the nlink counts are wrong when network namespaces are involved so stop counting them, and always return nlink == 1. Userspace already knows that directories with nlink == 1 have an nlink count they can't use to count subdirectories. This reduces the size of sysfs_dirent by 8 bytes on 64bit platforms. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Store the sysfs inode in an unsigned int.Eric W. Biederman2-3/+3
Store the sysfs inode number in an unsided int because ida inode allocator can return at most a 31 bit number, reducing the size of struct sysfs_dirent by 8 bytes on 64bit platforms. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Reduce s_flags to an unsinged short so it packs well with s_modeEric W. Biederman1-3/+3
On 32bit this reduces sizeof(struct sysfs_dirent) by 2 bytes. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Add s_hash to sysfs_dirent and order directory entries by hashEric W. Biederman2-108/+120
Compute a 31 bit hash of directory entries (that can fit in a signed 32bit off_t) and index the sysfs directory entries by that hash, replacing the per directory indexes by name and by inode. Because we now only use a single rbtree this reduces the size of sysfs_dirent by 2 pointers. Because we have fewer cases to deal with the code is now simpler. For now I use the simple hash that the dcache uses as that is easy to use and seems simple enough. In addition to makeing the code simpler using a hash for the file position in readdir brings sysfs in line with other filesystems that have non-trivial directory structures. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24sysfs: Complain bitterly about attempts to remove files from nonexistent ↵Eric W. Biederman2-1/+10
directories. Recently an OOPS was observed from the usb serial io_ti driver when it tried to remove sysfs directories. Upon investigation it turns out this driver was always buggy and that a recent sysfs change had stopped guarding itself against removing attributes from sysfs directories that had already been removed. :( Historically we have been silent about attempting to files from nonexistent sysfs directories and have politely returned error codes. That has resulted in people writing broken code that ignores the error codes. Issue a kernel WARNING and a stack backtrace to make it clear in no uncertain terms that abusing sysfs is not ok, and the callers need to fix their code. This change transforms the io_ti OOPS into a more comprehensible error message and stack backtrace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reported-by: Wolfgang Frisch <wfpub@roembden.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-03sysfs: propagate umode_tAl Viro3-4/+4
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch sysfs_chmod_file() to umode_tAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch ->is_visible() to returning umode_tAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-11-02filesystems: add set_nlink()Miklos Szeredi1-1/+1
Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-11-01sysfs: Make sysfs_rename safe with sysfs_dirents in rbtrees.Eric W. Biederman1-8/+6
In sysfs_rename we need to remove the optimization of not calling sysfs_unlink_sibling and sysfs_link_sibling if the renamed parent directory is not changing. This optimization is no longer valid now that sysfs dirents are stored in an rbtree sorted by name. Move the assignment of s_ns before the call of sysfs_link_sibling. With no sysfs_dirent fields changing after the call of sysfs_link_sibling this allows sysfs_link_sibling to take any of the directory entries into account when it builds the rbtrees, and s_ns looks like a prime canidate to be used in the rbtree in the future. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Greg KH <gregkh@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-25sysfs: Remove support for tagged directories with untagged members (again)Eric W. Biederman1-4/+4
In commit 8a9ea3237e7e ("Merge git://.../davem/net-next") where my sysfs changes from the net tree merged with the sysfs rbtree changes from Mickulas Patocka the conflict resolution failed to preserve the simplified property that was the point of my changes. That is sysfs_find_dirent can now say something is a match if and only s_name and s_ns match what we are looking for, and sysfs_readdir can simply return all of the directory entries where s_ns matches the directory that we should be returning. Now that we are back to exact matches we can tweak sysfs_find_dirent and the name rb_tree to order sysfs_dirents by s_ns s_name and remove the second loop in sysfs_find_dirent. However that change seems a bit much for a conflict resolution so it can come later. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds3-7/+65
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits) dp83640: free packet queues on remove dp83640: use proper function to free transmit time stamping packets ipv6: Do not use routes from locally generated RAs |PATCH net-next] tg3: add tx_dropped counter be2net: don't create multiple RX/TX rings in multi channel mode be2net: don't create multiple TXQs in BE2 be2net: refactor VF setup/teardown code into be_vf_setup/clear() be2net: add vlan/rx-mode/flow-control config to be_setup() net_sched: cls_flow: use skb_header_pointer() ipv4: avoid useless call of the function check_peer_pmtu TCP: remove TCP_DEBUG net: Fix driver name for mdio-gpio.c ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces ipv4: fix ipsec forward performance regression jme: fix irq storm after suspend/resume route: fix ICMP redirect validation net: hold sock reference while processing tx timestamps tcp: md5: add more const attributes Add ethtool -g support to virtio_net ... Fix up conflicts in: - drivers/net/Kconfig: The split-up generated a trivial conflict with removal of a stale reference to Documentation/networking/net-modules.txt. Remove it from the new location instead. - fs/sysfs/dir.c: Fairly nasty conflicts with the sysfs rb-tree usage, conflicting with Eric Biederman's changes for tagged directories.
2011-10-19sysfs: Reject with a warning invalid uses of tagged directories.Eric W. Biederman2-3/+14
sysfs is a core piece of ifrastructure that many people use and few people have all of the rules in their head on how to use it correctly. Add warnings for people using tagged directories improperly to that any misuses can be caught and diagnosed quickly. A single inexpensive test in sysfs_find_dirent is almost sufficient to catch all possible misuses. An additional warning is needed in sysfs_add_dirent so that we actually fail when attempting to add an untagged dirent in a tagged directory. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19sysfs: Remove support for tagged directories with untagged members.Eric W. Biederman2-5/+3
Now that /sys/class/net/bonding_masters is implemented as a tagged sysfs file we can remove support for untagged files in tagged directories. This change removes any ambiguity of what a NULL namespace value means. A NULL namespace parameter after this patch means that we are talking about an untagged sysfs dirent. This makes the sysfs code much less prone to mistakes when during maintenance. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19sysfs: Implement support for tagged files in sysfs.Eric W. Biederman1-2/+51
Looking up files in sysfs is hard to understand and analyize because we currently allow placing untagged files in tagged directories. In the implementation of that we have two subtly different meanings of NULL. NULL meaning there is no tag on a directory entry and NULL meaning we don't care which namespace the lookup is performed for. This multiple uses of NULL have resulted in subtle bugs (since fixed) in the code. Currently it is only the bonding driver that needs to have an untagged file in a tagged directory. To untagle this mess I am adding support for tagged files to sysfs. Modifying the bonding driver to implement bonding_masters as a tagged file. Registering bonding_masters once for each network namespace. Then I am removing support for untagged entries in tagged sysfs directories. Resulting in code that is much easier to reason about. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-26sysfs: add unsigned long cast to prevent compile warningHeiko Carstens1-1/+2
"sysfs: use rb-tree for inode number lookup" added a new printk which causes a new compile warning on s390 (and few other architectures): fs/sysfs/dir.c: In function 'sysfs_link_sibling': fs/sysfs/dir.c:63:4: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'ino_t' [-Wform Add an explicit unsigned long cast since ino_t is an unsigned long on most architectures. Cc: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22sysfs: use rb-tree for inode number lookupMikulas Patocka2-42/+52
sysfs: use rb-tree for inode number lookup This patch makes sysfs use red-black tree for inode number lookup. Together with a previous patch to use red-black tree for name lookup, this patch makes all sysfs lookups to have O(log n) complexity. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22sysfs: remove s_sibling hacksMikulas Patocka2-12/+12
sysfs: remove s_sibling hacks s_sibling was used for three different purposes: 1) as a linked list of entries in the directory 2) as a linked list of entries to be deleted 3) as a pointer to "struct completion" This patch removes the hack and introduces new union u which holds pointers for cases 2) and 3). This change is needed for the following patch that removes s_sibling at all and replaces it with a rb tree. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-22sysfs: use rb-tree for name lookupsMikulas Patocka2-7/+55
sysfs: use rb-tree for name lookups Use red-black tree for name lookups. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>