summaryrefslogtreecommitdiff
path: root/net/sunrpc/xprtsock.c
AgeCommit message (Collapse)AuthorFilesLines
2010-09-24net: fix a lockdep splatEric Dumazet1-14/+14
We have for each socket : One spinlock (sk_slock.slock) One rwlock (sk_callback_lock) Possible scenarios are : (A) (this is used in net/sunrpc/xprtsock.c) read_lock(&sk->sk_callback_lock) (without blocking BH) <BH> spin_lock(&sk->sk_slock.slock); ... read_lock(&sk->sk_callback_lock); ... (B) write_lock_bh(&sk->sk_callback_lock) stuff write_unlock_bh(&sk->sk_callback_lock) (C) spin_lock_bh(&sk->sk_slock) ... write_lock_bh(&sk->sk_callback_lock) stuff write_unlock_bh(&sk->sk_callback_lock) spin_unlock_bh(&sk->sk_slock) This (C) case conflicts with (A) : CPU1 [A] CPU2 [C] read_lock(callback_lock) <BH> spin_lock_bh(slock) <wait to spin_lock(slock)> <wait to write_lock_bh(callback_lock)> We have one problematic (C) use case in inet_csk_listen_stop() : local_bh_disable(); bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock) WARN_ON(sock_owned_by_user(child)); ... sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock) lockdep is not happy with this, as reported by Tetsuo Handa It seems only way to deal with this is to use read_lock_bh(callbacklock) everywhere. Thanks to Jarek for pointing a bug in my first attempt and suggesting this solution. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds1-6/+22
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix an Oops in the NFSv4 atomic open code NFS: Fix the selection of security flavours in Kconfig NFS: fix the return value of nfs_file_fsync() rpcrdma: Fix SQ size calculation when memreg is FRMR xprtrdma: Do not truncate iova_start values in frmr registrations. nfs: Remove redundant NULL check upon kfree() nfs: Add "lookupcache" to displayed mount options NFS: allow close-to-open cache semantics to apply to root of NFS filesystem SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)
2010-08-11param: use ops in struct kernel_param, rather than get and set fns directlyRusty Russell1-11/+15
This is more kernel-ish, saves some space, and also allows us to expand the ops without breaking all the callers who are happy for the new members to be NULL. The few places which defined their own param types are changed to the new scheme (more which crept in recently fixed in following patches). Since we're touching them anyway, we change get() and set() to take a const struct kernel_param (which they really are). This causes some harmless warnings until we fix them (in following patches). To reduce churn, module_param_call creates the ops struct so the callers don't have to change (and casts the functions to reduce warnings). The modern version which takes an ops struct is called module_param_cb. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Takashi Iwai <tiwai@suse.de> Tested-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ville Syrjala <syrjala@sci.fi> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Alessandro Rubini <rubini@ipvvis.unipv.it> Cc: Michal Januszewski <spock@gentoo.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: linux-kernel@vger.kernel.org Cc: linux-input@vger.kernel.org Cc: linux-fbdev-devel@lists.sourceforge.net Cc: linux-nfs@vger.kernel.org Cc: netdev@vger.kernel.org
2010-08-10SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)Andy Chittenden1-6/+22
When reusing a TCP connection, ensure that it's aborted if a previous shutdown attempt has been made on that connection so that the RPC over TCP recovery mechanism succeeds. Signed-off-by: Andy Chittenden <andyc.bluearc@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()Trond Myklebust1-16/+22
If the attempt to read the calldir fails, then instead of storing the read bytes, we currently discard them. This leads to a garbage final result when upon re-entry to the same routine, we read the remaining bytes. Fixes the regression in bugzilla number 16213. Please see https://bugzilla.kernel.org/show_bug.cgi?id=16213 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-05-26sunrpc: fix leak on error on socket xprt setupJ. Bruce Fields1-11/+18
Also collect exit code together while we're at it. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds1-4/+0
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1674 commits) qlcnic: adding co maintainer ixgbe: add support for active DA cables ixgbe: dcb, do not tag tc_prio_control frames ixgbe: fix ixgbe_tx_is_paused logic ixgbe: always enable vlan strip/insert when DCB is enabled ixgbe: remove some redundant code in setting FCoE FIP filter ixgbe: fix wrong offset to fc_frame_header in ixgbe_fcoe_ddp ixgbe: fix header len when unsplit packet overflows to data buffer ipv6: Never schedule DAD timer on dead address ipv6: Use POSTDAD state ipv6: Use state_lock to protect ifa state ipv6: Replace inet6_ifaddr->dead with state cxgb4: notify upper drivers if the device is already up when they load cxgb4: keep interrupts available when the ports are brought down cxgb4: fix initial addition of MAC address cnic: Return SPQ credit to bnx2x after ring setup and shutdown. cnic: Convert cnic_local_flags to atomic ops. can: Fix SJA1000 command register writes on SMP systems bridge: fix build for CONFIG_SYSFS disabled ARCNET: Limit com20020 PCI ID matches for SOHARD cards ... Fix up various conflicts with pcmcia tree drivers/net/ {pcmcia/3c589_cs.c, wireless/orinoco/orinoco_cs.c and wireless/orinoco/spectrum_cs.c} and feature removal (Documentation/feature-removal-schedule.txt). Also fix a non-content conflict due to pm_qos_requirement getting renamed in the PM tree (now pm_qos_request) in net/mac80211/scan.c
2010-05-17net: Remove unnecessary returns from void function()sJoe Perches1-4/+0
This patch removes from net/ (but not any netfilter files) all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-14SUNRPC: Move the task->tk_bytes_sent and tk_rtt to struct rpc_rqstTrond Myklebust1-2/+2
It seems strange to maintain stats for bytes_sent in one structure, and bytes received in another. Try to assemble all the RPC request-related stats in struct rpc_rqst Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14SUNRPC: Fix xs_setup_bc_tcp()Trond Myklebust1-3/+0
It is a BUG for anybody to call this function without setting args->bc_xprt. Trying to return an error value is just wrong, since the user cannot fix this: it is a programming error, not a user error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14SUNRPC: RPC metrics and RTT estimator should use same RTT valueChuck Lever1-1/+0
Compute an RPC request's RTT once, and use that value both for reporting RPC metrics, and for adjusting the RTT context used by the RPC client's RTT estimator algorithm. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14SUNRPC: Fail over more quickly on connect errorsTrond Myklebust1-17/+0
We should not allow soft tasks to wait for longer than the major timeout period when waiting for a reconnect to occur. Remove the field xprt->connect_timeout since it has been obsoleted by xprt->reestablish_timeout. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14SUNRPC: Move the test for XPRT_CONNECTING into xprt_connect()Trond Myklebust1-14/+1
This fixes a bug with setting xprt->stat.connect_start. Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-22SUNRPC: Fix a use after free bug with the NFSv4.1 backchannelTrond Myklebust1-3/+0
The ->release_request() callback was designed to allow the transport layer to do housekeeping after the RPC call is done. It cannot be used to free the request itself, and doing so leads to a use-after-free bug in xprt_release(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-18Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds1-6/+2
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: ensure bdi_unregister is called on mount failure. NFS: Avoid a deadlock in nfs_release_page NFSv4: Don't ignore the NFS_INO_REVAL_FORCED flag in nfs_revalidate_inode() nfs4: Make the v4 callback service hidden nfs: fix unlikely memory leak rpc client can not deal with ENOSOCK, so translate it into ENOCONN
2010-03-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds1-5/+4
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (108 commits) bridge: ensure to unlock in error path in br_multicast_query(). drivers/net/tulip/eeprom.c: fix bogus "(null)" in tulip init messages sky2: Avoid rtnl_unlock without rtnl_lock ipv6: Send netlink notification when DAD fails drivers/net/tg3.c: change the field used with the TG3_FLAG_10_100_ONLY constant ipconfig: Handle devices which take some time to come up. mac80211: Fix memory leak in ieee80211_if_write() mac80211: Fix (dynamic) power save entry ipw2200: use kmalloc for large local variables ath5k: read eeprom IQ calibration values correctly for G mode ath5k: fix I/Q calibration (for real) ath5k: fix TSF reset ath5k: use fixed antenna for tx descriptors libipw: split ieee->networks into small pieces mac80211: Fix sta_mtx unlocking on insert STA failure path rt2x00: remove KSEG1ADDR define from rt2x00soc.h net: add ColdFire support to the smc91x driver asix: fix setting mac address for AX88772 ipv6 ip6_tunnel: eliminate unused recursion field from ip6_tnl{}. net: Fix dev_mc_add() ...
2010-03-08net/sunrpc: Convert (void)snprintf to snprintfJoe Perches1-2/+2
(Applies on top of "Remove uses of NIPQUAD, use %pI4") Casts to void of snprintf are most uncommon in kernel source. 9 use casts, 1301 do not. Remove the remaining uses in net/sunrpc/ Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-08net/sunrpc: Remove uses of NIPQUAD, use %pI4Joe Perches1-3/+2
Originally submitted Jan 1, 2010 http://patchwork.kernel.org/patch/71221/ Convert NIPQUAD to the %pI4 format extension where possible Convert %02x%02x%02x%02x/NIPQUAD to %08x/ntohl Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-08rpc client can not deal with ENOSOCK, so translate it into ENOCONNBian Naimeng1-6/+2
If NFSv4 client send a request before connect, or the old connection was broken because a ETIMEOUT error catched by call_status, ->send_request will return ENOSOCK, but rpc layer can not deal with it, so make sure ->send_request can translate ENOSOCK into ENOCONN. Signed-off-by: Bian Naimeng <biannm@cn.fujitsu.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02SUNRPC: Handle EINVAL error returns from the TCP connect operationTrond Myklebust1-0/+5
This can, for instance, happen if the user specifies a link local IPv6 address. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-02-10xprtsock.c: make bc_{malloc/free} staticH Hartley Sweeten1-2/+2
xprtsock.c: make bc_{malloc/free} static The server backchannel buf_alloc and buf_free methods should be static since they are not used outside this file. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-13Merge branch 'nfs-for-2.6.33'Trond Myklebust1-1/+1
2009-12-03SUNRPC: Allow RPCs to fail quickly if the server is unreachableChuck Lever1-1/+1
The kernel sometimes makes RPC calls to services that aren't running. Because the kernel's RPC client always assumes the hard retry semantic when reconnecting a connection-oriented RPC transport, the underlying reconnect logic takes a long while to time out, even though the remote may have responded immediately with ECONNREFUSED. In certain cases, like upcalls to our local rpcbind daemon, or for NFS mount requests, we'd like the kernel to fail immediately if the remote service isn't reachable. This allows another transport to be tried immediately, or the pending request can be abandoned quickly. Introduce a per-request flag which controls how call_transmit_status() behaves when request transmission fails because the server cannot be reached. We don't want soft connection semantics to apply to other errors. The default case of the switch statement in call_transmit_status() no longer falls through; the fall through code is copied to the default case, and a "break;" is added. The transport's connection re-establishment timeout is also ignored for such requests. We want the request to fail immediately, so the reconnect delay is skipped. Additionally, we don't want a connect failure here to further increase the reconnect timeout value, since this request will not be retried. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-11-18sysctl: Drop & in front of every proc_handler.Eric W. Biederman1-5/+5
For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-12sysctl net: Remove unused binary sysctl codeEric W. Biederman1-16/+2
Now that sys_sysctl is a compatiblity wrapper around /proc/sys all sysctl strategy routines, and all ctl_name and strategy entries in the sysctl tables are unused, and can be revmoed. In addition neigh_sysctl_register has been modified to no longer take a strategy argument and it's callers have been modified not to pass one. Cc: "David Miller" <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: netdev@vger.kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-09-23NFS/RPC: fix problems with reestablish_timeout and related code.Neil Brown1-0/+9
[[resending with correct cc: - "vfs.kernel.org" just isn't right!]] xprt->reestablish_timeout is used to cause TCP connection attempts to back off if the connection fails so as not to hammer the network, but to still allow immediate connections when there is no reason to believe there is a problem. It is not used for the first connection (when transport->sock is NULL) but only on reconnects. It is currently set: a/ to 0 when xs_tcp_state_change finds a state of TCP_FIN_WAIT1 on the assumption that the client has closed the connection so the reconnect should be immediate when needed. b/ to at least XS_TCP_INIT_REEST_TO when xs_tcp_state_change detects TCP_CLOSING or TCP_CLOSE_WAIT on the assumption that the server closed the connection so a small delay at least is required. c/ as above when xs_tcp_state_change detects TCP_SYN_SENT, so that it is never 0 while a connection has been attempted, else the doubling will produce 0 and there will be no backoff. d/ to double is value (up to a limit) when delaying a connection, thus providing exponential backoff and e/ to XS_TCP_INIT_REEST_TO in xs_setup_tcp as simple initialisation. So you can see it is highly dependant on xs_tcp_state_change being called as expected. However experimental evidence shows that xs_tcp_state_change does not see all state changes. ("rpcdebug -m rpc trans" can help show what actually happens). Results show: TCP_ESTABLISHED is reported when a connection is made. TCP_SYN_SENT is never reported, so rule 'c' above is never effective. When the server closes the connection, TCP_CLOSE_WAIT and TCP_LAST_ACK *might* be reported, and TCP_CLOSE is always reported. This rule 'b' above will sometimes be effective, but not reliably. When the client closes the connection, it used to result in TCP_FIN_WAIT1, TCP_FIN_WAIT2, TCP_CLOSE. However since commit f75e674 (SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on reconnect) we don't see *any* events on client-close. I think this is because xs_restore_old_callbacks is called to disconnect xs_tcp_state_change before the socket is closed. In any case, rule 'a' no longer applies. So all that is left are rule d, which successfully doubles the timeout which is never rest, and rule e which initialises the timeout. Even if the rules worked as expected, there would be a problem because a successful connection does not reset the timeout, so a sequence of events where the server closes the connection (e.g. during failover testing) will cause longer and longer timeouts with no good reason. This patch: - sets reestablish_timeout to 0 in xs_close thus effecting rule 'a' - sets it to 0 in xs_tcp_data_ready to ensure that a successful connection resets the timeout - sets it to at least XS_TCP_INIT_REEST_TO after it is doubled, thus effecting rule c I have not reimplemented rule b and the new version of rule c seems sufficient. I suspect other code in xs_tcp_data_ready needs to be revised as well. For example I don't think connect_cookie is being incremented as often as it should be. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-09-13nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannelAlexandros Batsakis1-2/+94
[sunrpc: change idle timeout value for the backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-11nfsd41: sunrpc: Added rpc server-side backchannel handlingRahul Iyer1-0/+146
When the call direction is a reply, copy the xid and call direction into the req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns rpc_garbage. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [get rid of CONFIG_NFSD_V4_1] [sunrpc: refactoring of svc_tcp_recvfrom] [nfsd41: sunrpc: create common send routine for the fore and the back channels] [nfsd41: sunrpc: Use free_page() to free server backchannel pages] [nfsd41: sunrpc: Document server backchannel locking] [nfsd41: sunrpc: remove bc_connect_worker()] [nfsd41: sunrpc: Define xprt_server_backchannel()[ [nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions] [nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()] [nfsd41: sunrpc: Don't auto close the server backchannel connection] [nfsd41: sunrpc: Remove unused functions] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: change bc_sock to bc_xprt] [nfsd41: sunrpc: move struct rpc_buffer def into a common header file] [nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex] [removed cosmetic changes] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [sunrpc: add new xprt class for nfsv4.1 backchannel] [sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [reverted more cosmetic leftovers] [got rid of xprt_server_backchannel] [separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Cc: Trond Myklebust <trond.myklebust@netapp.com> [sunrpc: change idle timeout value for the backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Acked-by: Trond Myklebust <trond.myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-10Merge branch 'patches_cel-for-2.6.32' into nfs-for-2.6.32Trond Myklebust1-138/+97
2009-08-09SUNRPC: Update xprt address strings after an rpcbind completesChuck Lever1-41/+41
After a bind completes, update the transport instance's address strings so debugging messages display the current port the transport is connected to. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: Kill RPC_DISPLAY_ALLChuck Lever1-16/+34
At some point, I recall that rpc_pipe_fs used RPC_DISPLAY_ALL. Currently there are no uses of RPC_DISPLAY_ALL outside the transport modules themselves, so we can safely get rid of it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: Rename sock_xprt.addr as sock_xprt.srcaddrChuck Lever1-12/+12
Clean up: Give the "addr" and "port" field less ambiguous names. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: Use rpc_ntop() for constructing transport address stringsChuck Lever1-75/+34
Clean up: In addition to using the new generic rpc_ntop() and rpc_get_port() functions, have the RPC client compute the presentation address buffer sizes dynamically using kstrdup(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: Remove duplicate universal address generationChuck Lever1-18/+0
RPC universal address generation is currently done in several places: rpcb_clnt.c, nfs4proc.c xprtsock.c, and xprtrdma.c. Remove the redundant cases that convert a socket address to a universal address. The nfs4proc.c case takes a pre-formatted presentation address string, not a socket address, so we'll leave that one. Because the new uaddr constructor uses the recently introduced rpc_ntop(), it now supports proper "::" shorthanding for IPv6 addresses. This allows the kernel to register properly formed universal addresses with the local rpcbind service, in _all_ cases. The kernel can now also send properly formed universal addresses in RPCB_GETADDR requests, and support link-local properly when encoding and decoding IPv6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09SUNRPC: convert some sysctls into module parametersTrond Myklebust1-0/+52
Parameters like the minimum reserved port, and the number of slot entries should really be module parameters rather than sysctls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-18Merge branch 'devel-for-2.6.31' into for-2.6.31Trond Myklebust1-0/+1
Conflicts: fs/nfs/client.c fs/nfs/super.c
2009-06-17Merge commit 'linux-pnfs/nfs41-for-2.6.31' into nfsv41-for-2.6.31Trond Myklebust1-20/+196
2009-06-17nfs41: Backchannel callback service helper routinesRicardo Labiaga1-0/+3
Executes the backchannel task on the RPC state machine using the existing open connection previously established by the client. Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> nfs41: Add bc_svc.o to sunrpc Makefile. [nfs41: bc_send() does not need to be exported outside RPC module] [nfs41: xprt_free_bc_request() need not be exported outside RPC module] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Update copyright] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17SUNRPC: Fix a missing "break" option in xs_tcp_setup_socket()Trond Myklebust1-0/+1
In the case of -EADDRNOTAVAIL and/or unhandled connection errors, we want to get rid of the existing socket and retry immediately, just as the comment says. Currently we end up sleeping for a minute, due to the missing "break" statement. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17nfs41: New xs_tcp_read_data()Ricardo Labiaga1-18/+126
Handles RPC replies and backchannel callbacks. Traditionally the NFS client has expected only RPC replies on its open connections. With NFSv4.1, callbacks can arrive over an existing open connection. This patch refactors the old xs_tcp_read_request() into an RPC reply handler: xs_tcp_read_reply(), a new backchannel callback handler: xs_tcp_read_callback(), and a common routine to read the data off the transport: xs_tcp_read_common(). The new xs_tcp_read_callback() queues callback requests onto a queue where the callback service (a separate thread) is listening for the processing. This patch incorporates work and suggestions from Rahul Iyer (iyer@netapp.com) and Benny Halevy (bhalevy@panasas.com). xs_tcp_read_callback() drops the connection when the number of expected callbacks is exceeded. Use xprt_force_disconnect(), ensuring tasks on the pending queue are awaken on disconnect. [nfs41: Keep track of RPC call/reply direction with a flag] [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks] Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: xs_tcp_read_callback() should use xprt_force_disconnect()] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Moves embedded #ifdefs into #ifdef function blocks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17nfs41: Process the RPC call directionRicardo Labiaga1-6/+25
Reading and storing the RPC direction is a three step process. 1. xs_tcp_read_calldir() reads the RPC direction, but it will not store it in the XDR buffer since the 'struct rpc_rqst' is not yet available. 2. The 'struct rpc_rqst' is obtained during the TCP_RCV_COPY_DATA state. This state need not necessarily be preceeded by the TCP_RCV_READ_CALLDIR. For example, we may be reading a continuation packet to a large reply. Therefore, we can't simply obtain the 'struct rpc_rqst' during the TCP_RCV_READ_CALLDIR state and assume it's available during TCP_RCV_COPY_DATA. This patch adds a new TCP_RCV_READ_CALLDIR flag to indicate the need to read the RPC direction. It then uses TCP_RCV_COPY_CALLDIR to indicate the RPC direction needs to be saved after the 'struct rpc_rqst' has been allocated. 3. The 'struct rpc_rqst' is obtained by the xs_tcp_read_data() helper functions. xs_tcp_read_common() then saves the RPC direction in the XDR buffer if TCP_RCV_COPY_CALLDIR is set. This will happen when we're reading the data immediately after the direction was read. xs_tcp_read_common() then clears this flag. [was nfs41: Skip past the RPC call direction] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: Add RPC direction back into the XDR buffer] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: Don't skip past the RPC call direction] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17nfs41: Add ability to read RPC call direction on TCP stream.Ricardo Labiaga1-3/+49
NFSv4.1 callbacks can arrive over an existing connection. This patch adds the logic to read the RPC call direction (call or reply). It does this by updating the state machine to look for the call direction invoking xs_tcp_read_calldir(...) after reading the XID. [nfs41: Keep track of RPC call/reply direction with a flag] As per 11/14/08 review of RFC 53/85. Add a new flag to track whether the incoming message is an RPC call or an RPC reply. TCP_RPC_REPLY is set in the 'struct sock_xprt' tcp_flags in xs_tcp_read_calldir() if the message is an RPC reply sent on the forechannel. It is cleared if the message is an RPC request sent on the back channel. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-03net: skb->dst accessorsEric Dumazet1-1/+1
Define three accessors to get/set dst attached to a skb struct dst_entry *skb_dst(const struct sk_buff *skb) void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-02SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on reconnectTrond Myklebust1-5/+21
See http://bugzilla.kernel.org/show_bug.cgi?id=13034 If the port gets into a TIME_WAIT state, then we cannot reconnect without binding to a new port. Tested-by: Petr Vandrovec <petr@vandrovec.name> Tested-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01Merge branch 'devel' into for-linusTrond Myklebust1-135/+228
2009-03-26Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller1-5/+18
Conflicts: drivers/net/wimax/i2400m/usb-notif.c
2009-03-19SUNRPC: Ensure we close the socket on EPIPE errors too...Trond Myklebust1-1/+1
As long as one task is holding the socket lock, then calls to xprt_force_disconnect(xprt) will not succeed in shutting down the socket. In particular, this would mean that a server initiated shutdown will not succeed until the lock is relinquished. In order to avoid the deadlock, we should ensure that xs_tcp_send_request() closes the socket on EPIPE errors too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19SUNRPC: xs_tcp_connect_worker{4,6}: merge common codeTrond Myklebust1-68/+72
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19SUNRPC: Add a sysctl to control the duration of the socket linger timeoutTrond Myklebust1-2/+11
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC socketsTrond Myklebust1-17/+81
This fixes a regression against FreeBSD servers as reported by Tomas Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers don't ever react to the client closing the socket, and so commit e06799f958bf7f9f8fae15f0c6f519953fb0257c (SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket) causes the setup to hang forever whenever the client attempts to close and then reconnect. We break the deadlock by adding a 'linger2' style timeout to the socket, after which, the client will abort the connection using a TCP 'RST'. The default timeout is set to 15 seconds. A subsequent patch will put it under user control by means of a systctl. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>