summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2016-03-09kcm: Add receive message timeoutTom Herbert2-2/+36
This patch adds receive timeout for message assembly on the attached TCP sockets. The timeout is set when a new messages is started and the whole message has not been received by TCP (not in the receive queue). If the completely message is subsequently received the timer is cancelled, if the timer expires the RX side is aborted. The timeout value is taken from the socket timeout (SO_RCVTIMEO) that is set on a TCP socket (i.e. set by get sockopt before attaching a TCP socket to KCM. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09kcm: Add memory limit for receive message constructionTom Herbert2-2/+48
Message assembly is performed on the TCP socket. This is logically equivalent of an application that performs a peek on the socket to find out how much memory is needed for a receive buffer. The receive socket buffer also provides the maximum message size which is checked. The receive algorithm is something like: 1) Receive the first skbuf for a message (or skbufs if multiple are needed to determine message length). 2) Check the message length against the number of bytes in the TCP receive queue (tcp_inq()). - If all the bytes of the message are in the queue (incluing the skbuf received), then proceed with message assembly (it should complete with the tcp_read_sock) - Else, mark the psock with the number of bytes needed to complete the message. 3) In TCP data ready function, if the psock indicates that we are waiting for the rest of the bytes of a messages, check the number of queued bytes against that. - If there are still not enough bytes for the message, just return - Else, clear the waiting bytes and proceed to receive the skbufs. The message should now be received in one tcp_read_sock Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09kcm: Sendpage supportTom Herbert1-2/+145
Implement kcm_sendpage. Set in sendpage to kcm_sendpage in both dgram and seqpacket ops. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09kcm: Splice supportTom Herbert1-2/+96
Implement kcm_splice_read. This is supported only for seqpacket. Add kcm_seqpacket_ops and set splice read to kcm_splice_read. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09kcm: Add statistics and proc interfacesTom Herbert3-1/+503
This patch adds various counters for KCM. These include counters for messages and bytes received or sent, as well as counters for number of attached/unattached TCP sockets and other error or edge events. The statistics are exposed via a proc interface. /proc/net/kcm provides statistics per KCM socket and per psock (attached TCP sockets). /proc/net/kcm_stats provides aggregate statistics. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09kcm: Kernel Connection Multiplexor moduleTom Herbert5-0/+2031
This module implements the Kernel Connection Multiplexor. Kernel Connection Multiplexor (KCM) is a facility that provides a message based interface over TCP for generic application protocols. With KCM an application can efficiently send and receive application protocol messages over TCP using datagram sockets. For more information see the included Documentation/networking/kcm.txt Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09tcp: Add tcp_inq to get available receive bytes on socketTom Herbert1-14/+1
Create a common kernel function to get the number of bytes available on a TCP socket. This is based on code in INQ getsockopt and we now call the function for that getsockopt. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09net: Walk fragments in __skb_splice_bitsTom Herbert1-23/+16
Add walking of fragments in __skb_splice_bits. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09net: Add MSG_BATCH flagTom Herbert1-0/+5
Add a new msg flag called MSG_BATCH. This flag is used in sendmsg to indicate that more messages will follow (i.e. a batch of messages is being sent). This is similar to MSG_MORE except that the following messages are not merged into one packet, they are sent individually. sendmmsg is updated so that each contained message except for the last one is marked as MSG_BATCH. MSG_BATCH is a performance optimization in cases where a socket implementation can benefit by transmitting packets in a batch. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09net: Allow MSG_EOR in each msghdr of sendmmsgTom Herbert1-4/+6
This patch allows setting MSG_EOR in each individual msghdr passed in sendmmsg. This allows a sendmmsg to send multiple messages when using SOCK_SEQPACKET. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-09net: Make sock_alloc exportableTom Herbert1-1/+2
Export it for cases where we want to create sockets by hand. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08ipv6: per netns FIB garbage collectionMichal Kubeček1-5/+4
One of our customers observed issues with FIB6 garbage collectors running in different network namespaces blocking each other, resulting in soft lockups (fib6_run_gc() initiated from timer runs always in forced mode). Now that FIB6 walkers are separated per namespace, there is no more need for instances of fib6_run_gc() in different namespaces blocking each other. There is still a call to icmp6_dst_gc() which operates on shared data but this function is protected by its own shared lock. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08ipv6: per netns fib6 walkersMichal Kubeček1-32/+36
The IPv6 FIB data structures are separated per network namespace but there is still only one global walkers list and one global walker list lock. This means changes in one namespace unnecessarily interfere with walkers in other namespaces. Replace the global list with per-netns lists (and give each its own lock). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08ipv6: replace global gc_args with local variableMichal Kubeček1-6/+8
Global variable gc_args is only used in fib6_run_gc() and functions called from it. As fib6_run_gc() makes sure there is at most one instance of fib6_clean_all() running at any moment, we can replace gc_args with a local variable which will be needed once multiple instances (per netns) of garbage collector are allowed. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08net_sched: dsmark: use qdisc_dequeue_peeked()Kyeong Yoo1-1/+1
This fix is for dsmark similar to commit 3557619f0f6f7496ed453d4825e249 ("net_sched: prio: use qdisc_dequeue_peeked") and makes use of qdisc_dequeue_peeked() instead of direct dequeue() call. First time, wrr peeks dsmark, which will then peek into sfq. sfq dequeues an skb and it's stored in sch->gso_skb. Next time, wrr tries to dequeue from dsmark, which will call sfq dequeue directly. This results skipping the previously peeked skb. So changed dsmark dequeue to call qdisc_dequeue_peeked() instead to use peeked skb if exists. Signed-off-by: Kyeong Yoo <kyeong.yoo@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller24-320/+586
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Remove useless debug message when deleting IPVS service, from Yannick Brosseau. 2) Get rid of compilation warning when CONFIG_PROC_FS is unset in several spots of the IPVS code, from Arnd Bergmann. 3) Add prandom_u32 support to nft_meta, from Florian Westphal. 4) Remove unused variable in xt_osf, from Sudip Mukherjee. 5) Don't calculate IP checksum twice from netfilter ipv4 defrag hook since fixing af_packet defragmentation issues, from Joe Stringer. 6) On-demand hook registration for iptables from netns. Instead of registering the hooks for every available netns whenever we need one of the support tables, we register this on the specific netns that needs it, patchset from Florian Westphal. 7) Add missing port range selection to nf_tables masquerading support. BTW, just for the record, there is a typo in the description of 5f6c253ebe93b0 ("netfilter: bridge: register hooks only when bridge interface is added") that refers to the cluster match as deprecated, but it is actually the CLUSTERIP target (which registers hooks inconditionally) the one that is scheduled for removal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf, vxlan, geneve, gre: fix usage of dst_cache on xmitDaniel Borkmann2-5/+7
The assumptions from commit 0c1d70af924b ("net: use dst_cache for vxlan device"), 468dfffcd762 ("geneve: add dst caching support") and 3c1cb4d2604c ("net/ipv4: add dst cache support for gre lwtunnels") on dst_cache usage when ip_tunnel_info is used is unfortunately not always valid as assumed. While it seems correct for ip_tunnel_info front-ends such as OVS, eBPF however can fill in ip_tunnel_info for consumers like vxlan, geneve or gre with different remote dsts, tos, etc, therefore they cannot be assumed as packet independent. Right now vxlan, geneve, gre would cache the dst for eBPF and every packet would reuse the same entry that was first created on the initial route lookup. eBPF doesn't store/cache the ip_tunnel_info, so each skb may have a different one. Fix it by adding a flag that checks the ip_tunnel_info. Also the !tos test in vxlan needs to be handeled differently in this context as it is currently inferred from ip_tunnel_info as well if present. ip_tunnel_dst_cache_usable() helper is added for the three tunnel cases, which checks if we can use dst cache. Fixes: 0c1d70af924b ("net: use dst_cache for vxlan device") Fixes: 468dfffcd762 ("geneve: add dst caching support") Fixes: 3c1cb4d2604c ("net/ipv4: add dst cache support for gre lwtunnels") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf: support for access to tunnel optionsDaniel Borkmann1-6/+77
After eBPF being able to programmatically access/manage tunnel key meta data via commit d3aa45ce6b94 ("bpf: add helpers to access tunnel metadata") and more recently also for IPv6 through c6c33454072f ("bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key"), this work adds two complementary helpers to generically access their auxiliary tunnel options. Geneve and vxlan support this facility. For geneve, TLVs can be pushed, and for the vxlan case its GBP extension. I.e. setting tunnel key for geneve case only makes sense, if we can also read/write TLVs into it. In the GBP case, it provides the flexibility to easily map the group policy ID in combination with other helpers or maps. I chose to model this as two separate helpers, bpf_skb_{set,get}_tunnel_opt(), for a couple of reasons. bpf_skb_{set,get}_tunnel_key() is already rather complex by itself, and there may be cases for tunnel key backends where tunnel options are not always needed. If we would have integrated this into bpf_skb_{set,get}_tunnel_key() nevertheless, we are very limited with remaining helper arguments, so keeping compatibility on structs in case of passing in a flat buffer gets more cumbersome. Separating both also allows for more flexibility and future extensibility, f.e. options could be fed directly from a map, etc. Moreover, change geneve's xmit path to test only for info->options_len instead of TUNNEL_GENEVE_OPT flag. This makes it more consistent with vxlan's xmit path and allows for avoiding to specify a protocol flag in the API on xmit, so it can be protocol agnostic. Having info->options_len is enough information that is needed. Tested with vxlan and geneve. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf: allow to propagate df in bpf_skb_set_tunnel_keyDaniel Borkmann1-1/+5
Added by 9a628224a61b ("ip_tunnel: Add dont fragment flag."), allow to feed df flag into tunneling facilities (currently supported on TX by vxlan, geneve and gre) as a hint from eBPF's bpf_skb_set_tunnel_key() helper. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf: make helper function protos staticDaniel Borkmann1-9/+9
They are only used here, so there's no reason they should not be static. Only the vlan push/pop protos are used in the test_bpf suite. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf: add flags to bpf_skb_store_bytes for clearing hashDaniel Borkmann1-1/+3
When overwriting parts of the packet with bpf_skb_store_bytes() that were fed previously into skb->hash calculation, we should clear the current hash with skb_clear_hash(), so that a next skb_get_hash() call can determine the correct hash related to this skb. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08bpf: allow bpf_csum_diff to feed bpf_l3_csum_replace as wellDaniel Borkmann1-0/+6
Commit 7d672345ed29 ("bpf: add generic bpf_csum_diff helper") added a generic checksum diff helper that can feed bpf_l4_csum_replace() with a target __wsum diff that is to be applied to the L4 checksum. This facility is very flexible, can be cascaded, allows for adding, removing, or diffing data, or for calculating the pseudo header checksum from scratch, but it can also be reused for working with the IPv4 header checksum. Thus, analogous to bpf_l4_csum_replace(), add a case for header field value of 0 to change the checksum at a given offset through a new helper csum_replace_by_diff(). Also, in addition to that, this provides an easy to use interface for feeding precalculated diffs f.e. coming from a map. It nicely complements bpf_l3_csum_replace() that currently allows only for csum updates of 2 and 4 byte diffs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller35-82/+205
Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds30-74/+188
Pull networking fixes from David Miller: 1) Fix ordering of WEXT netlink messages so we don't see a newlink after a dellink, from Johannes Berg. 2) Out of bounds access in minstrel_ht_set_best_prob_rage, from Konstantin Khlebnikov. 3) Paging buffer memory leak in iwlwifi, from Matti Gottlieb. 4) Wrong units used to set initial TCP rto from cached metrics, also from Konstantin Khlebnikov. 5) Fix stale IP options data in the SKB control block from leaking through layers of encapsulation, from Bernie Harris. 6) Zero padding len miscalculated in bnxt_en, from Michael Chan. 7) Only CHECKSUM_PARTIAL packets should be passed down through GSO, fix from Hannes Frederic Sowa. 8) Fix suspend/resume with JME networking devices, from Diego Violat and Guo-Fu Tseng. 9) Checksums not validated properly in bridge multicast support due to the placement of the SKB header pointers at the time of the check, fix from Álvaro Fernández Rojas. 10) Fix hang/tiemout with r8169 if a stats fetch is done while the device is runtime suspended. From Chun-Hao Lin. 11) The forwarding database netlink dump facilities don't track the state of the dump properly, resulting in skipped/missed entries. From Minoura Makoto. 12) Fix regression from a recent 3c59x bug fix, from Neil Horman. 13) Fix list corruption in bna driver, from Ivan Vecera. 14) Big endian machines crash on vlan add in bnx2x, fix from Michal Schmidt. 15) Ethtool RSS configuration not propagated properly in mlx5 driver, from Tariq Toukan. 16) Fix regression in PHY probing in stmmac driver, from Gabriel Fernandez. 17) Fix SKB tailroom calculation in igmp/mld code, from Benjamin Poirier. 18) A past change to skip empty routing headers in ipv6 extention header parsing accidently caused fragment headers to not be matched any longer. Fix from Florian Westphal. 19) eTSEC-106 erratum needs to be applied to more gianfar chips, from Atsushi Nemoto. 20) Fix netdev reference after free via workqueues in usb networking drivers, from Oliver Neukum and Bjørn Mork. 21) mdio->irq is now an array rather than a pointer to dynamic memory, but several drivers were still trying to free it :-/ Fixes from Colin Ian King. 22) act_ipt iptables action forgets to set the family field, thus LOG netfilter targets don't work with it. Fix from Phil Sutter. 23) SKB leak in ibmveth when skb_linearize() fails, from Thomas Falcon. 24) pskb_may_pull() cannot be called with interrupts disabled, fix code that tries to do this in vmxnet3 driver, from Neil Horman. 25) be2net driver leaks iomap'd memory on removal, fix from Douglas Miller. 26) Forgotton RTNL mutex unlock in ppp_create_interface() error paths, from Guillaume Nault. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (97 commits) ppp: release rtnl mutex when interface creation fails cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind tcp: fix tcpi_segs_in after connection establishment net: hns: fix the bug about loopback jme: Fix device PM wakeup API usage jme: Do not enable NIC WoL functions on S0 udp6: fix UDP/IPv6 encap resubmit path be2net: Don't leak iomapped memory on removal. vmxnet3: avoid calling pskb_may_pull with interrupts disabled net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM ibmveth: check return of skb_linearize in ibmveth_start_xmit cdc_ncm: toggle altsetting to force reset before setup usbnet: cleanup after bind() in probe() mlxsw: pci: Correctly determine if descriptor queue is full mlxsw: spectrum: Always decrement bridge's ref count tipc: fix nullptr crash during subscription cancel net: eth: altera: do not free array priv->mdio->irq net/ethoc: do not free array priv->mdio->irq net: sched: fix act_ipt for LOG target asix: do not free array priv->mdio->irq ...
2016-03-07tcp: fix tcpi_segs_in after connection establishmentEric Dumazet1-1/+2
If final packet (ACK) of 3WHS is lost, it appears we do not properly account the following incoming segment into tcpi_segs_in While we are at it, starts segs_in with one, to count the SYN packet. We do not yet count number of SYN we received for a request sock, we might add this someday. packetdrill script showing proper behavior after fix : // Tests tcpi_segs_in when 3rd packet (ACK) of 3WHS is lost 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK> +.020 < P. 1:1001(1000) ack 1 win 32792 +0 accept(3, ..., ...) = 4 +.000 %{ assert tcpi_segs_in == 2, 'tcpi_segs_in=%d' % tcpi_segs_in }% Fixes: 2efd055c53c06 ("tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07udp6: fix UDP/IPv6 encap resubmit pathBill Sommerfeld1-4/+2
IPv4 interprets a negative return value from a protocol handler as a request to redispatch to a new protocol. In contrast, IPv6 interprets a negative value as an error, and interprets a positive value as a request for redispatch. UDP for IPv6 was unaware of this difference. Change __udp6_lib_rcv() to return a positive value for redispatch. Note that the socket's encap_rcv hook still needs to return a negative value to request dispatch, and in the case of IPv6 packets, adjust IP6CB(skb)->nhoff to identify the byte containing the next protocol. Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07tipc: move netlink policies to netlink.cRichard Alpe9-74/+85
Make the c files less cluttered and enable netlink attributes to be shared between files. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07arp: correct return value of arp_rcvZhang Shengju1-15/+20
Currently, arp_rcv() always return zero on a packet delivery upcall. To make its behavior more compliant with the way this API should be used, this patch changes this to let it return NET_RX_SUCCESS when the packet is proper handled, and NET_RX_DROP otherwise. v1->v2: If sanity check is failed, call kfree_skb() instead of consume_skb(), then return the correct return value. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07netlabel: do not initialise statics to NULLWei Tang1-2/+2
This patch fixes the checkpatch.pl error to netlabel_domainhash.c: ERROR: do not initialise statics to NULL Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07netlink: do not initialise statics to 0 or NULLWei Tang1-3/+3
This patch fixes the checkpatch.pl error to netlabel_unlabeled.c: ERROR: do not initialise statics to 0 or NULL Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06tipc: remove pre-allocated message header in link structJon Paul Maloy5-77/+45
Until now, we have kept a pre-allocated protocol message header aggregated into struct tipc_link. Apart from adding unnecessary footprint to the link instances, this requires extra code both to initialize and re-initialize it. We now remove this sub-optimization. This change also makes it possible to clean up the function tipc_build_proto_msg() and remove a couple of small functions that were accessing the mentioned header. In particular, we can replace all occurrences of the local function call link_own_addr(link) with the generic tipc_own_addr(net). Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06tipc: fix nullptr crash during subscription cancelParthasarathy Bhuvaragan1-1/+2
commit 4d5cfcba2f6e ('tipc: fix connection abort during subscription cancel'), removes the check for a valid subscription before calling tipc_nametbl_subscribe(). This will lead to a nullptr exception when we process a subscription cancel request. For a cancel request, a null subscription is passed to tipc_nametbl_subscribe() resulting in exception. In this commit, we call tipc_nametbl_subscribe() only for a valid subscription. Fixes: 4d5cfcba2f6e ('tipc: fix connection abort during subscription cancel') Reported-by: Anders Widell <anders.widell@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06net: sched: fix act_ipt for LOG targetPhil Sutter1-0/+2
Before calling the destroy() or target() callbacks, the family parameter field has to be initialized. Otherwise at least the LOG target will refuse to work and upon removal oops the kernel. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06tipc: make sure required IPv6 addresses are scopedRichard Alpe1-0/+5
Make sure the user has provided a scope for multicast and link local addresses used locally by a UDP bearer. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06tipc: safely copy UDP netlink data from userRichard Alpe1-11/+13
The netlink policy for TIPC_NLA_UDP_LOCAL and TIPC_NLA_UDP_REMOTE is of type binary with a defined length. This causes the policy framework to threat the defined length as maximum length. There is however no protection against a user sending a smaller amount of data. Prior to this patch this wasn't handled which could result in a partially incomplete sockaddr_storage struct containing uninitialized data. In this patch we use nla_memcpy() when copying the user data. This ensures a potential gap at the end is cleared out properly. This was found by Julia with Coccinelle tool. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06tipc: don't check link reset on non existing linkRichard Alpe1-1/+1
Make sure we have a link before checking if it has been reset or not. Prior to this patch tipc_link_is_reset() could be called with a non existing link, resulting in a null pointer dereference. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06tipc: add net device to skb before UDP xmitRichard Alpe1-0/+2
Prior to this patch enabling a IPv4 UDP bearer caused a null pointer dereference in iptunnel_xmit_stats(), when it tried to dereference the net device from the skb. To resolve this we now point the skb device to the net device resolved from the routing table. Fixes: 039f50629b7f (ip_tunnel: Move stats update to iptunnel_xmit()) Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-06act_ife: fix a typo in kmemdup() parametersWANG Cong1-2/+2
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-04wireless: use reset to set mac headerZhang Shengju1-1/+1
Since offset is zero, it's not necessary to use set function. Reset function is straightforward, and will remove the unnecessary add operation in set function. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-04mac80211: use reset to set header pointerZhang Shengju4-9/+9
Since offset is zero, it's not necessary to use set function. Reset function is straightforward, and will remove the unnecessary add operation in set function. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-04rxrpc: Don't try to map ICMP to error as the lower layer already did thatDavid Howells1-10/+0
In the ICMP message processing code, don't try to map ICMP codes to UNIX error codes as the caller (IPv4/IPv6) already did that for us (ee_errno). Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: Clear the unused part of a sockaddr_rxrpc for memcmp() useDavid Howells1-3/+5
Clear the unused part of a sockaddr_rxrpc structs so that memcmp() can be used to compare them. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: rxkad: Casts are needed when comparing be32 valuesDavid Howells1-1/+1
Forced casts are needed to avoid sparse warning when directly comparing be32 values. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: rxkad: The version number in the response should be net byte orderDavid Howells1-8/+9
The version number rxkad places in the response should be network byte order. Whilst we're at it, rearrange the code to be more readable. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: Use ACCESS_ONCE() when accessing circular buffer pointersDavid Howells1-4/+7
Use ACCESS_ONCE() when accessing the other-end pointer into a circular buffer as it's possible the other-end pointer might change whilst we're doing this, and if we access it twice, we might get some weird things happening. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: Adjust some whitespace and commentsDavid Howells7-29/+22
Remove some excess whitespace, insert some missing spaces and adjust a couple of comments. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: Be more selective about the types of received packets we acceptDavid Howells1-1/+2
Currently, received RxRPC packets outside the range 1-13 are rejected. There are, however, holes in the range that should also be rejected - plus at least one type we don't yet support - so reject these also. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: Fix defined range for /proc/sys/net/rxrpc/rx_mtuDavid Howells1-1/+1
The upper bound of the defined range for rx_mtu is being set in the same member as the lower bound (extra1) rather than the correct place (extra2). I'm not entirely sure why this compiles. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: The protocol family should be set to PF_RXRPC not PF_UNIXDavid Howells1-1/+1
Fix the protocol family set in the proto_ops for rxrpc to be PF_RXRPC not PF_UNIX. Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04rxrpc: Keep the skb private record of the Rx header in host byte orderDavid Howells17-381/+431
Currently, a copy of the Rx packet header is copied into the the sk_buff private data so that we can advance the pointer into the buffer, potentially discarding the original. At the moment, this copy is held in network byte order, but this means we're doing a lot of unnecessary translations. The reasons it was done this way are that we need the values in network byte order occasionally and we can use the copy, slightly modified, as part of an iov array when sending an ack or an abort packet. However, it seems more reasonable on review that it would be better kept in host byte order and that we make up a new header when we want to send another packet. To this end, rename the original header struct to rxrpc_wire_header (with BE fields) and institute a variant called rxrpc_host_header that has host order fields. Change the struct in the sk_buff private data into an rxrpc_host_header and translate the values when filling it in. This further allows us to keep values kept in various structures in host byte order rather than network byte order and allows removal of some fields that are byteswapped duplicates. Signed-off-by: David Howells <dhowells@redhat.com>