summaryrefslogtreecommitdiff
path: root/net/ipv4/ip_tunnel.c
AgeCommit message (Collapse)AuthorFilesLines
2015-01-19tunnels: advertise link netns via netlinkNicolas Dichtel1-0/+8
Implement rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-16ip_tunnel: Add missing validation of encap type to ip_tunnel_encap_setup()Thomas Graf1-0/+3
The encap->type comes straight from Netlink. Validate it against max supported encap types just like ip_encap_hlen() already does. Fixes: a8c5f9 ("ip_tunnel: Ops registration for secondary encap (fou, gue)") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-16ip_tunnel: Add sanity checks to ip_tunnel_encap_add_ops()Thomas Graf1-0/+6
The symbols are exported and could be used by external modules. Fixes: a8c5f9 ("ip_tunnel: Ops registration for secondary encap (fou, gue)") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12ip_tunnel: Ops registration for secondary encap (fou, gue)Tom Herbert1-25/+53
Instead of calling fou and gue functions directly from ip_tunnel use ops for these that were previously registered. This patch adds the logic to add and remove encapsulation operations for ip_tunnel, and modified fou (and gue) to register with ip_tunnels. This patch also addresses a circular dependency between ip_tunnel and fou that was causing link errors when CONFIG_NET_IP_TUNNEL=y and CONFIG_NET_FOU=m. References to fou an gue have been removed from ip_tunnel.c Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05net: Move fou_build_header into fou.c and refactorTom Herbert1-49/+12
Move fou_build_header out of ip_tunnel.c and into fou.c splitting it up into fou_build_header, gue_build_header, and fou_build_udp. This allows for other users for TX of FOU or GUE. Change ip_tunnel_encap to call fou_build_header or gue_build_header based on the tunnel encapsulation type. Similarly, added fou_encap_hlen and gue_encap_hlen functions which are called by ip_encap_hlen. New net/fou.h has prototypes and defines for this. Added NET_FOU_IP_TUNNELS configuration. When this is set, IP tunnels can use FOU/GUE and fou module is also selected. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03ip_tunnel: Add GUE supportTom Herbert1-0/+13
This patch allows configuring IPIP, sit, and GRE tunnels to use GUE. This is very similar to fou excpet that we need to insert the GUE header in addition to the UDP header on transmit. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03ip_tunnel: Account for secondary encapsulation header in max_headroomTom Herbert1-1/+1
When adjusting max_header for the tunnel interface based on egress device we need to account for any extra bytes in secondary encapsulation (e.g. FOU). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-3/+8
Conflicts: drivers/net/usb/r8152.c net/netfilter/nfnetlink.c Both r8152 and nfnetlink conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26ip_tunnel: Don't allow to add the same tunnel multiple times.Steffen Klassert1-3/+8
When we try to add an already existing tunnel, we don't return an error. Instead we continue and call ip_tunnel_update(). This means that we can change existing tunnels by adding the same tunnel multiple times. It is even possible to change the tunnel endpoints of the fallback device. We fix this by returning an error if we try to add an existing tunnel. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-3/+3
Conflicts: arch/mips/net/bpf_jit.c drivers/net/can/flexcan.c Both the flexcan and MIPS bpf_jit conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22ipv4: do not use this_cpu_ptr() in preemptible contextEric Dumazet1-3/+3
this_cpu_ptr() in preemptible context is generally bad Sep 22 05:05:55 br kernel: [ 94.608310] BUG: using smp_processor_id() in preemptible [00000000] code: ip/2261 Sep 22 05:05:55 br kernel: [ 94.608316] caller is tunnel_dst_set.isra.28+0x20/0x60 [ip_tunnel] Sep 22 05:05:55 br kernel: [ 94.608319] CPU: 3 PID: 2261 Comm: ip Not tainted 3.17.0-rc5 #82 We can simply use raw_cpu_ptr(), as preemption is safe in these contexts. Should fix https://bugzilla.kernel.org/show_bug.cgi?id=84991 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Joe <joe9mail@gmail.com> Fixes: 9a4aa9af447f ("ipv4: Use percpu Cache route in IP tunnels") Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: Changes to ip_tunnel to support foo-over-udp encapsulationTom Herbert1-1/+90
This patch changes IP tunnel to support (secondary) encapsulation, Foo-over-UDP. Changes include: 1) Adding tun_hlen as the tunnel header length, encap_hlen as the encapsulation header length, and hlen becomes the grand total of these. 2) Added common netlink define to support FOU encapsulation. 3) Routines to perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-11/+18
Conflicts: drivers/net/Makefile net/ipv6/sysctl_net_ipv6.c Two ipv6_table_template[] additions overlap, so the index of the ipv6_table[x] assignments needed to be adjusted. In the drivers/net/Makefile case, we've gotten rid of the garbage whereby we had to list every single USB networking driver in the top-level Makefile, there is just one "USB_NETWORKING" that guards everything. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30ip_tunnel(ipv4): fix tunnels with "local any remote $remote_ip"Dmitry Popov1-11/+18
Ipv4 tunnels created with "local any remote $ip" didn't work properly since 7d442fab0 (ipv4: Cache dst in tunnels). 99% of packets sent via those tunnels had src addr = 0.0.0.0. That was because only dst_entry was cached, although fl4.saddr has to be cached too. Every time ip_tunnel_xmit used cached dst_entry (tunnel_rtable_get returned non-NULL), fl4.saddr was initialized with tnl_params->saddr (= 0 in our case), and wasn't changed until iptunnel_xmit(). This patch adds saddr to ip_tunnel->dst_cache, fixing this issue. Reported-by: Sergey Popov <pinkbyte@gentoo.org> Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+8
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15net: set name_assign_type in alloc_netdev()Tom Gundersen1-1/+1
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert all users to pass NET_NAME_UNKNOWN. Coccinelle patch: @@ expression sizeof_priv, name, setup, txqs, rxqs, count; @@ ( -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs) +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs) | -alloc_netdev_mq(sizeof_priv, name, setup, count) +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count) | -alloc_netdev(sizeof_priv, name, setup) +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup) ) v9: move comments here from the wrong commit Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08ip_tunnel: fix ip_tunnel_lookupDmitry Popov1-4/+8
This patch fixes 3 similar bugs where incoming packets might be routed into wrong non-wildcard tunnels: 1) Consider the following setup: ip address add 1.1.1.1/24 dev eth0 ip address add 1.1.1.2/24 dev eth0 ip tunnel add ipip1 remote 2.2.2.2 local 1.1.1.1 mode ipip dev eth0 ip link set ipip1 up Incoming ipip packets from 2.2.2.2 were routed into ipip1 even if it has dst = 1.1.1.2. Moreover even if there was wildcard tunnel like ip tunnel add ipip0 remote 2.2.2.2 local any mode ipip dev eth0 but it was created before explicit one (with local 1.1.1.1), incoming ipip packets with src = 2.2.2.2 and dst = 1.1.1.2 were still routed into ipip1. Same issue existed with all tunnels that use ip_tunnel_lookup (gre, vti) 2) ip address add 1.1.1.1/24 dev eth0 ip tunnel add ipip1 remote 2.2.146.85 local 1.1.1.1 mode ipip dev eth0 ip link set ipip1 up Incoming ipip packets with dst = 1.1.1.1 were routed into ipip1, no matter what src address is. Any remote ip address which has ip_tunnel_hash = 0 raised this issue, 2.2.146.85 is just an example, there are more than 4 million of them. And again, wildcard tunnel like ip tunnel add ipip0 remote any local 1.1.1.1 mode ipip dev eth0 wouldn't be ever matched if it was created before explicit tunnel like above. Gre & vti tunnels had the same issue. 3) ip address add 1.1.1.1/24 dev eth0 ip tunnel add gre1 remote 2.2.146.84 local 1.1.1.1 key 1 mode gre dev eth0 ip link set gre1 up Any incoming gre packet with key = 1 were routed into gre1, no matter what src/dst addresses are. Any remote ip address which has ip_tunnel_hash = 0 raised the issue, 2.2.146.84 is just an example, there are more than 4 million of them. Wildcard tunnel like ip tunnel add gre2 remote any local any key 1 mode gre dev eth0 wouldn't be ever matched if it was created before explicit tunnel like above. All this stuff happened because while looking for a wildcard tunnel we didn't check that matched tunnel is a wildcard one. Fixed. Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25ipv4: fix dst race in sk_dst_get()Eric Dumazet1-9/+5
When IP route cache had been removed in linux-3.6, we broke assumption that dst entries were all freed after rcu grace period. DST_NOCACHE dst were supposed to be freed from dst_release(). But it appears we want to keep such dst around, either in UDP sockets or tunnels. In sk_dst_get() we need to make sure dst refcount is not 0 before incrementing it, or else we might end up freeing a dst twice. DST_NOCACHE set on a dst does not mean this dst can not be attached to a socket or a tunnel. Then, before actual freeing, we need to observe a rcu grace period to make sure all other cpus can catch the fact the dst is no longer usable. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dormando <dormando@rydia.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-6/+10
Conflicts: net/core/rtnetlink.c net/core/skbuff.c Both conflicts were very simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11ip_tunnel: fix i_key matching in ip_tunnel_findDmitry Popov1-2/+3
Some tunnels (though only vti as for now) can use i_key just for internal use: for example vti uses it for fwmark'ing incoming packets. So raw i_key value shouldn't be treated as a distinguisher for them. ip_tunnel_key_match exists for cases when we want to compare two ip_tunnel_parms' i_keys. Example bug: ip link add type vti ikey 1 local 1.0.0.1 remote 2.0.0.2 ip link add type vti ikey 2 local 1.0.0.1 remote 2.0.0.2 spawned two tunnels, although it doesn't make sense. Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11ip_vti: Fix 'ip tunnel add' with 'key' parametersDmitry Popov1-4/+6
ip tunnel add remote 10.2.2.1 local 10.2.2.2 mode vti ikey 1 okey 2 translates to p->iflags = VTI_ISVTI|GRE_KEY and p->i_key = 1, but GRE_KEY != TUNNEL_KEY, so ip_tunnel_ioctl would set i_key to 0 (same story with o_key) making us unable to create vti tunnels with [io]key via ip tunnel. We cannot simply translate GRE_KEY to TUNNEL_KEY (as GRE module does) because vti_tunnels with same local/remote addresses but different ikeys will be treated as different then. So, imo the best option here is to move p->i_flags & *_KEY check for vti tunnels from ip_tunnel.c to ip_vti.c and to think about [io]_mark field for ip_tunnel_parm in the future. Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-05ip_tunnel: fix possible rtable leakDmitry Popov1-0/+1
ip_rt_put(rt) is always called in "error" branches above, but was missed in skb_cow_head branch. As rt is not yet bound to skb here we have to release it by hand. Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+3
Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21ip_tunnel: Initialize the fallback device properlySteffen Klassert1-0/+1
We need to initialize the fallback device to have a correct mtu set on this device. Otherwise the mtu is set to null and the device is unusable. Fixes: fd58156e456d ("IPIP: Use ip-tunneling code.") Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16ipv4: ip_tunnels: disable cache for nbma gre tunnelsTimo Teräs1-1/+2
The connected check fails to check for ip_gre nbma mode tunnels properly. ip_gre creates temporary tnl_params with daddr specified to pass-in the actual target on per-packet basis from neighbor layer. Detect these tunnels by inspecting the actual tunnel configuration. Minimal test case: ip route add 192.168.1.1/32 via 10.0.0.1 ip route add 192.168.1.2/32 via 10.0.0.2 ip tunnel add nbma0 mode gre key 1 tos c0 ip addr add 172.17.0.0/16 dev nbma0 ip link set nbma0 up ip neigh add 172.17.0.1 lladdr 192.168.1.1 dev nbma0 ip neigh add 172.17.0.2 lladdr 192.168.1.2 dev nbma0 ping 172.17.0.1 ping 172.17.0.2 The second ping should be going to 192.168.1.2 and head 10.0.0.2; but cached gre tunnel level route is used and it's actually going to 192.168.1.1 via 10.0.0.1. The lladdr's need to go to separate dst for the bug to trigger. Test case uses separate route entries, but this can also happen when the route entry is same: if there is a nexthop exception or the GRE tunnel is IPsec'ed in which case the dst points to xfrm bundle unique to the gre lladdr. Fixes: 7d442fab0a67 ("ipv4: Cache dst in tunnels") Signed-off-by: Timo Teräs <timo.teras@iki.fi> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16ip_tunnel: don't add tunnel twiceDuan Jiong1-4/+2
When using command "ip tunnel add" to add a tunnel, the tunnel will be added twice, through ip_tunnel_create() and ip_tunnel_update(). Because the second is unnecessary, so we can just break after adding tunnel through ip_tunnel_create(). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15ip_tunnel: delete unneeded call to netdev_privJulia Lawall1-2/+1
Netdev_priv is an accessor function, and has no purpose if its result is not used. A simplified version of the semantic match that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ local idexpression x; @@ -x = netdev_priv(...); ... when != x // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05ip_tunnel: Set network header properly for IP_ECN_decapsulate()Ying Cai1-0/+2
In ip_tunnel_rcv(), set skb->network_header to inner IP header before IP_ECN_decapsulate(). Without the fix, IP_ECN_decapsulate() takes outer IP header as inner IP header, possibly causing error messages or packet drops. Note that this skb_reset_network_header() call was in this spot when the original feature for checking consistency of ECN bits through tunnels was added in eccc1bb8d4b4 ("tunnel: drop packet if ECN present with not-ECT"). It was only removed from this spot in 3d7b46cd20e3 ("ip_tunnel: push generic protocol handling to ip_tunnel module."). Fixes: 3d7b46cd20e3 ("ip_tunnel: push generic protocol handling to ip_tunnel module.") Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Ying Cai <ycai@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16ip_tunnel: use the right netns in ioctl handlerNicolas Dichtel1-8/+7
Because the netdevice may be in another netns than the i/o netns, we should use the i/o netns instead of dev_net(dev). The variable 'tunnel' was used only to get 'itn', hence to simplify code I remove it and use 't' instead. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15ipv4: add a sock pointer to dst->output() path.Eric Dumazet1-1/+1
In the dst->output() path for ipv4, the code assumes the skb it has to transmit is attached to an inet socket, specifically via ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the provider of the packet is an AF_PACKET socket. The dst->output() method gets an additional 'struct sock *sk' parameter. This needs a cascade of changes so that this parameter can be propagated from vxlan to final consumer. Fixes: 8f646c922d55 ("vxlan: keep original skb ownership") Reported-by: lucien xin <lucien.xin@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-3/+0
Conflicts: drivers/net/ethernet/marvell/mvneta.c The mvneta.c conflict is a case of overlapping changes, a conversion to devm_ioremap_resource() vs. a conversion to netdev_alloc_pcpu_stats. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-26ip_tunnel: Fix dst ref-count.Pravin B Shelar1-3/+0
Commit 10ddceb22ba (ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer) removed dst-drop call from ip-tunnel-recv. Following commit reintroduce dst-drop and fix the original bug by checking loopback packet before releasing dst. Original bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681 CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-49/+4
Conflicts: drivers/net/wireless/ath/ath9k/recv.c drivers/net/wireless/mwifiex/pcie.c net/ipv6/sit.c The SIT driver conflict consists of a bug fix being done by hand in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper was created (netdev_alloc_pcpu_stats()) which takes care of this. The two wireless conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25ip_tunnel: Make vti work with i_key setSteffen Klassert1-1/+5
Vti uses the o_key to mark packets that were transmitted or received by a vti interface. Unfortunately we can't apply different marks to in and outbound packets with only one key availabe. Vti interfaces typically use wildcard selectors for vti IPsec policies. On forwarding, the same output policy will match for both directions. This generates a loop between the IPsec gateways until the ttl of the packet is exceeded. The gre i_key/o_key are usually there to find the right gre tunnel during a lookup. When vti uses the i_key to mark packets, the tunnel lookup does not work any more because vti does not use the gre keys as a hash key for the lookup. This patch workarounds this my not including the i_key when comupting the hash for the tunnel lookup in case of vti tunnels. With this we have separate keys available for the transmitting and receiving side of the vti interface. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-20sit: fix panic with route cache in ip tunnelsNicolas Dichtel1-3/+4
Bug introduced by commit 7d442fab0a67 ("ipv4: Cache dst in tunnels"). Because sit code does not call ip_tunnel_init(), the dst_cache was not initialized. CC: Tom Herbert <therbert@google.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-20ip_tunnel: Move ip_tunnel_get_stats64 into ip_tunnel_core.cDavid S. Miller1-46/+0
net/built-in.o:(.rodata+0x1707c): undefined reference to `ip_tunnel_get_stats64' Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17ip_tunnel: return more precise errno value when adding tunnel failsFlorian Westphal1-5/+10
Currently this always returns ENOBUFS, because the return value of __ip_tunnel_create is discarded. A more common failure is a duplicate name (EEXIST). Propagate the real error code so userspace can display a more meaningful error message. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-14net: introduce netdev_alloc_pcpu_stats() for driversWANG Cong1-8/+2
There are many drivers calling alloc_percpu() to allocate pcpu stats and then initializing ->syncp. So just introduce a helper function for them. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-03ip_tunnel: fix panic in ip_tunnel_xmit()Eric Dumazet1-18/+11
Setting rt variable to NULL at the beginning of ip_tunnel_xmit() missed possible use of this variable as a scratch value. Also fixes a possible dst leak in tunnel_dst_check() : If we had to call tunnel_dst_reset(), we forgot to release the reference on dst. Merges tunnel_dst_get()/tunnel_dst_check() into a single tunnel_rtable_get() function for clarity. Many thanks to Tommi for his report and tests. Fixes: 7d442fab0a67 ("ipv4: Cache dst in tunnels") Reported-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Tommi Rantala <tt.rantala@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27net: ipv4: Use PTR_ERR_OR_ZEROSachin Kamat1-1/+2
PTR_RET is deprecated. Use PTR_ERR_OR_ZERO instead. While at it also include missing err.h header. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-23ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is calledDuan Jiong1-0/+1
commit a622260254ee48("ip_tunnel: fix kernel panic with icmp_dest_unreach") clear IPCB in ip_tunnel_xmit() , or else skb->cb[] may contain garbage from GSO segmentation layer. But commit 0e6fbc5b6c621("ip_tunnels: extend iptunnel_xmit()") refactor codes, and it clear IPCB behind the dst_link_failure(). So clear IPCB in ip_tunnel_xmit() just like commti a622260254ee48("ip_tunnel: fix kernel panic with icmp_dest_unreach"). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-21ipv4: remove the useless argument from ip_tunnel_hash()Duan Jiong1-5/+4
Since commit c544193214("GRE: Refactor GRE tunneling code") introduced function ip_tunnel_hash(), the argument itn is no longer in use, so remove it. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18ipv4: be friend with drop monitorEric Dumazet1-2/+2
Replace some dev_kfree_skb() with kfree_skb() calls when we drop one skb, this might help bug tracking. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-17ipv4: fix a dst leak in tunnelsEric Dumazet1-20/+14
This patch : 1) Remove a dst leak if DST_NOCACHE was set on dst Fix this by holding a reference only if dst really cached. 2) Remove a lockdep warning in __tunnel_dst_set() This was reported by Cong Wang. 3) Remove usage of a spinlock where xchg() is enough 4) Remove some spurious inline keywords. Let compiler decide for us. Fixes: 7d442fab0a67 ("ipv4: Cache dst in tunnels") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <cwang@twopensource.com> Cc: Tom Herbert <therbert@google.com> Cc: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-09ip_tunnel: fix sparse non static symbol warningWei Yongjun1-1/+1
Fixes the following sparse warning: net/ipv4/ip_tunnel.c:116:18: warning: symbol 'tunnel_dst_check' was not declared. Should it be static? Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04net: unify the pcpu_tstats and br_cpu_netstats as oneLi RongQing1-4/+5
They are same, so unify them as one, pcpu_sw_netstats. Define pcpu_sw_netstat in netdevice.h, remove pcpu_tstats from if_tunnel and remove br_cpu_netstats from br_private.h Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03ipv4: Use percpu Cache route in IP tunnelsTom Herbert1-12/+32
percpu route cache eliminates share of dst refcnt between CPUs. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-03ipv4: Cache dst in tunnelsTom Herbert1-24/+89
Avoid doing a route lookup on every packet being tunneled. In ip_tunnel.c cache the route returned from ip_route_output if the tunnel is "connected" so that all the rouitng parameters are taken from tunnel parms for a packet. Specifically, not NBMA tunnel and tos is from tunnel parms (not inner packet). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-2/+2
Pull networking fixes from David Miller: "Mostly these are fixes for fallout due to merge window changes, as well as cures for problems that have been with us for a much longer period of time" 1) Johannes Berg noticed two major deficiencies in our genetlink registration. Some genetlink protocols we passing in constant counts for their ops array rather than something like ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were using fixed IDs for their multicast groups. We have to retain these fixed IDs to keep existing userland tools working, but reserve them so that other multicast groups used by other protocols can not possibly conflict. In dealing with these two problems, we actually now use less state management for genetlink operations and multicast groups. 2) When configuring interface hardware timestamping, fix several drivers that simply do not validate that the hwtstamp_config value is one the driver actually supports. From Ben Hutchings. 3) Invalid memory references in mwifiex driver, from Amitkumar Karwar. 4) In dev_forward_skb(), set the skb->protocol in the right order relative to skb_scrub_packet(). From Alexei Starovoitov. 5) Bridge erroneously fails to use the proper wrapper functions to make calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki Makita. 6) When detaching a bridge port, make sure to flush all VLAN IDs to prevent them from leaking, also from Toshiaki Makita. 7) Put in a compromise for TCP Small Queues so that deep queued devices that delay TX reclaim non-trivially don't have such a performance decrease. One particularly problematic area is 802.11 AMPDU in wireless. From Eric Dumazet. 8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts here. Fix from Eric Dumzaet, reported by Dave Jones. 9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn. 10) When computing mergeable buffer sizes, virtio-net fails to take the virtio-net header into account. From Michael Dalton. 11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic bumping, this one has been with us for a while. From Eric Dumazet. 12) Fix NULL deref in the new TIPC fragmentation handling, from Erik Hugne. 13) 6lowpan bit used for traffic classification was wrong, from Jukka Rissanen. 14) macvlan has the same issue as normal vlans did wrt. propagating LRO disabling down to the real device, fix it the same way. From Michal Kubecek. 15) CPSW driver needs to soft reset all slaves during suspend, from Daniel Mack. 16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet. 17) The xen-netfront RX buffer refill timer isn't properly scheduled on partial RX allocation success, from Ma JieYue. 18) When ipv6 ping protocol support was added, the AF_INET6 protocol initialization cleanup path on failure was borked a little. Fix from Vlad Yasevich. 19) If a socket disconnects during a read/recvmsg/recvfrom/etc that blocks we can do the wrong thing with the msg_name we write back to userspace. From Hannes Frederic Sowa. There is another fix in the works from Hannes which will prevent future problems of this nature. 20) Fix route leak in VTI tunnel transmit, from Fan Du. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits) genetlink: make multicast groups const, prevent abuse genetlink: pass family to functions using groups genetlink: add and use genl_set_err() genetlink: remove family pointer from genl_multicast_group genetlink: remove genl_unregister_mc_group() hsr: don't call genl_unregister_mc_group() quota/genetlink: use proper genetlink multicast APIs drop_monitor/genetlink: use proper genetlink multicast APIs genetlink: only pass array to genl_register_family_with_ops() tcp: don't update snd_nxt, when a socket is switched from repair mode atm: idt77252: fix dev refcnt leak xfrm: Release dst if this dst is improper for vti tunnel netlink: fix documentation typo in netlink_set_err() be2net: Delete secondary unicast MAC addresses during be_close be2net: Fix unconditional enabling of Rx interface options net, virtio_net: replace the magic value ping: prevent NULL pointer dereference on write to msg_name bnx2x: Prevent "timeout waiting for state X" bnx2x: prevent CFC attention bnx2x: Prevent panic during DMAE timeout ...
2013-11-14core/dev: do not ignore dmac in dev_forward_skb()Alexei Starovoitov1-2/+2
commit 06a23fe31ca3 ("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()") and refactoring 64261f230a91 ("dev: move skb_scrub_packet() after eth_type_trans()") are forcing pkt_type to be PACKET_HOST when skb traverses veth. which means that ip forwarding will kick in inside netns even if skb->eth->h_dest != dev->dev_addr Fix order of eth_type_trans() and skb_scrub_packet() in dev_forward_skb() and in ip_tunnel_rcv() Fixes: 06a23fe31ca3 ("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()") CC: Isaku Yamahata <yamahatanetdev@gmail.com> CC: Maciej Zenczykowski <zenczykowski@gmail.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>