summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-03-27irda: prevent heap corruption on invalid nicknameDan Rosenberg1-0/+3
Invalid nicknames containing only spaces will result in an underflow in a memcpy size calculation, subsequently destroying the heap and panicking. v2 also catches the case where the provided nickname is longer than the buffer size, which can result in controllable heap corruption. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27dst: Clone child entry in skb_dst_popSteffen Klassert2-2/+2
We clone the child entry in skb_dst_pop before we call skb_dst_drop(). Otherwise we might kill the child right before we return it to the caller. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27xfrm: Force a dst refcount before entering the xfrm type handlersSteffen Klassert2-0/+4
Crypto requests might return asynchronous. In this case we leave the rcu protected region, so force a refcount on the skb's destination entry before we enter the xfrm type input/output handlers. This fixes a crash when a route is deleted whilst sending IPsec data that is transformed by an asynchronous algorithm. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-25ipv4: do not ignore route errorsJulian Anastasov1-2/+2
The "ipv4: Inline fib_semantic_match into check_leaf" change forgets to return the route errors. check_leaf should return the same results as fib_table_lookup. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-25route: Take the right src and dst addresses in ip_route_newportsSteffen Klassert1-2/+2
When we set up the flow informations in ip_route_newports(), we take the address informations from the the rt_key_src and rt_key_dst fields of the rtable. They appear to be empty. So take the address informations from rt_src and rt_dst instead. This issue was introduced by commit 5e2b61f78411be25f0b84f97d5b5d312f184dfd1 ("ipv4: Remove flowi from struct rtable.") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24ipv4: Fix nexthop caching wrt. scoping.David S. Miller4-20/+16
Move the scope value out of the fib alias entries and into fib_info, so that we always use the correct scope when recomputing the nexthop cached source address. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24ipv4: Invalidate nexthop cache nh_saddr more correctly.David S. Miller5-29/+33
Any operation that: 1) Brings up an interface 2) Adds an IP address to an interface 3) Deletes an IP address from an interface can potentially invalidate the nh_saddr value, requiring it to be recomputed. Perform the recomputation lazily using a generation ID. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24net: fix pch_gbe section mismatch warningRandy Dunlap1-3/+3
Fix section mismatch warning by renaming the pci_driver variable to a recognized (whitelisted) name. WARNING: drivers/net/pch_gbe/pch_gbe.o(.data+0x1f8): Section mismatch in reference from the variable pch_gbe_pcidev to the variable .devinit.rodata:pch_gbe_pcidev_id The variable pch_gbe_pcidev references the variable __devinitconst pch_gbe_pcidev_id If the reference is valid then annotate the variable with __init* or __refdata (see linux/init.h) or name the variable: *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24ipv4: fix fib metricsEric Dumazet1-1/+1
Alessandro Suardi reported that we could not change route metrics : ip ro change default .... advmss 1400 This regression came with commit 9c150e82ac50 (Allocate fib metrics dynamically). fib_metrics is no longer an array, but a pointer to an array. Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Alessandro Suardi <alessandro.suardi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24mlx4_en: Removing HW info from ethtool -i report.Yevgeny Petrilin1-14/+1
Avoiding abuse of ethtool_drvinfo.driver field. HW specific info can be retrieved using lspci. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24Merge branch 'master' of ↵David S. Miller4-2/+7
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2011-03-24net_sched: fix THROTTLED/RUNNING raceEric Dumazet1-4/+4
commit fd245a4adb52 (net_sched: move TCQ_F_THROTTLED flag) added a race. qdisc_watchdog() is run from softirq, so special care should be taken or we can lose one state transition (THROTTLED/RUNNING) Prior to fd245a4adb52, we were manipulating q->flags (qdisc->flags &= ~TCQ_F_THROTTLED;) and this manipulation could only race with qdisc_warn_nonwc(). Since we want to avoid atomic ops in qdisc fast path - it was the meaning of commit 371121057607e (QDISC_STATE_RUNNING dont need atomic bit ops) - fix is to move THROTTLE bit into 'state' field, this one being manipulated with SMP and IRQ safe operations. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23Merge branch 'sfc-2.6.39' of ↵David S. Miller2-2/+18
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6
2011-03-23drivers/net/a2065.c: Convert release_resource to ↵Julia Lawall1-5/+5
release_region/release_mem_region Request_mem_region should be used with release_mem_region, not release_resource. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x,E; @@ *x = request_mem_region(...) ... when != release_mem_region(x) when != x = E * release_resource(x); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23drivers/net/ariadne.c: Convert release_resource to ↵Julia Lawall1-5/+5
release_region/release_mem_region Request_mem_region should be used with release_mem_region, not release_resource. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x,E; @@ *x = request_mem_region(...) ... when != release_mem_region(x) when != x = E * release_resource(x); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23bonding: fix rx_handler lockingJiri Pirko2-25/+32
This prevents possible race between bond_enslave and bond_handle_frame as reported by Nicolas by moving rx_handler register/unregister. slave->bond is added to hold pointer to master bonding sructure. That way dev->master is no longer used in bond_handler_frame. Also, this removes "BUG: scheduling while atomic" message Reported-by: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com> Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Tested-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23myri10ge: fix rmmod crashStanislaw Gruszka1-0/+1
Rmmod myri10ge crash at free_netdev() -> netif_napi_del(), because napi structures are already deallocated. To fix call netif_napi_del() before kfree() at myri10ge_free_slices(). Cc: stable@kernel.org Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: updated driver version to 1.5.4.1Yevgeny Petrilin1-2/+2
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: Using blue flame supportYevgeny Petrilin3-22/+64
Doorbell is used according to usage of BlueFlame. For Blue Flame to work in Ethernet mode QP number should have 0 at bits 6,7. Allocating range of QPs accordingly. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_core: reserve UARs for userspace consumersEli Cohen1-0/+8
Do not allow a kernel consumer to allocate a UAR to serve for blue flame if the number of available UARs gets below MLX4_NUM_RESERVED_UARS (currently 8). This will allow userspace apps to open a device file and run things like ibv_devinfo. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_core: maintain available field in bitmap allocatorEli Cohen2-0/+15
Add mlx4_bitmap_avail() to give the number of available resources. We want to use this as a hint to whether to allocate a resources or not. This patch is introduced to be used with allocation blue flame registers. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: Add blue flame support for kernel consumersEli Cohen5-0/+142
Using blue flame can improve latency by allowing the HW to more efficiently access the WQE. This patch presents two functions that are used to allocate or release HW resources for using blue flame; the caller need to supply a struct mlx4_bf object when allocating resources. Consumers that make use of this API should post doorbells to the UAR object pointed by the initialized struct mlx4_bf; Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: Enabling new steeringYevgeny Petrilin9-61/+300
The mlx4_en module now uses the new steering mechanism. The RX packets are now steered through the MCG table instead of Mac table for unicast, and default entry for multicast. The feature is enabled through INIT_HCA Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: Add support for promiscuous mode in the new steering model.Yevgeny Petrilin3-11/+620
For Ethernet mode only, When we want to register QP as promiscuous, it must be added to all the existing steering entries and also to the default one. The promiscuous QP might also be on of "real" QPs, which means we need to monitor every entry to avoid duplicates and ensure we close an entry when all it has is promiscuous QPs. Same mechanism both for unicast and multicast. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: generalization of multicast steering.Yevgeny Petrilin7-43/+102
The same packet steering mechanism would be used both for IB and Ethernet, Both multicasts and unicasts. This commit prepares the general infrastructure for this. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: Reporting HW revision in ethtool -iYevgeny Petrilin5-3/+20
HW revision is derived from device ID and rev id. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: Wake on LAN supportYevgeny Petrilin6-2/+93
The driver queries the FW for WOL support. Ethtool get/set_wol is implemented accordingly. Only magic packets are supported at the time. Signed-off-by: Igor Yarovinsky <igory@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: using new mlx4 interrupt schemeYevgeny Petrilin5-23/+62
Each RX ring will have its own interrupt vector, and TX rings will share one (we mostly use polling for TX completions). The vectors are assigned first time device is opened, and its name includes the interface name and ring number. Signed-off-by: Markuze Alex <markuze@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4: Changing interrupt schemeYevgeny Petrilin6-13/+134
Adding a pool of MSI-X vectors and EQs that can be used explicitly by mlx4_core customers (mlx4_ib, mlx4_en). The consumers will assign their own names to the interrupt vectors. Those vectors are not opened at mlx4 device initialization, opened by demand. Changed the max number of possible EQs according to the new scheme, no longer relies on on number of cores. The new functionality is exposed through mlx4_assign_eq() and mlx4_release_eq(). Customers that do not use the new API will get completion vectors as before. Signed-off-by: Markuze Alex <markuze@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: bringing link up when registering netdeviceYevgeny Petrilin1-0/+17
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: optimize adaptive moderation algorithm for better latencyYevgeny Petrilin2-13/+6
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: moderation parameters are not reseted.Yevgeny Petrilin1-2/+1
Instead of reseting the module parameters each ifup or mtu change, they are being set once at device initialization Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23mlx4_en: going out of range of TX rings when reporting statsYevgeny Petrilin1-1/+1
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23ath9k: Fix TX queue stuck issue.Senthil Balasubramanian1-0/+2
commit 86271e460a66003dc1f4cbfd845adafb790b7587 introduced a regression that caused mac80211 queues in stopped state. ath_drain_all_txq is called in driver flush which would reset the stopped flag and the mac80211 queues were never started after that. iperf traffic is completely stalled due to this issue. Restart the mac80211 queues in driver flush only if the txqs were drained. Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-23ath9k: Fix kernel panic caused by invalid rate index access.Senthil Balasubramanian1-1/+1
With the recent tx status optimization in mac80211, we bail out as and and when invalid rate index is found. So the behavior of resetting rate idx to -1 and count to 0 has changed for the rate indexes that were not part of the driver's retry series. This has resulted in ath9k using incorrect rate table index which caused the system to panic. Ideally ath9k need to loop only for the indexes that were part of the retry series and so simply use hw->max_rates as the loop counter. Pasted the stack trace of the panic issue for reference. [ 754.093192] BUG: unable to handle kernel paging request at ffff88046a9025b0 [ 754.093256] IP: [<ffffffffa02eac49>] ath_tx_status+0x209/0x2f0 [ath9k] [ 754.094888] Call Trace: [ 754.094903] <IRQ> [ 754.094928] [<ffffffffa051f883>] ieee80211_tx_status+0x203/0x9e0 [mac80211] [ 754.094975] [<ffffffffa053e305>] ? __ieee80211_wake_queue+0x125/0x140 [mac80211] [ 754.095017] [<ffffffffa02e66c9>] ath_tx_complete_buf+0x1b9/0x370 [ath9k] [ 754.095054] [<ffffffffa02e6fcf>] ath_tx_complete_aggr+0x51f/0xb50 [ath9k] [ 754.095098] [<ffffffffa05382a3>] ? ieee80211_prepare_and_rx_handle+0x173/0xab0 [mac80211] [ 754.095148] [<ffffffff81350e62>] ? _raw_spin_unlock_irqrestore+0x32/0x40 [ 754.095186] [<ffffffffa02e9735>] ath_tx_tasklet+0x365/0x4b0 [ath9k] [ 754.095224] [<ffffffff8107a2a2>] ? clockevents_program_event+0x62/0xa0 [ 754.095261] [<ffffffffa02e2628>] ath9k_tasklet+0x168/0x1c0 [ath9k] [ 754.095298] [<ffffffff8105599b>] tasklet_action+0x6b/0xe0 [ 754.095331] [<ffffffff81056278>] __do_softirq+0x98/0x120 [ 754.095361] [<ffffffff8100cd5c>] call_softirq+0x1c/0x30 [ 754.095393] [<ffffffff8100efb5>] do_softirq+0x65/0xa0 [ 754.095423] [<ffffffff810563fd>] irq_exit+0x8d/0x90 [ 754.095453] [<ffffffff8100ebc1>] do_IRQ+0x61/0xe0 [ 754.095482] [<ffffffff81351413>] ret_from_intr+0x0/0x15 [ 754.095513] <EOI> [ 754.095531] [<ffffffff81014375>] ? native_sched_clock+0x15/0x70 [ 754.096475] [<ffffffffa02bcfa6>] ? acpi_idle_enter_bm+0x24d/0x285 [processor] [ 754.096475] [<ffffffffa02bcf9f>] ? acpi_idle_enter_bm+0x246/0x285 [processor] [ 754.096475] [<ffffffff8127fab2>] cpuidle_idle_call+0x82/0x100 [ 754.096475] [<ffffffff8100a236>] cpu_idle+0xa6/0xf0 [ 754.096475] [<ffffffff81339bc1>] rest_init+0x91/0xa0 [ 754.096475] [<ffffffff814efccd>] start_kernel+0x3fd/0x408 [ 754.096475] [<ffffffff814ef347>] x86_64_start_reservations+0x132/0x136 [ 754.096475] [<ffffffff814ef451>] x86_64_start_kernel+0x106/0x115 [ 754.096475] RIP [<ffffffffa02eac49>] ath_tx_status+0x209/0x2f0 [ath9k] Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-23orinoco: Clear dangling pointer on hardware busyarmadefuego@gmail.com1-0/+3
On hardware busy the scan request pointer should be cleared, as higher levels will release. This avoids a crash when that pointer is erroneously used later. Signed-off-by: Joseph J. Gunn <armadefuego@yahoo.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-23iwlagn: fix error in command waitingJohannes Berg1-1/+1
Clearly a mistake, since pointers won't suddenly change their value... Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-23ipv4: fix ip_rt_update_pmtu()Eric Dumazet1-2/+0
commit 2c8cec5c10bc (Cache learned PMTU information in inetpeer) added an extra inet_putpeer() call in ip_rt_update_pmtu(). This results in various problems, since we can free one inetpeer, while it is still in use. Ref: http://www.spinics.net/lists/netdev/msg159121.html Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23ipv4: Fallback to FIB local table in __ip_dev_find().David S. Miller1-0/+16
In commit 9435eb1cf0b76b323019cebf8d16762a50a12a19 ("ipv4: Implement __ip_dev_find using new interface address hash.") we reimplemented __ip_dev_find() so that it doesn't have to do a full FIB table lookup. Instead, it consults a hash table of addresses configured to interfaces. This works identically to the old code in all except one case, and that is for loopback subnets. The old code would match the loopback device for any IP address that falls within a subnet configured to the loopback device. Handle this corner case by doing the FIB lookup. We could implement this via inet_addr_onlink() but: 1) Someone could configure many addresses to loopback and inet_addr_onlink() is a simple list traversal. 2) We know the old code works. Reported-by: Julian Anastasov <ja@ssi.bg> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22tcp: Make undo_ssthresh arg to tcp_undo_cwr() a bool.David S. Miller1-6/+6
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22tcp: avoid cwnd moderation in undoYuchung Cheng1-5/+7
In the current undo logic, cwnd is moderated after it was restored to the value prior entering fast-recovery. It was moderated first in tcp_try_undo_recovery then again in tcp_complete_cwr. Since the undo indicates recovery was false, these moderations are not necessary. If the undo is triggered when most of the outstanding data have been acknowledged, the (restored) cwnd is falsely pulled down to a small value. This patch removes these cwnd moderations if cwnd is undone a) during fast-recovery b) by receiving DSACKs past fast-recovery Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22bridge: Fix possibly wrong MLD queries' ethernet source addressLinus Lüssing1-1/+1
The ipv6_dev_get_saddr() is currently called with an uninitialized destination address. Although in tests it usually seemed to nevertheless always fetch the right source address, there seems to be a possible race condition. Therefore this commit changes this, first setting the destination address and only after that fetching the source address. Reported-by: Jan Beulich <JBeulich@novell.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22net: davinci_emac:Fix translation logic for buffer descriptorSriram4-4/+14
With recent changes to the driver(switch to new cpdma layer), the support for buffer descriptor address translation logic is broken. This affects platforms where the physical address of the descriptors as seen by the DMA engine is different from the physical address. Original Patch adding translation logic support: Commit: ad021ae8862209864dc8ebd3b7d3a55ce84b9ea2 Signed-off-by: Sriramakrishnan A G <srk@ti.com> Tested-By: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22ipv6: ip6_route_output does not modify sk parameter, so make it constFlorian Westphal2-2/+2
This avoids explicit cast to avoid 'discards qualifiers' compiler warning in a netfilter patch that i've been working on. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23sfc: Siena: Disable write-combining when SR-IOV is enabledSteve Hodgson2-2/+18
If SR-IOV is enabled by firmware, even if it is not enabled in the PCI capability, TX pushes using write-combining may be corrupted. We want to know whether it is enabled before mapping the NIC registers, and even if PCI extended capabilities are not accessible. Therefore, we look for the MSI capability, which is removed if SR-IOV is enabled. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-03-22Merge branch 'master' of ↵David S. Miller9-24/+34
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2011-03-22ipv4: optimize route adding on secondary promotionJulian Anastasov1-1/+2
Optimize the calling of fib_add_ifaddr for all secondary addresses after the promoted one to start from their place, not from the new place of the promoted secondary. It will save some CPU cycles because we are sure the promoted secondary was first for the subnet and all next secondaries do not change their place. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22ipv4: remove the routes on secondary promotionJulian Anastasov1-0/+11
The secondary address promotion relies on fib_sync_down_addr to remove all routes created for the secondary addresses when the old primary address is deleted. It does not happen for cases when the primary address is also in another subnet. Fix that by deleting local and broadcast routes for all secondaries while they are on device list and by faking that all addresses from this subnet are to be deleted. It relies on fib_del_ifaddr being able to ignore the IPs from the concerned subnet while checking for duplication. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22ipv4: fix route deletion for IPs on many subnetsJulian Anastasov2-13/+89
Alex Sidorenko reported for problems with local routes left after IP addresses are deleted. It happens when same IPs are used in more than one subnet for the device. Fix fib_del_ifaddr to restrict the checks for duplicate local and broadcast addresses only to the IFAs that use our primary IFA or another primary IFA with same address. And we expect the prefsrc to be matched when the routes are deleted because it is possible they to differ only by prefsrc. This patch prevents local and broadcast routes to be leaked until their primary IP is deleted finally from the box. As the secondary address promotion needs to delete the routes for all secondaries that used the old primary IFA, add option to ignore these secondaries from the checks and to assume they are already deleted, so that we can safely delete the route while these IFAs are still on the device list. Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22ipv4: match prefsrc when deleting routesJulian Anastasov1-0/+2
fib_table_delete forgets to match the routes by prefsrc. Callers can specify known IP in fc_prefsrc and we should remove the exact route. This is needed for cases when same local or broadcast addresses are used in different subnets and the routes differ only in prefsrc. All callers that do not provide fc_prefsrc will ignore the route prefsrc as before and will delete the first occurence. That is how the ip route del default magic works. Current callers are: - ip_rt_ioctl where rtentry_to_fib_config provides fc_prefsrc only when the provided device name matches IP label with colon. - inet_rtm_delroute where RTA_PREFSRC is optional too - fib_magic which deals with routes when deleting addresses and where the fc_prefsrc is always set with the primary IP for the concerned IFA. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>