summaryrefslogtreecommitdiff
path: root/net/ipv4
AgeCommit message (Collapse)AuthorFilesLines
2005-09-22[NETFILTER] Fix conntrack event cache deadlock/oopsHarald Welte5-28/+28
This patch fixes a number of bugs. It cannot be reasonably split up in multiple fixes, since all bugs interact with each other and affect the same function: Bug #1: The event cache code cannot be called while a lock is held. Therefore, the call to ip_conntrack_event_cache() within ip_ct_refresh_acct() needs to be moved outside of the locked section. This fixes a number of 2.6.14-rcX oops and deadlock reports. Bug #2: We used to call ct_add_counters() for unconfirmed connections without holding a lock. Since the add operations are not atomic, we could race with another CPU. Bug #3: ip_ct_refresh_acct() lost REFRESH events in some cases where refresh (and the corresponding event) are desired, but no accounting shall be performed. Both, evenst and accounting implicitly depended on the skb parameter bein non-null. We now re-introduce a non-accounting "ip_ct_refresh()" variant to explicitly state the desired behaviour. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22[NETFILTER] Fix sparse endian warnings in pptp helperAlexey Dobriyan1-6/+8
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22[NETFILTER] fix DEBUG statement in PPTP helperHarald Welte1-1/+1
As noted by Alexey Dobriyan, the DEBUGP statement prints the wrong callID. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22[TCP]: Adjust Reno SACK estimate in tcp_fragmentHerbert Xu1-0/+9
Since the introduction of TSO pcount a year ago, it has been possible for tcp_fragment() to cause packets_out to decrease. Prior to that, tcp_retrans_try_collapse() was the only way for that to happen on the retransmission path. When this happens with Reno, it is possible for sasked_out to become invalid because it is only an estimate and not tied to any particular packet on the retransmission queue. Therefore we need to adjust sacked_out as well as left_out in the Reno case. The following patch does exactly that. This bug is pretty difficult to trigger in practice though since you need a SACKless peer with a retransmission that occurs just as the cached MTU value expires. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21[TCP]: Set default congestion control correctly for incoming connections.Stephen Hemminger1-1/+1
Patch from Joel Sing to fix the default congestion control algorithm for incoming connections. If a new congestion control handler is added (via module), it should become the default for new connections. Instead, the incoming connections use reno. The cause is incorrect initialisation causes the tcp_init_congestion_control() function to return after the initial if test fails. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Ian McDonald <imcdnzl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21[FIB_TRIE]: message cleanupStephen Hemminger1-12/+3
Cleanup the printk's in fib_trie: * Convert a couple of places in the dump code to BUG_ON * Put log level's on each message The version message really needed the message since it leaks out on the pretty Fedora bootup. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Robert Olsson <Robert.Olsson@data.slu.se>, Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds12-25/+1805
2005-09-19[PATCH] raw_sendmsg DoS on 2.6Mark J Cox1-1/+1
Fix unchecked __get_user that could be tricked into generating a memory read on an arbitrary address. The result of the read is not returned directly but you may be able to divine some information about it, or use the read to cause a crash on some architectures by reading hardware state. CAN-2004-2492. Fix from Al Viro, ack from Dave Miller. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-19[TCP]: Handle SACK'd packets properly in tcp_fragment().Herbert Xu1-3/+7
The problem is that we're now calling tcp_fragment() in a context where the packets might be marked as SACKED_ACKED or SACKED_RETRANS. This was not possible before as you never retransmitted packets that are so marked. Because of this, we need to adjust sacked_out and retrans_out in tcp_fragment(). This is exactly what the following patch does. We also need to preserve the SACKED_ACKED/SACKED_RETRANS marking if they exist. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19[NETFILTER]: Export ip_nat_port_{nfattr_to_range,range_to_nfattr}Harald Welte1-0/+2
Those exports are needed by the PPTP helper following in the next couple of changes. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19[NETFILTER]: Rename misnamed functionPatrick McHardy3-5/+5
Both __ip_conntrack_expect_find and ip_conntrack_expect_find_get take a reference to the expectation, the difference is that callers of __ip_conntrack_expect_find must hold ip_conntrack_lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19[NETFILTER]: Add new PPTP conntrack and NAT helperHarald Welte6-0/+1774
This new "version 3" PPTP conntrack/nat helper is finally ready for mainline inclusion. Special thanks to lots of last-minute bugfixing by Patric McHardy. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19[IPV4]: fib_trie RCU refinementsRobert Olsson1-11/+10
* This patch is from Paul McKenney's RCU reviewing. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19[IPV4]: fib_trie tnode stats refinementsRobert Olsson1-6/+7
* Prints the route tnode and set the stats level deepth as before. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18[NETFILTER]: Solve Kconfig dependency problemHarald Welte1-3/+1
As suggested by Roman Zippel. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-17[NETFILTER] Fix Kconfig dependencies for nfnetlink/ctnetlinkHarald Welte1-6/+10
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16[NETFILTER]: Fix oops in conntrack event cacheHarald Welte1-1/+4
ip_ct_refresh_acct() can be called without a valid "skb" pointer. This used to work, since ct_add_counters() deals with that fact. However, the recently-added event cache doesn't handle this at all. This patch is a quick fix that is supposed to be replaced soon by a cleaner solution during the pending redesign of the event cache. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16[NETFILTER] CLUSTERIP: use a bitmap to store node responsibility dataKOVACS Krisztian1-82/+61
Instead of maintaining an array containing a list of nodes this instance is responsible for let's use a simple bitmap. This provides the following features: * clusterip_responsible() and the add_node()/delete_node() operations become very simple and don't need locking * the config structure is much smaller In spite of the completely different internal data representation the user-space interface remains almost unchanged; the only difference is that the proc file does not list nodes in the order they were added. (The target info structure remains the same.) Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16[NETFILTER] CLUSTERIP: introduce reference counting for entriesKOVACS Krisztian1-18/+62
The CLUSTERIP target creates a procfs entry for all different cluster IPs. Although more than one rules can refer to a single cluster IP (and thus a single config structure), removal of the procfs entry is done unconditionally in destroy(). In more complicated situations involving deferred dereferencing of the config structure by procfs and creating a new rule with the same cluster IP it's also possible that no entry will be created for the new rule. This patch fixes the problem by counting the number of entries referencing a given config structure and moving the config list manipulation and procfs entry deletion parts to the clusterip_config_entry_put() function. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14[IPVS]: ip_vs_ftp breaks connections using persistenceJulian Anastasov3-17/+60
ip_vs_ftp when loaded can create NAT connections with unknown client port for passive FTP. For such expectations we lookup with cport=0 on incoming packet but it matches the format of the persistence templates causing packets to other persistent virtual servers to be forwarded to real server without creating connection. Later the reply packets are treated as foreign and not SNAT-ed. This patch changes the connection lookup for packets from clients: * introduce IP_VS_CONN_F_TEMPLATE connection flag to mark the connection as template * create new connection lookup function just for templates - ip_vs_ct_in_get * make sure ip_vs_conn_in_get hits only connections with IP_VS_CONN_F_NO_CPORT flag set when s_port is 0. By this way we avoid returning template when looking for cport=0 (ftp) Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14[IPVS]: Really invalidate persistent templatesJulian Anastasov1-1/+1
Agostino di Salle noticed that persistent templates are not invalidated due to buggy optimization. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14[MCAST]: Fix MCAST_EXCLUDE line dupesDenis Lukianov1-1/+1
This patch fixes line dupes at /ipv4/igmp.c and /ipv6/mcast.c in the 2.6 kernel, where MCAST_EXCLUDE is mistakenly used instead of MCAST_INCLUDE. Signed-off-by: Denis Lukianov <denis@voxelsoft.com> Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14[TCP]: Compute in_sacked properly when we split up a TSO frame.Herbert Xu2-7/+11
The problem is that the SACK fragmenting code may incorrectly call tcp_fragment() with a length larger than the skb->len. This happens when the skb on the transmit queue completely falls to the LHS of the SACK. And add a BUG() check to tcp_fragment() so we can spot this kind of error more quickly in the future. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13[NETFILTER]: Fix DHCP + MASQUERADE problemPatrick McHardy1-0/+6
In 2.6.13-rcX the MASQUERADE target was changed not to exclude local packets for better source address consistency. This breaks DHCP clients using UDP sockets when the DHCP requests are caught by a MASQUERADE rule because the MASQUERADE target drops packets when no address is configured on the outgoing interface. This patch makes it ignore packets with a source address of 0. Thanks to Rusty for this suggestion. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13[NETFILTER]: Fix rcu race in ipt_REDIRECTPatrick McHardy1-6/+10
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13[NETFILTER]: Simplify netbios helperPatrick McHardy1-12/+7
Don't parse the packet, the data is already available in the conntrack structure. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13[NETFILTER]: Use correct type for "ports" module parameterPatrick McHardy3-9/+9
With large port numbers the helper_names buffer can overflow. Noticed by Samir Bellabes <sbellabes@mandriva.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12[NET]: fix-up schedule_timeout() usageNishanth Aravamudan1-4/+2
Use schedule_timeout_{,un}interruptible() instead of set_current_state()/schedule_timeout() to reduce kernel size. Also use human-time conversion functions instead of hard-coded division to avoid rounding issues. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-10[TCP]: Fix double adjustment of tp->{lost,left}_out in tcp_fragment().Herbert Xu1-5/+0
There is an extra left_out/lost_out adjustment in tcp_fragment which means that the lost_out accounting is always wrong. This patch removes that chunk of code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-09Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Linus Torvalds2-432/+385
2005-09-09[PATCH] timer initialization cleanup: DEFINE_TIMERIngo Molnar1-2/+1
Clean up timer initialization by introducing DEFINE_TIMER a'la DEFINE_SPINLOCK. Build and boot-tested on x86. A similar patch has been been in the -RT tree for some time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] files: lock-free fd look-upDipankar Sarma1-0/+1
With the use of RCU in files structure, the look-up of files using fds can now be lock-free. The lookup is protected by rcu_read_lock()/rcu_read_unlock(). This patch changes the readers to use lock-free lookup. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[IPV4] fib_trie: fix proc interfaceStephen Hemminger2-432/+385
Create one iterator for walking over FIB trie, and use it for all the /proc functions. Add a /proc/net/route output for backwards compatibility with old applications. Make initialization of fib_trie same as fib_hash so no #ifdef is needed in af_inet.c Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=5209 Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08[XFRM]: Always release dst_entry on error in xfrm_lookupPatrick McHardy1-4/+1
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08[TCP]: Fix off by one in tcp_fragment() "already sent" test.Herbert Xu1-1/+1
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08[NETFILTER]: ip_conntrack_netbios_ns.c gcc-2.95.x build fixAndrew Morton1-4/+20
gcc-2.95.x can't do this sort of initialisation Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08[IPV4]: Fix refcount damaging in net/ipv4/route.cJulian Anastasov1-15/+14
One such place that can damage the dst refcnts is route.c with CONFIG_IP_ROUTE_MULTIPATH_CACHED enabled, i don't see the user's .config. In this new code i see that rt_intern_hash is called before dst->refcnt is set to 1, dst is the 2nd arg to rt_intern_hash. Arg 2 of rt_intern_hash must come with refcnt 1 as it is added to table or dropped depending on error/add/update. One such example is ip_mkroute_input where __mkroute_input return rth with refcnt 0 which is provided to rt_intern_hash. ip_mkroute_output looks like a 2nd such place. Appending untested patch for comments and review. The idea is to put previous reference as we are going to return next result/error. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08[IPV4] udp: trim forgets about CHECKSUM_HWStephen Hemminger1-1/+1
A UDP packet may contain extra data that needs to be trimmed off. But when doing so, UDP forgets to fixup the skb checksum if CHECKSUM_HW is being used. I think this explains the case of a NFS receive using skge driver causing 'udp hw checksum failures' when interacting with a crufty settop box. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[IPV4]: Reassembly trim not clearing CHECKSUM_HWStephen Hemminger1-1/+1
This was found by inspection while looking for checksum problems with the skge driver that sets CHECKSUM_HW. It did not fix the problem, but it looks like it is needed. If IP reassembly is trimming an overlapping fragment, it should reset (or adjust) the hardware checksum flag on the skb. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[NETFILTER]: Missing unlock in TCP connection tracking error pathPatrick McHardy1-0/+1
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[NETFILTER]: kill __ip_ct_expect_unlink_destroyPablo Neira Ayuso3-18/+16
The following patch kills __ip_ct_expect_unlink_destroy and export unlink_expect as ip_ct_unlink_expect. As it was discussed [1], the function __ip_ct_expect_unlink_destroy is a bit confusing so better do the following sequence: ip_ct_destroy_expect and ip_conntrack_expect_put. [1] https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020794.html Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[NETFILTER]: Don't increase master refcount on expectationsPablo Neira Ayuso1-4/+4
As it's been discussed [1][2]. We shouldn't increase the master conntrack refcount for non-fulfilled conntracks. During the conntrack destruction, the expectations are always killed before the conntrack itself, this guarantees that there won't be any orphan expectation. [1]https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020783.html [2]https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020904.html Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[NETFILTER]: Handle NAT module load racePatrick McHardy2-2/+27
When the NAT module is loaded when connections are already confirmed it must not change their tuples anymore. This is especially important with CONFIG_NETFILTER_DEBUG, the netfilter listhelp functions will refuse to remove an entry from a list when it can not be found on the list, so when a changed tuple hashes to a new bucket the entry is kept in the list until and after the conntrack is freed. Allocate the exact conntrack tuple for NAT for already confirmed connections or drop them if that fails. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[NETFILTER]: Fix CONNMARK Kconfig dependencyYasuyuki Kozakai1-0/+1
Connection mark tracking support is one of the feature in connection tracking, so IP_NF_CONNTRACK_MARK depends on IP_NF_CONNTRACK. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[NETFILTER]: Add NetBIOS name service helperPatrick McHardy3-0/+151
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[NETFILTER]: Add support for permanent expectationsPatrick McHardy6-4/+13
A permanent expectation exists until timeing out and can expect multiple related connections. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05[TCP]: Fix TCP_OFF() bug check introduced by previous change.Herbert Xu1-2/+2
The TCP_OFF assignment at the bottom of that if block can indeed set TCP_OFF without setting TCP_PAGE. Since there is not much to be gained from avoiding this situation, we might as well just zap the offset. The following patch should fix it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05[IPV4]: net/ipv4/ipconfig.c should #include <linux/nfs_fs.h>Adrian Bunk1-0/+1
Every file should #include the header files containing the prototypes of it's global functions. nfs_fs.h contains the prototype of root_nfs_parse_addr(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01[TCP]: Keep TSO enabled even during loss events.David S. Miller2-47/+44
All we need to do is resegment the queue so that we record SACK information accurately. The edges of the SACK blocks guide our resegmenting decisions. With help from Herbert Xu. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01[TCP]: Fix sk_forward_alloc underflow in tcp_sendmsgHerbert Xu1-5/+9
I've finally found a potential cause of the sk_forward_alloc underflows that people have been reporting sporadically. When tcp_sendmsg tacks on extra bits to an existing TCP_PAGE we don't check sk_forward_alloc even though a large amount of time may have elapsed since we allocated the page. In the mean time someone could've come along and liberated packets and reclaimed sk_forward_alloc memory. This patch makes tcp_sendmsg check sk_forward_alloc every time as we do in do_tcp_sendpages. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>