diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2014-12-11 14:27:06 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-12-11 14:27:06 -0800 |
commit | 70e71ca0af244f48a5dcf56dc435243792e3a495 (patch) | |
tree | f7d9c4c4d9a857a00043e9bf6aa2d6f533a34778 /net/netfilter | |
parent | bae41e45b7400496b9bf0c70c6004419d9987819 (diff) | |
parent | 00c83b01d58068dfeb2e1351cca6fccf2a83fa8f (diff) |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) New offloading infrastructure and example 'rocker' driver for
offloading of switching and routing to hardware.
This work was done by a large group of dedicated individuals, not
limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu
2) Start making the networking operate on IOV iterators instead of
modifying iov objects in-situ during transfers. Thanks to Al Viro
and Herbert Xu.
3) A set of new netlink interfaces for the TIPC stack, from Richard
Alpe.
4) Remove unnecessary looping during ipv6 routing lookups, from Martin
KaFai Lau.
5) Add PAUSE frame generation support to gianfar driver, from Matei
Pavaluca.
6) Allow for larger reordering levels in TCP, which are easily
achievable in the real world right now, from Eric Dumazet.
7) Add a variable of napi_schedule that doesn't need to disable cpu
interrupts, from Eric Dumazet.
8) Use a doubly linked list to optimize neigh_parms_release(), from
Nicolas Dichtel.
9) Various enhancements to the kernel BPF verifier, and allow eBPF
programs to actually be attached to sockets. From Alexei
Starovoitov.
10) Support TSO/LSO in sunvnet driver, from David L Stevens.
11) Allow controlling ECN usage via routing metrics, from Florian
Westphal.
12) Remote checksum offload, from Tom Herbert.
13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
driver, from Thomas Lendacky.
14) Add MPLS support to openvswitch, from Simon Horman.
15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
Klassert.
16) Do gro flushes on a per-device basis using a timer, from Eric
Dumazet. This tries to resolve the conflicting goals between the
desired handling of bulk vs. RPC-like traffic.
17) Allow userspace to ask for the CPU upon what a packet was
received/steered, via SO_INCOMING_CPU. From Eric Dumazet.
18) Limit GSO packets to half the current congestion window, from Eric
Dumazet.
19) Add a generic helper so that all drivers set their RSS keys in a
consistent way, from Eric Dumazet.
20) Add xmit_more support to enic driver, from Govindarajulu
Varadarajan.
21) Add VLAN packet scheduler action, from Jiri Pirko.
22) Support configurable RSS hash functions via ethtool, from Eyal
Perry.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
Fix race condition between vxlan_sock_add and vxlan_sock_release
net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
net/mlx4: Add support for A0 steering
net/mlx4: Refactor QUERY_PORT
net/mlx4_core: Add explicit error message when rule doesn't meet configuration
net/mlx4: Add A0 hybrid steering
net/mlx4: Add mlx4_bitmap zone allocator
net/mlx4: Add a check if there are too many reserved QPs
net/mlx4: Change QP allocation scheme
net/mlx4_core: Use tasklet for user-space CQ completion events
net/mlx4_core: Mask out host side virtualization features for guests
net/mlx4_en: Set csum level for encapsulated packets
be2net: Export tunnel offloads only when a VxLAN tunnel is created
gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
net: fec: only enable mdio interrupt before phy device link up
net: fec: clear all interrupt events to support i.MX6SX
net: fec: reset fep link status in suspend function
net: sock: fix access via invalid file descriptor
net: introduce helper macro for_each_cmsghdr
...
Diffstat (limited to 'net/netfilter')
27 files changed, 502 insertions, 225 deletions
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig index ae5096ab65eb..b02660fa9eb0 100644 --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -411,6 +411,13 @@ config NF_NAT_TFTP depends on NF_CONNTRACK && NF_NAT default NF_NAT && NF_CONNTRACK_TFTP +config NF_NAT_REDIRECT + tristate "IPv4/IPv6 redirect support" + depends on NF_NAT + help + This is the kernel functionality to redirect packets to local + machine through NAT. + config NETFILTER_SYNPROXY tristate @@ -505,6 +512,15 @@ config NFT_MASQ This option adds the "masquerade" expression that you can use to perform NAT in the masquerade flavour. +config NFT_REDIR + depends on NF_TABLES + depends on NF_CONNTRACK + depends on NF_NAT + tristate "Netfilter nf_tables redirect support" + help + This options adds the "redirect" expression that you can use + to perform NAT in the redirect flavour. + config NFT_NAT depends on NF_TABLES depends on NF_CONNTRACK @@ -835,6 +851,7 @@ config NETFILTER_XT_TARGET_RATEEST config NETFILTER_XT_TARGET_REDIRECT tristate "REDIRECT target support" depends on NF_NAT + select NF_NAT_REDIRECT ---help--- REDIRECT is a special case of NAT: all incoming connections are mapped onto the incoming interface's address, causing the packets to diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index a9571be3f791..89f73a9e9874 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -51,6 +51,7 @@ nf_nat-y := nf_nat_core.o nf_nat_proto_unknown.o nf_nat_proto_common.o \ obj-$(CONFIG_NF_LOG_COMMON) += nf_log_common.o obj-$(CONFIG_NF_NAT) += nf_nat.o +obj-$(CONFIG_NF_NAT_REDIRECT) += nf_nat_redirect.o # NAT protocols (nf_nat) obj-$(CONFIG_NF_NAT_PROTO_DCCP) += nf_nat_proto_dccp.o @@ -88,6 +89,7 @@ obj-$(CONFIG_NFT_HASH) += nft_hash.o obj-$(CONFIG_NFT_COUNTER) += nft_counter.o obj-$(CONFIG_NFT_LOG) += nft_log.o obj-$(CONFIG_NFT_MASQ) += nft_masq.o +obj-$(CONFIG_NFT_REDIR) += nft_redir.o # generic X tables obj-$(CONFIG_NETFILTER_XTABLES) += x_tables.o xt_tcpudp.o diff --git a/net/netfilter/core.c b/net/netfilter/core.c index 024a2e25c8a4..fea9ef566427 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -17,6 +17,7 @@ #include <linux/interrupt.h> #include <linux/if.h> #include <linux/netdevice.h> +#include <linux/netfilter_ipv6.h> #include <linux/inetdevice.h> #include <linux/proc_fs.h> #include <linux/mutex.h> diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index fee7c64e4dd1..974ff386db0f 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -147,16 +147,22 @@ hbucket_elem_add(struct hbucket *n, u8 ahash_max, size_t dsize) #else #define __CIDR(cidr, i) (cidr) #endif + +/* cidr + 1 is stored in net_prefixes to support /0 */ +#define SCIDR(cidr, i) (__CIDR(cidr, i) + 1) + #ifdef IP_SET_HASH_WITH_NETS_PACKED -/* When cidr is packed with nomatch, cidr - 1 is stored in the entry */ -#define CIDR(cidr, i) (__CIDR(cidr, i) + 1) +/* When cidr is packed with nomatch, cidr - 1 is stored in the data entry */ +#define GCIDR(cidr, i) (__CIDR(cidr, i) + 1) +#define NCIDR(cidr) (cidr) #else -#define CIDR(cidr, i) (__CIDR(cidr, i)) +#define GCIDR(cidr, i) (__CIDR(cidr, i)) +#define NCIDR(cidr) (cidr - 1) #endif #define SET_HOST_MASK(family) (family == AF_INET ? 32 : 128) -#ifdef IP_SET_HASH_WITH_MULTI +#ifdef IP_SET_HASH_WITH_NET0 #define NLEN(family) (SET_HOST_MASK(family) + 1) #else #define NLEN(family) SET_HOST_MASK(family) @@ -292,24 +298,22 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 nets_length, u8 n) int i, j; /* Add in increasing prefix order, so larger cidr first */ - for (i = 0, j = -1; i < nets_length && h->nets[i].nets[n]; i++) { + for (i = 0, j = -1; i < nets_length && h->nets[i].cidr[n]; i++) { if (j != -1) continue; else if (h->nets[i].cidr[n] < cidr) j = i; else if (h->nets[i].cidr[n] == cidr) { - h->nets[i].nets[n]++; + h->nets[cidr - 1].nets[n]++; return; } } if (j != -1) { - for (; i > j; i--) { + for (; i > j; i--) h->nets[i].cidr[n] = h->nets[i - 1].cidr[n]; - h->nets[i].nets[n] = h->nets[i - 1].nets[n]; - } } h->nets[i].cidr[n] = cidr; - h->nets[i].nets[n] = 1; + h->nets[cidr - 1].nets[n] = 1; } static void @@ -320,16 +324,12 @@ mtype_del_cidr(struct htype *h, u8 cidr, u8 nets_length, u8 n) for (i = 0; i < nets_length; i++) { if (h->nets[i].cidr[n] != cidr) continue; - if (h->nets[i].nets[n] > 1 || i == net_end || - h->nets[i + 1].nets[n] == 0) { - h->nets[i].nets[n]--; + h->nets[cidr -1].nets[n]--; + if (h->nets[cidr -1].nets[n] > 0) return; - } - for (j = i; j < net_end && h->nets[j].nets[n]; j++) { + for (j = i; j < net_end && h->nets[j].cidr[n]; j++) h->nets[j].cidr[n] = h->nets[j + 1].cidr[n]; - h->nets[j].nets[n] = h->nets[j + 1].nets[n]; - } - h->nets[j].nets[n] = 0; + h->nets[j].cidr[n] = 0; return; } } @@ -486,7 +486,7 @@ mtype_expire(struct ip_set *set, struct htype *h, u8 nets_length, size_t dsize) pr_debug("expired %u/%u\n", i, j); #ifdef IP_SET_HASH_WITH_NETS for (k = 0; k < IPSET_NET_COUNT; k++) - mtype_del_cidr(h, CIDR(data->cidr, k), + mtype_del_cidr(h, SCIDR(data->cidr, k), nets_length, k); #endif ip_set_ext_destroy(set, data); @@ -633,29 +633,6 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, bool flag_exist = flags & IPSET_FLAG_EXIST; u32 key, multi = 0; - if (h->elements >= h->maxelem && SET_WITH_FORCEADD(set)) { - rcu_read_lock_bh(); - t = rcu_dereference_bh(h->table); - key = HKEY(value, h->initval, t->htable_bits); - n = hbucket(t,key); - if (n->pos) { - /* Choosing the first entry in the array to replace */ - j = 0; - goto reuse_slot; - } - rcu_read_unlock_bh(); - } - if (SET_WITH_TIMEOUT(set) && h->elements >= h->maxelem) - /* FIXME: when set is full, we slow down here */ - mtype_expire(set, h, NLEN(set->family), set->dsize); - - if (h->elements >= h->maxelem) { - if (net_ratelimit()) - pr_warn("Set %s is full, maxelem %u reached\n", - set->name, h->maxelem); - return -IPSET_ERR_HASH_FULL; - } - rcu_read_lock_bh(); t = rcu_dereference_bh(h->table); key = HKEY(value, h->initval, t->htable_bits); @@ -680,15 +657,32 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, j != AHASH_MAX(h) + 1) j = i; } + if (h->elements >= h->maxelem && SET_WITH_FORCEADD(set) && n->pos) { + /* Choosing the first entry in the array to replace */ + j = 0; + goto reuse_slot; + } + if (SET_WITH_TIMEOUT(set) && h->elements >= h->maxelem) + /* FIXME: when set is full, we slow down here */ + mtype_expire(set, h, NLEN(set->family), set->dsize); + + if (h->elements >= h->maxelem) { + if (net_ratelimit()) + pr_warn("Set %s is full, maxelem %u reached\n", + set->name, h->maxelem); + ret = -IPSET_ERR_HASH_FULL; + goto out; + } + reuse_slot: if (j != AHASH_MAX(h) + 1) { /* Fill out reused slot */ data = ahash_data(n, j, set->dsize); #ifdef IP_SET_HASH_WITH_NETS for (i = 0; i < IPSET_NET_COUNT; i++) { - mtype_del_cidr(h, CIDR(data->cidr, i), + mtype_del_cidr(h, SCIDR(data->cidr, i), NLEN(set->family), i); - mtype_add_cidr(h, CIDR(d->cidr, i), + mtype_add_cidr(h, SCIDR(d->cidr, i), NLEN(set->family), i); } #endif @@ -705,7 +699,7 @@ reuse_slot: data = ahash_data(n, n->pos++, set->dsize); #ifdef IP_SET_HASH_WITH_NETS for (i = 0; i < IPSET_NET_COUNT; i++) - mtype_add_cidr(h, CIDR(d->cidr, i), NLEN(set->family), + mtype_add_cidr(h, SCIDR(d->cidr, i), NLEN(set->family), i); #endif h->elements++; @@ -766,7 +760,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, h->elements--; #ifdef IP_SET_HASH_WITH_NETS for (j = 0; j < IPSET_NET_COUNT; j++) - mtype_del_cidr(h, CIDR(d->cidr, j), NLEN(set->family), + mtype_del_cidr(h, SCIDR(d->cidr, j), NLEN(set->family), j); #endif ip_set_ext_destroy(set, data); @@ -827,15 +821,15 @@ mtype_test_cidrs(struct ip_set *set, struct mtype_elem *d, u8 nets_length = NLEN(set->family); pr_debug("test by nets\n"); - for (; j < nets_length && h->nets[j].nets[0] && !multi; j++) { + for (; j < nets_length && h->nets[j].cidr[0] && !multi; j++) { #if IPSET_NET_COUNT == 2 mtype_data_reset_elem(d, &orig); - mtype_data_netmask(d, h->nets[j].cidr[0], false); - for (k = 0; k < nets_length && h->nets[k].nets[1] && !multi; + mtype_data_netmask(d, NCIDR(h->nets[j].cidr[0]), false); + for (k = 0; k < nets_length && h->nets[k].cidr[1] && !multi; k++) { - mtype_data_netmask(d, h->nets[k].cidr[1], true); + mtype_data_netmask(d, NCIDR(h->nets[k].cidr[1]), true); #else - mtype_data_netmask(d, h->nets[j].cidr[0]); + mtype_data_netmask(d, NCIDR(h->nets[j].cidr[0])); #endif key = HKEY(d, h->initval, t->htable_bits); n = hbucket(t, key); @@ -883,7 +877,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext, /* If we test an IP address and not a network address, * try all possible network sizes */ for (i = 0; i < IPSET_NET_COUNT; i++) - if (CIDR(d->cidr, i) != SET_HOST_MASK(set->family)) + if (GCIDR(d->cidr, i) != SET_HOST_MASK(set->family)) break; if (i == IPSET_NET_COUNT) { ret = mtype_test_cidrs(set, d, ext, mext, flags); @@ -1107,8 +1101,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, hsize = sizeof(*h); #ifdef IP_SET_HASH_WITH_NETS - hsize += sizeof(struct net_prefixes) * - (set->family == NFPROTO_IPV4 ? 32 : 128); + hsize += sizeof(struct net_prefixes) * NLEN(set->family); #endif h = kzalloc(hsize, GFP_KERNEL); if (!h) diff --git a/net/netfilter/ipset/ip_set_hash_netiface.c b/net/netfilter/ipset/ip_set_hash_netiface.c index 35dd35873442..758b002130d9 100644 --- a/net/netfilter/ipset/ip_set_hash_netiface.c +++ b/net/netfilter/ipset/ip_set_hash_netiface.c @@ -115,6 +115,7 @@ iface_add(struct rb_root *root, const char **iface) #define IP_SET_HASH_WITH_NETS #define IP_SET_HASH_WITH_RBTREE #define IP_SET_HASH_WITH_MULTI +#define IP_SET_HASH_WITH_NET0 #define STREQ(a, b) (strcmp(a, b) == 0) diff --git a/net/netfilter/ipset/ip_set_hash_netnet.c b/net/netfilter/ipset/ip_set_hash_netnet.c index da00284b3571..ea8772afb6e7 100644 --- a/net/netfilter/ipset/ip_set_hash_netnet.c +++ b/net/netfilter/ipset/ip_set_hash_netnet.c @@ -46,6 +46,7 @@ struct hash_netnet4_elem { __be64 ipcmp; }; u8 nomatch; + u8 padding; union { u8 cidr[2]; u16 ccmp; @@ -271,6 +272,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[], struct hash_netnet6_elem { union nf_inet_addr ip[2]; u8 nomatch; + u8 padding; union { u8 cidr[2]; u16 ccmp; diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index b8053d675fc3..bfaa94c7baa7 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -53,6 +53,7 @@ struct hash_netportnet4_elem { u8 cidr[2]; u16 ccmp; }; + u16 padding; u8 nomatch:1; u8 proto; }; @@ -324,6 +325,7 @@ struct hash_netportnet6_elem { u8 cidr[2]; u16 ccmp; }; + u16 padding; u8 nomatch:1; u8 proto; }; diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index ac7ba689efe7..b8295a430a56 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -465,8 +465,7 @@ __ip_vs_bind_svc(struct ip_vs_dest *dest, struct ip_vs_service *svc) static void ip_vs_service_free(struct ip_vs_service *svc) { - if (svc->stats.cpustats) - free_percpu(svc->stats.cpustats); + free_percpu(svc->stats.cpustats); kfree(svc); } diff --git a/net/netfilter/ipvs/ip_vs_pe.c b/net/netfilter/ipvs/ip_vs_pe.c index 1a82b29ce8ea..0df17caa8af6 100644 --- a/net/netfilter/ipvs/ip_vs_pe.c +++ b/net/netfilter/ipvs/ip_vs_pe.c @@ -37,8 +37,7 @@ struct ip_vs_pe *__ip_vs_pe_getbyname(const char *pe_name) rcu_read_unlock(); return pe; } - if (pe->module) - module_put(pe->module); + module_put(pe->module); } rcu_read_unlock(); diff --git a/net/netfilter/ipvs/ip_vs_sched.c b/net/netfilter/ipvs/ip_vs_sched.c index 4dbcda6258bc..199760c71f39 100644 --- a/net/netfilter/ipvs/ip_vs_sched.c +++ b/net/netfilter/ipvs/ip_vs_sched.c @@ -104,8 +104,7 @@ static struct ip_vs_scheduler *ip_vs_sched_getbyname(const char *sched_name) mutex_unlock(&ip_vs_sched_mutex); return sched; } - if (sched->module) - module_put(sched->module); + module_put(sched->module); } mutex_unlock(&ip_vs_sched_mutex); diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c index 7162c86fd50d..c47ffd7a0a70 100644 --- a/net/netfilter/ipvs/ip_vs_sync.c +++ b/net/netfilter/ipvs/ip_vs_sync.c @@ -820,8 +820,7 @@ ip_vs_conn_fill_param_sync(struct net *net, int af, union ip_vs_sync_conn *sc, p->pe_data = kmemdup(pe_data, pe_data_len, GFP_ATOMIC); if (!p->pe_data) { - if (p->pe->module) - module_put(p->pe->module); + module_put(p->pe->module); return -ENOMEM; } p->pe_data_len = pe_data_len; diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c index bd90bf8107da..3aedbda7658a 100644 --- a/net/netfilter/ipvs/ip_vs_xmit.c +++ b/net/netfilter/ipvs/ip_vs_xmit.c @@ -293,7 +293,6 @@ __ip_vs_get_out_rt(int skb_af, struct sk_buff *skb, struct ip_vs_dest *dest, &dest->addr.ip, &dest_dst->dst_saddr.ip, atomic_read(&rt->dst.__refcnt)); } - daddr = dest->addr.ip; if (ret_saddr) *ret_saddr = dest_dst->dst_saddr.ip; } else { @@ -344,7 +343,7 @@ __ip_vs_get_out_rt(int skb_af, struct sk_buff *skb, struct ip_vs_dest *dest, skb_dst_drop(skb); if (noref) { if (!local) - skb_dst_set_noref_force(skb, &rt->dst); + skb_dst_set_noref(skb, &rt->dst); else skb_dst_set(skb, dst_clone(&rt->dst)); } else @@ -488,7 +487,7 @@ __ip_vs_get_out_rt_v6(int skb_af, struct sk_buff *skb, struct ip_vs_dest *dest, skb_dst_drop(skb); if (noref) { if (!local) - skb_dst_set_noref_force(skb, &rt->dst); + skb_dst_set_noref(skb, &rt->dst); else skb_dst_set(skb, dst_clone(&rt->dst)); } else diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 5016a6929085..a11674806707 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -824,22 +824,19 @@ __nf_conntrack_alloc(struct net *net, u16 zone, atomic_dec(&net->ct.count); return ERR_PTR(-ENOMEM); } - /* - * Let ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.next - * and ct->tuplehash[IP_CT_DIR_REPLY].hnnode.next unchanged. - */ - memset(&ct->tuplehash[IP_CT_DIR_MAX], 0, - offsetof(struct nf_conn, proto) - - offsetof(struct nf_conn, tuplehash[IP_CT_DIR_MAX])); spin_lock_init(&ct->lock); ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple = *orig; ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.pprev = NULL; ct->tuplehash[IP_CT_DIR_REPLY].tuple = *repl; /* save hash for reusing when confirming */ *(unsigned long *)(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev) = hash; + ct->status = 0; /* Don't set timer yet: wait for confirmation */ setup_timer(&ct->timeout, death_by_timeout, (unsigned long)ct); write_pnet(&ct->ct_net, net); + memset(&ct->__nfct_init_offset[0], 0, + offsetof(struct nf_conn, proto) - + offsetof(struct nf_conn, __nfct_init_offset[0])); #ifdef CONFIG_NF_CONNTRACK_ZONES if (zone) { struct nf_conntrack_zone *nf_ct_zone; diff --git a/net/netfilter/nf_conntrack_h323_main.c b/net/netfilter/nf_conntrack_h323_main.c index 3a3a60b126e0..1d69f5b9748f 100644 --- a/net/netfilter/nf_conntrack_h323_main.c +++ b/net/netfilter/nf_conntrack_h323_main.c @@ -728,7 +728,8 @@ static int expect_h245(struct sk_buff *skb, struct nf_conn *ct, /* If the calling party is on the same side of the forward-to party, * we don't need to track the second call */ -static int callforward_do_filter(const union nf_inet_addr *src, +static int callforward_do_filter(struct net *net, + const union nf_inet_addr *src, const union nf_inet_addr *dst, u_int8_t family) { @@ -750,9 +751,9 @@ static int callforward_do_filter(const union nf_inet_addr *src, memset(&fl2, 0, sizeof(fl2)); fl2.daddr = dst->ip; - if (!afinfo->route(&init_net, (struct dst_entry **)&rt1, + if (!afinfo->route(net, (struct dst_entry **)&rt1, flowi4_to_flowi(&fl1), false)) { - if (!afinfo->route(&init_net, (struct dst_entry **)&rt2, + if (!afinfo->route(net, (struct dst_entry **)&rt2, flowi4_to_flowi(&fl2), false)) { if (rt_nexthop(rt1, fl1.daddr) == rt_nexthop(rt2, fl2.daddr) && @@ -774,9 +775,9 @@ static int callforward_do_filter(const union nf_inet_addr *src, memset(&fl2, 0, sizeof(fl2)); fl2.daddr = dst->in6; - if (!afinfo->route(&init_net, (struct dst_entry **)&rt1, + if (!afinfo->route(net, (struct dst_entry **)&rt1, flowi6_to_flowi(&fl1), false)) { - if (!afinfo->route(&init_net, (struct dst_entry **)&rt2, + if (!afinfo->route(net, (struct dst_entry **)&rt2, flowi6_to_flowi(&fl2), false)) { if (ipv6_addr_equal(rt6_nexthop(rt1), rt6_nexthop(rt2)) && @@ -807,6 +808,7 @@ static int expect_callforwarding(struct sk_buff *skb, __be16 port; union nf_inet_addr addr; struct nf_conntrack_expect *exp; + struct net *net = nf_ct_net(ct); typeof(nat_callforwarding_hook) nat_callforwarding; /* Read alternativeAddress */ @@ -816,7 +818,7 @@ static int expect_callforwarding(struct sk_buff *skb, /* If the calling party is on the same side of the forward-to party, * we don't need to track the second call */ if (callforward_filter && - callforward_do_filter(&addr, &ct->tuplehash[!dir].tuple.src.u3, + callforward_do_filter(net, &addr, &ct->tuplehash[!dir].tuple.src.u3, nf_ct_l3num(ct))) { pr_debug("nf_ct_q931: Call Forwarding not tracked\n"); return 0; diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c index 5b3eae7d4c9a..bd9d31537905 100644 --- a/net/netfilter/nf_conntrack_helper.c +++ b/net/netfilter/nf_conntrack_helper.c @@ -250,7 +250,7 @@ out: } EXPORT_SYMBOL_GPL(__nf_ct_try_assign_helper); -/* appropiate ct lock protecting must be taken by caller */ +/* appropriate ct lock protecting must be taken by caller */ static inline int unhelp(struct nf_conntrack_tuple_hash *i, const struct nf_conntrack_helper *me) { diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c index 6e3b9117db1f..43c926cae9c0 100644 --- a/net/netfilter/nf_log.c +++ b/net/netfilter/nf_log.c @@ -19,6 +19,9 @@ static struct nf_logger __rcu *loggers[NFPROTO_NUMPROTO][NF_LOG_TYPE_MAX] __read_mostly; static DEFINE_MUTEX(nf_log_mutex); +#define nft_log_dereference(logger) \ + rcu_dereference_protected(logger, lockdep_is_held(&nf_log_mutex)) + static struct nf_logger *__find_logger(int pf, const char *str_logger) { struct nf_logger *log; @@ -28,8 +31,7 @@ static struct nf_logger *__find_logger(int pf, const char *str_logger) if (loggers[pf][i] == NULL) continue; - log = rcu_dereference_protected(loggers[pf][i], - lockdep_is_held(&nf_log_mutex)); + log = nft_log_dereference(loggers[pf][i]); if (!strncasecmp(str_logger, log->name, strlen(log->name))) return log; } @@ -45,8 +47,7 @@ void nf_log_set(struct net *net, u_int8_t pf, const struct nf_logger *logger) return; mutex_lock(&nf_log_mutex); - log = rcu_dereference_protected(net->nf.nf_loggers[pf], - lockdep_is_held(&nf_log_mutex)); + log = nft_log_dereference(net->nf.nf_loggers[pf]); if (log == NULL) rcu_assign_pointer(net->nf.nf_loggers[pf], logger); @@ -61,8 +62,7 @@ void nf_log_unset(struct net *net, const struct nf_logger *logger) mutex_lock(&nf_log_mutex); for (i = 0; i < NFPROTO_NUMPROTO; i++) { - log = rcu_dereference_protected(net->nf.nf_loggers[i], - lockdep_is_held(&nf_log_mutex)); + log = nft_log_dereference(net->nf.nf_loggers[i]); if (log == logger) RCU_INIT_POINTER(net->nf.nf_loggers[i], NULL); } @@ -75,6 +75,7 @@ EXPORT_SYMBOL(nf_log_unset); int nf_log_register(u_int8_t pf, struct nf_logger *logger) { int i; + int ret = 0; if (pf >= ARRAY_SIZE(init_net.nf.nf_loggers)) return -EINVAL; @@ -82,16 +83,25 @@ int nf_log_register(u_int8_t pf, struct nf_logger *logger) mutex_lock(&nf_log_mutex); if (pf == NFPROTO_UNSPEC) { + for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++) { + if (rcu_access_pointer(loggers[i][logger->type])) { + ret = -EEXIST; + goto unlock; + } + } for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++) rcu_assign_pointer(loggers[i][logger->type], logger); } else { - /* register at end of list to honor first register win */ + if (rcu_access_pointer(loggers[pf][logger->type])) { + ret = -EEXIST; + goto unlock; + } rcu_assign_pointer(loggers[pf][logger->type], logger); } +unlock: mutex_unlock(&nf_log_mutex); - - return 0; + return ret; } EXPORT_SYMBOL(nf_log_register); @@ -144,8 +154,7 @@ int nf_logger_find_get(int pf, enum nf_log_type type) struct nf_logger *logger; int ret = -ENOENT; - logger = loggers[pf][type]; - if (logger == NULL) + if (rcu_access_pointer(loggers[pf][type]) == NULL) request_module("nf-logger-%u-%u", pf, type); rcu_read_lock(); @@ -297,8 +306,7 @@ static int seq_show(struct seq_file *s, void *v) int i; struct net *net = seq_file_net(s); - logger = rcu_dereference_protected(net->nf.nf_loggers[*pos], - lockdep_is_held(&nf_log_mutex)); + logger = nft_log_dereference(net->nf.nf_loggers[*pos]); if (!logger) seq_printf(s, "%2lld NONE (", *pos); @@ -312,8 +320,7 @@ static int seq_show(struct seq_file *s, void *v) if (loggers[*pos][i] == NULL) continue; - logger = rcu_dereference_protected(loggers[*pos][i], - lockdep_is_held(&nf_log_mutex)); + logger = nft_log_dereference(loggers[*pos][i]); seq_printf(s, "%s", logger->name); if (i == 0 && loggers[*pos][i + 1] != NULL) seq_printf(s, ","); @@ -387,8 +394,7 @@ static int nf_log_proc_dostring(struct ctl_table *table, int write, mutex_unlock(&nf_log_mutex); } else { mutex_lock(&nf_log_mutex); - logger = rcu_dereference_protected(net->nf.nf_loggers[tindex], - lockdep_is_held(&nf_log_mutex)); + logger = nft_log_dereference(net->nf.nf_loggers[tindex]); if (!logger) table->data = "NONE"; else diff --git a/net/netfilter/nf_nat_redirect.c b/net/netfilter/nf_nat_redirect.c new file mode 100644 index 000000000000..97b75f9bfbcd --- /dev/null +++ b/net/netfilter/nf_nat_redirect.c @@ -0,0 +1,127 @@ +/* + * (C) 1999-2001 Paul `Rusty' Russell + * (C) 2002-2006 Netfilter Core Team <coreteam@netfilter.org> + * Copyright (c) 2011 Patrick McHardy <kaber@trash.net> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * Based on Rusty Russell's IPv4 REDIRECT target. Development of IPv6 + * NAT funded by Astaro. + */ + +#include <linux/if.h> +#include <linux/inetdevice.h> +#include <linux/ip.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/netdevice.h> +#include <linux/netfilter.h> +#include <linux/types.h> +#include <linux/netfilter_ipv4.h> +#include <linux/netfilter_ipv6.h> +#include <linux/netfilter/x_tables.h> +#include <net/addrconf.h> +#include <net/checksum.h> +#include <net/protocol.h> +#include <net/netfilter/nf_nat.h> +#include <net/netfilter/nf_nat_redirect.h> + +unsigned int +nf_nat_redirect_ipv4(struct sk_buff *skb, + const struct nf_nat_ipv4_multi_range_compat *mr, + unsigned int hooknum) +{ + struct nf_conn *ct; + enum ip_conntrack_info ctinfo; + __be32 newdst; + struct nf_nat_range newrange; + + NF_CT_ASSERT(hooknum == NF_INET_PRE_ROUTING || + hooknum == NF_INET_LOCAL_OUT); + + ct = nf_ct_get(skb, &ctinfo); + NF_CT_ASSERT(ct && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED)); + + /* Local packets: make them go to loopback */ + if (hooknum == NF_INET_LOCAL_OUT) { + newdst = htonl(0x7F000001); + } else { + struct in_device *indev; + struct in_ifaddr *ifa; + + newdst = 0; + + rcu_read_lock(); + indev = __in_dev_get_rcu(skb->dev); + if (indev != NULL) { + ifa = indev->ifa_list; + newdst = ifa->ifa_local; + } + rcu_read_unlock(); + + if (!newdst) + return NF_DROP; + } + + /* Transfer from original range. */ + memset(&newrange.min_addr, 0, sizeof(newrange.min_addr)); + memset(&newrange.max_addr, 0, sizeof(newrange.max_addr)); + newrange.flags = mr->range[0].flags | NF_NAT_RANGE_MAP_IPS; + newrange.min_addr.ip = newdst; + newrange.max_addr.ip = newdst; + newrange.min_proto = mr->range[0].min; + newrange.max_proto = mr->range[0].max; + + /* Hand modified range to generic setup. */ + return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST); +} +EXPORT_SYMBOL_GPL(nf_nat_redirect_ipv4); + +static const struct in6_addr loopback_addr = IN6ADDR_LOOPBACK_INIT; + +unsigned int +nf_nat_redirect_ipv6(struct sk_buff *skb, const struct nf_nat_range *range, + unsigned int hooknum) +{ + struct nf_nat_range newrange; + struct in6_addr newdst; + enum ip_conntrack_info ctinfo; + struct nf_conn *ct; + + ct = nf_ct_get(skb, &ctinfo); + if (hooknum == NF_INET_LOCAL_OUT) { + newdst = loopback_addr; + } else { + struct inet6_dev *idev; + struct inet6_ifaddr *ifa; + bool addr = false; + + rcu_read_lock(); + idev = __in6_dev_get(skb->dev); + if (idev != NULL) { + list_for_each_entry(ifa, &idev->addr_list, if_list) { + newdst = ifa->addr; + addr = true; + break; + } + } + rcu_read_unlock(); + + if (!addr) + return NF_DROP; + } + + newrange.flags = range->flags | NF_NAT_RANGE_MAP_IPS; + newrange.min_addr.in6 = newdst; + newrange.max_addr.in6 = newdst; + newrange.min_proto = range->min_proto; + newrange.max_proto = range->max_proto; + + return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST); +} +EXPORT_SYMBOL_GPL(nf_nat_redirect_ipv6); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>"); diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 66e8425dbfe7..129a8daa4abf 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -2477,7 +2477,7 @@ static int nf_tables_getset(struct sock *nlsk, struct sk_buff *skb, const struct nfgenmsg *nfmsg = nlmsg_data(nlh); int err; - /* Verify existance before starting dump */ + /* Verify existence before starting dump */ err = nft_ctx_init_from_setattr(&ctx, skb, nlh, nla); if (err < 0) return err; @@ -3665,8 +3665,7 @@ static int nf_tables_abort(struct sk_buff *skb) break; case NFT_MSG_NEWCHAIN: if (nft_trans_chain_update(trans)) { - if (nft_trans_chain_stats(trans)) - free_percpu(nft_trans_chain_stats(trans)); + free_percpu(nft_trans_chain_stats(trans)); nft_trans_destroy(trans); } else { diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c index 5f1be5ba3559..11d85b3813f2 100644 --- a/net/netfilter/nfnetlink_log.c +++ b/net/netfilter/nfnetlink_log.c @@ -12,6 +12,9 @@ * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include <linux/module.h> #include <linux/skbuff.h> #include <linux/if_arp.h> @@ -337,9 +340,6 @@ nfulnl_alloc_skb(struct net *net, u32 peer_portid, unsigned int inst_size, skb = nfnetlink_alloc_skb(net, pkt_size, peer_portid, GFP_ATOMIC); - if (!skb) - pr_err("nfnetlink_log: can't even alloc %u bytes\n", - pkt_size); } } @@ -570,10 +570,8 @@ __build_packet_message(struct nfnl_log_net *log, struct nlattr *nla; int size = nla_attr_size(data_len); - if (skb_tailroom(inst->skb) < nla_total_size(data_len)) { - printk(KERN_WARNING "nfnetlink_log: no tailroom!\n"); - return -1; - } + if (skb_tailroom(inst->skb) < nla_total_size(data_len)) + goto nla_put_failure; nla = (struct nlattr *)skb_put(inst->skb, nla_total_size(data_len)); nla->nla_type = NFULA_PAYLOAD; @@ -1069,19 +1067,19 @@ static int __init nfnetlink_log_init(void) netlink_register_notifier(&nfulnl_rtnl_notifier); status = nfnetlink_subsys_register(&nfulnl_subsys); if (status < 0) { - pr_err("log: failed to create netlink socket\n"); + pr_err("failed to create netlink socket\n"); goto cleanup_netlink_notifier; } status = nf_log_register(NFPROTO_UNSPEC, &nfulnl_logger); if (status < 0) { - pr_err("log: failed to register logger\n"); + pr_err("failed to register logger\n"); goto cleanup_subsys; } status = register_pernet_subsys(&nfnl_log_net_ops); if (status < 0) { - pr_err("log: failed to register pernet ops\n"); + pr_err("failed to register pernet ops\n"); goto cleanup_logger; } return status; diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c index 8892b7b6184a..1e316ce4cb5d 100644 --- a/net/netfilter/nft_hash.c +++ b/net/netfilter/nft_hash.c @@ -65,7 +65,7 @@ static int nft_hash_insert(const struct nft_set *set, if (set->flags & NFT_SET_MAP) nft_data_copy(he->data, &elem->data); - rhashtable_insert(priv, &he->node, GFP_KERNEL); + rhashtable_insert(priv, &he->node); return 0; } @@ -88,7 +88,7 @@ static void nft_hash_remove(const struct nft_set *set, pprev = elem->cookie; he = rht_dereference((*pprev), priv); - rhashtable_remove_pprev(priv, he, pprev, GFP_KERNEL); + rhashtable_remove_pprev(priv, he, pprev); synchronize_rcu(); kfree(he); @@ -153,10 +153,12 @@ static unsigned int nft_hash_privsize(const struct nlattr * const nla[]) return sizeof(struct rhashtable); } -static int lockdep_nfnl_lock_is_held(void) +#ifdef CONFIG_PROVE_LOCKING +static int lockdep_nfnl_lock_is_held(void *parent) { return lockdep_nfnl_is_held(NFNL_SUBSYS_NFTABLES); } +#endif static int nft_hash_init(const struct nft_set *set, const struct nft_set_desc *desc, @@ -171,7 +173,9 @@ static int nft_hash_init(const struct nft_set *set, .hashfn = jhash, .grow_decision = rht_grow_above_75, .shrink_decision = rht_shrink_below_30, +#ifdef CONFIG_PROVE_LOCKING .mutex_is_held = lockdep_nfnl_lock_is_held, +#endif }; return rhashtable_init(priv, ¶ms); diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c index 1e7c076ca63a..e99911eda915 100644 --- a/net/netfilter/nft_meta.c +++ b/net/netfilter/nft_meta.c @@ -165,6 +165,12 @@ void nft_meta_get_eval(const struct nft_expr *expr, goto err; dest->data[0] = out->group; break; + case NFT_META_CGROUP: + if (skb->sk == NULL) + break; + + dest->data[0] = skb->sk->sk_classid; + break; default: WARN_ON(1); goto err; @@ -240,6 +246,7 @@ int nft_meta_get_init(const struct nft_ctx *ctx, case NFT_META_CPU: case NFT_META_IIFGROUP: case NFT_META_OIFGROUP: + case NFT_META_CGROUP: break; default: return -EOPNOTSUPP; diff --git a/net/netfilter/nft_redir.c b/net/netfilter/nft_redir.c new file mode 100644 index 000000000000..9e8093f28311 --- /dev/null +++ b/net/netfilter/nft_redir.c @@ -0,0 +1,99 @@ +/* + * Copyright (c) 2014 Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/kernel.h> +#include <linux/init.h> +#include <linux/module.h> +#include <linux/netlink.h> +#include <linux/netfilter.h> +#include <linux/netfilter/nf_tables.h> +#include <net/netfilter/nf_nat.h> +#include <net/netfilter/nf_tables.h> +#include <net/netfilter/nft_redir.h> + +const struct nla_policy nft_redir_policy[NFTA_REDIR_MAX + 1] = { + [NFTA_REDIR_REG_PROTO_MIN] = { .type = NLA_U32 }, + [NFTA_REDIR_REG_PROTO_MAX] = { .type = NLA_U32 }, + [NFTA_REDIR_FLAGS] = { .type = NLA_U32 }, +}; +EXPORT_SYMBOL_GPL(nft_redir_policy); + +int nft_redir_init(const struct nft_ctx *ctx, + const struct nft_expr *expr, + const struct nlattr * const tb[]) +{ + struct nft_redir *priv = nft_expr_priv(expr); + int err; + + err = nft_chain_validate_dependency(ctx->chain, NFT_CHAIN_T_NAT); + if (err < 0) + return err; + + if (tb[NFTA_REDIR_REG_PROTO_MIN]) { + priv->sreg_proto_min = + ntohl(nla_get_be32(tb[NFTA_REDIR_REG_PROTO_MIN])); + + err = nft_validate_input_register(priv->sreg_proto_min); + if (err < 0) + return err; + + if (tb[NFTA_REDIR_REG_PROTO_MAX]) { + priv->sreg_proto_max = + ntohl(nla_get_be32(tb[NFTA_REDIR_REG_PROTO_MAX])); + + err = nft_validate_input_register(priv->sreg_proto_max); + if (err < 0) + return err; + } else { + priv->sreg_proto_max = priv->sreg_proto_min; + } + } + + if (tb[NFTA_REDIR_FLAGS]) { + priv->flags = ntohl(nla_get_be32(tb[NFTA_REDIR_FLAGS])); + if (priv->flags & ~NF_NAT_RANGE_MASK) + return -EINVAL; + } + + return 0; +} +EXPORT_SYMBOL_GPL(nft_redir_init); + +int nft_redir_dump(struct sk_buff *skb, const struct nft_expr *expr) +{ + const struct nft_redir *priv = nft_expr_priv(expr); + + if (priv->sreg_proto_min) { + if (nla_put_be32(skb, NFTA_REDIR_REG_PROTO_MIN, + htonl(priv->sreg_proto_min))) + goto nla_put_failure; + if (nla_put_be32(skb, NFTA_REDIR_REG_PROTO_MAX, + htonl(priv->sreg_proto_max))) + goto nla_put_failure; + } + + if (priv->flags != 0 && + nla_put_be32(skb, NFTA_REDIR_FLAGS, htonl(priv->flags))) + goto nla_put_failure; + + return 0; + +nla_put_failure: + return -1; +} +EXPORT_SYMBOL_GPL(nft_redir_dump); + +int nft_redir_validate(const struct nft_ctx *ctx, const struct nft_expr *expr, + const struct nft_data **data) +{ + return nft_chain_validate_dependency(ctx->chain, NFT_CHAIN_T_NAT); +} +EXPORT_SYMBOL_GPL(nft_redir_validate); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>"); diff --git a/net/netfilter/xt_DSCP.c b/net/netfilter/xt_DSCP.c index ae8271652efa..3f83d38c4e5b 100644 --- a/net/netfilter/xt_DSCP.c +++ b/net/netfilter/xt_DSCP.c @@ -37,7 +37,8 @@ dscp_tg(struct sk_buff *skb, const struct xt_action_param *par) if (!skb_make_writable(skb, sizeof(struct iphdr))) return NF_DROP; - ipv4_change_dsfield(ip_hdr(skb), (__u8)(~XT_DSCP_MASK), + ipv4_change_dsfield(ip_hdr(skb), + (__force __u8)(~XT_DSCP_MASK), dinfo->dscp << XT_DSCP_SHIFT); } @@ -54,7 +55,8 @@ dscp_tg6(struct sk_buff *skb, const struct xt_action_param *par) if (!skb_make_writable(skb, sizeof(struct ipv6hdr))) return NF_DROP; - ipv6_change_dsfield(ipv6_hdr(skb), (__u8)(~XT_DSCP_MASK), + ipv6_change_dsfield(ipv6_hdr(skb), + (__force __u8)(~XT_DSCP_MASK), dinfo->dscp << XT_DSCP_SHIFT); } return XT_CONTINUE; diff --git a/net/netfilter/xt_REDIRECT.c b/net/netfilter/xt_REDIRECT.c index 22a10309297c..03f0b370e178 100644 --- a/net/netfilter/xt_REDIRECT.c +++ b/net/netfilter/xt_REDIRECT.c @@ -26,48 +26,12 @@ #include <net/checksum.h> #include <net/protocol.h> #include <net/netfilter/nf_nat.h> - -static const struct in6_addr loopback_addr = IN6ADDR_LOOPBACK_INIT; +#include <net/netfilter/nf_nat_redirect.h> static unsigned int redirect_tg6(struct sk_buff *skb, const struct xt_action_param *par) { - const struct nf_nat_range *range = par->targinfo; - struct nf_nat_range newrange; - struct in6_addr newdst; - enum ip_conntrack_info ctinfo; - struct nf_conn *ct; - - ct = nf_ct_get(skb, &ctinfo); - if (par->hooknum == NF_INET_LOCAL_OUT) - newdst = loopback_addr; - else { - struct inet6_dev *idev; - struct inet6_ifaddr *ifa; - bool addr = false; - - rcu_read_lock(); - idev = __in6_dev_get(skb->dev); - if (idev != NULL) { - list_for_each_entry(ifa, &idev->addr_list, if_list) { - newdst = ifa->addr; - addr = true; - break; - } - } - rcu_read_unlock(); - - if (!addr) - return NF_DROP; - } - - newrange.flags = range->flags | NF_NAT_RANGE_MAP_IPS; - newrange.min_addr.in6 = newdst; - newrange.max_addr.in6 = newdst; - newrange.min_proto = range->min_proto; - newrange.max_proto = range->max_proto; - - return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST); + return nf_nat_redirect_ipv6(skb, par->targinfo, par->hooknum); } static int redirect_tg6_checkentry(const struct xt_tgchk_param *par) @@ -98,48 +62,7 @@ static int redirect_tg4_check(const struct xt_tgchk_param *par) static unsigned int redirect_tg4(struct sk_buff *skb, const struct xt_action_param *par) { - struct nf_conn *ct; - enum ip_conntrack_info ctinfo; - __be32 newdst; - const struct nf_nat_ipv4_multi_range_compat *mr = par->targinfo; - struct nf_nat_range newrange; - - NF_CT_ASSERT(par->hooknum == NF_INET_PRE_ROUTING || - par->hooknum == NF_INET_LOCAL_OUT); - - ct = nf_ct_get(skb, &ctinfo); - NF_CT_ASSERT(ct && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED)); - - /* Local packets: make them go to loopback */ - if (par->hooknum == NF_INET_LOCAL_OUT) - newdst = htonl(0x7F000001); - else { - struct in_device *indev; - struct in_ifaddr *ifa; - - newdst = 0; - - rcu_read_lock(); - indev = __in_dev_get_rcu(skb->dev); - if (indev && (ifa = indev->ifa_list)) - newdst = ifa->ifa_local; - rcu_read_unlock(); - - if (!newdst) - return NF_DROP; - } - - /* Transfer from original range. */ - memset(&newrange.min_addr, 0, sizeof(newrange.min_addr)); - memset(&newrange.max_addr, 0, sizeof(newrange.max_addr)); - newrange.flags = mr->range[0].flags | NF_NAT_RANGE_MAP_IPS; - newrange.min_addr.ip = newdst; - newrange.max_addr.ip = newdst; - newrange.min_proto = mr->range[0].min; - newrange.max_proto = mr->range[0].max; - - /* Hand modified range to generic setup. */ - return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST); + return nf_nat_redirect_ipv4(skb, par->targinfo, par->hooknum); } static struct xt_target redirect_tg_reg[] __read_mostly = { diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c index fbc66bb250d5..29ba6218a820 100644 --- a/net/netfilter/xt_connlimit.c +++ b/net/netfilter/xt_connlimit.c @@ -134,6 +134,7 @@ static bool add_hlist(struct hlist_head *head, static unsigned int check_hlist(struct net *net, struct hlist_head *head, const struct nf_conntrack_tuple *tuple, + u16 zone, bool *addit) { const struct nf_conntrack_tuple_hash *found; @@ -147,8 +148,7 @@ static unsigned int check_hlist(struct net *net, /* check the saved connections */ hlist_for_each_entry_safe(conn, n, head, node) { - found = nf_conntrack_find_get(net, NF_CT_DEFAULT_ZONE, - &conn->tuple); + found = nf_conntrack_find_get(net, zone, &conn->tuple); if (found == NULL) { hlist_del(&conn->node); kmem_cache_free(connlimit_conn_cachep, conn); @@ -201,7 +201,7 @@ static unsigned int count_tree(struct net *net, struct rb_root *root, const struct nf_conntrack_tuple *tuple, const union nf_inet_addr *addr, const union nf_inet_addr *mask, - u8 family) + u8 family, u16 zone) { struct xt_connlimit_rb *gc_nodes[CONNLIMIT_GC_MAX_NODES]; struct rb_node **rbnode, *parent; @@ -229,7 +229,7 @@ count_tree(struct net *net, struct rb_root *root, } else { /* same source network -> be counted! */ unsigned int count; - count = check_hlist(net, &rbconn->hhead, tuple, &addit); + count = check_hlist(net, &rbconn->hhead, tuple, zone, &addit); tree_nodes_free(root, gc_nodes, gc_count); if (!addit) @@ -245,7 +245,7 @@ count_tree(struct net *net, struct rb_root *root, continue; /* only used for GC on hhead, retval and 'addit' ignored */ - check_hlist(net, &rbconn->hhead, tuple, &addit); + check_hlist(net, &rbconn->hhead, tuple, zone, &addit); if (hlist_empty(&rbconn->hhead)) gc_nodes[gc_count++] = rbconn; } @@ -290,7 +290,7 @@ static int count_them(struct net *net, const struct nf_conntrack_tuple *tuple, const union nf_inet_addr *addr, const union nf_inet_addr *mask, - u_int8_t family) + u_int8_t family, u16 zone) { struct rb_root *root; int count; @@ -306,7 +306,7 @@ static int count_them(struct net *net, spin_lock_bh(&xt_connlimit_locks[hash % CONNLIMIT_LOCK_SLOTS]); - count = count_tree(net, root, tuple, addr, mask, family); + count = count_tree(net, root, tuple, addr, mask, family, zone); spin_unlock_bh(&xt_connlimit_locks[hash % CONNLIMIT_LOCK_SLOTS]); @@ -324,13 +324,16 @@ connlimit_mt(const struct sk_buff *skb, struct xt_action_param *par) enum ip_conntrack_info ctinfo; const struct nf_conn *ct; unsigned int connections; + u16 zone = NF_CT_DEFAULT_ZONE; ct = nf_ct_get(skb, &ctinfo); - if (ct != NULL) + if (ct != NULL) { tuple_ptr = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; - else if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), - par->family, &tuple)) + zone = nf_ct_zone(ct); + } else if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), + par->family, &tuple)) { goto hotdrop; + } if (par->family == NFPROTO_IPV6) { const struct ipv6hdr *iph = ipv6_hdr(skb); @@ -343,7 +346,7 @@ connlimit_mt(const struct sk_buff *skb, struct xt_action_param *par) } connections = count_them(net, info->data, tuple_ptr, &addr, - &info->mask, par->family); + &info->mask, par->family, zone); if (connections == 0) /* kmalloc failed, drop it entirely */ goto hotdrop; diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c index a9faae89f955..30dbe34915ae 100644 --- a/net/netfilter/xt_recent.c +++ b/net/netfilter/xt_recent.c @@ -43,25 +43,29 @@ MODULE_LICENSE("GPL"); MODULE_ALIAS("ipt_recent"); MODULE_ALIAS("ip6t_recent"); -static unsigned int ip_list_tot = 100; -static unsigned int ip_pkt_list_tot = 20; -static unsigned int ip_list_hash_size = 0; -static unsigned int ip_list_perms = 0644; -static unsigned int ip_list_uid = 0; -static unsigned int ip_list_gid = 0; +static unsigned int ip_list_tot __read_mostly = 100; +static unsigned int ip_list_hash_size __read_mostly; +static unsigned int ip_list_perms __read_mostly = 0644; +static unsigned int ip_list_uid __read_mostly; +static unsigned int ip_list_gid __read_mostly; module_param(ip_list_tot, uint, 0400); -module_param(ip_pkt_list_tot, uint, 0400); module_param(ip_list_hash_size, uint, 0400); module_param(ip_list_perms, uint, 0400); module_param(ip_list_uid, uint, S_IRUGO | S_IWUSR); module_param(ip_list_gid, uint, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(ip_list_tot, "number of IPs to remember per list"); -MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP address to remember (max. 255)"); MODULE_PARM_DESC(ip_list_hash_size, "size of hash table used to look up IPs"); MODULE_PARM_DESC(ip_list_perms, "permissions on /proc/net/xt_recent/* files"); MODULE_PARM_DESC(ip_list_uid, "default owner of /proc/net/xt_recent/* files"); MODULE_PARM_DESC(ip_list_gid, "default owning group of /proc/net/xt_recent/* files"); +/* retained for backwards compatibility */ +static unsigned int ip_pkt_list_tot __read_mostly; +module_param(ip_pkt_list_tot, uint, 0400); +MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP address to remember (max. 255)"); + +#define XT_RECENT_MAX_NSTAMPS 256 + struct recent_entry { struct list_head list; struct list_head lru_list; @@ -79,6 +83,7 @@ struct recent_table { union nf_inet_addr mask; unsigned int refcnt; unsigned int entries; + u8 nstamps_max_mask; struct list_head lru_list; struct list_head iphash[0]; }; @@ -90,7 +95,8 @@ struct recent_net { #endif }; -static int recent_net_id; +static int recent_net_id __read_mostly; + static inline struct recent_net *recent_pernet(struct net *net) { return net_generic(net, recent_net_id); @@ -171,12 +177,15 @@ recent_entry_init(struct recent_table *t, const union nf_inet_addr *addr, u_int16_t family, u_int8_t ttl) { struct recent_entry *e; + unsigned int nstamps_max = t->nstamps_max_mask; if (t->entries >= ip_list_tot) { e = list_entry(t->lru_list.next, struct recent_entry, lru_list); recent_entry_remove(t, e); } - e = kmalloc(sizeof(*e) + sizeof(e->stamps[0]) * ip_pkt_list_tot, + + nstamps_max += 1; + e = kmalloc(sizeof(*e) + sizeof(e->stamps[0]) * nstamps_max, GFP_ATOMIC); if (e == NULL) return NULL; @@ -197,7 +206,7 @@ recent_entry_init(struct recent_table *t, const union nf_inet_addr *addr, static void recent_entry_update(struct recent_table *t, struct recent_entry *e) { - e->index %= ip_pkt_list_tot; + e->index &= t->nstamps_max_mask; e->stamps[e->index++] = jiffies; if (e->index > e->nstamps) e->nstamps = e->index; @@ -326,6 +335,7 @@ static int recent_mt_check(const struct xt_mtchk_param *par, kuid_t uid; kgid_t gid; #endif + unsigned int nstamp_mask; unsigned int i; int ret = -EINVAL; size_t sz; @@ -349,19 +359,33 @@ static int recent_mt_check(const struct xt_mtchk_param *par, return -EINVAL; if ((info->check_set & XT_RECENT_REAP) && !info->seconds) return -EINVAL; - if (info->hit_count > ip_pkt_list_tot) { - pr_info("hitcount (%u) is larger than " - "packets to be remembered (%u)\n", - info->hit_count, ip_pkt_list_tot); + if (info->hit_count >= XT_RECENT_MAX_NSTAMPS) { + pr_info("hitcount (%u) is larger than allowed maximum (%u)\n", + info->hit_count, XT_RECENT_MAX_NSTAMPS - 1); return -EINVAL; } if (info->name[0] == '\0' || strnlen(info->name, XT_RECENT_NAME_LEN) == XT_RECENT_NAME_LEN) return -EINVAL; + if (ip_pkt_list_tot && info->hit_count < ip_pkt_list_tot) + nstamp_mask = roundup_pow_of_two(ip_pkt_list_tot) - 1; + else if (info->hit_count) + nstamp_mask = roundup_pow_of_two(info->hit_count) - 1; + else + nstamp_mask = 32 - 1; + mutex_lock(&recent_mutex); t = recent_table_lookup(recent_net, info->name); if (t != NULL) { + if (info->hit_count > t->nstamps_max_mask) { + pr_info("hitcount (%u) is larger than packets to be remembered (%u) for table %s\n", + info->hit_count, t->nstamps_max_mask + 1, + info->name); + ret = -EINVAL; + goto out; + } + t->refcnt++; ret = 0; goto out; @@ -377,6 +401,7 @@ static int recent_mt_check(const struct xt_mtchk_param *par, goto out; } t->refcnt = 1; + t->nstamps_max_mask = nstamp_mask; memcpy(&t->mask, &info->mask, sizeof(t->mask)); strcpy(t->name, info->name); @@ -497,9 +522,12 @@ static void recent_seq_stop(struct seq_file *s, void *v) static int recent_seq_show(struct seq_file *seq, void *v) { const struct recent_entry *e = v; + struct recent_iter_state *st = seq->private; + const struct recent_table *t = st->table; unsigned int i; - i = (e->index - 1) % ip_pkt_list_tot; + i = (e->index - 1) & t->nstamps_max_mask; + if (e->family == NFPROTO_IPV4) seq_printf(seq, "src=%pI4 ttl: %u last_seen: %lu oldest_pkt: %u", &e->addr.ip, e->ttl, e->stamps[i], e->index); @@ -717,7 +745,9 @@ static int __init recent_mt_init(void) { int err; - if (!ip_list_tot || !ip_pkt_list_tot || ip_pkt_list_tot > 255) + BUILD_BUG_ON_NOT_POWER_OF_2(XT_RECENT_MAX_NSTAMPS); + + if (!ip_list_tot || ip_pkt_list_tot >= XT_RECENT_MAX_NSTAMPS) return -EINVAL; ip_list_hash_size = 1 << fls(ip_list_tot); diff --git a/net/netfilter/xt_set.c b/net/netfilter/xt_set.c index 5732cd64acc0..0d47afea9682 100644 --- a/net/netfilter/xt_set.c +++ b/net/netfilter/xt_set.c @@ -157,7 +157,7 @@ set_match_v1_destroy(const struct xt_mtdtor_param *par) /* Revision 3 match */ static bool -match_counter(u64 counter, const struct ip_set_counter_match *info) +match_counter0(u64 counter, const struct ip_set_counter_match0 *info) { switch (info->op) { case IPSET_COUNTER_NONE: @@ -192,14 +192,60 @@ set_match_v3(const struct sk_buff *skb, struct xt_action_param *par) if (!(ret && opt.cmdflags & IPSET_FLAG_MATCH_COUNTERS)) return ret; - if (!match_counter(opt.ext.packets, &info->packets)) + if (!match_counter0(opt.ext.packets, &info->packets)) return 0; - return match_counter(opt.ext.bytes, &info->bytes); + return match_counter0(opt.ext.bytes, &info->bytes); } #define set_match_v3_checkentry set_match_v1_checkentry #define set_match_v3_destroy set_match_v1_destroy +/* Revision 4 match */ + +static bool +match_counter(u64 counter, const struct ip_set_counter_match *info) +{ + switch (info->op) { + case IPSET_COUNTER_NONE: + return true; + case IPSET_COUNTER_EQ: + return counter == info->value; + case IPSET_COUNTER_NE: + return counter != info->value; + case IPSET_COUNTER_LT: + return counter < info->value; + case IPSET_COUNTER_GT: + return counter > info->value; + } + return false; +} + +static bool +set_match_v4(const struct sk_buff *skb, struct xt_action_param *par) +{ + const struct xt_set_info_match_v4 *info = par->matchinfo; + ADT_OPT(opt, par->family, info->match_set.dim, + info->match_set.flags, info->flags, UINT_MAX); + int ret; + + if (info->packets.op != IPSET_COUNTER_NONE || + info->bytes.op != IPSET_COUNTER_NONE) + opt.cmdflags |= IPSET_FLAG_MATCH_COUNTERS; + + ret = match_set(info->match_set.index, skb, par, &opt, + info->match_set.flags & IPSET_INV_MATCH); + + if (!(ret && opt.cmdflags & IPSET_FLAG_MATCH_COUNTERS)) + return ret; + + if (!match_counter(opt.ext.packets, &info->packets)) + return 0; + return match_counter(opt.ext.bytes, &info->bytes); +} + +#define set_match_v4_checkentry set_match_v1_checkentry +#define set_match_v4_destroy set_match_v1_destroy + /* Revision 0 interface: backward compatible with netfilter/iptables */ static unsigned int @@ -573,6 +619,27 @@ static struct xt_match set_matches[] __read_mostly = { .destroy = set_match_v3_destroy, .me = THIS_MODULE }, + /* new revision for counters support: update, match */ + { + .name = "set", + .family = NFPROTO_IPV4, + .revision = 4, + .match = set_match_v4, + .matchsize = sizeof(struct xt_set_info_match_v4), + .checkentry = set_match_v4_checkentry, + .destroy = set_match_v4_destroy, + .me = THIS_MODULE + }, + { + .name = "set", + .family = NFPROTO_IPV6, + .revision = 4, + .match = set_match_v4, + .matchsize = sizeof(struct xt_set_info_match_v4), + .checkentry = set_match_v4_checkentry, + .destroy = set_match_v4_destroy, + .me = THIS_MODULE + }, }; static struct xt_target set_targets[] __read_mostly = { |