From 27caca9d2e01c92b26d0690f065aad093fea01c7 Mon Sep 17 00:00:00 2001 From: Alexander Kochetkov Date: Tue, 18 Nov 2014 21:00:58 +0400 Subject: i2c: omap: fix NACK and Arbitration Lost irq handling commit 1d7afc95946487945cc7f5019b41255b72224b70 (i2c: omap: ack IRQ in parts) changed the interrupt handler to complete transfers without clearing XRDY (AL case) and ARDY (NACK case) flags. XRDY or ARDY interrupts will be fired again. As a result, ISR keep processing transfer after it was already complete (from the driver code point of view). A didn't see real impacts of the 1d7afc9, but it is really bad idea to have ISR running on user data after transfer was complete. It looks, what 1d7afc9 violate TI specs in what how AL and NACK should be handled (see Note 1, sprugn4r, Figure 17-31 and Figure 17-32). According to specs (if I understood correctly), in case of NACK and AL driver must reset NACK, AL, ARDY, RDR, and RRDY (Master Receive Mode), and NACK, AL, ARDY, and XDR (Master Transmitter Mode). All that is done down the code under the if condition: if (stat & (OMAP_I2C_STAT_ARDY | OMAP_I2C_STAT_NACK | OMAP_I2C_STAT_AL)) ... The patch restore pre 1d7afc9 logic of handling NACK and AL interrupts, so no interrupts is fired after ISR informs the rest of driver what transfer complete. Note: instead of removing break under NACK case, we could just replace 'break' with 'continue' and allow NACK transfer to finish using ARDY event. I found that NACK and ARDY bits usually set together. That case confirm TI wiki: http://processors.wiki.ti.com/index.php/I2C_Tips#Detecting_and_handling_NACK In order if someone interested in the event traces for NACK and AL cases, I sent them to mailing list. Tested on Beagleboard XM C. Signed-off-by: Alexander Kochetkov Fixes: 1d7afc9 i2c: omap: ack IRQ in parts Cc: # v3.7+ Acked-by: Felipe Balbi Tested-by: Aaro Koskinen Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-omap.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c index 26942c159de1..32dc65183ea8 100644 --- a/drivers/i2c/busses/i2c-omap.c +++ b/drivers/i2c/busses/i2c-omap.c @@ -922,14 +922,12 @@ omap_i2c_isr_thread(int this_irq, void *dev_id) if (stat & OMAP_I2C_STAT_NACK) { err |= OMAP_I2C_STAT_NACK; omap_i2c_ack_stat(dev, OMAP_I2C_STAT_NACK); - break; } if (stat & OMAP_I2C_STAT_AL) { dev_err(dev->dev, "Arbitration lost\n"); err |= OMAP_I2C_STAT_AL; omap_i2c_ack_stat(dev, OMAP_I2C_STAT_AL); - break; } /* -- cgit v1.2.3 From d39f77b06a712fcba6185a20bb209e357923d980 Mon Sep 17 00:00:00 2001 From: Andrew Jackson Date: Fri, 7 Nov 2014 12:10:44 +0000 Subject: i2c: designware: prevent early stop on TX FIFO empty If the Designware core is configured with IC_EMPTYFIFO_HOLD_MASTER_EN set to zero, allowing the TX FIFO to become empty causes a STOP condition to be generated on the I2C bus. If the transmit FIFO threshold is set too high, an erroneous STOP condition can be generated on long transfers - particularly where the interrupt latency is extended. Signed-off-by: Andrew Jackson Signed-off-by: Liviu Dudau Tested-by: Mika Westerberg Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-designware-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-designware-core.c b/drivers/i2c/busses/i2c-designware-core.c index edca99dbba23..23628b7bfb8d 100644 --- a/drivers/i2c/busses/i2c-designware-core.c +++ b/drivers/i2c/busses/i2c-designware-core.c @@ -359,7 +359,7 @@ int i2c_dw_init(struct dw_i2c_dev *dev) } /* Configure Tx/Rx FIFO threshold levels */ - dw_writel(dev, dev->tx_fifo_depth - 1, DW_IC_TX_TL); + dw_writel(dev, dev->tx_fifo_depth / 2, DW_IC_TX_TL); dw_writel(dev, 0, DW_IC_RX_TL); /* configure the i2c master */ -- cgit v1.2.3 From ccfc866356674cb3a61829d239c685af6e85f197 Mon Sep 17 00:00:00 2001 From: Alexander Kochetkov Date: Fri, 21 Nov 2014 04:16:51 +0400 Subject: i2c: omap: fix i207 errata handling commit 6d9939f651419a63e091105663821f9c7d3fec37 (i2c: omap: split out [XR]DR and [XR]RDY) changed the way how errata i207 (I2C: RDR Flag May Be Incorrectly Set) get handled. 6d9939f6514 code doesn't correspond to workaround provided by errata. According to errata ISR must filter out spurious RDR before data read not after. ISR must read RXSTAT to get number of bytes available to read. Because RDR could be set while there could no data in the receive FIFO. Restored pre 6d9939f6514 way of handling errata. Found by code review. Real impact haven't seen. Tested on Beagleboard XM C. Signed-off-by: Alexander Kochetkov Fixes: 6d9939f651419a63e09110 i2c: omap: split out [XR]DR and [XR]RDY Tested-by: Felipe Balbi Reviewed-by: Felipe Balbi Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-omap.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c index 32dc65183ea8..277a2288d4a8 100644 --- a/drivers/i2c/busses/i2c-omap.c +++ b/drivers/i2c/busses/i2c-omap.c @@ -952,11 +952,13 @@ omap_i2c_isr_thread(int this_irq, void *dev_id) if (dev->fifo_size) num_bytes = dev->buf_len; - omap_i2c_receive_data(dev, num_bytes, true); - - if (dev->errata & I2C_OMAP_ERRATA_I207) + if (dev->errata & I2C_OMAP_ERRATA_I207) { i2c_omap_errata_i207(dev, stat); + num_bytes = (omap_i2c_read_reg(dev, + OMAP_I2C_BUFSTAT_REG) >> 8) & 0x3F; + } + omap_i2c_receive_data(dev, num_bytes, true); omap_i2c_ack_stat(dev, OMAP_I2C_STAT_RDR); continue; } -- cgit v1.2.3 From bdfa7542d40e6251c232a802231b37116bd31b11 Mon Sep 17 00:00:00 2001 From: Ville Syrjälä Date: Tue, 27 May 2014 21:33:09 +0300 Subject: drm/i915: Ignore SURFLIVE and flip counter when the GPU gets reset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit During a GPU reset we need to get pending page flip cleared out since the ring contents are gone and flip will never complete on its own. This used to work until the mmio vs. CS flip race detection came about. That piece of code is looking for a specific surface address in the SURFLIVE register, but as a flip to that address may never happen the check may never pass. So we should just skip the SURFLIVE and flip counter checks when the GPU gets reset. intel_display_handle_reset() tries to effectively complete the flip anyway by calling .update_primary_plane(). But that may not satisfy the conditions of the mmio vs. CS race detection since there's no guarantee that a modeset didn't sneak in between the GPU reset and intel_display_handle_reset(). Such a modeset will not wait for pending flips due to the ongoing GPU reset, and then the primary plane updates performed by intel_display_handle_reset() will already use the new surface address, and thus the surface address the flip is waiting for might never appear in SURFLIVE. The result is that the flip will never complete and attempts to perform further page flips will fail with -EBUSY. During the GPU reset intel_crtc_has_pending_flip() will return false regardless, so the deadlock with a modeset vs. the error work acquiring crtc->mutex was avoided. And the reset_counter check in intel_crtc_has_pending_flip() actually made this bug even less severe since it allowed normal modesets to go through even though there's a pending flip. This is a regression introduced by me here: commit 75f7f3ec600524c9544cc31695155f1a9ddbe1d9 Author: Ville Syrjälä Date: Tue Apr 15 21:41:34 2014 +0300 drm/i915: Fix mmio vs. CS flip race on ILK+ Testcase: igt/kms_flip/flip-vs-panning-vs-hang Signed-off-by: Ville Syrjälä Reviewed-by: Chris Wilson Reviewed-by: Daniel Vetter Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/intel_display.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index f0a1a56406eb..8bcdb981d540 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9408,6 +9408,10 @@ static bool page_flip_finished(struct intel_crtc *crtc) struct drm_device *dev = crtc->base.dev; struct drm_i915_private *dev_priv = dev->dev_private; + if (i915_reset_in_progress(&dev_priv->gpu_error) || + crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) + return true; + /* * The relevant registers doen't exist on pre-ctg. * As the flip done interrupt doesn't trigger for mmio -- cgit v1.2.3 From afa4e53a7bcd4328d88e25c7a63746b65dc6bbe2 Mon Sep 17 00:00:00 2001 From: Ville Syrjälä Date: Tue, 25 Nov 2014 15:43:48 +0200 Subject: drm/i915: Cancel vdd off work before suspend MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently we just make sure vdd is off before suspending, but we don't cancel the vdd off work. The work wil not touch vdd if want_panel_vdd==false so in theory this is fine. In the past that was perfectly fine since the vdd off work didn't do anything when want_panel_vdd==false, so even if the work would have been run during system resume before i915 has resumed, nothing would happen. However since pps_lock() will now grab the power domain references before it can check want_panel_vdd, we may end up toggling the power wells on/off already before the driver has resumed. That is not really acceptable, so cancel the vdd off work when suspending the encoder. The problem appeared when pps_lock() was introduced in: commit 773538e86081d146e0020435d614f4b96996c1f9 Author: Ville Syrjälä Date: Thu Sep 4 14:54:56 2014 +0300 drm/i915: Reset power sequencer pipe tracking when disp2d is off Signed-off-by: Ville Syrjälä Reviewed-by: Imre Deak Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/intel_dp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 5ad45bfff3fe..4bcd91757321 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4450,6 +4450,7 @@ static void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder) * vdd might still be enabled do to the delayed vdd off. * Make sure vdd is actually turned off here. */ + cancel_delayed_work_sync(&intel_dp->panel_vdd_work); pps_lock(intel_dp); edp_panel_vdd_off_sync(intel_dp); pps_unlock(intel_dp); -- cgit v1.2.3 From 094cb98179f19b75acf9ff471daabf3948ce98e6 Mon Sep 17 00:00:00 2001 From: Ian Campbell Date: Tue, 25 Nov 2014 15:05:13 +0000 Subject: of/fdt: memblock_reserve /memreserve/ regions in the case of partial overlap memblock_is_region_reserved() returns true in the case of a partial overlap, meaning that the current code fails to reserve the non-overlapping portion. This call was introduced as part of d1552ce449eb "of/fdt: move memreserve and dtb memory reservations into core" which went into v3.16. I observed this causing a Midway system with a buggy fdt (the header declares itself to be larger than it really is) failing to boot because the over-inflated size of the fdt was causing it to seem to run into the swapper_pg_dir region, meaning the DT wasn't reserved. The symptoms were failing to find an disks or network and failing to boot. However given the ambiguity of whether things like the initrd are covered by /memreserve/ and similar I think it is best to also register the region rather than just ignoring it. Since memblock_reserve() handles overlaps just fine lets just warn and carry on. Signed-off-by: Ian Campbell Signed-off-by: Grant Likely Cc: Rob Herring Cc: stable@vger.kernel.org # v3.16+ --- drivers/of/fdt.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index 30e97bcc4f88..d134710de96d 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -964,8 +964,6 @@ void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size) int __init __weak early_init_dt_reserve_memory_arch(phys_addr_t base, phys_addr_t size, bool nomap) { - if (memblock_is_region_reserved(base, size)) - return -EBUSY; if (nomap) return memblock_remove(base, size); return memblock_reserve(base, size); -- cgit v1.2.3 From 1348579433566355e570008929daa09a0db64fd8 Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Fri, 14 Nov 2014 12:08:34 -0500 Subject: drm/radeon: report disconnected for LVDS/eDP with PX if ddc fails If ddc fails, presumably the i2c mux (and hopefully the signal mux) are switched to the other GPU so don't fetch the edid from the vbios so that the connector reports disconnected. bug: https://bugzilla.opensuse.org/show_bug.cgi?id=904417 Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/radeon_connectors.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c index 300c4b3d4669..26baa9c05f6c 100644 --- a/drivers/gpu/drm/radeon/radeon_connectors.c +++ b/drivers/gpu/drm/radeon/radeon_connectors.c @@ -322,6 +322,12 @@ static void radeon_connector_get_edid(struct drm_connector *connector) } if (!radeon_connector->edid) { + /* don't fetch the edid from the vbios if ddc fails and runpm is + * enabled so we report disconnected. + */ + if ((rdev->flags & RADEON_IS_PX) && (radeon_runtime_pm != 0)) + return; + if (rdev->is_atom_bios) { /* some laptops provide a hardcoded edid in rom for LCDs */ if (((connector->connector_type == DRM_MODE_CONNECTOR_LVDS) || @@ -826,6 +832,8 @@ static int radeon_lvds_mode_valid(struct drm_connector *connector, static enum drm_connector_status radeon_lvds_detect(struct drm_connector *connector, bool force) { + struct drm_device *dev = connector->dev; + struct radeon_device *rdev = dev->dev_private; struct radeon_connector *radeon_connector = to_radeon_connector(connector); struct drm_encoder *encoder = radeon_best_single_encoder(connector); enum drm_connector_status ret = connector_status_disconnected; @@ -842,7 +850,11 @@ radeon_lvds_detect(struct drm_connector *connector, bool force) /* check if panel is valid */ if (native_mode->hdisplay >= 320 && native_mode->vdisplay >= 240) ret = connector_status_connected; - + /* don't fetch the edid from the vbios if ddc fails and runpm is + * enabled so we report disconnected. + */ + if ((rdev->flags & RADEON_IS_PX) && (radeon_runtime_pm != 0)) + ret = connector_status_disconnected; } /* check for edid as well */ @@ -1589,6 +1601,11 @@ radeon_dp_detect(struct drm_connector *connector, bool force) /* check if panel is valid */ if (native_mode->hdisplay >= 320 && native_mode->vdisplay >= 240) ret = connector_status_connected; + /* don't fetch the edid from the vbios if ddc fails and runpm is + * enabled so we report disconnected. + */ + if ((rdev->flags & RADEON_IS_PX) && (radeon_runtime_pm != 0)) + ret = connector_status_disconnected; } /* eDP is always DP */ radeon_dig_connector->dp_sink_type = CONNECTOR_OBJECT_ID_DISPLAYPORT; -- cgit v1.2.3 From f6c6fda4c9e17940b0a2ba206b0408babfdc930c Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Thu, 27 Nov 2014 00:22:33 +0100 Subject: bond: Check length of IFLA_BOND_ARP_IP_TARGET attributes Fixes: 7f28fa10 ("bonding: add arp_ip_target netlink support") Reported-by: John Fastabend Cc: Scott Feldman Signed-off-by: Thomas Graf Acked-by: John Fastabend Signed-off-by: David S. Miller --- drivers/net/bonding/bond_netlink.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c index c13d83e15ace..45f09a66e6c9 100644 --- a/drivers/net/bonding/bond_netlink.c +++ b/drivers/net/bonding/bond_netlink.c @@ -225,7 +225,12 @@ static int bond_changelink(struct net_device *bond_dev, bond_option_arp_ip_targets_clear(bond); nla_for_each_nested(attr, data[IFLA_BOND_ARP_IP_TARGET], rem) { - __be32 target = nla_get_be32(attr); + __be32 target; + + if (nla_len(attr) < sizeof(target)) + return -EINVAL; + + target = nla_get_be32(attr); bond_opt_initval(&newval, (__force u64)target); err = __bond_opt_set(bond, BOND_OPT_ARP_TARGETS, -- cgit v1.2.3 From e0ebde0e131b529fd721b24f62872def5ec3718c Mon Sep 17 00:00:00 2001 From: Nicolas Dichtel Date: Thu, 27 Nov 2014 10:16:15 +0100 Subject: rtnetlink: release net refcnt on error in do_setlink() rtnl_link_get_net() holds a reference on the 'struct net', we need to release it in case of error. CC: Eric W. Biederman Fixes: b51642f6d77b ("net: Enable a userns root rtnl calls that are safe for unprivilged users") Signed-off-by: Nicolas Dichtel Reviewed-by: "Eric W. Biederman" Signed-off-by: David S. Miller --- net/core/rtnetlink.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index b9b7dfaf202b..76321ea442c3 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1498,6 +1498,7 @@ static int do_setlink(const struct sk_buff *skb, goto errout; } if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) { + put_net(net); err = -EPERM; goto errout; } -- cgit v1.2.3 From 4d6a949c62f123569fb355b6ec7f314b76f93735 Mon Sep 17 00:00:00 2001 From: Mitsuhiro Kimura Date: Thu, 27 Nov 2014 20:34:00 +0900 Subject: sh_eth: Fix skb alloc size and alignment adjust rule. In the current driver, allocation size of skb does not care the alignment adjust after allocation. And also, in the current implementation, buffer alignment method by sh_eth_set_receive_align function has a bug that this function displace buffer start address forcedly when the alignment is corrected. In the result, tail of the skb will exceed allocated area and kernel panic will be occurred. This patch fix this issue. Signed-off-by: Mitsuhiro Kimura Signed-off-by: Yoshihiro Kaneko Signed-off-by: David S. Miller --- drivers/net/ethernet/renesas/sh_eth.c | 32 +++++++++++++------------------- drivers/net/ethernet/renesas/sh_eth.h | 4 ++-- 2 files changed, 15 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index 60e9c2cd051e..f9e30b87ecd5 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -917,21 +917,13 @@ static int sh_eth_reset(struct net_device *ndev) return ret; } -#if defined(CONFIG_CPU_SH4) || defined(CONFIG_ARCH_SHMOBILE) static void sh_eth_set_receive_align(struct sk_buff *skb) { - int reserve; + uintptr_t reserve = (uintptr_t)skb->data & (SH_ETH_RX_ALIGN - 1); - reserve = SH4_SKB_RX_ALIGN - ((u32)skb->data & (SH4_SKB_RX_ALIGN - 1)); if (reserve) - skb_reserve(skb, reserve); + skb_reserve(skb, SH_ETH_RX_ALIGN - reserve); } -#else -static void sh_eth_set_receive_align(struct sk_buff *skb) -{ - skb_reserve(skb, SH2_SH3_SKB_RX_ALIGN); -} -#endif /* CPU <-> EDMAC endian convert */ @@ -1119,6 +1111,7 @@ static void sh_eth_ring_format(struct net_device *ndev) struct sh_eth_txdesc *txdesc = NULL; int rx_ringsize = sizeof(*rxdesc) * mdp->num_rx_ring; int tx_ringsize = sizeof(*txdesc) * mdp->num_tx_ring; + int skbuff_size = mdp->rx_buf_sz + SH_ETH_RX_ALIGN - 1; mdp->cur_rx = 0; mdp->cur_tx = 0; @@ -1131,21 +1124,21 @@ static void sh_eth_ring_format(struct net_device *ndev) for (i = 0; i < mdp->num_rx_ring; i++) { /* skb */ mdp->rx_skbuff[i] = NULL; - skb = netdev_alloc_skb(ndev, mdp->rx_buf_sz); + skb = netdev_alloc_skb(ndev, skbuff_size); mdp->rx_skbuff[i] = skb; if (skb == NULL) break; - dma_map_single(&ndev->dev, skb->data, mdp->rx_buf_sz, - DMA_FROM_DEVICE); sh_eth_set_receive_align(skb); /* RX descriptor */ rxdesc = &mdp->rx_ring[i]; + /* The size of the buffer is a multiple of 16 bytes. */ + rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16); + dma_map_single(&ndev->dev, skb->data, rxdesc->buffer_length, + DMA_FROM_DEVICE); rxdesc->addr = virt_to_phys(PTR_ALIGN(skb->data, 4)); rxdesc->status = cpu_to_edmac(mdp, RD_RACT | RD_RFP); - /* The size of the buffer is 16 byte boundary. */ - rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16); /* Rx descriptor address set */ if (i == 0) { sh_eth_write(ndev, mdp->rx_desc_dma, RDLAR); @@ -1397,6 +1390,7 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota) struct sk_buff *skb; u16 pkt_len = 0; u32 desc_status; + int skbuff_size = mdp->rx_buf_sz + SH_ETH_RX_ALIGN - 1; rxdesc = &mdp->rx_ring[entry]; while (!(rxdesc->status & cpu_to_edmac(mdp, RD_RACT))) { @@ -1448,7 +1442,7 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota) if (mdp->cd->rpadir) skb_reserve(skb, NET_IP_ALIGN); dma_sync_single_for_cpu(&ndev->dev, rxdesc->addr, - mdp->rx_buf_sz, + ALIGN(mdp->rx_buf_sz, 16), DMA_FROM_DEVICE); skb_put(skb, pkt_len); skb->protocol = eth_type_trans(skb, ndev); @@ -1468,13 +1462,13 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota) rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16); if (mdp->rx_skbuff[entry] == NULL) { - skb = netdev_alloc_skb(ndev, mdp->rx_buf_sz); + skb = netdev_alloc_skb(ndev, skbuff_size); mdp->rx_skbuff[entry] = skb; if (skb == NULL) break; /* Better luck next round. */ - dma_map_single(&ndev->dev, skb->data, mdp->rx_buf_sz, - DMA_FROM_DEVICE); sh_eth_set_receive_align(skb); + dma_map_single(&ndev->dev, skb->data, + rxdesc->buffer_length, DMA_FROM_DEVICE); skb_checksum_none_assert(skb); rxdesc->addr = virt_to_phys(PTR_ALIGN(skb->data, 4)); diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h index b37c427144ee..9fa93321d512 100644 --- a/drivers/net/ethernet/renesas/sh_eth.h +++ b/drivers/net/ethernet/renesas/sh_eth.h @@ -162,9 +162,9 @@ enum { /* Driver's parameters */ #if defined(CONFIG_CPU_SH4) || defined(CONFIG_ARCH_SHMOBILE) -#define SH4_SKB_RX_ALIGN 32 +#define SH_ETH_RX_ALIGN 32 #else -#define SH2_SH3_SKB_RX_ALIGN 2 +#define SH_ETH_RX_ALIGN 2 #endif /* Register's bits -- cgit v1.2.3 From 28603d13997e2ef47f18589cc9a44553aad49c86 Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Thu, 27 Nov 2014 21:05:34 +0800 Subject: stmmac: platform: Move plat_dat checking earlier Original code only check/alloc plat_dat for the CONFIG_OF case, this patch check/alloc it earlier and unconditionally to avoid kernel build warnings: drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:275 stmmac_pltfr_probe() warn: variable dereferenced before check 'plat_dat' V2: Fix coding style. Signed-off-by: Huacai Chen Signed-off-by: David S. Miller --- drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 5b0da3986216..58a1a0a423d4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -265,6 +265,15 @@ static int stmmac_pltfr_probe(struct platform_device *pdev) plat_dat = dev_get_platdata(&pdev->dev); + if (!plat_dat) + plat_dat = devm_kzalloc(&pdev->dev, + sizeof(struct plat_stmmacenet_data), + GFP_KERNEL); + if (!plat_dat) { + pr_err("%s: ERROR: no memory", __func__); + return -ENOMEM; + } + /* Set default value for multicast hash bins */ plat_dat->multicast_filter_bins = HASH_TABLE_SIZE; @@ -272,15 +281,6 @@ static int stmmac_pltfr_probe(struct platform_device *pdev) plat_dat->unicast_filter_entries = 1; if (pdev->dev.of_node) { - if (!plat_dat) - plat_dat = devm_kzalloc(&pdev->dev, - sizeof(struct plat_stmmacenet_data), - GFP_KERNEL); - if (!plat_dat) { - pr_err("%s: ERROR: no memory", __func__); - return -ENOMEM; - } - ret = stmmac_probe_config_dt(pdev, plat_dat, &mac); if (ret) { pr_err("%s: main dt probe failed", __func__); -- cgit v1.2.3 From 7fa2955ff70ce4532f144d26b8a087095f9c9ffc Mon Sep 17 00:00:00 2001 From: Mitsuhiro Kimura Date: Fri, 28 Nov 2014 10:04:15 +0900 Subject: sh_eth: Fix sleeping function called from invalid context This resolves the following bug which can be reproduced by building the kernel with CONFIG_DEBUG_ATOMIC_SLEEP=y and reading network statistics while the network interface is down. e.g.: ifconfig eth0 down cat /sys/class/net/eth0/statistics/tx_errors ---- [ 1238.161349] BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:952 [ 1238.188279] in_atomic(): 1, irqs_disabled(): 0, pid: 1388, name: cat [ 1238.207425] CPU: 0 PID: 1388 Comm: cat Not tainted 3.10.31-ltsi-00046-gefa0b46 #1087 [ 1238.230737] Backtrace: [ 1238.238123] [] (dump_backtrace+0x0/0x10c) from [] (show_stack+0x18/0x1c) [ 1238.263499] r6:000003b8 r5:c06160c0 r4:c0669e00 r3:00404000 [ 1238.280583] [] (show_stack+0x0/0x1c) from [] (dump_stack+0x20/0x28) [ 1238.304631] [] (dump_stack+0x0/0x28) from [] (__might_sleep+0xf8/0x118) [ 1238.329734] [] (__might_sleep+0x0/0x118) from [] (__pm_runtime_resume+0x38/0x90) [ 1238.357170] r7:d616f000 r6:c049c458 r5:00000004 r4:d6a17210 [ 1238.374251] [] (__pm_runtime_resume+0x0/0x90) from [] (sh_eth_get_stats+0x44/0x280) [ 1238.402468] r7:d616f000 r6:c049c458 r5:d5c21000 r4:d5c21000 [ 1238.419552] [] (sh_eth_get_stats+0x0/0x280) from [] (dev_get_stats+0x54/0x88) [ 1238.446204] r5:d5c21000 r4:d5ed7e08 [ 1238.456980] [] (dev_get_stats+0x0/0x88) from [] (netstat_show.isra.15+0x54/0x9c) [ 1238.484413] r6:d5c21000 r5:d5c21238 r4:00000028 r3:00000001 [ 1238.501495] [] (netstat_show.isra.15+0x0/0x9c) from [] (show_tx_errors+0x18/0x1c) [ 1238.529196] r7:d5f945d8 r6:d5f945c0 r5:c049716c r4:c0650e7c [ 1238.546279] [] (show_tx_errors+0x0/0x1c) from [] (dev_attr_show+0x24/0x50) [ 1238.572157] [] (dev_attr_show+0x0/0x50) from [] (sysfs_read_file+0xb0/0x140) [ 1238.598554] r5:c049716c r4:d5c21240 [ 1238.609326] [] (sysfs_read_file+0x0/0x140) from [] (vfs_read+0xb0/0x13c) [ 1238.634679] [] (vfs_read+0x0/0x13c) from [] (SyS_read+0x44/0x74) [ 1238.657944] r8:bef45bf0 r7:00000000 r6:d6ac0600 r5:00000000 r4:00000000 [ 1238.678172] [] (SyS_read+0x0/0x74) from [] (ret_fast_syscall+0x0/0x30) ---- Signed-off-by: Mitsuhiro Kimura Signed-off-by: Yoshihiro Kaneko Signed-off-by: Simon Horman Signed-off-by: David S. Miller --- drivers/net/ethernet/renesas/sh_eth.c | 64 +++++++++++++++++++---------------- drivers/net/ethernet/renesas/sh_eth.h | 1 + 2 files changed, 36 insertions(+), 29 deletions(-) diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index f9e30b87ecd5..b5db6b3f939f 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -2036,6 +2036,8 @@ static int sh_eth_open(struct net_device *ndev) if (ret) goto out_free_irq; + mdp->is_opened = 1; + return ret; out_free_irq: @@ -2125,6 +2127,36 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev) return NETDEV_TX_OK; } +static struct net_device_stats *sh_eth_get_stats(struct net_device *ndev) +{ + struct sh_eth_private *mdp = netdev_priv(ndev); + + if (sh_eth_is_rz_fast_ether(mdp)) + return &ndev->stats; + + if (!mdp->is_opened) + return &ndev->stats; + + ndev->stats.tx_dropped += sh_eth_read(ndev, TROCR); + sh_eth_write(ndev, 0, TROCR); /* (write clear) */ + ndev->stats.collisions += sh_eth_read(ndev, CDCR); + sh_eth_write(ndev, 0, CDCR); /* (write clear) */ + ndev->stats.tx_carrier_errors += sh_eth_read(ndev, LCCR); + sh_eth_write(ndev, 0, LCCR); /* (write clear) */ + + if (sh_eth_is_gether(mdp)) { + ndev->stats.tx_carrier_errors += sh_eth_read(ndev, CERCR); + sh_eth_write(ndev, 0, CERCR); /* (write clear) */ + ndev->stats.tx_carrier_errors += sh_eth_read(ndev, CEECR); + sh_eth_write(ndev, 0, CEECR); /* (write clear) */ + } else { + ndev->stats.tx_carrier_errors += sh_eth_read(ndev, CNDCR); + sh_eth_write(ndev, 0, CNDCR); /* (write clear) */ + } + + return &ndev->stats; +} + /* device close function */ static int sh_eth_close(struct net_device *ndev) { @@ -2139,6 +2171,7 @@ static int sh_eth_close(struct net_device *ndev) sh_eth_write(ndev, 0, EDTRR); sh_eth_write(ndev, 0, EDRRR); + sh_eth_get_stats(ndev); /* PHY Disconnect */ if (mdp->phydev) { phy_stop(mdp->phydev); @@ -2157,36 +2190,9 @@ static int sh_eth_close(struct net_device *ndev) pm_runtime_put_sync(&mdp->pdev->dev); - return 0; -} - -static struct net_device_stats *sh_eth_get_stats(struct net_device *ndev) -{ - struct sh_eth_private *mdp = netdev_priv(ndev); - - if (sh_eth_is_rz_fast_ether(mdp)) - return &ndev->stats; - - pm_runtime_get_sync(&mdp->pdev->dev); - - ndev->stats.tx_dropped += sh_eth_read(ndev, TROCR); - sh_eth_write(ndev, 0, TROCR); /* (write clear) */ - ndev->stats.collisions += sh_eth_read(ndev, CDCR); - sh_eth_write(ndev, 0, CDCR); /* (write clear) */ - ndev->stats.tx_carrier_errors += sh_eth_read(ndev, LCCR); - sh_eth_write(ndev, 0, LCCR); /* (write clear) */ - if (sh_eth_is_gether(mdp)) { - ndev->stats.tx_carrier_errors += sh_eth_read(ndev, CERCR); - sh_eth_write(ndev, 0, CERCR); /* (write clear) */ - ndev->stats.tx_carrier_errors += sh_eth_read(ndev, CEECR); - sh_eth_write(ndev, 0, CEECR); /* (write clear) */ - } else { - ndev->stats.tx_carrier_errors += sh_eth_read(ndev, CNDCR); - sh_eth_write(ndev, 0, CNDCR); /* (write clear) */ - } - pm_runtime_put_sync(&mdp->pdev->dev); + mdp->is_opened = 0; - return &ndev->stats; + return 0; } /* ioctl to device function */ diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h index 9fa93321d512..22301bf9c21d 100644 --- a/drivers/net/ethernet/renesas/sh_eth.h +++ b/drivers/net/ethernet/renesas/sh_eth.h @@ -522,6 +522,7 @@ struct sh_eth_private { unsigned no_ether_link:1; unsigned ether_link_active_low:1; + unsigned is_opened:1; }; static inline void sh_eth_soft_swap(char *src, int len) -- cgit v1.2.3 From 2f19cad94cee3c9bd52d0c9ca584ef506302fb7c Mon Sep 17 00:00:00 2001 From: Chris Mason Date: Sun, 30 Nov 2014 08:56:33 -0500 Subject: btrfs: zero out left over bytes after processing compression streams Don Bailey noticed that our page zeroing for compression at end-io time isn't complete. This reworks a patch from Linus to push the zeroing into the zlib and lzo specific functions instead of trying to handle the corners inside btrfs_decompress_buf2page Signed-off-by: Chris Mason Reviewed-by: Josef Bacik Reported-by: Don A. Bailey cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds --- fs/btrfs/compression.c | 33 +++++++++++++++++++++++++++++++-- fs/btrfs/compression.h | 4 +++- fs/btrfs/lzo.c | 15 +++++++++++++++ fs/btrfs/zlib.c | 20 ++++++++++++++++++-- 4 files changed, 67 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index d3220d31d3cb..dcd9be32ac57 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -1011,8 +1011,6 @@ int btrfs_decompress_buf2page(char *buf, unsigned long buf_start, bytes = min(bytes, working_bytes); kaddr = kmap_atomic(page_out); memcpy(kaddr + *pg_offset, buf + buf_offset, bytes); - if (*pg_index == (vcnt - 1) && *pg_offset == 0) - memset(kaddr + bytes, 0, PAGE_CACHE_SIZE - bytes); kunmap_atomic(kaddr); flush_dcache_page(page_out); @@ -1054,3 +1052,34 @@ int btrfs_decompress_buf2page(char *buf, unsigned long buf_start, return 1; } + +/* + * When uncompressing data, we need to make sure and zero any parts of + * the biovec that were not filled in by the decompression code. pg_index + * and pg_offset indicate the last page and the last offset of that page + * that have been filled in. This will zero everything remaining in the + * biovec. + */ +void btrfs_clear_biovec_end(struct bio_vec *bvec, int vcnt, + unsigned long pg_index, + unsigned long pg_offset) +{ + while (pg_index < vcnt) { + struct page *page = bvec[pg_index].bv_page; + unsigned long off = bvec[pg_index].bv_offset; + unsigned long len = bvec[pg_index].bv_len; + + if (pg_offset < off) + pg_offset = off; + if (pg_offset < off + len) { + unsigned long bytes = off + len - pg_offset; + char *kaddr; + + kaddr = kmap_atomic(page); + memset(kaddr + pg_offset, 0, bytes); + kunmap_atomic(kaddr); + } + pg_index++; + pg_offset = 0; + } +} diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index 0c803b4fbf93..d181f70caae0 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -45,7 +45,9 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 start, unsigned long nr_pages); int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags); - +void btrfs_clear_biovec_end(struct bio_vec *bvec, int vcnt, + unsigned long pg_index, + unsigned long pg_offset); struct btrfs_compress_op { struct list_head *(*alloc_workspace)(void); diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c index 78285f30909e..617553cdb7d3 100644 --- a/fs/btrfs/lzo.c +++ b/fs/btrfs/lzo.c @@ -373,6 +373,8 @@ cont: } done: kunmap(pages_in[page_in_index]); + if (!ret) + btrfs_clear_biovec_end(bvec, vcnt, page_out_index, pg_offset); return ret; } @@ -410,10 +412,23 @@ static int lzo_decompress(struct list_head *ws, unsigned char *data_in, goto out; } + /* + * the caller is already checking against PAGE_SIZE, but lets + * move this check closer to the memcpy/memset + */ + destlen = min_t(unsigned long, destlen, PAGE_SIZE); bytes = min_t(unsigned long, destlen, out_len - start_byte); kaddr = kmap_atomic(dest_page); memcpy(kaddr, workspace->buf + start_byte, bytes); + + /* + * btrfs_getblock is doing a zero on the tail of the page too, + * but this will cover anything missing from the decompressed + * data. + */ + if (bytes < destlen) + memset(kaddr+bytes, 0, destlen-bytes); kunmap_atomic(kaddr); out: return ret; diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c index 759fa4e2de8f..fb22fd8d8fb8 100644 --- a/fs/btrfs/zlib.c +++ b/fs/btrfs/zlib.c @@ -299,6 +299,8 @@ done: zlib_inflateEnd(&workspace->strm); if (data_in) kunmap(pages_in[page_in_index]); + if (!ret) + btrfs_clear_biovec_end(bvec, vcnt, page_out_index, pg_offset); return ret; } @@ -310,10 +312,14 @@ static int zlib_decompress(struct list_head *ws, unsigned char *data_in, struct workspace *workspace = list_entry(ws, struct workspace, list); int ret = 0; int wbits = MAX_WBITS; - unsigned long bytes_left = destlen; + unsigned long bytes_left; unsigned long total_out = 0; + unsigned long pg_offset = 0; char *kaddr; + destlen = min_t(unsigned long, destlen, PAGE_SIZE); + bytes_left = destlen; + workspace->strm.next_in = data_in; workspace->strm.avail_in = srclen; workspace->strm.total_in = 0; @@ -341,7 +347,6 @@ static int zlib_decompress(struct list_head *ws, unsigned char *data_in, unsigned long buf_start; unsigned long buf_offset; unsigned long bytes; - unsigned long pg_offset = 0; ret = zlib_inflate(&workspace->strm, Z_NO_FLUSH); if (ret != Z_OK && ret != Z_STREAM_END) @@ -384,6 +389,17 @@ next: ret = 0; zlib_inflateEnd(&workspace->strm); + + /* + * this should only happen if zlib returned fewer bytes than we + * expected. btrfs_get_block is responsible for zeroing from the + * end of the inline extent (destlen) to the end of the page + */ + if (pg_offset < destlen) { + kaddr = kmap_atomic(dest_page); + memset(kaddr + pg_offset, 0, destlen - pg_offset); + kunmap_atomic(kaddr); + } return ret; } -- cgit v1.2.3 From 009d0431c3914de64666bec0d350e54fdd59df6a Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 30 Nov 2014 16:42:27 -0800 Subject: Linux 3.18-rc7 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 2fd5c4e5c139..ce70361f766e 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 3 PATCHLEVEL = 18 SUBLEVEL = 0 -EXTRAVERSION = -rc6 +EXTRAVERSION = -rc7 NAME = Diseased Newt # *DOCUMENTATION* -- cgit v1.2.3 From 2cb4a18262fd0108cb8abd875710c59d0aa66f1d Mon Sep 17 00:00:00 2001 From: Sebastian Ott Date: Fri, 28 Nov 2014 15:40:57 +0100 Subject: s390: fix machine check handling Commit eb7e7d76 "s390: Replace __get_cpu_var uses" broke machine check handling. We copy machine check information from per-cpu to a stack variable for local processing. Next we should zap the per-cpu variable, not the stack variable. Signed-off-by: Sebastian Ott Reviewed-by: Heiko Carstens Acked-by: Christoph Lameter Signed-off-by: Martin Schwidefsky --- arch/s390/kernel/nmi.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c index dd1c24ceda50..3f51cf4e8f02 100644 --- a/arch/s390/kernel/nmi.c +++ b/arch/s390/kernel/nmi.c @@ -54,12 +54,8 @@ void s390_handle_mcck(void) */ local_irq_save(flags); local_mcck_disable(); - /* - * Ummm... Does this make sense at all? Copying the percpu struct - * and then zapping it one statement later? - */ - memcpy(&mcck, this_cpu_ptr(&cpu_mcck), sizeof(mcck)); - memset(&mcck, 0, sizeof(struct mcck_struct)); + mcck = *this_cpu_ptr(&cpu_mcck); + memset(this_cpu_ptr(&cpu_mcck), 0, sizeof(mcck)); clear_cpu_flag(CIF_MCCK_PENDING); local_mcck_enable(); local_irq_restore(flags); -- cgit v1.2.3 From aa9d4437893f7e015ce5b6d6c443a9ba92c8a2e7 Mon Sep 17 00:00:00 2001 From: David Howells Date: Mon, 1 Dec 2014 22:52:45 +0000 Subject: KEYS: Fix the size of the key description passed to/from userspace When a key description argument is imported into the kernel from userspace, as happens in add_key(), request_key(), KEYCTL_JOIN_SESSION_KEYRING, KEYCTL_SEARCH, the description is copied into a buffer up to PAGE_SIZE in size. PAGE_SIZE, however, is a variable quantity, depending on the arch. Fix this at 4096 instead (ie. 4095 plus a NUL termination) and define a constant (KEY_MAX_DESC_SIZE) to this end. When reading the description back with KEYCTL_DESCRIBE, a PAGE_SIZE internal buffer is allocated into which the information and description will be rendered. This means that the description will get truncated if an extremely long description it has to be crammed into the buffer with the stringified information. There is no particular need to copy the description into the buffer, so just copy it directly to userspace in a separate operation. Reported-by: Christian Kastner Signed-off-by: David Howells Tested-by: Christian Kastner --- security/keys/keyctl.c | 56 +++++++++++++++++++++++--------------------------- 1 file changed, 26 insertions(+), 30 deletions(-) diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index eff88a5f5d40..4743d71e4aa6 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -26,6 +26,8 @@ #include #include "internal.h" +#define KEY_MAX_DESC_SIZE 4096 + static int key_get_type_from_user(char *type, const char __user *_type, unsigned len) @@ -78,7 +80,7 @@ SYSCALL_DEFINE5(add_key, const char __user *, _type, description = NULL; if (_description) { - description = strndup_user(_description, PAGE_SIZE); + description = strndup_user(_description, KEY_MAX_DESC_SIZE); if (IS_ERR(description)) { ret = PTR_ERR(description); goto error; @@ -177,7 +179,7 @@ SYSCALL_DEFINE4(request_key, const char __user *, _type, goto error; /* pull the description into kernel space */ - description = strndup_user(_description, PAGE_SIZE); + description = strndup_user(_description, KEY_MAX_DESC_SIZE); if (IS_ERR(description)) { ret = PTR_ERR(description); goto error; @@ -287,7 +289,7 @@ long keyctl_join_session_keyring(const char __user *_name) /* fetch the name from userspace */ name = NULL; if (_name) { - name = strndup_user(_name, PAGE_SIZE); + name = strndup_user(_name, KEY_MAX_DESC_SIZE); if (IS_ERR(name)) { ret = PTR_ERR(name); goto error; @@ -562,8 +564,9 @@ long keyctl_describe_key(key_serial_t keyid, { struct key *key, *instkey; key_ref_t key_ref; - char *tmpbuf; + char *infobuf; long ret; + int desclen, infolen; key_ref = lookup_user_key(keyid, KEY_LOOKUP_PARTIAL, KEY_NEED_VIEW); if (IS_ERR(key_ref)) { @@ -586,38 +589,31 @@ long keyctl_describe_key(key_serial_t keyid, } okay: - /* calculate how much description we're going to return */ - ret = -ENOMEM; - tmpbuf = kmalloc(PAGE_SIZE, GFP_KERNEL); - if (!tmpbuf) - goto error2; - key = key_ref_to_ptr(key_ref); + desclen = strlen(key->description); - ret = snprintf(tmpbuf, PAGE_SIZE - 1, - "%s;%d;%d;%08x;%s", - key->type->name, - from_kuid_munged(current_user_ns(), key->uid), - from_kgid_munged(current_user_ns(), key->gid), - key->perm, - key->description ?: ""); - - /* include a NUL char at the end of the data */ - if (ret > PAGE_SIZE - 1) - ret = PAGE_SIZE - 1; - tmpbuf[ret] = 0; - ret++; + /* calculate how much information we're going to return */ + ret = -ENOMEM; + infobuf = kasprintf(GFP_KERNEL, + "%s;%d;%d;%08x;", + key->type->name, + from_kuid_munged(current_user_ns(), key->uid), + from_kgid_munged(current_user_ns(), key->gid), + key->perm); + if (!infobuf) + goto error2; + infolen = strlen(infobuf); + ret = infolen + desclen + 1; /* consider returning the data */ - if (buffer && buflen > 0) { - if (buflen > ret) - buflen = ret; - - if (copy_to_user(buffer, tmpbuf, buflen) != 0) + if (buffer && buflen >= ret) { + if (copy_to_user(buffer, infobuf, infolen) != 0 || + copy_to_user(buffer + infolen, key->description, + desclen + 1) != 0) ret = -EFAULT; } - kfree(tmpbuf); + kfree(infobuf); error2: key_ref_put(key_ref); error: @@ -649,7 +645,7 @@ long keyctl_keyring_search(key_serial_t ringid, if (ret < 0) goto error; - description = strndup_user(_description, PAGE_SIZE); + description = strndup_user(_description, KEY_MAX_DESC_SIZE); if (IS_ERR(description)) { ret = PTR_ERR(description); goto error; -- cgit v1.2.3 From 054f6180d8b5602b431b5924976c956e760488b1 Mon Sep 17 00:00:00 2001 From: David Howells Date: Mon, 1 Dec 2014 22:52:50 +0000 Subject: KEYS: Simplify KEYRING_SEARCH_{NO,DO}_STATE_CHECK flags Simplify KEYRING_SEARCH_{NO,DO}_STATE_CHECK flags to be two variations of the same flag. They are effectively mutually exclusive and one or the other should be provided, but not both. Keyring cycle detection and key possession determination are the only things that set NO_STATE_CHECK, except that neither flag really does anything there because neither purpose makes use of the keyring_search_iterator() function, but rather provides their own. For cycle detection we definitely want to check inside of expired keyrings, just so that we don't create a cycle we can't get rid of. Revoked keyrings are cleared at revocation time and can't then be reused, so shouldn't be a problem either way. For possession determination, we *might* want to validate each keyring before searching it: do you possess a key that's hidden behind an expired or just plain inaccessible keyring? Currently, the answer is yes. Note that you cannot, however, possess a key behind a revoked keyring because they are cleared on revocation. keyring_search() sets DO_STATE_CHECK, which is correct. request_key_and_link() currently doesn't specify whether to check the key state or not - but it should set DO_STATE_CHECK. key_get_instantiation_authkey() also currently doesn't specify whether to check the key state or not - but it probably should also set DO_STATE_CHECK. Signed-off-by: David Howells Tested-by: Chuck Lever --- security/keys/keyring.c | 7 ++++--- security/keys/request_key.c | 1 + security/keys/request_key_auth.c | 1 + 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/security/keys/keyring.c b/security/keys/keyring.c index 8177010174f7..238aa172f25b 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -628,6 +628,10 @@ static bool search_nested_keyrings(struct key *keyring, ctx->index_key.type->name, ctx->index_key.description); +#define STATE_CHECKS (KEYRING_SEARCH_NO_STATE_CHECK | KEYRING_SEARCH_DO_STATE_CHECK) + BUG_ON((ctx->flags & STATE_CHECKS) == 0 || + (ctx->flags & STATE_CHECKS) == STATE_CHECKS); + if (ctx->index_key.description) ctx->index_key.desc_len = strlen(ctx->index_key.description); @@ -637,7 +641,6 @@ static bool search_nested_keyrings(struct key *keyring, if (ctx->match_data.lookup_type == KEYRING_SEARCH_LOOKUP_ITERATE || keyring_compare_object(keyring, &ctx->index_key)) { ctx->skipped_ret = 2; - ctx->flags |= KEYRING_SEARCH_DO_STATE_CHECK; switch (ctx->iterator(keyring_key_to_ptr(keyring), ctx)) { case 1: goto found; @@ -649,8 +652,6 @@ static bool search_nested_keyrings(struct key *keyring, } ctx->skipped_ret = 0; - if (ctx->flags & KEYRING_SEARCH_NO_STATE_CHECK) - ctx->flags &= ~KEYRING_SEARCH_DO_STATE_CHECK; /* Start processing a new keyring */ descend_to_keyring: diff --git a/security/keys/request_key.c b/security/keys/request_key.c index bb4337c7ae1b..0bb23f98e4ca 100644 --- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -516,6 +516,7 @@ struct key *request_key_and_link(struct key_type *type, .match_data.cmp = key_default_cmp, .match_data.raw_data = description, .match_data.lookup_type = KEYRING_SEARCH_LOOKUP_DIRECT, + .flags = KEYRING_SEARCH_DO_STATE_CHECK, }; struct key *key; key_ref_t key_ref; diff --git a/security/keys/request_key_auth.c b/security/keys/request_key_auth.c index 6639e2cb8853..5d672f7580dd 100644 --- a/security/keys/request_key_auth.c +++ b/security/keys/request_key_auth.c @@ -249,6 +249,7 @@ struct key *key_get_instantiation_authkey(key_serial_t target_id) .match_data.cmp = key_default_cmp, .match_data.raw_data = description, .match_data.lookup_type = KEYRING_SEARCH_LOOKUP_DIRECT, + .flags = KEYRING_SEARCH_DO_STATE_CHECK, }; struct key *authkey; key_ref_t authkey_ref; -- cgit v1.2.3 From 0b0a84154eff56913e91df29de5c3a03a0029e38 Mon Sep 17 00:00:00 2001 From: David Howells Date: Mon, 1 Dec 2014 22:52:53 +0000 Subject: KEYS: request_key() should reget expired keys rather than give EKEYEXPIRED Since the keyring facility can be viewed as a cache (at least in some applications), the local expiration time on the key should probably be viewed as a 'needs updating after this time' property rather than an absolute 'anyone now wanting to use this object is out of luck' property. Since request_key() is the main interface for the usage of keys, this should update or replace an expired key rather than issuing EKEYEXPIRED if the local expiration has been reached (ie. it should refresh the cache). For absolute conditions where refreshing the cache probably doesn't help, the key can be negatively instantiated using KEYCTL_REJECT_KEY with EKEYEXPIRED given as the error to issue. This will still cause request_key() to return EKEYEXPIRED as that was explicitly set. In the future, if the key type has an update op available, we might want to upcall with the expired key and allow the upcall to update it. We would pass a different operation name (the first column in /etc/request-key.conf) to the request-key program. request_key() returning EKEYEXPIRED is causing an NFS problem which Chuck Lever describes thusly: After about 10 minutes, my NFSv4 functional tests fail because the ownership of the test files goes to "-2". Looking at /proc/keys shows that the id_resolv keys that map to my test user ID have expired. The ownership problem persists until the expired keys are purged from the keyring, and fresh keys are obtained. I bisected the problem to 3.13 commit b2a4df200d57 ("KEYS: Expand the capacity of a keyring"). This commit inadvertantly changes the API contract of the internal function keyring_search_aux(). The root cause appears to be that b2a4df200d57 made "no state check" the default behavior. "No state check" means the keyring search iterator function skips checking the key's expiry timeout, and returns expired keys. request_key_and_link() depends on getting an -EAGAIN result code to know when to perform an upcall to refresh an expired key. This patch can be tested directly by: keyctl request2 user debug:fred a @s keyctl timeout %user:debug:fred 3 sleep 4 keyctl request2 user debug:fred a @s Without the patch, the last command gives error EKEYEXPIRED, but with the command it gives a new key. Reported-by: Carl Hetherington Reported-by: Chuck Lever Signed-off-by: David Howells Tested-by: Chuck Lever --- security/keys/internal.h | 1 + security/keys/keyring.c | 3 ++- security/keys/request_key.c | 3 ++- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/security/keys/internal.h b/security/keys/internal.h index b8960c4959a5..200e37867336 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -117,6 +117,7 @@ struct keyring_search_context { #define KEYRING_SEARCH_NO_UPDATE_TIME 0x0004 /* Don't update times */ #define KEYRING_SEARCH_NO_CHECK_PERM 0x0008 /* Don't check permissions */ #define KEYRING_SEARCH_DETECT_TOO_DEEP 0x0010 /* Give an error on excessive depth */ +#define KEYRING_SEARCH_SKIP_EXPIRED 0x0020 /* Ignore expired keys (intention to replace) */ int (*iterator)(const void *object, void *iterator_data); diff --git a/security/keys/keyring.c b/security/keys/keyring.c index 238aa172f25b..e72548b5897e 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -546,7 +546,8 @@ static int keyring_search_iterator(const void *object, void *iterator_data) } if (key->expiry && ctx->now.tv_sec >= key->expiry) { - ctx->result = ERR_PTR(-EKEYEXPIRED); + if (!(ctx->flags & KEYRING_SEARCH_SKIP_EXPIRED)) + ctx->result = ERR_PTR(-EKEYEXPIRED); kleave(" = %d [expire]", ctx->skipped_ret); goto skipped; } diff --git a/security/keys/request_key.c b/security/keys/request_key.c index 0bb23f98e4ca..0c7aea4dea54 100644 --- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -516,7 +516,8 @@ struct key *request_key_and_link(struct key_type *type, .match_data.cmp = key_default_cmp, .match_data.raw_data = description, .match_data.lookup_type = KEYRING_SEARCH_LOOKUP_DIRECT, - .flags = KEYRING_SEARCH_DO_STATE_CHECK, + .flags = (KEYRING_SEARCH_DO_STATE_CHECK | + KEYRING_SEARCH_SKIP_EXPIRED), }; struct key *key; key_ref_t key_ref; -- cgit v1.2.3 From 5106787a9e08dc2901d6b2513ed8f377671befa8 Mon Sep 17 00:00:00 2001 From: Thierry Reding Date: Thu, 27 Nov 2014 09:54:09 +0100 Subject: PCI: tegra: Use physical range for I/O mapping Commit 0b0b0893d49b ("of/pci: Fix the conversion of IO ranges into IO resources") changed how I/O resources are parsed from DT. Rather than containing the physical address of the I/O region, the addresses will now be in I/O address space. On Tegra the union of all ranges is used to expose a top-level memory- mapped resource for the PCI host bridge. This helps to make /proc/iomem more readable. Combining both of the above, the union would now include the I/O space region. This causes a regression on Tegra20, where the physical base address of the PCIe controller (and therefore of the union) is located at physical address 0x80000000. Since I/O space starts at 0, the union will now include all of system RAM which starts at 0x00000000. This commit fixes this by keeping two copies of the I/O range: one that represents the range in the CPU's physical address space, the other for the range in the I/O address space. This allows the translation setup within the driver to reuse the physical addresses. The code registering the I/O region with the PCI core uses both ranges to establish the mapping. Fixes: 0b0b0893d49b ("of/pci: Fix the conversion of IO ranges into IO resources") Reported-by: Marc Zyngier Tested-by: Marc Zyngier Suggested-by: Arnd Bergmann Signed-off-by: Thierry Reding Signed-off-by: Bjorn Helgaas Reviewed-by: Arnd Bergmann --- drivers/pci/host/pci-tegra.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c index 3d43874319be..19bb19c7db4a 100644 --- a/drivers/pci/host/pci-tegra.c +++ b/drivers/pci/host/pci-tegra.c @@ -276,6 +276,7 @@ struct tegra_pcie { struct resource all; struct resource io; + struct resource pio; struct resource mem; struct resource prefetch; struct resource busn; @@ -658,7 +659,6 @@ static int tegra_pcie_setup(int nr, struct pci_sys_data *sys) { struct tegra_pcie *pcie = sys_to_pcie(sys); int err; - phys_addr_t io_start; err = devm_request_resource(pcie->dev, &pcie->all, &pcie->mem); if (err < 0) @@ -668,14 +668,12 @@ static int tegra_pcie_setup(int nr, struct pci_sys_data *sys) if (err) return err; - io_start = pci_pio_to_address(pcie->io.start); - pci_add_resource_offset(&sys->resources, &pcie->mem, sys->mem_offset); pci_add_resource_offset(&sys->resources, &pcie->prefetch, sys->mem_offset); pci_add_resource(&sys->resources, &pcie->busn); - pci_ioremap_io(nr * SZ_64K, io_start); + pci_ioremap_io(pcie->pio.start, pcie->io.start); return 1; } @@ -786,7 +784,6 @@ static irqreturn_t tegra_pcie_isr(int irq, void *arg) static void tegra_pcie_setup_translations(struct tegra_pcie *pcie) { u32 fpci_bar, size, axi_address; - phys_addr_t io_start = pci_pio_to_address(pcie->io.start); /* Bar 0: type 1 extended configuration space */ fpci_bar = 0xfe100000; @@ -799,7 +796,7 @@ static void tegra_pcie_setup_translations(struct tegra_pcie *pcie) /* Bar 1: downstream IO bar */ fpci_bar = 0xfdfc0000; size = resource_size(&pcie->io); - axi_address = io_start; + axi_address = pcie->io.start; afi_writel(pcie, axi_address, AFI_AXI_BAR1_START); afi_writel(pcie, size >> 12, AFI_AXI_BAR1_SZ); afi_writel(pcie, fpci_bar, AFI_FPCI_BAR1); @@ -1690,8 +1687,23 @@ static int tegra_pcie_parse_dt(struct tegra_pcie *pcie) switch (res.flags & IORESOURCE_TYPE_BITS) { case IORESOURCE_IO: - memcpy(&pcie->io, &res, sizeof(res)); - pcie->io.name = np->full_name; + memcpy(&pcie->pio, &res, sizeof(res)); + pcie->pio.name = np->full_name; + + /* + * The Tegra PCIe host bridge uses this to program the + * mapping of the I/O space to the physical address, + * so we override the .start and .end fields here that + * of_pci_range_to_resource() converted to I/O space. + * We also set the IORESOURCE_MEM type to clarify that + * the resource is in the physical memory space. + */ + pcie->io.start = range.cpu_addr; + pcie->io.end = range.cpu_addr + range.size - 1; + pcie->io.flags = IORESOURCE_MEM; + pcie->io.name = "I/O"; + + memcpy(&res, &pcie->io, sizeof(res)); break; case IORESOURCE_MEM: -- cgit v1.2.3 From 32f3869184d498850d36b7e6aa3b9f5260ea648a Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Mon, 1 Dec 2014 16:22:23 -0800 Subject: jbd2: fix regression where we fail to initialize checksum seed when loading When we're enabling journal features, we cannot use the predicate jbd2_journal_has_csum_v2or3() because we haven't yet set the sb feature flag fields! Moreover, we just finished loading the shash driver, so the test is unnecessary; calculate the seed always. Without this patch, we fail to initialize the checksum seed the first time we turn on journal_checksum, which means that all journal blocks written during that first mount are corrupt. Transactions written after the second mount will be fine, since the feature flag will be set in the journal superblock. xfstests generic/{034,321,322} are the regression tests. (This is important for 3.18.) Signed-off-by: Darrick J. Wong Reported-by: Eric Whitney Signed-off-by: Theodore Ts'o --- fs/jbd2/journal.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index e4dc74713a43..1df94fabe4eb 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -1853,13 +1853,12 @@ int jbd2_journal_set_features (journal_t *journal, unsigned long compat, journal->j_chksum_driver = NULL; return 0; } - } - /* Precompute checksum seed for all metadata */ - if (jbd2_journal_has_csum_v2or3(journal)) + /* Precompute checksum seed for all metadata */ journal->j_csum_seed = jbd2_chksum(journal, ~0, sb->s_uuid, sizeof(sb->s_uuid)); + } } /* If enabling v1 checksums, downgrade superblock */ -- cgit v1.2.3 From 19a10828814aa3ba483301b416b27f94330e0c80 Mon Sep 17 00:00:00 2001 From: Ben Skeggs Date: Mon, 1 Dec 2014 11:44:27 +1000 Subject: drm/nouveau/fifo/g84-: ack non-stall interrupt before handling it Closes a very unlikely race that can occur if another NonStallInterrupt method passes between checking fences and acking the previous interrupt. With this change, the interrupt will re-fire under such conditions. Tested-by: Tobias Klausmann Signed-off-by: Ben Skeggs --- drivers/gpu/drm/nouveau/core/engine/fifo/nv04.c | 2 +- drivers/gpu/drm/nouveau/core/engine/fifo/nvc0.c | 4 ++-- drivers/gpu/drm/nouveau/core/engine/fifo/nve0.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nv04.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nv04.c index 5ae6a43893b5..1931057f9962 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nv04.c +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nv04.c @@ -551,8 +551,8 @@ nv04_fifo_intr(struct nouveau_subdev *subdev) } if (status & 0x40000000) { - nouveau_fifo_uevent(&priv->base); nv_wr32(priv, 0x002100, 0x40000000); + nouveau_fifo_uevent(&priv->base); status &= ~0x40000000; } } diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nvc0.c index 1fe1f8fbda0c..074d434c3077 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nvc0.c +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nvc0.c @@ -740,6 +740,8 @@ nvc0_fifo_intr_engine_unit(struct nvc0_fifo_priv *priv, int engn) u32 inte = nv_rd32(priv, 0x002628); u32 unkn; + nv_wr32(priv, 0x0025a8 + (engn * 0x04), intr); + for (unkn = 0; unkn < 8; unkn++) { u32 ints = (intr >> (unkn * 0x04)) & inte; if (ints & 0x1) { @@ -751,8 +753,6 @@ nvc0_fifo_intr_engine_unit(struct nvc0_fifo_priv *priv, int engn) nv_mask(priv, 0x002628, ints, 0); } } - - nv_wr32(priv, 0x0025a8 + (engn * 0x04), intr); } static void diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.c index d2f0fd39c145..f8734eb74eaa 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.c +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nve0.c @@ -952,8 +952,8 @@ nve0_fifo_intr(struct nouveau_subdev *subdev) } if (stat & 0x80000000) { - nve0_fifo_intr_engine(priv); nv_wr32(priv, 0x002100, 0x80000000); + nve0_fifo_intr_engine(priv); stat &= ~0x80000000; } -- cgit v1.2.3 From 0ec5f02f0e2c6fe88ba5817790e11fe33ee298a7 Mon Sep 17 00:00:00 2001 From: Maarten Lankhorst Date: Mon, 1 Dec 2014 19:11:06 +1000 Subject: drm/nouveau: prevent stale fence->channel pointers, and protect with rcu Tested-by: Alexandre Courbot Signed-off-by: Ben Skeggs --- drivers/gpu/drm/nouveau/nouveau_fence.c | 92 +++++++++++++++++++++++---------- drivers/gpu/drm/nouveau/nouveau_fence.h | 4 +- 2 files changed, 67 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 515cd9aebb99..f32a434724e3 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -52,20 +52,24 @@ nouveau_fctx(struct nouveau_fence *fence) return container_of(fence->base.lock, struct nouveau_fence_chan, lock); } -static void +static int nouveau_fence_signal(struct nouveau_fence *fence) { + int drop = 0; + fence_signal_locked(&fence->base); list_del(&fence->head); + rcu_assign_pointer(fence->channel, NULL); if (test_bit(FENCE_FLAG_USER_BITS, &fence->base.flags)) { struct nouveau_fence_chan *fctx = nouveau_fctx(fence); if (!--fctx->notify_ref) - nvif_notify_put(&fctx->notify); + drop = 1; } fence_put(&fence->base); + return drop; } static struct nouveau_fence * @@ -88,16 +92,23 @@ nouveau_fence_context_del(struct nouveau_fence_chan *fctx) { struct nouveau_fence *fence; - nvif_notify_fini(&fctx->notify); - spin_lock_irq(&fctx->lock); while (!list_empty(&fctx->pending)) { fence = list_entry(fctx->pending.next, typeof(*fence), head); - nouveau_fence_signal(fence); - fence->channel = NULL; + if (nouveau_fence_signal(fence)) + nvif_notify_put(&fctx->notify); } spin_unlock_irq(&fctx->lock); + + nvif_notify_fini(&fctx->notify); + fctx->dead = 1; + + /* + * Ensure that all accesses to fence->channel complete before freeing + * the channel. + */ + synchronize_rcu(); } static void @@ -112,21 +123,23 @@ nouveau_fence_context_free(struct nouveau_fence_chan *fctx) kref_put(&fctx->fence_ref, nouveau_fence_context_put); } -static void +static int nouveau_fence_update(struct nouveau_channel *chan, struct nouveau_fence_chan *fctx) { struct nouveau_fence *fence; - + int drop = 0; u32 seq = fctx->read(chan); while (!list_empty(&fctx->pending)) { fence = list_entry(fctx->pending.next, typeof(*fence), head); if ((int)(seq - fence->base.seqno) < 0) - return; + break; - nouveau_fence_signal(fence); + drop |= nouveau_fence_signal(fence); } + + return drop; } static int @@ -135,18 +148,21 @@ nouveau_fence_wait_uevent_handler(struct nvif_notify *notify) struct nouveau_fence_chan *fctx = container_of(notify, typeof(*fctx), notify); unsigned long flags; + int ret = NVIF_NOTIFY_KEEP; spin_lock_irqsave(&fctx->lock, flags); if (!list_empty(&fctx->pending)) { struct nouveau_fence *fence; + struct nouveau_channel *chan; fence = list_entry(fctx->pending.next, typeof(*fence), head); - nouveau_fence_update(fence->channel, fctx); + chan = rcu_dereference_protected(fence->channel, lockdep_is_held(&fctx->lock)); + if (nouveau_fence_update(fence->channel, fctx)) + ret = NVIF_NOTIFY_DROP; } spin_unlock_irqrestore(&fctx->lock, flags); - /* Always return keep here. NVIF refcount is handled with nouveau_fence_update */ - return NVIF_NOTIFY_KEEP; + return ret; } void @@ -262,7 +278,10 @@ nouveau_fence_emit(struct nouveau_fence *fence, struct nouveau_channel *chan) if (!ret) { fence_get(&fence->base); spin_lock_irq(&fctx->lock); - nouveau_fence_update(chan, fctx); + + if (nouveau_fence_update(chan, fctx)) + nvif_notify_put(&fctx->notify); + list_add_tail(&fence->head, &fctx->pending); spin_unlock_irq(&fctx->lock); } @@ -276,13 +295,16 @@ nouveau_fence_done(struct nouveau_fence *fence) if (fence->base.ops == &nouveau_fence_ops_legacy || fence->base.ops == &nouveau_fence_ops_uevent) { struct nouveau_fence_chan *fctx = nouveau_fctx(fence); + struct nouveau_channel *chan; unsigned long flags; if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->base.flags)) return true; spin_lock_irqsave(&fctx->lock, flags); - nouveau_fence_update(fence->channel, fctx); + chan = rcu_dereference_protected(fence->channel, lockdep_is_held(&fctx->lock)); + if (chan && nouveau_fence_update(chan, fctx)) + nvif_notify_put(&fctx->notify); spin_unlock_irqrestore(&fctx->lock, flags); } return fence_is_signaled(&fence->base); @@ -387,12 +409,18 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e if (fence && (!exclusive || !fobj || !fobj->shared_count)) { struct nouveau_channel *prev = NULL; + bool must_wait = true; f = nouveau_local_fence(fence, chan->drm); - if (f) - prev = f->channel; + if (f) { + rcu_read_lock(); + prev = rcu_dereference(f->channel); + if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0)) + must_wait = false; + rcu_read_unlock(); + } - if (!prev || (prev != chan && (ret = fctx->sync(f, prev, chan)))) + if (must_wait) ret = fence_wait(fence, intr); return ret; @@ -403,19 +431,22 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e for (i = 0; i < fobj->shared_count && !ret; ++i) { struct nouveau_channel *prev = NULL; + bool must_wait = true; fence = rcu_dereference_protected(fobj->shared[i], reservation_object_held(resv)); f = nouveau_local_fence(fence, chan->drm); - if (f) - prev = f->channel; + if (f) { + rcu_read_lock(); + prev = rcu_dereference(f->channel); + if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0)) + must_wait = false; + rcu_read_unlock(); + } - if (!prev || (prev != chan && (ret = fctx->sync(f, prev, chan)))) + if (must_wait) ret = fence_wait(fence, intr); - - if (ret) - break; } return ret; @@ -463,7 +494,7 @@ static const char *nouveau_fence_get_timeline_name(struct fence *f) struct nouveau_fence *fence = from_fence(f); struct nouveau_fence_chan *fctx = nouveau_fctx(fence); - return fence->channel ? fctx->name : "dead channel"; + return !fctx->dead ? fctx->name : "dead channel"; } /* @@ -476,9 +507,16 @@ static bool nouveau_fence_is_signaled(struct fence *f) { struct nouveau_fence *fence = from_fence(f); struct nouveau_fence_chan *fctx = nouveau_fctx(fence); - struct nouveau_channel *chan = fence->channel; + struct nouveau_channel *chan; + bool ret = false; + + rcu_read_lock(); + chan = rcu_dereference(fence->channel); + if (chan) + ret = (int)(fctx->read(chan) - fence->base.seqno) >= 0; + rcu_read_unlock(); - return (int)(fctx->read(chan) - fence->base.seqno) >= 0; + return ret; } static bool nouveau_fence_no_signaling(struct fence *f) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h index 943b0b17b1fc..96e461c6f68f 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.h +++ b/drivers/gpu/drm/nouveau/nouveau_fence.h @@ -14,7 +14,7 @@ struct nouveau_fence { bool sysmem; - struct nouveau_channel *channel; + struct nouveau_channel __rcu *channel; unsigned long timeout; }; @@ -47,7 +47,7 @@ struct nouveau_fence_chan { char name[32]; struct nvif_notify notify; - int notify_ref; + int notify_ref, dead; }; struct nouveau_fence_priv { -- cgit v1.2.3 From 226d63a1addea8cbe8fc671978e62dc84927b046 Mon Sep 17 00:00:00 2001 From: Ilia Mirkin Date: Sun, 30 Nov 2014 12:56:18 -0500 Subject: drm/nouveau/gf116: remove copy1 engine Indications are that no GF116's actually have a copy engine there, but actually have the decompression engine. This engine can be made to do copies, but that should be done separately. Unclear why this didn't turn up on all GF116's, but perhaps the non-mobile ones came with enough VRAM to not trigger ttm migrations in test scenarios. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=85465 Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=59168 Cc: stable@vger.kernel.org Signed-off-by: Ilia Mirkin Signed-off-by: Ben Skeggs --- drivers/gpu/drm/nouveau/core/engine/device/nvc0.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/core/engine/device/nvc0.c b/drivers/gpu/drm/nouveau/core/engine/device/nvc0.c index cd05677ad4b7..72a40f95d048 100644 --- a/drivers/gpu/drm/nouveau/core/engine/device/nvc0.c +++ b/drivers/gpu/drm/nouveau/core/engine/device/nvc0.c @@ -218,7 +218,6 @@ nvc0_identify(struct nouveau_device *device) device->oclass[NVDEV_ENGINE_BSP ] = &nvc0_bsp_oclass; device->oclass[NVDEV_ENGINE_PPP ] = &nvc0_ppp_oclass; device->oclass[NVDEV_ENGINE_COPY0 ] = &nvc0_copy0_oclass; - device->oclass[NVDEV_ENGINE_COPY1 ] = &nvc0_copy1_oclass; device->oclass[NVDEV_ENGINE_DISP ] = nva3_disp_oclass; device->oclass[NVDEV_ENGINE_PERFMON] = &nvc0_perfmon_oclass; break; -- cgit v1.2.3 From 8b62c8c6df08ca567c78afa51aa7bbc554cede06 Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Tue, 2 Dec 2014 16:27:25 +1000 Subject: nouveau: move the hotplug ignore to correct place. Introduced in b440bde74f, however it was added to the wrong function in nouveau. https://bugzilla.kernel.org/show_bug.cgi?id=86011 Cc: Bjorn Helgaas CC: stable@vger.kernel.org # v3.15+ Signed-off-by: Dave Airlie --- drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 57238076049f..62b97c4eef8d 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -629,7 +629,6 @@ int nouveau_pmops_suspend(struct device *dev) pci_save_state(pdev); pci_disable_device(pdev); - pci_ignore_hotplug(pdev); pci_set_power_state(pdev, PCI_D3hot); return 0; } @@ -933,6 +932,7 @@ static int nouveau_pmops_runtime_suspend(struct device *dev) ret = nouveau_do_suspend(drm_dev, true); pci_save_state(pdev); pci_disable_device(pdev); + pci_ignore_hotplug(pdev); pci_set_power_state(pdev, PCI_D3cold); drm_dev->switch_power_state = DRM_SWITCH_POWER_DYNAMIC_OFF; return ret; -- cgit v1.2.3 From 594416a72032684792bb22510c538098db10b750 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Tue, 25 Nov 2014 17:40:25 -0800 Subject: block: fix regression where bio_integrity_process uses wrong bio_vec iterator bio integrity handling is broken on a system with LVM layered atop a DIF/DIX SCSI drive because device mapper clones the bio, modifies the clone, and sends the clone to the lower layers for processing. However, the clone bio has bi_vcnt == 0, which means that when the sd driver calls bio_integrity_process to attach DIX data, the for_each_segment_all() call (which uses bi_vcnt) returns immediately and random garbage is sent to the disk on a disk write. The disk of course returns an error. Therefore, teach bio_integrity_process() to use bio_for_each_segment() to iterate the bio_vecs, since the per-bio iterator tracks which bio_vecs are associated with that particular bio. The integrity handling code is effectively part of the "driver" (it's not the bio owner), so it must use the correct iterator function. v2: Fix a compiler warning about abandoned local variables. This patch supersedes "block: bio_integrity_process uses wrong bio_vec iterator". Patch applies against 3.18-rc6. Signed-off-by: Darrick J. Wong Acked-by: Martin K. Petersen Signed-off-by: Jens Axboe --- block/bio-integrity.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/block/bio-integrity.c b/block/bio-integrity.c index 0984232e429f..5cbd5d9ea61d 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -216,9 +216,10 @@ static int bio_integrity_process(struct bio *bio, { struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev); struct blk_integrity_iter iter; - struct bio_vec *bv; + struct bvec_iter bviter; + struct bio_vec bv; struct bio_integrity_payload *bip = bio_integrity(bio); - unsigned int i, ret = 0; + unsigned int ret = 0; void *prot_buf = page_address(bip->bip_vec->bv_page) + bip->bip_vec->bv_offset; @@ -227,11 +228,11 @@ static int bio_integrity_process(struct bio *bio, iter.seed = bip_get_seed(bip); iter.prot_buf = prot_buf; - bio_for_each_segment_all(bv, bio, i) { - void *kaddr = kmap_atomic(bv->bv_page); + bio_for_each_segment(bv, bio, bviter) { + void *kaddr = kmap_atomic(bv.bv_page); - iter.data_buf = kaddr + bv->bv_offset; - iter.data_size = bv->bv_len; + iter.data_buf = kaddr + bv.bv_offset; + iter.data_size = bv.bv_len; ret = proc_fn(&iter); if (ret) { -- cgit v1.2.3 From 86b276385c6a986872e4cd144f5940b156053c3f Mon Sep 17 00:00:00 2001 From: Christian König Date: Thu, 27 Nov 2014 13:12:58 +0100 Subject: drm/radeon: sync all BOs involved in a CS v2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Not just the userspace relocs, otherwise we won't wait for a swapped out page tables to be swapped in again. v2: rebased on Alex current drm-fixes-3.18 Signed-off-by: Christian König Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher --- drivers/gpu/drm/radeon/radeon_cs.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index a3e7aed7e680..6f377de099f9 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -251,22 +251,19 @@ static int radeon_cs_get_ring(struct radeon_cs_parser *p, u32 ring, s32 priority static int radeon_cs_sync_rings(struct radeon_cs_parser *p) { - int i, r = 0; + struct radeon_cs_reloc *reloc; + int r; - for (i = 0; i < p->nrelocs; i++) { + list_for_each_entry(reloc, &p->validated, tv.head) { struct reservation_object *resv; - if (!p->relocs[i].robj) - continue; - - resv = p->relocs[i].robj->tbo.resv; + resv = reloc->robj->tbo.resv; r = radeon_semaphore_sync_resv(p->rdev, p->ib.semaphore, resv, - p->relocs[i].tv.shared); - + reloc->tv.shared); if (r) - break; + return r; } - return r; + return 0; } /* XXX: note that this is called from the legacy UMS CS ioctl as well */ -- cgit v1.2.3 From a08b588e4199e4200d26027ffcdf3ab2fa906412 Mon Sep 17 00:00:00 2001 From: Michel Dänzer Date: Thu, 27 Nov 2014 18:00:54 +0900 Subject: drm/radeon: Ignore RADEON_GEM_GTT_WC on 32-bit x86 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84627 Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/radeon_object.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index 99a960a4f302..4c0d786d5c7a 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -213,6 +213,13 @@ int radeon_bo_create(struct radeon_device *rdev, if (!(rdev->flags & RADEON_IS_PCIE)) bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC); +#ifdef CONFIG_X86_32 + /* XXX: Write-combined CPU mappings of GTT seem broken on 32-bit + * See https://bugs.freedesktop.org/show_bug.cgi?id=84627 + */ + bo->flags &= ~RADEON_GEM_GTT_WC; +#endif + radeon_ttm_placement_from_domain(bo, domain); /* Kernel allocation are uninterruptible */ down_read(&rdev->pm.mclk_lock); -- cgit v1.2.3 From f5475cc43c899e33098d4db44b7c5e710f16589d Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Thu, 27 Nov 2014 16:57:21 +0100 Subject: drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6 I was unable too boot 3.18.0-rc6 because of the following kernel panic in drm_calc_vbltimestamp_from_scanoutpos(): [drm] Initialized drm 1.1.0 20060810 [drm] radeon kernel modesetting enabled. [drm] initializing kernel modesetting (RV100 0x1002:0x515E 0x15D9:0x8080). [drm] register mmio base: 0xC8400000 [drm] register mmio size: 65536 radeon 0000:0b:01.0: VRAM: 128M 0x00000000D0000000 - 0x00000000D7FFFFFF (16M used) radeon 0000:0b:01.0: GTT: 512M 0x00000000B0000000 - 0x00000000CFFFFFFF [drm] Detected VRAM RAM=128M, BAR=128M [drm] RAM width 16bits DDR [TTM] Zone kernel: Available graphics memory: 3829346 kiB [TTM] Zone dma32: Available graphics memory: 2097152 kiB [TTM] Initializing pool allocator [TTM] Initializing DMA pool allocator [drm] radeon: 16M of VRAM memory ready [drm] radeon: 512M of GTT memory ready. [drm] GART: num cpu pages 131072, num gpu pages 131072 [drm] PCI GART of 512M enabled (table at 0x0000000037880000). radeon 0000:0b:01.0: WB disabled radeon 0000:0b:01.0: fence driver on ring 0 use gpu addr 0x00000000b0000000 and cpu addr 0xffff8800bbbfa000 [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [drm] Driver supports precise vblank timestamp query. [drm] radeon: irq initialized. [drm] Loading R100 Microcode radeon 0000:0b:01.0: Direct firmware load for radeon/R100_cp.bin failed with error -2 radeon_cp: Failed to load firmware "radeon/R100_cp.bin" [drm:r100_cp_init] *ERROR* Failed to load firmware! radeon 0000:0b:01.0: failed initializing CP (-2). radeon 0000:0b:01.0: Disabling GPU acceleration [drm] radeon: cp finalized BUG: unable to handle kernel NULL pointer dereference at 000000000000025c IP: [] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320 PGD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-4-default #2649 Hardware name: Supermicro X7DB8/X7DB8, BIOS 6.00 07/26/2006 task: ffff880234da2010 ti: ffff880234da4000 task.ti: ffff880234da4000 RIP: 0010:[] [] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320 RSP: 0000:ffff880234da7918 EFLAGS: 00010086 RAX: ffffffff81557890 RBX: 0000000000000000 RCX: ffff880234da7a48 RDX: ffff880234da79f4 RSI: 0000000000000000 RDI: ffff880232e15000 RBP: ffff880234da79b8 R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000000a R11: 0000000000000001 R12: ffff880232dda1c0 R13: ffff880232e1518c R14: 0000000000000292 R15: ffff880232e15000 FS: 0000000000000000(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000000025c CR3: 0000000002014000 CR4: 00000000000007e0 Stack: ffff880234da79d8 0000000000000286 ffff880232dcbc00 0000000000002480 ffff880234da7958 0000000000000296 ffff880234da7998 ffffffff8151b51d ffff880234da7a48 0000000032dcbeb0 ffff880232dcbc00 ffff880232dcbc58 Call Trace: [] ? drm_vma_offset_remove+0x1d/0x110 [] radeon_get_vblank_timestamp_kms+0x38/0x60 [] ? ttm_bo_release_list+0xba/0x180 [] drm_get_last_vbltimestamp+0x41/0x70 [] vblank_disable_and_save+0x73/0x1d0 [] ? try_to_del_timer_sync+0x4f/0x70 [] drm_vblank_cleanup+0x65/0xa0 [] radeon_irq_kms_fini+0x1a/0x70 [] r100_init+0x26e/0x410 [] radeon_device_init+0x7ae/0xb50 [] radeon_driver_load_kms+0x8f/0x210 [] drm_dev_register+0xb5/0x110 [] drm_get_pci_dev+0x8f/0x200 [] radeon_pci_probe+0xad/0xe0 [] local_pci_probe+0x45/0xa0 [] pci_device_probe+0xd1/0x130 [] driver_probe_device+0x12d/0x3e0 [] __driver_attach+0x9b/0xa0 [] ? __device_attach+0x40/0x40 [] bus_for_each_dev+0x63/0xa0 [] driver_attach+0x1e/0x20 [] bus_add_driver+0x180/0x240 [] driver_register+0x64/0xf0 [] __pci_register_driver+0x4c/0x50 [] drm_pci_init+0xf5/0x120 [] ? ttm_init+0x6a/0x6a [] radeon_init+0x97/0xb5 [] do_one_initcall+0xbc/0x1f0 [] ? __wake_up+0x48/0x60 [] kernel_init_freeable+0x18a/0x215 [] ? initcall_blacklist+0xc0/0xc0 [] ? rest_init+0x80/0x80 [] kernel_init+0xe/0xf0 [] ret_from_fork+0x7c/0xb0 [] ? rest_init+0x80/0x80 Code: 45 ac 0f 88 a8 01 00 00 3b b7 d0 01 00 00 49 89 ff 0f 83 99 01 00 00 48 8b 47 20 48 8b 80 88 00 00 00 48 85 c0 0f 84 cd 01 00 00 <41> 8b b1 5c 02 00 00 41 8b 89 58 02 00 00 89 75 98 41 8b b1 60 RIP [] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320 RSP CR2: 000000000000025c ---[ end trace ad2c0aadf48e2032 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 It has helped me to add a NULL pointer check that was suggested at http://lists.freedesktop.org/archives/dri-devel/2014-October/070663.html I am not familiar with the code. But the change looks sane and we need something fast at this stage of 3.18 development. Suggested-by: Helge Deller Signed-off-by: Petr Mladek Tested-by: Petr Mladek Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/radeon_kms.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 8309b11e674d..03586763ee86 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -795,6 +795,8 @@ int radeon_get_vblank_timestamp_kms(struct drm_device *dev, int crtc, /* Get associated drm_crtc: */ drmcrtc = &rdev->mode_info.crtcs[crtc]->base; + if (!drmcrtc) + return -EINVAL; /* Helper routine in DRM core does all the work: */ return drm_calc_vbltimestamp_from_scanoutpos(dev, crtc, max_error, -- cgit v1.2.3 From bc127bda37db2792fa7f02a8d258731ecf3e4b8b Mon Sep 17 00:00:00 2001 From: Rafael Aquini Date: Tue, 2 Dec 2014 15:59:22 -0800 Subject: mm: do not overwrite reserved pages counter at show_mem() Minor fixlet to perform the reserved pages counter aggregation for each node, at show_mem() Signed-off-by: Rafael Aquini Acked-by: Rik van Riel Acked-by: Johannes Weiner Acked-by: Yasuaki Ishimatsu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- lib/show_mem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/show_mem.c b/lib/show_mem.c index 09225796991a..5e256271b47b 100644 --- a/lib/show_mem.c +++ b/lib/show_mem.c @@ -28,7 +28,7 @@ void show_mem(unsigned int filter) continue; total += zone->present_pages; - reserved = zone->present_pages - zone->managed_pages; + reserved += zone->present_pages - zone->managed_pages; if (is_highmem_idx(zoneid)) highmem += zone->present_pages; -- cgit v1.2.3 From fb993fa1a2f669215fa03a09eed7848f2663e336 Mon Sep 17 00:00:00 2001 From: Weijie Yang Date: Tue, 2 Dec 2014 15:59:25 -0800 Subject: mm: frontswap: invalidate expired data on a dup-store failure If a frontswap dup-store failed, it should invalidate the expired page in the backend, or it could trigger some data corruption issue. Such as: 1. use zswap as the frontswap backend with writeback feature 2. store a swap page(version_1) to entry A, success 3. dup-store a newer page(version_2) to the same entry A, fail 4. use __swap_writepage() write version_2 page to swapfile, success 5. zswap do shrink, writeback version_1 page to swapfile 6. version_2 page is overwrited by version_1, data corrupt. This patch fixes this issue by invalidating expired data immediately when meet a dup-store failure. Signed-off-by: Weijie Yang Cc: Konrad Rzeszutek Wilk Cc: Seth Jennings Cc: Dan Streetman Cc: Minchan Kim Cc: Bob Liu Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/frontswap.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/frontswap.c b/mm/frontswap.c index c30eec536f03..f2a3571c6e22 100644 --- a/mm/frontswap.c +++ b/mm/frontswap.c @@ -244,8 +244,10 @@ int __frontswap_store(struct page *page) the (older) page from frontswap */ inc_frontswap_failed_stores(); - if (dup) + if (dup) { __frontswap_clear(sis, offset); + frontswap_ops->invalidate_page(type, offset); + } } if (frontswap_writethrough_enabled) /* report failure so swap also writes to swap device */ -- cgit v1.2.3 From 91b57191cfd152c02ded0745250167d0263084f8 Mon Sep 17 00:00:00 2001 From: Andrew Morton Date: Tue, 2 Dec 2014 15:59:28 -0800 Subject: mm/vmpressure.c: fix race in vmpressure_work_fn() In some android devices, there will be a "divide by zero" exception. vmpr->scanned could be zero before spin_lock(&vmpr->sr_lock). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=88051 [akpm@linux-foundation.org: neaten] Reported-by: ji_ang Cc: Anton Vorontsov Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/vmpressure.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/vmpressure.c b/mm/vmpressure.c index d4042e75f7c7..c5afd573d7da 100644 --- a/mm/vmpressure.c +++ b/mm/vmpressure.c @@ -165,6 +165,7 @@ static void vmpressure_work_fn(struct work_struct *work) unsigned long scanned; unsigned long reclaimed; + spin_lock(&vmpr->sr_lock); /* * Several contexts might be calling vmpressure(), so it is * possible that the work was rescheduled again before the old @@ -173,11 +174,12 @@ static void vmpressure_work_fn(struct work_struct *work) * here. No need for any locks here since we don't care if * vmpr->reclaimed is in sync. */ - if (!vmpr->scanned) + scanned = vmpr->scanned; + if (!scanned) { + spin_unlock(&vmpr->sr_lock); return; + } - spin_lock(&vmpr->sr_lock); - scanned = vmpr->scanned; reclaimed = vmpr->reclaimed; vmpr->scanned = 0; vmpr->reclaimed = 0; -- cgit v1.2.3 From 8d609725d4357f499e2103e46011308b32f53513 Mon Sep 17 00:00:00 2001 From: Seth Forshee Date: Tue, 25 Nov 2014 20:28:24 -0600 Subject: xen-netfront: Remove BUGs on paged skb data which crosses a page boundary These BUGs can be erroneously triggered by frags which refer to tail pages within a compound page. The data in these pages may overrun the hardware page while still being contained within the compound page, but since compound_order() evaluates to 0 for tail pages the assertion fails. The code already iterates through subsequent pages correctly in this scenario, so the BUGs are unnecessary and can be removed. Fixes: f36c374782e4 ("xen/netfront: handle compound page fragments on transmit") Cc: # 3.7+ Signed-off-by: Seth Forshee Reviewed-by: David Vrabel Signed-off-by: David S. Miller --- drivers/net/xen-netfront.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index cca871346a0f..ece8d1804d13 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -496,9 +496,6 @@ static void xennet_make_frags(struct sk_buff *skb, struct netfront_queue *queue, len = skb_frag_size(frag); offset = frag->page_offset; - /* Data must not cross a page boundary. */ - BUG_ON(len + offset > PAGE_SIZE<> PAGE_SHIFT; offset &= ~PAGE_MASK; @@ -506,8 +503,6 @@ static void xennet_make_frags(struct sk_buff *skb, struct netfront_queue *queue, while (len > 0) { unsigned long bytes; - BUG_ON(offset >= PAGE_SIZE); - bytes = PAGE_SIZE - offset; if (bytes > len) bytes = len; -- cgit v1.2.3 From 4c2d518695338801110bc166eece6aa02822b0b4 Mon Sep 17 00:00:00 2001 From: Hariprasad Shenai Date: Fri, 28 Nov 2014 18:35:14 +0530 Subject: cxgb4: Fill in supported link mode for SFP modules Signed-off-by: Hariprasad Shenai Signed-off-by: David S. Miller --- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 8520d5529df8..279873cb6e3a 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -2442,9 +2442,13 @@ static unsigned int from_fw_linkcaps(unsigned int type, unsigned int caps) SUPPORTED_10000baseKR_Full | SUPPORTED_1000baseKX_Full | SUPPORTED_10000baseKX4_Full; else if (type == FW_PORT_TYPE_FIBER_XFI || - type == FW_PORT_TYPE_FIBER_XAUI || type == FW_PORT_TYPE_SFP) + type == FW_PORT_TYPE_FIBER_XAUI || type == FW_PORT_TYPE_SFP) { v |= SUPPORTED_FIBRE; - else if (type == FW_PORT_TYPE_BP40_BA) + if (caps & FW_PORT_CAP_SPEED_1G) + v |= SUPPORTED_1000baseT_Full; + if (caps & FW_PORT_CAP_SPEED_10G) + v |= SUPPORTED_10000baseT_Full; + } else if (type == FW_PORT_TYPE_BP40_BA) v |= SUPPORTED_40000baseSR4_Full; if (caps & FW_PORT_CAP_ANEG) -- cgit v1.2.3 From 92788ac1eb06e69a822de45e2a8a63fa45eb5be2 Mon Sep 17 00:00:00 2001 From: Andrew Morton Date: Tue, 2 Dec 2014 15:59:31 -0800 Subject: drivers/input/evdev.c: don't kfree() a vmalloc address If kzalloc() failed and then evdev_open_device() fails, evdev_open() will pass a vmalloc'ed pointer to kfree. This might fix https://bugzilla.kernel.org/show_bug.cgi?id=88401, where there was a crash in kfree(). Reported-by: Christian Casteyde Belatedly-Acked-by: Dmitry Torokhov Cc: Henrik Rydberg Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- drivers/input/evdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c index bc203485716d..8afa28e4570e 100644 --- a/drivers/input/evdev.c +++ b/drivers/input/evdev.c @@ -421,7 +421,7 @@ static int evdev_open(struct inode *inode, struct file *file) err_free_client: evdev_detach_client(evdev, client); - kfree(client); + kvfree(client); return error; } -- cgit v1.2.3 From e8577d1f0329d4842e8302e289fb2c22156abef4 Mon Sep 17 00:00:00 2001 From: Manfred Spraul Date: Tue, 2 Dec 2014 15:59:34 -0800 Subject: ipc/sem.c: fully initialize sem_array before making it visible ipc_addid() makes a new ipc identifier visible to everyone. New objects start as locked, so that the caller can complete the initialization after the call. Within struct sem_array, at least sma->sem_base and sma->sem_nsems are accessed without any locks, therefore this approach doesn't work. Thus: Move the ipc_addid() to the end of the initialization. Signed-off-by: Manfred Spraul Reported-by: Rik van Riel Acked-by: Rik van Riel Acked-by: Davidlohr Bueso Acked-by: Rafael Aquini Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- ipc/sem.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/ipc/sem.c b/ipc/sem.c index 454f6c6020a8..53c3310f41c6 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -507,13 +507,6 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params) return retval; } - id = ipc_addid(&sem_ids(ns), &sma->sem_perm, ns->sc_semmni); - if (id < 0) { - ipc_rcu_putref(sma, sem_rcu_free); - return id; - } - ns->used_sems += nsems; - sma->sem_base = (struct sem *) &sma[1]; for (i = 0; i < nsems; i++) { @@ -528,6 +521,14 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params) INIT_LIST_HEAD(&sma->list_id); sma->sem_nsems = nsems; sma->sem_ctime = get_seconds(); + + id = ipc_addid(&sem_ids(ns), &sma->sem_perm, ns->sc_semmni); + if (id < 0) { + ipc_rcu_putref(sma, sem_rcu_free); + return id; + } + ns->used_sems += nsems; + sem_unlock(sma, -1); rcu_read_unlock(); -- cgit v1.2.3 From 1ead0e79bfedd4b563b8ea7c585ca3884b0c89a7 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Tue, 2 Dec 2014 15:59:37 -0800 Subject: fat: fix oops on corrupted vfat fs a) don't bother with ->d_time for positives - we only check it for negatives anyway. b) make sure to set it at unlink and rmdir time - at *that* point soon-to-be negative dentry matches then-current directory contents c) don't go into renaming of old alias in vfat_lookup() unless it has the same parent (which it will, unless we are seeing corrupted image) [hirofumi@mail.parknet.co.jp: make change minimum, don't call d_move() for dir] Signed-off-by: Al Viro Signed-off-by: OGAWA Hirofumi Cc: [3.17.x] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/fat/namei_vfat.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c index 6df8d3d885e5..b8b92c2f9683 100644 --- a/fs/fat/namei_vfat.c +++ b/fs/fat/namei_vfat.c @@ -736,7 +736,12 @@ static struct dentry *vfat_lookup(struct inode *dir, struct dentry *dentry, } alias = d_find_alias(inode); - if (alias && !vfat_d_anon_disconn(alias)) { + /* + * Checking "alias->d_parent == dentry->d_parent" to make sure + * FS is not corrupted (especially double linked dir). + */ + if (alias && alias->d_parent == dentry->d_parent && + !vfat_d_anon_disconn(alias)) { /* * This inode has non anonymous-DCACHE_DISCONNECTED * dentry. This means, the user did ->lookup() by an @@ -755,12 +760,9 @@ static struct dentry *vfat_lookup(struct inode *dir, struct dentry *dentry, out: mutex_unlock(&MSDOS_SB(sb)->s_lock); - dentry->d_time = dentry->d_parent->d_inode->i_version; - dentry = d_splice_alias(inode, dentry); - if (dentry) - dentry->d_time = dentry->d_parent->d_inode->i_version; - return dentry; - + if (!inode) + dentry->d_time = dir->i_version; + return d_splice_alias(inode, dentry); error: mutex_unlock(&MSDOS_SB(sb)->s_lock); return ERR_PTR(err); @@ -793,7 +795,6 @@ static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode, inode->i_mtime = inode->i_atime = inode->i_ctime = ts; /* timestamp is already written, so mark_inode_dirty() is unneeded. */ - dentry->d_time = dentry->d_parent->d_inode->i_version; d_instantiate(dentry, inode); out: mutex_unlock(&MSDOS_SB(sb)->s_lock); @@ -824,6 +825,7 @@ static int vfat_rmdir(struct inode *dir, struct dentry *dentry) clear_nlink(inode); inode->i_mtime = inode->i_atime = CURRENT_TIME_SEC; fat_detach(inode); + dentry->d_time = dir->i_version; out: mutex_unlock(&MSDOS_SB(sb)->s_lock); @@ -849,6 +851,7 @@ static int vfat_unlink(struct inode *dir, struct dentry *dentry) clear_nlink(inode); inode->i_mtime = inode->i_atime = CURRENT_TIME_SEC; fat_detach(inode); + dentry->d_time = dir->i_version; out: mutex_unlock(&MSDOS_SB(sb)->s_lock); @@ -889,7 +892,6 @@ static int vfat_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) inode->i_mtime = inode->i_atime = inode->i_ctime = ts; /* timestamp is already written, so mark_inode_dirty() is unneeded. */ - dentry->d_time = dentry->d_parent->d_inode->i_version; d_instantiate(dentry, inode); mutex_unlock(&MSDOS_SB(sb)->s_lock); -- cgit v1.2.3 From 2022b4d18a491a578218ce7a4eca8666db895a73 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Tue, 2 Dec 2014 15:59:39 -0800 Subject: mm: fix swapoff hang after page migration and fork I've been seeing swapoff hangs in recent testing: it's cycling around trying unsuccessfully to find an mm for some remaining pages of swap. I have been exercising swap and page migration more heavily recently, and now notice a long-standing error in copy_one_pte(): it's trying to add dst_mm to swapoff's mmlist when it finds a swap entry, but is doing so even when it's a migration entry or an hwpoison entry. Which wouldn't matter much, except it adds dst_mm next to src_mm, assuming src_mm is already on the mmlist: which may not be so. Then if pages are later swapped out from dst_mm, swapoff won't be able to find where to replace them. There's already a !non_swap_entry() test for stats: move that up before the swap_duplicate() and the addition to mmlist. Signed-off-by: Hugh Dickins Cc: Kelley Nielsen Cc: [2.6.18+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/memory.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 3e503831e042..d5f2ae9c4a23 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -815,20 +815,20 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, if (!pte_file(pte)) { swp_entry_t entry = pte_to_swp_entry(pte); - if (swap_duplicate(entry) < 0) - return entry.val; - - /* make sure dst_mm is on swapoff's mmlist. */ - if (unlikely(list_empty(&dst_mm->mmlist))) { - spin_lock(&mmlist_lock); - if (list_empty(&dst_mm->mmlist)) - list_add(&dst_mm->mmlist, - &src_mm->mmlist); - spin_unlock(&mmlist_lock); - } - if (likely(!non_swap_entry(entry))) + if (likely(!non_swap_entry(entry))) { + if (swap_duplicate(entry) < 0) + return entry.val; + + /* make sure dst_mm is on swapoff's mmlist. */ + if (unlikely(list_empty(&dst_mm->mmlist))) { + spin_lock(&mmlist_lock); + if (list_empty(&dst_mm->mmlist)) + list_add(&dst_mm->mmlist, + &src_mm->mmlist); + spin_unlock(&mmlist_lock); + } rss[MM_SWAPENTS]++; - else if (is_migration_entry(entry)) { + } else if (is_migration_entry(entry)) { page = migration_entry_to_page(entry); if (PageAnon(page)) -- cgit v1.2.3 From c4ea95d7cd08d9ffd7fa75e6c5e0332d596dd11e Mon Sep 17 00:00:00 2001 From: Daniel Forrest Date: Tue, 2 Dec 2014 15:59:42 -0800 Subject: mm: fix anon_vma_clone() error treatment Andrew Morton noticed that the error return from anon_vma_clone() was being dropped and replaced with -ENOMEM (which is not itself a bug because the only error return value from anon_vma_clone() is -ENOMEM). I did an audit of callers of anon_vma_clone() and discovered an actual bug where the error return was being lost. In __split_vma(), between Linux 3.11 and 3.12 the code was changed so the err variable is used before the call to anon_vma_clone() and the default initial value of -ENOMEM is overwritten. So a failure of anon_vma_clone() will return success since err at this point is now zero. Below is a patch which fixes this bug and also propagates the error return value from anon_vma_clone() in all cases. Fixes: ef0855d334e1 ("mm: mempolicy: turn vma_set_policy() into vma_dup_policy()") Signed-off-by: Daniel Forrest Reviewed-by: Michal Hocko Cc: Konstantin Khlebnikov Cc: Andrea Arcangeli Cc: Rik van Riel Cc: Tim Hartrick Cc: Hugh Dickins Cc: Michel Lespinasse Cc: Vlastimil Babka Cc: [3.12+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/mmap.c | 10 +++++++--- mm/rmap.c | 6 ++++-- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 87e82b38453c..ae919891a087 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -776,8 +776,11 @@ again: remove_next = 1 + (end > next->vm_end); * shrinking vma had, to cover any anon pages imported. */ if (exporter && exporter->anon_vma && !importer->anon_vma) { - if (anon_vma_clone(importer, exporter)) - return -ENOMEM; + int error; + + error = anon_vma_clone(importer, exporter); + if (error) + return error; importer->anon_vma = exporter->anon_vma; } } @@ -2469,7 +2472,8 @@ static int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma, if (err) goto out_free_vma; - if (anon_vma_clone(new, vma)) + err = anon_vma_clone(new, vma); + if (err) goto out_free_mpol; if (new->vm_file) diff --git a/mm/rmap.c b/mm/rmap.c index 19886fb2f13a..3e4c7213210c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -274,6 +274,7 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma) { struct anon_vma_chain *avc; struct anon_vma *anon_vma; + int error; /* Don't bother if the parent process has no anon_vma here. */ if (!pvma->anon_vma) @@ -283,8 +284,9 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma) * First, attach the new VMA to the parent VMA's anon_vmas, * so rmap can find non-COWed pages in child processes. */ - if (anon_vma_clone(vma, pvma)) - return -ENOMEM; + error = anon_vma_clone(vma, pvma); + if (error) + return error; /* Then add our own anon_vma. */ anon_vma = anon_vma_alloc(); -- cgit v1.2.3 From b724aa213df7aee08d57237ee0bbafffef09aea5 Mon Sep 17 00:00:00 2001 From: Michal Simek Date: Tue, 2 Dec 2014 15:59:45 -0800 Subject: lib/genalloc.c: export devm_gen_pool_create() for modules Modules can use this function for creating pool. Signed-off-by: Michal Simek Acked-by: Lad, Prabhakar Cc: Laura Abbott Cc: Olof Johansson Cc: Catalin Marinas Cc: Will Deacon Cc: Vladimir Zapolskiy Cc: Philipp Zabel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- lib/genalloc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/genalloc.c b/lib/genalloc.c index cce4dd68c40d..2e65d206b01c 100644 --- a/lib/genalloc.c +++ b/lib/genalloc.c @@ -598,6 +598,7 @@ struct gen_pool *devm_gen_pool_create(struct device *dev, int min_alloc_order, return pool; } +EXPORT_SYMBOL(devm_gen_pool_create); /** * dev_get_gen_pool - Obtain the gen_pool (if any) for a device -- cgit v1.2.3 From 7c3fbbdd04a681a1992ad6a3d7a36a63ff668753 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Tue, 2 Dec 2014 15:59:48 -0800 Subject: slab: fix nodeid bounds check for non-contiguous node IDs The bounds check for nodeid in ____cache_alloc_node gives false positives on machines where the node IDs are not contiguous, leading to a panic at boot time. For example, on a POWER8 machine the node IDs are typically 0, 1, 16 and 17. This means that num_online_nodes() returns 4, so when ____cache_alloc_node is called with nodeid = 16 the VM_BUG_ON triggers, like this: kernel BUG at /home/paulus/kernel/kvm/mm/slab.c:3079! Call Trace: .____cache_alloc_node+0x5c/0x270 (unreliable) .kmem_cache_alloc_node_trace+0xdc/0x360 .init_list+0x3c/0x128 .kmem_cache_init+0x1dc/0x258 .start_kernel+0x2a0/0x568 start_here_common+0x20/0xa8 To fix this, we instead compare the nodeid with MAX_NUMNODES, and additionally make sure it isn't negative (since nodeid is an int). The check is there mainly to protect the array dereference in the get_node() call in the next line, and the array being dereferenced is of size MAX_NUMNODES. If the nodeid is in range but invalid (for example if the node is off-line), the BUG_ON in the next line will catch that. Fixes: 14e50c6a9bc2 ("mm: slab: Verify the nodeid passed to ____cache_alloc_node") Signed-off-by: Paul Mackerras Reviewed-by: Yasuaki Ishimatsu Reviewed-by: Pekka Enberg Acked-by: David Rientjes Cc: Christoph Lameter Cc: Joonsoo Kim Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/slab.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slab.c b/mm/slab.c index eb2b2ea30130..f34e053ec46e 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3076,7 +3076,7 @@ static void *____cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, void *obj; int x; - VM_BUG_ON(nodeid > num_online_nodes()); + VM_BUG_ON(nodeid < 0 || nodeid >= MAX_NUMNODES); n = get_node(cachep, nodeid); BUG_ON(!n); -- cgit v1.2.3 From 7cc78f8fa02c2485104b86434acbc1538a3bd807 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski Date: Wed, 3 Dec 2014 15:37:08 -0800 Subject: context_tracking: Restore previous state in schedule_user MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It appears that some SCHEDULE_USER (asm for schedule_user) callers in arch/x86/kernel/entry_64.S are called from RCU kernel context, and schedule_user will return in RCU user context. This causes RCU warnings and possible failures. This is intended to be a minimal fix suitable for 3.18. Reported-and-tested-by: Dave Jones Cc: Oleg Nesterov Cc: Frédéric Weisbecker Acked-by: Paul E. McKenney Signed-off-by: Andy Lutomirski Signed-off-by: Linus Torvalds --- kernel/sched/core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 24beb9bb4c3e..89e7283015a6 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2874,10 +2874,14 @@ asmlinkage __visible void __sched schedule_user(void) * or we have been woken up remotely but the IPI has not yet arrived, * we haven't yet exited the RCU idle mode. Do it here manually until * we find a better solution. + * + * NB: There are buggy callers of this function. Ideally we + * should warn if prev_state != IN_USER, but that will trigger + * too frequently to make sense yet. */ - user_exit(); + enum ctx_state prev_state = exception_enter(); schedule(); - user_enter(); + exception_exit(prev_state); } #endif -- cgit v1.2.3 From 6fb2a756739aa507c1fd5b8126f0bfc2f070dc46 Mon Sep 17 00:00:00 2001 From: Tom Herbert Date: Sat, 29 Nov 2014 09:59:45 -0800 Subject: gre: Set inner mac header in gro complete Set the inner mac header to point to the GRE payload when doing GRO. This is needed if we proceed to send the packet through GRE GSO which now uses the inner mac header instead of inner network header to determine the length of encapsulation headers. Fixes: 14051f0452a2 ("gre: Use inner mac length when computing tunnel length") Reported-by: Wolfgang Walter Signed-off-by: Tom Herbert Signed-off-by: David S. Miller --- net/ipv4/gre_offload.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c index bb5947b0ce2d..51973ddc05a6 100644 --- a/net/ipv4/gre_offload.c +++ b/net/ipv4/gre_offload.c @@ -247,6 +247,9 @@ static int gre_gro_complete(struct sk_buff *skb, int nhoff) err = ptype->callbacks.gro_complete(skb, nhoff + grehlen); rcu_read_unlock(); + + skb_set_inner_mac_header(skb, nhoff + grehlen); + return err; } -- cgit v1.2.3 From 769e0de6475e5512f88bfb4dbf6d6323fd23514f Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Sat, 29 Nov 2014 14:46:13 -0800 Subject: bpf: x86: fix epilogue generation for eBPF programs classic BPF has a restriction that last insn is always BPF_RET. eBPF doesn't have BPF_RET instruction and this restriction. It has BPF_EXIT insn which can appear anywhere in the program one or more times and it doesn't have to be last insn. Fix eBPF JIT to emit epilogue when first BPF_EXIT is seen and all other BPF_EXIT instructions will be emitted as jump. Since jump offset to epilogue is computed as: jmp_offset = ctx->cleanup_addr - addrs[i] we need to change type of cleanup_addr to signed to compute the offset as: (long long) ((int)20 - (int)30) instead of: (long long) ((unsigned int)20 - (int)30) Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT") Signed-off-by: Alexei Starovoitov Signed-off-by: David S. Miller --- arch/x86/net/bpf_jit_comp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 3f627345d51c..7e90244c84e3 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -178,7 +178,7 @@ static void jit_fill_hole(void *area, unsigned int size) } struct jit_context { - unsigned int cleanup_addr; /* epilogue code offset */ + int cleanup_addr; /* epilogue code offset */ bool seen_ld_abs; }; @@ -192,6 +192,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, struct bpf_insn *insn = bpf_prog->insnsi; int insn_cnt = bpf_prog->len; bool seen_ld_abs = ctx->seen_ld_abs | (oldproglen == 0); + bool seen_exit = false; u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; int i; int proglen = 0; @@ -854,10 +855,11 @@ common_load: goto common_load; case BPF_JMP | BPF_EXIT: - if (i != insn_cnt - 1) { + if (seen_exit) { jmp_offset = ctx->cleanup_addr - addrs[i]; goto emit_jmp; } + seen_exit = true; /* update cleanup_addr */ ctx->cleanup_addr = proglen; /* mov rbx, qword ptr [rbp-X] */ -- cgit v1.2.3 From 8961b1940200ac5e91bee1c1bc69086365e1b7c9 Mon Sep 17 00:00:00 2001 From: Lino Sanfilippo Date: Sun, 30 Nov 2014 11:49:36 +0100 Subject: pxa168: close race between napi and irq activation In pxa168_eth_open() the irqs are enabled before napi. This opens a tiny time window in which the irq handler is processed, disables irqs but then is not able to schedule the not yet activated napi, leaving irqs disabled forever (since irqs are reenabled in napi poll function). Fix this race by activating napi before irqs are activated. Signed-off-by: Lino Sanfilippo Signed-off-by: David S. Miller --- drivers/net/ethernet/marvell/pxa168_eth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/pxa168_eth.c b/drivers/net/ethernet/marvell/pxa168_eth.c index c3b209cd0660..a3e394c47819 100644 --- a/drivers/net/ethernet/marvell/pxa168_eth.c +++ b/drivers/net/ethernet/marvell/pxa168_eth.c @@ -1153,8 +1153,8 @@ static int pxa168_eth_open(struct net_device *dev) pep->rx_used_desc_q = 0; pep->rx_curr_desc_q = 0; netif_carrier_off(dev); - eth_port_start(dev); napi_enable(&pep->napi); + eth_port_start(dev); return 0; out_free_rx_skb: rxq_deinit(dev); -- cgit v1.2.3 From 6276288a4c1f755e6b65b0f2e1177855b0340e31 Mon Sep 17 00:00:00 2001 From: Lino Sanfilippo Date: Sun, 30 Nov 2014 12:51:31 +0100 Subject: skge: Unmask interrupts in case of spurious interrupts In case of a spurious interrupt dont forget to reenable the interrupts that have been masked by reading the interrupt source register. Signed-off-by: Lino Sanfilippo Acked-by: Mirko Lindner Signed-off-by: David S. Miller --- drivers/net/ethernet/marvell/skge.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/marvell/skge.c b/drivers/net/ethernet/marvell/skge.c index 264eab7d3b26..7173836fe361 100644 --- a/drivers/net/ethernet/marvell/skge.c +++ b/drivers/net/ethernet/marvell/skge.c @@ -3433,10 +3433,9 @@ static irqreturn_t skge_intr(int irq, void *dev_id) if (status & IS_HW_ERR) skge_error_irq(hw); - +out: skge_write32(hw, B0_IMSK, hw->intr_mask); skge_read32(hw, B0_IMSK); -out: spin_unlock(&hw->hw_lock); return IRQ_RETVAL(handled); -- cgit v1.2.3 From ea589e9b7838f5d1c3d4998f9fe08854872187fc Mon Sep 17 00:00:00 2001 From: Lino Sanfilippo Date: Sun, 30 Nov 2014 12:56:51 +0100 Subject: sky2: avoid pci write posting after disabling irqs In sky2_change_mtu setting B0_IMSK to 0 may be delayed due to PCI write posting which could result in irqs being still active when synchronize_irq is called. Since we are not prepared to handle any further irqs after synchronize_irq (our resources are freed after that) force the write by a consecutive read from the same register. Similar situation in sky2_all_down: Here we disabled irqs by a write to B0_IMSK but did not ensure that this write took place before synchronize_irq. Fix that too. Signed-off-by: Lino Sanfilippo Acked-by: Stephen Hemminger Signed-off-by: David S. Miller --- drivers/net/ethernet/marvell/sky2.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c index bd3366267039..f14544c8d73f 100644 --- a/drivers/net/ethernet/marvell/sky2.c +++ b/drivers/net/ethernet/marvell/sky2.c @@ -2419,6 +2419,7 @@ static int sky2_change_mtu(struct net_device *dev, int new_mtu) imask = sky2_read32(hw, B0_IMSK); sky2_write32(hw, B0_IMSK, 0); + sky2_read32(hw, B0_IMSK); dev->trans_start = jiffies; /* prevent tx timeout */ napi_disable(&hw->napi); @@ -3487,8 +3488,8 @@ static void sky2_all_down(struct sky2_hw *hw) int i; if (hw->flags & SKY2_HW_IRQ_SETUP) { - sky2_read32(hw, B0_IMSK); sky2_write32(hw, B0_IMSK, 0); + sky2_read32(hw, B0_IMSK); synchronize_irq(hw->pdev->irq); napi_disable(&hw->napi); -- cgit v1.2.3 From 4fc11e61f46dd66730a18e8136607386007ee8c9 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Mon, 1 Dec 2014 10:16:17 +0900 Subject: uapi: fix to export linux/vm_sockets.h A typo "header=y" was introduced by commit 7071cf7fc435 (uapi: add missing network related headers to kbuild). Signed-off-by: Masahiro Yamada Cc: Stephen Hemminger Signed-off-by: David S. Miller --- include/uapi/linux/Kbuild | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild index 4c94f31a8c99..8523f9bb72f2 100644 --- a/include/uapi/linux/Kbuild +++ b/include/uapi/linux/Kbuild @@ -427,7 +427,7 @@ header-y += virtio_net.h header-y += virtio_pci.h header-y += virtio_ring.h header-y += virtio_rng.h -header=y += vm_sockets.h +header-y += vm_sockets.h header-y += vt.h header-y += wait.h header-y += wanrouter.h -- cgit v1.2.3 From f2a01517f2a1040a0b156f171a7cefd748f2fd03 Mon Sep 17 00:00:00 2001 From: Pravin B Shelar Date: Sun, 30 Nov 2014 23:04:17 -0800 Subject: openvswitch: Fix flow mask validation. Following patch fixes typo in the flow validation. This prevented installation of ARP and IPv6 flows. Fixes: 19e7a3df72 ("openvswitch: Fix NDP flow mask validation") Signed-off-by: Pravin B Shelar Reviewed-by: Thomas Graf Signed-off-by: David S. Miller --- net/openvswitch/flow_netlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index 089b195c064a..918e96645b05 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -145,7 +145,7 @@ static bool match_validate(const struct sw_flow_match *match, if (match->key->eth.type == htons(ETH_P_ARP) || match->key->eth.type == htons(ETH_P_RARP)) { key_expected |= 1 << OVS_KEY_ATTR_ARP; - if (match->mask && (match->mask->key.tp.src == htons(0xff))) + if (match->mask && (match->mask->key.eth.type == htons(0xffff))) mask_allowed |= 1 << OVS_KEY_ATTR_ARP; } @@ -220,7 +220,7 @@ static bool match_validate(const struct sw_flow_match *match, htons(NDISC_NEIGHBOUR_SOLICITATION) || match->key->tp.src == htons(NDISC_NEIGHBOUR_ADVERTISEMENT)) { key_expected |= 1 << OVS_KEY_ATTR_ND; - if (match->mask && (match->mask->key.tp.src == htons(0xffff))) + if (match->mask && (match->mask->key.tp.src == htons(0xff))) mask_allowed |= 1 << OVS_KEY_ATTR_ND; } } -- cgit v1.2.3 From dc50ddcd4c58a5a0226038307d6ef884bec9f8c2 Mon Sep 17 00:00:00 2001 From: Stephane Grosjean Date: Fri, 28 Nov 2014 14:08:48 +0100 Subject: can: peak_usb: fix memset() usage This patchs fixes a misplaced call to memset() that fills the request buffer with 0. The problem was with sending PCAN_USBPRO_REQ_FCT requests, the content set by the caller was thus lost. With this patch, the memory area is zeroed only when requesting info from the device. Signed-off-by: Stephane Grosjean Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/peak_usb/pcan_usb_pro.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c index 263dd921edc4..f7f796a2c50b 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c @@ -333,8 +333,6 @@ static int pcan_usb_pro_send_req(struct peak_usb_device *dev, int req_id, if (!(dev->state & PCAN_USB_STATE_CONNECTED)) return 0; - memset(req_addr, '\0', req_size); - req_type = USB_TYPE_VENDOR | USB_RECIP_OTHER; switch (req_id) { @@ -345,6 +343,7 @@ static int pcan_usb_pro_send_req(struct peak_usb_device *dev, int req_id, default: p = usb_rcvctrlpipe(dev->udev, 0); req_type |= USB_DIR_IN; + memset(req_addr, '\0', req_size); break; } -- cgit v1.2.3 From af35d0f1cce7a990286e2b94c260a2c2d2a0e4b0 Mon Sep 17 00:00:00 2001 From: Stephane Grosjean Date: Fri, 28 Nov 2014 13:49:10 +0100 Subject: can: peak_usb: fix cleanup sequence order in case of error during init This patch sets the correct reverse sequence order to the instructions set to run, when any failure occurs during the initialization steps. It also adds the missing unregistration call of the can device if the failure appears after having been registered. Signed-off-by: Stephane Grosjean Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/peak_usb/pcan_usb_core.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.c b/drivers/net/can/usb/peak_usb/pcan_usb_core.c index 644e6ab8a489..dc807e10f802 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c @@ -735,7 +735,7 @@ static int peak_usb_create_dev(struct peak_usb_adapter *peak_usb_adapter, dev->cmd_buf = kmalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL); if (!dev->cmd_buf) { err = -ENOMEM; - goto lbl_set_intf_data; + goto lbl_free_candev; } dev->udev = usb_dev; @@ -775,7 +775,7 @@ static int peak_usb_create_dev(struct peak_usb_adapter *peak_usb_adapter, err = register_candev(netdev); if (err) { dev_err(&intf->dev, "couldn't register CAN device: %d\n", err); - goto lbl_free_cmd_buf; + goto lbl_restore_intf_data; } if (dev->prev_siblings) @@ -788,14 +788,14 @@ static int peak_usb_create_dev(struct peak_usb_adapter *peak_usb_adapter, if (dev->adapter->dev_init) { err = dev->adapter->dev_init(dev); if (err) - goto lbl_free_cmd_buf; + goto lbl_unregister_candev; } /* set bus off */ if (dev->adapter->dev_set_bus) { err = dev->adapter->dev_set_bus(dev, 0); if (err) - goto lbl_free_cmd_buf; + goto lbl_unregister_candev; } /* get device number early */ @@ -807,11 +807,14 @@ static int peak_usb_create_dev(struct peak_usb_adapter *peak_usb_adapter, return 0; -lbl_free_cmd_buf: - kfree(dev->cmd_buf); +lbl_unregister_candev: + unregister_candev(netdev); -lbl_set_intf_data: +lbl_restore_intf_data: usb_set_intfdata(intf, dev->prev_siblings); + kfree(dev->cmd_buf); + +lbl_free_candev: free_candev(netdev); return err; -- cgit v1.2.3 From 62bc24f67abda56e486735706be6a4dea60fdb4c Mon Sep 17 00:00:00 2001 From: Stephane Grosjean Date: Fri, 5 Dec 2014 14:11:09 +0100 Subject: can: peak_usb: fix multi-byte values endianess This patch fixes the endianess definition as well as the usage of the multi-byte fields in the data structures exchanged with the PEAK-System USB adapters. By fixing the endianess, this patch also fixes the wrong usage of a 32-bits local variable for handling the error status 16-bits field, in function pcan_usb_pro_handle_error(). Signed-off-by: Stephane Grosjean Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/peak_usb/pcan_usb.c | 14 ++++---- drivers/net/can/usb/peak_usb/pcan_usb_core.c | 3 +- drivers/net/can/usb/peak_usb/pcan_usb_pro.c | 20 ++++++------ drivers/net/can/usb/peak_usb/pcan_usb_pro.h | 48 ++++++++++++++-------------- 4 files changed, 43 insertions(+), 42 deletions(-) diff --git a/drivers/net/can/usb/peak_usb/pcan_usb.c b/drivers/net/can/usb/peak_usb/pcan_usb.c index 925ab8ec9329..4e1659d07979 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb.c @@ -316,7 +316,7 @@ static int pcan_usb_get_serial(struct peak_usb_device *dev, u32 *serial_number) if (err) { netdev_err(dev->netdev, "getting serial failure: %d\n", err); } else if (serial_number) { - u32 tmp32; + __le32 tmp32; memcpy(&tmp32, args, 4); *serial_number = le32_to_cpu(tmp32); @@ -347,7 +347,7 @@ static int pcan_usb_get_device_id(struct peak_usb_device *dev, u32 *device_id) */ static int pcan_usb_update_ts(struct pcan_usb_msg_context *mc) { - u16 tmp16; + __le16 tmp16; if ((mc->ptr+2) > mc->end) return -EINVAL; @@ -371,7 +371,7 @@ static int pcan_usb_decode_ts(struct pcan_usb_msg_context *mc, u8 first_packet) { /* only 1st packet supplies a word timestamp */ if (first_packet) { - u16 tmp16; + __le16 tmp16; if ((mc->ptr + 2) > mc->end) return -EINVAL; @@ -614,7 +614,7 @@ static int pcan_usb_decode_data(struct pcan_usb_msg_context *mc, u8 status_len) return -ENOMEM; if (status_len & PCAN_USB_STATUSLEN_EXT_ID) { - u32 tmp32; + __le32 tmp32; if ((mc->ptr + 4) > mc->end) goto decode_failed; @@ -622,9 +622,9 @@ static int pcan_usb_decode_data(struct pcan_usb_msg_context *mc, u8 status_len) memcpy(&tmp32, mc->ptr, 4); mc->ptr += 4; - cf->can_id = le32_to_cpu(tmp32 >> 3) | CAN_EFF_FLAG; + cf->can_id = (le32_to_cpu(tmp32) >> 3) | CAN_EFF_FLAG; } else { - u16 tmp16; + __le16 tmp16; if ((mc->ptr + 2) > mc->end) goto decode_failed; @@ -632,7 +632,7 @@ static int pcan_usb_decode_data(struct pcan_usb_msg_context *mc, u8 status_len) memcpy(&tmp16, mc->ptr, 2); mc->ptr += 2; - cf->can_id = le16_to_cpu(tmp16 >> 5); + cf->can_id = le16_to_cpu(tmp16) >> 5; } cf->can_dlc = get_can_dlc(rec_len); diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.c b/drivers/net/can/usb/peak_usb/pcan_usb_core.c index dc807e10f802..c62f48a1161d 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c @@ -856,6 +856,7 @@ static int peak_usb_probe(struct usb_interface *intf, const struct usb_device_id *id) { struct usb_device *usb_dev = interface_to_usbdev(intf); + const u16 usb_id_product = le16_to_cpu(usb_dev->descriptor.idProduct); struct peak_usb_adapter *peak_usb_adapter, **pp; int i, err = -ENOMEM; @@ -863,7 +864,7 @@ static int peak_usb_probe(struct usb_interface *intf, /* get corresponding PCAN-USB adapter */ for (pp = peak_usb_adapters_list; *pp; pp++) - if ((*pp)->device_id == usb_dev->descriptor.idProduct) + if ((*pp)->device_id == usb_id_product) break; peak_usb_adapter = *pp; diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c index f7f796a2c50b..4cfa3b8605b1 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c @@ -78,8 +78,8 @@ struct pcan_usb_pro_msg { int rec_buffer_size; int rec_buffer_len; union { - u16 *rec_cnt_rd; - u32 *rec_cnt; + __le16 *rec_cnt_rd; + __le32 *rec_cnt; u8 *rec_buffer; } u; }; @@ -155,7 +155,7 @@ static int pcan_msg_add_rec(struct pcan_usb_pro_msg *pm, u8 id, ...) *pc++ = va_arg(ap, int); *pc++ = va_arg(ap, int); *pc++ = va_arg(ap, int); - *(u32 *)pc = cpu_to_le32(va_arg(ap, u32)); + *(__le32 *)pc = cpu_to_le32(va_arg(ap, u32)); pc += 4; memcpy(pc, va_arg(ap, int *), i); pc += i; @@ -165,7 +165,7 @@ static int pcan_msg_add_rec(struct pcan_usb_pro_msg *pm, u8 id, ...) case PCAN_USBPRO_GETDEVID: *pc++ = va_arg(ap, int); pc += 2; - *(u32 *)pc = cpu_to_le32(va_arg(ap, u32)); + *(__le32 *)pc = cpu_to_le32(va_arg(ap, u32)); pc += 4; break; @@ -173,21 +173,21 @@ static int pcan_msg_add_rec(struct pcan_usb_pro_msg *pm, u8 id, ...) case PCAN_USBPRO_SETBUSACT: case PCAN_USBPRO_SETSILENT: *pc++ = va_arg(ap, int); - *(u16 *)pc = cpu_to_le16(va_arg(ap, int)); + *(__le16 *)pc = cpu_to_le16(va_arg(ap, int)); pc += 2; break; case PCAN_USBPRO_SETLED: *pc++ = va_arg(ap, int); - *(u16 *)pc = cpu_to_le16(va_arg(ap, int)); + *(__le16 *)pc = cpu_to_le16(va_arg(ap, int)); pc += 2; - *(u32 *)pc = cpu_to_le32(va_arg(ap, u32)); + *(__le32 *)pc = cpu_to_le32(va_arg(ap, u32)); pc += 4; break; case PCAN_USBPRO_SETTS: pc++; - *(u16 *)pc = cpu_to_le16(va_arg(ap, int)); + *(__le16 *)pc = cpu_to_le16(va_arg(ap, int)); pc += 2; break; @@ -200,7 +200,7 @@ static int pcan_msg_add_rec(struct pcan_usb_pro_msg *pm, u8 id, ...) len = pc - pm->rec_ptr; if (len > 0) { - *pm->u.rec_cnt = cpu_to_le32(*pm->u.rec_cnt+1); + *pm->u.rec_cnt = cpu_to_le32(le32_to_cpu(*pm->u.rec_cnt) + 1); *pm->rec_ptr = id; pm->rec_ptr = pc; @@ -571,7 +571,7 @@ static int pcan_usb_pro_handle_canmsg(struct pcan_usb_pro_interface *usb_if, static int pcan_usb_pro_handle_error(struct pcan_usb_pro_interface *usb_if, struct pcan_usb_pro_rxstatus *er) { - const u32 raw_status = le32_to_cpu(er->status); + const u16 raw_status = le16_to_cpu(er->status); const unsigned int ctrl_idx = (er->channel >> 4) & 0x0f; struct peak_usb_device *dev = usb_if->dev[ctrl_idx]; struct net_device *netdev = dev->netdev; diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_pro.h b/drivers/net/can/usb/peak_usb/pcan_usb_pro.h index 32275af547e0..837cee267132 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_pro.h +++ b/drivers/net/can/usb/peak_usb/pcan_usb_pro.h @@ -33,27 +33,27 @@ /* PCAN_USBPRO_INFO_BL vendor request record type */ struct __packed pcan_usb_pro_blinfo { - u32 ctrl_type; + __le32 ctrl_type; u8 version[4]; u8 day; u8 month; u8 year; u8 dummy; - u32 serial_num_hi; - u32 serial_num_lo; - u32 hw_type; - u32 hw_rev; + __le32 serial_num_hi; + __le32 serial_num_lo; + __le32 hw_type; + __le32 hw_rev; }; /* PCAN_USBPRO_INFO_FW vendor request record type */ struct __packed pcan_usb_pro_fwinfo { - u32 ctrl_type; + __le32 ctrl_type; u8 version[4]; u8 day; u8 month; u8 year; u8 dummy; - u32 fw_type; + __le32 fw_type; }; /* @@ -80,46 +80,46 @@ struct __packed pcan_usb_pro_fwinfo { struct __packed pcan_usb_pro_btr { u8 data_type; u8 channel; - u16 dummy; - u32 CCBT; + __le16 dummy; + __le32 CCBT; }; struct __packed pcan_usb_pro_busact { u8 data_type; u8 channel; - u16 onoff; + __le16 onoff; }; struct __packed pcan_usb_pro_silent { u8 data_type; u8 channel; - u16 onoff; + __le16 onoff; }; struct __packed pcan_usb_pro_filter { u8 data_type; u8 dummy; - u16 filter_mode; + __le16 filter_mode; }; struct __packed pcan_usb_pro_setts { u8 data_type; u8 dummy; - u16 mode; + __le16 mode; }; struct __packed pcan_usb_pro_devid { u8 data_type; u8 channel; - u16 dummy; - u32 serial_num; + __le16 dummy; + __le32 serial_num; }; struct __packed pcan_usb_pro_setled { u8 data_type; u8 channel; - u16 mode; - u32 timeout; + __le16 mode; + __le32 timeout; }; struct __packed pcan_usb_pro_rxmsg { @@ -127,8 +127,8 @@ struct __packed pcan_usb_pro_rxmsg { u8 client; u8 flags; u8 len; - u32 ts32; - u32 id; + __le32 ts32; + __le32 id; u8 data[8]; }; @@ -141,15 +141,15 @@ struct __packed pcan_usb_pro_rxmsg { struct __packed pcan_usb_pro_rxstatus { u8 data_type; u8 channel; - u16 status; - u32 ts32; - u32 err_frm; + __le16 status; + __le32 ts32; + __le32 err_frm; }; struct __packed pcan_usb_pro_rxts { u8 data_type; u8 dummy[3]; - u32 ts64[2]; + __le32 ts64[2]; }; struct __packed pcan_usb_pro_txmsg { @@ -157,7 +157,7 @@ struct __packed pcan_usb_pro_txmsg { u8 client; u8 flags; u8 len; - u32 id; + __le32 id; u8 data[8]; }; -- cgit v1.2.3 From 2e46477a12f6fd273e31a220b155d66e8352198c Mon Sep 17 00:00:00 2001 From: Denis Kirjanov Date: Mon, 1 Dec 2014 12:57:02 +0300 Subject: mips: bpf: Fix broken BPF_MOD Remove optimize_div() from BPF_MOD | BPF_K case since we don't know the dividend and fix the emit_mod() by reading the mod operation result from HI register Signed-off-by: Denis Kirjanov Reviewed-by: Markos Chandras Signed-off-by: David S. Miller --- arch/mips/net/bpf_jit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c index 9b55143d19db..9fd6834a2172 100644 --- a/arch/mips/net/bpf_jit.c +++ b/arch/mips/net/bpf_jit.c @@ -426,7 +426,7 @@ static inline void emit_mod(unsigned int dst, unsigned int src, u32 *p = &ctx->target[ctx->idx]; uasm_i_divu(&p, dst, src); p = &ctx->target[ctx->idx + 1]; - uasm_i_mflo(&p, dst); + uasm_i_mfhi(&p, dst); } ctx->idx += 2; /* 2 insts */ } @@ -971,7 +971,7 @@ load_ind: break; case BPF_ALU | BPF_MOD | BPF_K: /* A %= k */ - if (k == 1 || optimize_div(&k)) { + if (k == 1) { ctx->flags |= SEEN_A; emit_jit_reg_move(r_A, r_zero, ctx); } else { -- cgit v1.2.3 From aebea2ba0f7495e1a1c9ea5e753d146cb2f6b845 Mon Sep 17 00:00:00 2001 From: willy tarreau Date: Tue, 2 Dec 2014 08:13:04 +0100 Subject: net: mvneta: fix Tx interrupt delay The mvneta driver sets the amount of Tx coalesce packets to 16 by default. Normally that does not cause any trouble since the driver uses a much larger Tx ring size (532 packets). But some sockets might run with very small buffers, much smaller than the equivalent of 16 packets. This is what ping is doing for example, by setting SNDBUF to 324 bytes rounded up to 2kB by the kernel. The problem is that there is no documented method to force a specific packet to emit an interrupt (eg: the last of the ring) nor is it possible to make the NIC emit an interrupt after a given delay. In this case, it causes trouble, because when ping sends packets over its raw socket, the few first packets leave the system, and the first 15 packets will be emitted without an IRQ being generated, so without the skbs being freed. And since the socket's buffer is small, there's no way to reach that amount of packets, and the ping ends up with "send: no buffer available" after sending 6 packets. Running with 3 instances of ping in parallel is enough to hide the problem, because with 6 packets per instance, that's 18 packets total, which is enough to grant a Tx interrupt before all are sent. The original driver in the LSP kernel worked around this design flaw by using a software timer to clean up the Tx descriptors. This timer was slow and caused terrible network performance on some Tx-bound workloads (such as routing) but was enough to make tools like ping work correctly. Instead here, we simply set the packet counts before interrupt to 1. This ensures that each packet sent will produce an interrupt. NAPI takes care of coalescing interrupts since the interrupt is disabled once generated. No measurable performance impact nor CPU usage were observed on small nor large packets, including when saturating the link on Tx, and this fixes tools like ping which rely on too small a send buffer. If one wants to increase this value for certain workloads where it is safe to do so, "ethtool -C $dev tx-frames" will override this default setting. This fix needs to be applied to stable kernels starting with 3.10. Tested-By: Maggie Mae Roxas Signed-off-by: Willy Tarreau Signed-off-by: David S. Miller --- drivers/net/ethernet/marvell/mvneta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index ade067de1689..bb4afe6ccc85 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -216,7 +216,7 @@ /* Various constants */ /* Coalescing */ -#define MVNETA_TXDONE_COAL_PKTS 16 +#define MVNETA_TXDONE_COAL_PKTS 1 #define MVNETA_RX_COAL_PKTS 32 #define MVNETA_RX_COAL_USEC 100 -- cgit v1.2.3 From 5f478b41033606d325e420df693162e2524c2b94 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 2 Dec 2014 04:30:59 -0800 Subject: net: mvneta: fix race condition in mvneta_tx() mvneta_tx() dereferences skb to get skb->len too late, as hardware might have completed the transmit and TX completion could have freed the skb from another cpu. Fixes: 71f6d1b31fb1 ("net: mvneta: replace Tx timer with a real interrupt") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- drivers/net/ethernet/marvell/mvneta.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index bb4afe6ccc85..67a84cfaefa1 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1721,6 +1721,7 @@ static int mvneta_tx(struct sk_buff *skb, struct net_device *dev) u16 txq_id = skb_get_queue_mapping(skb); struct mvneta_tx_queue *txq = &pp->txqs[txq_id]; struct mvneta_tx_desc *tx_desc; + int len = skb->len; int frags = 0; u32 tx_cmd; @@ -1788,7 +1789,7 @@ out: u64_stats_update_begin(&stats->syncp); stats->tx_packets++; - stats->tx_bytes += skb->len; + stats->tx_bytes += len; u64_stats_update_end(&stats->syncp); } else { dev->stats.tx_dropped++; -- cgit v1.2.3 From a5a519b2710be43fce3cf9ce7bd8de8db3f2a9de Mon Sep 17 00:00:00 2001 From: Alexander Duyck Date: Tue, 2 Dec 2014 10:58:21 -0800 Subject: fib_trie: Fix /proc/net/fib_trie when CONFIG_IP_MULTIPLE_TABLES is not defined In recent testing I had disabled CONFIG_IP_MULTIPLE_TABLES and as a result when I ran "cat /proc/net/fib_trie" the main trie was displayed multiple times. I found that the problem line of code was in the function fib_trie_seq_next. Specifically the line below caused the indexes to go in the opposite direction of our traversal: h = tb->tb_id & (FIB_TABLE_HASHSZ - 1); This issue was that the RT tables are defined such that RT_TABLE_LOCAL is ID 255, while it is located at TABLE_LOCAL_INDEX of 0, and RT_TABLE_MAIN is 254 with a TABLE_MAIN_INDEX of 1. This means that the above line will return 1 for the local table and 0 for main. The result is that fib_trie_seq_next will return NULL at the end of the local table, fib_trie_seq_start will return the start of the main table, and then fib_trie_seq_next will loop on main forever as h will always return 0. The fix for this is to reverse the ordering of the two tables. It has the advantage of making it so that the tables now print in the same order regardless of if multiple tables are enabled or not. In order to make the definition consistent with the multiple tables case I simply masked the to RT_TABLE_XXX values by (FIB_TABLE_HASHSZ - 1). This way the two table layouts should always stay consistent. Fixes: 93456b6 ("[IPV4]: Unify access to the routing tables") Signed-off-by: Alexander Duyck Signed-off-by: David S. Miller --- include/net/ip_fib.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index dc9d2a27c315..09a819ee2151 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -201,8 +201,8 @@ void fib_free_table(struct fib_table *tb); #ifndef CONFIG_IP_MULTIPLE_TABLES -#define TABLE_LOCAL_INDEX 0 -#define TABLE_MAIN_INDEX 1 +#define TABLE_LOCAL_INDEX (RT_TABLE_LOCAL & (FIB_TABLE_HASHSZ - 1)) +#define TABLE_MAIN_INDEX (RT_TABLE_MAIN & (FIB_TABLE_HASHSZ - 1)) static inline struct fib_table *fib_get_table(struct net *net, u32 id) { -- cgit v1.2.3 From 03ccc4c0a9dac4b94f4e07cd0b7a9a1cb142ed36 Mon Sep 17 00:00:00 2001 From: "Lendacky, Thomas" Date: Tue, 2 Dec 2014 18:16:48 -0600 Subject: amd-xgbe: Do not clear interrupt indicator The interrupt value within the xgbe_ring_data structure is used as an indicator of which Rx descriptor should have the INTE bit set to generate an interrupt when that Rx descriptor is used. This bit was mistakenly cleared in the xgbe_unmap_rdata function, effectively nullifying the ethtool rx-frames support. Signed-off-by: Tom Lendacky Signed-off-by: David S. Miller --- drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c index 6fc5da01437d..43b7d2e948f7 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c @@ -356,7 +356,6 @@ static void xgbe_unmap_skb(struct xgbe_prv_data *pdata, rdata->tso_header = 0; rdata->len = 0; - rdata->interrupt = 0; rdata->mapped_as_page = 0; if (rdata->state_saved) { -- cgit v1.2.3 From c153031773887fca473d25a2854cc37bd946c010 Mon Sep 17 00:00:00 2001 From: "Lendacky, Thomas" Date: Tue, 2 Dec 2014 18:16:54 -0600 Subject: amd-xgbe: Associate Tx SKB with proper ring descriptor The SKB for a Tx packet is associated with an xgbe_ring_data structure in the xgbe_map_tx_skb function. However, it is being saved in the structure after the last structure used when the SKB is mapped. Use the last used structure to save the SKB value. Signed-off-by: Tom Lendacky Signed-off-by: David S. Miller --- drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c index 43b7d2e948f7..b15551bad7fa 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c @@ -480,7 +480,11 @@ static int xgbe_map_tx_skb(struct xgbe_channel *channel, struct sk_buff *skb) } } - /* Save the skb address in the last entry */ + /* Save the skb address in the last entry. We always have some data + * that has been mapped so rdata is always advanced past the last + * piece of mapped data - use the entry pointed to by cur_index - 1. + */ + rdata = XGBE_GET_DESC_DATA(ring, cur_index - 1); rdata->skb = skb; /* Save the number of descriptor entries used */ -- cgit v1.2.3 From 79af221d67e7e7502d740d57eebd93ddd66cffc4 Mon Sep 17 00:00:00 2001 From: Hariprasad Shenai Date: Wed, 3 Dec 2014 11:49:50 +0530 Subject: cxgb4: Add a check for flashing FW using ethtool Don't let T4 firmware flash on a T5 adapter and vice-versa using ethtool Based on original work by Casey Leedom Signed-off-by: Hariprasad Shenai Signed-off-by: David S. Miller --- drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c index 163a2a14948c..c623f1fc2e3d 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c @@ -1131,6 +1131,27 @@ unsigned int t4_flash_cfg_addr(struct adapter *adapter) return FLASH_CFG_START; } +/* Return TRUE if the specified firmware matches the adapter. I.e. T4 + * firmware for T4 adapters, T5 firmware for T5 adapters, etc. We go ahead + * and emit an error message for mismatched firmware to save our caller the + * effort ... + */ +static bool t4_fw_matches_chip(const struct adapter *adap, + const struct fw_hdr *hdr) +{ + /* The expression below will return FALSE for any unsupported adapter + * which will keep us "honest" in the future ... + */ + if ((is_t4(adap->params.chip) && hdr->chip == FW_HDR_CHIP_T4) || + (is_t5(adap->params.chip) && hdr->chip == FW_HDR_CHIP_T5)) + return true; + + dev_err(adap->pdev_dev, + "FW image (%d) is not suitable for this adapter (%d)\n", + hdr->chip, CHELSIO_CHIP_VERSION(adap->params.chip)); + return false; +} + /** * t4_load_fw - download firmware * @adap: the adapter @@ -1170,6 +1191,8 @@ int t4_load_fw(struct adapter *adap, const u8 *fw_data, unsigned int size) FW_MAX_SIZE); return -EFBIG; } + if (!t4_fw_matches_chip(adap, hdr)) + return -EINVAL; for (csum = 0, i = 0; i < size / sizeof(csum); i++) csum += ntohl(p[i]); @@ -3080,6 +3103,9 @@ int t4_fw_upgrade(struct adapter *adap, unsigned int mbox, const struct fw_hdr *fw_hdr = (const struct fw_hdr *)fw_data; int reset, ret; + if (!t4_fw_matches_chip(adap, fw_hdr)) + return -EINVAL; + ret = t4_fw_halt(adap, mbox, force); if (ret < 0 && !force) return ret; -- cgit v1.2.3 From c5ac97042aea4c358c5124119fdb98cc79bb9144 Mon Sep 17 00:00:00 2001 From: Hariprasad Shenai Date: Wed, 3 Dec 2014 12:29:31 +0530 Subject: cxgb4: Update FW version string to match FW binary version 1.12.25.0 Signed-off-by: Hariprasad Shenai Signed-off-by: David S. Miller --- drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index 3c481b260745..0514b7431746 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -50,13 +50,13 @@ #include "cxgb4_uld.h" #define T4FW_VERSION_MAJOR 0x01 -#define T4FW_VERSION_MINOR 0x0B -#define T4FW_VERSION_MICRO 0x1B +#define T4FW_VERSION_MINOR 0x0C +#define T4FW_VERSION_MICRO 0x19 #define T4FW_VERSION_BUILD 0x00 #define T5FW_VERSION_MAJOR 0x01 -#define T5FW_VERSION_MINOR 0x0B -#define T5FW_VERSION_MICRO 0x1B +#define T5FW_VERSION_MINOR 0x0C +#define T5FW_VERSION_MICRO 0x19 #define T5FW_VERSION_BUILD 0x00 #define CH_WARN(adap, fmt, ...) dev_warn(adap->pdev_dev, fmt, ## __VA_ARGS__) -- cgit v1.2.3 From 9772b54c55266ce80c639a80aa68eeb908f8ecf5 Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Wed, 3 Dec 2014 12:13:58 +0100 Subject: net: sctp: use MAX_HEADER for headroom reserve in output path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit To accomodate for enough headroom for tunnels, use MAX_HEADER instead of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs of trinity an skb_under_panic() via SCTP output path (see reference). I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere in other protocols might be one possible cause for this. In any case, it looks like accounting on chunks themself seems to look good as the skb already passed the SCTP output path and did not hit any skb_over_panic(). Given tunneling was enabled in his .config, the headroom would have been expanded by MAX_HEADER in this case. Reported-by: Robert Święcki Reference: https://lkml.org/lkml/2014/12/1/507 Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().") Signed-off-by: Daniel Borkmann Acked-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller --- net/sctp/output.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/sctp/output.c b/net/sctp/output.c index 42dffd428389..fc5e45b8a832 100644 --- a/net/sctp/output.c +++ b/net/sctp/output.c @@ -401,12 +401,12 @@ int sctp_packet_transmit(struct sctp_packet *packet) sk = chunk->skb->sk; /* Allocate the new skb. */ - nskb = alloc_skb(packet->size + LL_MAX_HEADER, GFP_ATOMIC); + nskb = alloc_skb(packet->size + MAX_HEADER, GFP_ATOMIC); if (!nskb) goto nomem; /* Make sure the outbound skb has enough header room reserved. */ - skb_reserve(nskb, packet->overhead + LL_MAX_HEADER); + skb_reserve(nskb, packet->overhead + MAX_HEADER); /* Set the owning socket so that we know where to get the * destination IP address. -- cgit v1.2.3 From 9b8d16cf81f9d5f7284aee55050fa6753535101b Mon Sep 17 00:00:00 2001 From: Giuseppe CAVALLARO Date: Wed, 3 Dec 2014 12:32:58 +0100 Subject: stmmac: fix max coal timer parameter This patch is to fix the max coalesce timer setting that can be provided by ethtool. The default value (STMMAC_COAL_TX_TIMER) was used in the set_coalesce helper instead of the max one (STMMAC_MAX_COAL_TX_TICK, so defined but not used). Signed-off-by: Giuseppe Cavallaro Signed-off-by: David S. Miller --- drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c index 3a08a1f78c73..771cda2a48b2 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c @@ -696,7 +696,7 @@ static int stmmac_set_coalesce(struct net_device *dev, (ec->tx_max_coalesced_frames == 0)) return -EINVAL; - if ((ec->tx_coalesce_usecs > STMMAC_COAL_TX_TIMER) || + if ((ec->tx_coalesce_usecs > STMMAC_MAX_COAL_TX_TICK) || (ec->tx_max_coalesced_frames > STMMAC_TX_MAX_FRAMES)) return -EINVAL; -- cgit v1.2.3 From c9cdc74dfab260ebc37190bc359f99d83fc7104f Mon Sep 17 00:00:00 2001 From: Yaniv Rosner Date: Wed, 3 Dec 2014 14:15:20 +0200 Subject: bnx2x: Limit 1G link enforcement Change 1G-SFP module detection by verifying not only that it's not compliant with 10G-Ethernet, but also that it's 1G-ethernet compliant. Signed-off-by: Yaniv Rosner Signed-off-by: Yuval Mintz Signed-off-by: David S. Miller --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c index 549549eaf580..778e4cd32571 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c @@ -8119,10 +8119,11 @@ static int bnx2x_get_edc_mode(struct bnx2x_phy *phy, case SFP_EEPROM_CON_TYPE_VAL_LC: case SFP_EEPROM_CON_TYPE_VAL_RJ45: check_limiting_mode = 1; - if ((val[SFP_EEPROM_10G_COMP_CODE_ADDR] & + if (((val[SFP_EEPROM_10G_COMP_CODE_ADDR] & (SFP_EEPROM_10G_COMP_CODE_SR_MASK | SFP_EEPROM_10G_COMP_CODE_LR_MASK | - SFP_EEPROM_10G_COMP_CODE_LRM_MASK)) == 0) { + SFP_EEPROM_10G_COMP_CODE_LRM_MASK)) == 0) && + (val[SFP_EEPROM_1G_COMP_CODE_ADDR] != 0)) { DP(NETIF_MSG_LINK, "1G SFP module detected\n"); phy->media_type = ETH_PHY_SFP_1G_FIBER; if (phy->req_line_speed != SPEED_1000) { -- cgit v1.2.3 From 5d330cddb907df0911652c447baf9ee1a137245d Mon Sep 17 00:00:00 2001 From: Andrew Shewmaker Date: Wed, 3 Dec 2014 14:07:31 -0800 Subject: Update old iproute2 and Xen Remus links Signed-off-by: Andrew Shewmaker Acked-by: Stephen Hemminger Signed-off-by: David S. Miller --- Documentation/Changes | 2 +- net/sched/Kconfig | 7 ++++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/Documentation/Changes b/Documentation/Changes index 1de131bb49fb..74bdda9272a4 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -383,7 +383,7 @@ o Ip-route2 --------- -o +o OProfile -------- diff --git a/net/sched/Kconfig b/net/sched/Kconfig index a1a8e29e5fc9..d17053d9e9e2 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -22,8 +22,9 @@ menuconfig NET_SCHED This code is considered to be experimental. To administer these schedulers, you'll need the user-level utilities - from the package iproute2+tc at . - That package also contains some documentation; for more, check out + from the package iproute2+tc at + . That package + also contains some documentation; for more, check out . This Quality of Service (QoS) support will enable you to use @@ -336,7 +337,7 @@ config NET_SCH_PLUG of virtual machines by allowing the generated network output to be rolled back if needed. - For more information, please refer to http://wiki.xensource.com/xenwiki/Remus + For more information, please refer to Say Y here if you are using this kernel for Xen dom0 and want to protect Xen guests with Remus. -- cgit v1.2.3 From 244d62be91ddcea55ec6d456dbb7f71d411d21f0 Mon Sep 17 00:00:00 2001 From: "Lendacky, Thomas" Date: Thu, 4 Dec 2014 11:52:35 -0600 Subject: amd-xgbe: Prevent Tx cleanup stall When performing Tx cleanup, the dirty index counter is compared to the current index counter as one of the tests used to determine when to stop cleanup. The "less than" test will fail when the current index counter rolls over to zero causing cleanup to never occur again. Update the test to a "not equal" to avoid this situation. Signed-off-by: Tom Lendacky Signed-off-by: David S. Miller --- drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c index 2349ea970255..d0e35302410f 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c @@ -1554,7 +1554,7 @@ static int xgbe_tx_poll(struct xgbe_channel *channel) spin_lock_irqsave(&ring->lock, flags); while ((processed < XGBE_TX_DESC_MAX_PROC) && - (ring->dirty < ring->cur)) { + (ring->dirty != ring->cur)) { rdata = XGBE_GET_DESC_DATA(ring, ring->dirty); rdesc = rdata->rdesc; -- cgit v1.2.3 From 51de7bb9ab430f1db4d07e2b5e0711c48cb5a7e6 Mon Sep 17 00:00:00 2001 From: Joe Stringer Date: Fri, 5 Dec 2014 11:35:46 -0800 Subject: bnx2x: Implement ndo_gso_check() Use vxlan_gso_check() to advertise offload support for this NIC. Signed-off-by: Joe Stringer Signed-off-by: David S. Miller --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 74fbf9ea7bd8..893cdb6a423e 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include #include @@ -12550,6 +12551,11 @@ static int bnx2x_get_phys_port_id(struct net_device *netdev, return 0; } +static bool bnx2x_gso_check(struct sk_buff *skb, struct net_device *dev) +{ + return vxlan_gso_check(skb); +} + static const struct net_device_ops bnx2x_netdev_ops = { .ndo_open = bnx2x_open, .ndo_stop = bnx2x_close, @@ -12581,6 +12587,7 @@ static const struct net_device_ops bnx2x_netdev_ops = { #endif .ndo_get_phys_port_id = bnx2x_get_phys_port_id, .ndo_set_vf_link_state = bnx2x_set_vf_link_state, + .ndo_gso_check = bnx2x_gso_check, }; static int bnx2x_set_coherency_mask(struct bnx2x *bp) -- cgit v1.2.3 From f15650b7f94879667f253bc32de7431c1baf2d6e Mon Sep 17 00:00:00 2001 From: Jan Beulich Date: Tue, 9 Dec 2014 11:47:04 +0000 Subject: netback: don't store invalid vif pointer When xenvif_alloc() fails, it returns a non-NULL error indicator. To avoid eventual races, we shouldn't store that into struct backend_info as readers of it only check for NULL. Signed-off-by: Jan Beulich Acked-by: Ian Campbell Signed-off-by: David S. Miller --- drivers/net/xen-netback/xenbus.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index fab0d4b42f58..d44cd19169bd 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -404,6 +404,7 @@ static int backend_create_xenvif(struct backend_info *be) int err; long handle; struct xenbus_device *dev = be->dev; + struct xenvif *vif; if (be->vif != NULL) return 0; @@ -414,13 +415,13 @@ static int backend_create_xenvif(struct backend_info *be) return (err < 0) ? err : -EINVAL; } - be->vif = xenvif_alloc(&dev->dev, dev->otherend_id, handle); - if (IS_ERR(be->vif)) { - err = PTR_ERR(be->vif); - be->vif = NULL; + vif = xenvif_alloc(&dev->dev, dev->otherend_id, handle); + if (IS_ERR(vif)) { + err = PTR_ERR(vif); xenbus_dev_fatal(dev, err, "creating interface"); return err; } + be->vif = vif; kobject_uevent(&dev->dev.kobj, KOBJ_ONLINE); return 0; -- cgit v1.2.3 From 0f85feae6b710ced3abad5b2b47d31dfcb956b62 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 9 Dec 2014 09:56:08 -0800 Subject: tcp: fix more NULL deref after prequeue changes When I cooked commit c3658e8d0f1 ("tcp: fix possible NULL dereference in tcp_vX_send_reset()") I missed other spots we could deref a NULL skb_dst(skb) Again, if a socket is provided, we do not need skb_dst() to get a pointer to network namespace : sock_net(sk) is good enough. Reported-by: Dann Frazier Bisected-by: Dann Frazier Tested-by: Dann Frazier Signed-off-by: Eric Dumazet Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller --- net/ipv4/tcp_ipv4.c | 4 ++-- net/ipv6/tcp_ipv6.c | 28 ++++++++++++++-------------- 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 147be2024290..ef7089ca86e2 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -623,6 +623,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb) arg.iov[0].iov_base = (unsigned char *)&rep; arg.iov[0].iov_len = sizeof(rep.th); + net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); #ifdef CONFIG_TCP_MD5SIG hash_location = tcp_parse_md5sig_option(th); if (!sk && hash_location) { @@ -633,7 +634,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb) * Incoming packet is checked with md5 hash with finding key, * no RST generated if md5 hash doesn't match. */ - sk1 = __inet_lookup_listener(dev_net(skb_dst(skb)->dev), + sk1 = __inet_lookup_listener(net, &tcp_hashinfo, ip_hdr(skb)->saddr, th->source, ip_hdr(skb)->daddr, ntohs(th->source), inet_iif(skb)); @@ -681,7 +682,6 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb) if (sk) arg.bound_dev_if = sk->sk_bound_dev_if; - net = dev_net(skb_dst(skb)->dev); arg.tos = ip_hdr(skb)->tos; ip_send_unicast_reply(net, skb, &TCP_SKB_CB(skb)->header.h4.opt, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index dc495ae2ead0..c277951d783b 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -787,16 +787,16 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = { .queue_hash_add = inet6_csk_reqsk_queue_hash_add, }; -static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win, - u32 tsval, u32 tsecr, int oif, - struct tcp_md5sig_key *key, int rst, u8 tclass, - u32 label) +static void tcp_v6_send_response(struct sock *sk, struct sk_buff *skb, u32 seq, + u32 ack, u32 win, u32 tsval, u32 tsecr, + int oif, struct tcp_md5sig_key *key, int rst, + u8 tclass, u32 label) { const struct tcphdr *th = tcp_hdr(skb); struct tcphdr *t1; struct sk_buff *buff; struct flowi6 fl6; - struct net *net = dev_net(skb_dst(skb)->dev); + struct net *net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev); struct sock *ctl_sk = net->ipv6.tcp_sk; unsigned int tot_len = sizeof(struct tcphdr); struct dst_entry *dst; @@ -946,7 +946,7 @@ static void tcp_v6_send_reset(struct sock *sk, struct sk_buff *skb) (th->doff << 2); oif = sk ? sk->sk_bound_dev_if : 0; - tcp_v6_send_response(skb, seq, ack_seq, 0, 0, 0, oif, key, 1, 0, 0); + tcp_v6_send_response(sk, skb, seq, ack_seq, 0, 0, 0, oif, key, 1, 0, 0); #ifdef CONFIG_TCP_MD5SIG release_sk1: @@ -957,13 +957,13 @@ release_sk1: #endif } -static void tcp_v6_send_ack(struct sk_buff *skb, u32 seq, u32 ack, - u32 win, u32 tsval, u32 tsecr, int oif, +static void tcp_v6_send_ack(struct sock *sk, struct sk_buff *skb, u32 seq, + u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, struct tcp_md5sig_key *key, u8 tclass, u32 label) { - tcp_v6_send_response(skb, seq, ack, win, tsval, tsecr, oif, key, 0, tclass, - label); + tcp_v6_send_response(sk, skb, seq, ack, win, tsval, tsecr, oif, key, 0, + tclass, label); } static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) @@ -971,7 +971,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) struct inet_timewait_sock *tw = inet_twsk(sk); struct tcp_timewait_sock *tcptw = tcp_twsk(sk); - tcp_v6_send_ack(skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt, + tcp_v6_send_ack(sk, skb, tcptw->tw_snd_nxt, tcptw->tw_rcv_nxt, tcptw->tw_rcv_wnd >> tw->tw_rcv_wscale, tcp_time_stamp + tcptw->tw_ts_offset, tcptw->tw_ts_recent, tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw), @@ -986,10 +986,10 @@ static void tcp_v6_reqsk_send_ack(struct sock *sk, struct sk_buff *skb, /* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV * sk->sk_state == TCP_SYN_RECV -> for Fast Open. */ - tcp_v6_send_ack(skb, (sk->sk_state == TCP_LISTEN) ? + tcp_v6_send_ack(sk, skb, (sk->sk_state == TCP_LISTEN) ? tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt, - tcp_rsk(req)->rcv_nxt, - req->rcv_wnd, tcp_time_stamp, req->ts_recent, sk->sk_bound_dev_if, + tcp_rsk(req)->rcv_nxt, req->rcv_wnd, + tcp_time_stamp, req->ts_recent, sk->sk_bound_dev_if, tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr), 0, 0); } -- cgit v1.2.3 From 11d3d2a16cc1f05c6ece69a4392e99efb85666a6 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Tue, 9 Dec 2014 18:43:28 +0000 Subject: xen-netfront: use correct linear area after linearizing an skb Commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d (xen-netfront: Fix handling packets on compound pages with skb_linearize) attempted to fix a problem where an skb that would have required too many slots would be dropped causing TCP connections to stall. However, it filled in the first slot using the original buffer and not the new one and would use the wrong offset and grant access to the wrong page. Netback would notice the malformed request and stop all traffic on the VIF, reporting: vif vif-3-0 vif3.0: txreq.offset: 85e, size: 4002, end: 6144 vif vif-3-0 vif3.0: fatal error; disabling device Reported-by: Anthony Wright Tested-by: Anthony Wright Signed-off-by: David Vrabel Signed-off-by: David S. Miller --- drivers/net/xen-netfront.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index ece8d1804d13..eeed0ce620f3 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -627,6 +627,9 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) slots, skb->len); if (skb_linearize(skb)) goto drop; + data = skb->data; + offset = offset_in_page(data); + len = skb_headlen(skb); } spin_lock_irqsave(&queue->tx_lock, flags); -- cgit v1.2.3 From 69204cf7eb9c5a72067ce6922d4699378251d053 Mon Sep 17 00:00:00 2001 From: "Valdis.Kletnieks@vt.edu" Date: Tue, 9 Dec 2014 16:15:50 -0500 Subject: net: fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.c commit 46e5da40ae (net: qdisc: use rcu prefix and silence sparse warnings) triggers a spurious warning: net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage! The code should be using the _bh variant of rcu_dereference. Signed-off-by: Valdis Kletnieks Acked-by: Eric Dumazet Acked-by: John Fastabend Signed-off-by: David S. Miller --- net/sched/sch_fq_codel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index b9ca32ebc1de..1e52decb7b59 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -94,7 +94,7 @@ static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch, TC_H_MIN(skb->priority) <= q->flows_cnt) return TC_H_MIN(skb->priority); - filter = rcu_dereference(q->filter_list); + filter = rcu_dereference_bh(q->filter_list); if (!filter) return fq_codel_hash(q, skb) + 1; -- cgit v1.2.3