summaryrefslogtreecommitdiff
path: root/net
diff options
context:
space:
mode:
authorJason Gunthorpe <jgg@nvidia.com>2020-10-26 11:25:49 -0300
committerJason Gunthorpe <jgg@nvidia.com>2020-10-28 09:14:49 -0300
commit071ba4cc559de47160761b9500b72e8fa09d923d (patch)
tree0433f309f2aae8e868f09260cd1647671236af85 /net
parent7d66a71488d7c14506ab81d6455c095992efca04 (diff)
RDMA: Add rdma_connect_locked()
There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the handler triggers a completion and another thread does rdma_connect() or the handler directly calls rdma_connect(). In all cases rdma_connect() needs to hold the handler_mutex, but when handler's are invoked this is already held by the core code. This causes ULPs using the 2nd method to deadlock. Provide a rdma_connect_locked() and have all ULPs call it from their handlers. Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state") Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Diffstat (limited to 'net')
-rw-r--r--net/rds/ib_cm.c5
1 files changed, 3 insertions, 2 deletions
diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index 06603dd1c8aa..b36b60668b1d 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -956,9 +956,10 @@ int rds_ib_cm_initiate_connect(struct rdma_cm_id *cm_id, bool isv6)
rds_ib_cm_fill_conn_param(conn, &conn_param, &dp,
conn->c_proposed_version,
UINT_MAX, UINT_MAX, isv6);
- ret = rdma_connect(cm_id, &conn_param);
+ ret = rdma_connect_locked(cm_id, &conn_param);
if (ret)
- rds_ib_conn_error(conn, "rdma_connect failed (%d)\n", ret);
+ rds_ib_conn_error(conn, "rdma_connect_locked failed (%d)\n",
+ ret);
out:
/* Beware - returning non-zero tells the rdma_cm to destroy