From 98123866fcf3fe95a0c1b198ef122dfdbd351916 Mon Sep 17 00:00:00 2001 From: Benjamin Coddington Date: Fri, 16 Dec 2022 07:45:27 -0500 Subject: Treewide: Stop corrupting socket's task_frag MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the GFP_NOIO flag on sk_allocation which the networking system uses to decide when it is safe to use current->task_frag. The results of this are unexpected corruption in task_frag when SUNRPC is involved in memory reclaim. The corruption can be seen in crashes, but the root cause is often difficult to ascertain as a crashing machine's stack trace will have no evidence of being near NFS or SUNRPC code. I believe this problem to be much more pervasive than reports to the community may indicate. Fix this by having kernel users of sockets that may corrupt task_frag due to reclaim set sk_use_task_frag = false. Preemptively correcting this situation for users that still set sk_allocation allows them to convert to memalloc_nofs_save/restore without the same unexpected corruptions that are sure to follow, unlikely to show up in testing, and difficult to bisect. CC: Philipp Reisner CC: Lars Ellenberg CC: "Christoph Böhmwalder" CC: Jens Axboe CC: Josef Bacik CC: Keith Busch CC: Christoph Hellwig CC: Sagi Grimberg CC: Lee Duncan CC: Chris Leech CC: Mike Christie CC: "James E.J. Bottomley" CC: "Martin K. Petersen" CC: Valentina Manea CC: Shuah Khan CC: Greg Kroah-Hartman CC: David Howells CC: Marc Dionne CC: Steve French CC: Christine Caulfield CC: David Teigland CC: Mark Fasheh CC: Joel Becker CC: Joseph Qi CC: Eric Van Hensbergen CC: Latchesar Ionkov CC: Dominique Martinet CC: Ilya Dryomov CC: Xiubo Li CC: Chuck Lever CC: Jeff Layton CC: Trond Myklebust CC: Anna Schumaker CC: Steffen Klassert CC: Herbert Xu Suggested-by: Guillaume Nault Signed-off-by: Benjamin Coddington Reviewed-by: Guillaume Nault Signed-off-by: Jakub Kicinski --- fs/cifs/connect.c | 1 + fs/dlm/lowcomms.c | 2 ++ fs/ocfs2/cluster/tcp.c | 1 + 3 files changed, 4 insertions(+) (limited to 'fs') diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index e80252a83225..7bc7b5e03c51 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -2944,6 +2944,7 @@ generic_ip_connect(struct TCP_Server_Info *server) cifs_dbg(FYI, "Socket created\n"); server->ssocket = socket; socket->sk->sk_allocation = GFP_NOFS; + socket->sk->sk_use_task_frag = false; if (sfamily == AF_INET6) cifs_reclassify_socket6(socket); else diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index 8b80ca0cd65f..4450721ec83c 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -645,6 +645,7 @@ static void add_sock(struct socket *sock, struct connection *con) if (dlm_config.ci_protocol == DLM_PROTO_SCTP) sk->sk_state_change = lowcomms_state_change; sk->sk_allocation = GFP_NOFS; + sk->sk_use_task_frag = false; sk->sk_error_report = lowcomms_error_report; release_sock(sk); } @@ -1769,6 +1770,7 @@ static int dlm_listen_for_all(void) listen_con.sock = sock; sock->sk->sk_allocation = GFP_NOFS; + sock->sk->sk_use_task_frag = false; sock->sk->sk_data_ready = lowcomms_listen_data_ready; release_sock(sock->sk); diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c index 37d222bdfc8c..a07b24d170f2 100644 --- a/fs/ocfs2/cluster/tcp.c +++ b/fs/ocfs2/cluster/tcp.c @@ -1602,6 +1602,7 @@ static void o2net_start_connect(struct work_struct *work) sc->sc_sock = sock; /* freed by sc_kref_release */ sock->sk->sk_allocation = GFP_ATOMIC; + sock->sk->sk_use_task_frag = false; myaddr.sin_family = AF_INET; myaddr.sin_addr.s_addr = mynode->nd_ipv4_address; -- cgit v1.2.3