scsi: core: Avoid leaving shost->last_reset with stale value if EH does not run

The changes to issue the abort from the scmd->abort_work instead of the EH thread introduced a problem if eh_deadline is used. If aborting the command(s) is successful, and there are never any scmds added to the shost->eh_cmd_q, there is no code path which will reset the ->last_reset value back to zero. The effect of this is that after a successful abort with no EH thread activity, a subsequent timeout, perhaps a long time later, might immediately be considered past a user-set eh_deadline time, and the host will be reset with no attempt at recovery. Fix this by resetting ->last_reset back to zero in scmd_eh_abort_handler() if it is determined that the EH thread will not run to do this. Thanks to Gopinath Marappan for investigating this problem. Link: https://lore.kernel.org/r/20211029194311.17504-2-emilne@redhat.com Fixes: e494f6a728 ("[SCSI] improved eh timeout handler") Cc: stable@vger.kernel.org Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-29 15:43:10 -04:00
parent 5f7cf82c1d
commit 5ae17501bc
5 changed files with 29 additions and 1 deletions
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -551,6 +551,7 @@ struct Scsi_Host {

 	struct mutex		scan_mutex;/* serialize scanning activity */

+	struct list_head	eh_abort_list;
 	struct list_head	eh_cmd_q;
 	struct task_struct    * ehandler;  /* Error recovery thread. */
 	struct completion     * eh_action; /* Wait for specific actions on the