ANDROID: sched: Fix missing RQCF_UPDATED in migrate_tasks

Currently, the sched code checks if the rq clock has been
updated after its lock has been held when CONFIG_SCHED_DEBUG
is enabled. It tracks this by clearing the RQCF_UPDATED bit
when a lock is acquired and setting it upon a subsequent
update_rq_clock() call. It warns if rq clock is read without
RQCF_UPDATED flag indicating the code path missed updating
the clock.

When migrate_tasks() is called during a pause_cpus() event,
the local variable orf is updated with the contents of *rf,
prior to the call to update_rq_clock(). As a result, when
migrate_tasks() restores *rf from the local variable the
RQCF_UPDATED flag is lost. This clearing out of the
RQCF_UPDATED flag leads to a warning when the next task
is being pushed out.

For example in migrate_tasks()

   orf = rf; // save flags, RQCF_UPDATE cleared
   update_rq_clock() // set RQCF_UPDATE
   for()
      ...
      __migrate_task(dead_rq, new_cpu)
      ...
      --> if migration, restore dead_rq's flags with orf.
         --> We loose RQCF_UPDATE
	 rq_relock(dead_rq, orf)

This leaves the current cpu's rq clock_update_flags with the
RQCF_UPDATED flag cleared, an error condition with
CONFIG_SCHED_DEBUG enabled.

Fix the issue for by ensuring that the local variable orf
has the RQCF_UPDATED flag set, allowing the current
CPU's rq to have the flag set and leaving it in a good state
for future usage.

pause_cpus() is currently Android specific. As cpu_pause does
not rely on stop_machine_cpuslocked() like the regular
hotunplug path does, there's a risk for another CPU to
read the rq_clock, after we cleared RQCF_UPDATE, when using
pause_cpus(). This change will have little or no impact outside
of Android currently. If pause_cpus() or drain_rq_cpu_stop()
are merged upstream this change should be merged as well.

Bug: 186222712
Change-Id: Id241122e1449cdd4dcd15f94eb68735b40e3d6f5
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
This commit is contained in:
Stephen Dickey
2021-04-23 20:04:49 -07:00
committed by Todd Kjos
parent 28b4b1588e
commit a64f42d1df

View File

@@ -6932,6 +6932,11 @@ static void migrate_tasks(struct rq *dead_rq, struct rq_flags *rf, bool force)
*/
update_rq_clock(rq);
#ifdef CONFIG_SCHED_DEBUG
/* note the clock update in orf */
orf.clock_update_flags |= RQCF_UPDATED;
#endif
for (;;) {
/*
* There's this thread running, bail when that's the only