- if the actual performance deviates from the predicted (scored) performance, the system easily enters a degenerate bottlenecked state.
- and if that happens, the many internal queues make diagnosis, root causing, and confidence in a fix all exponentially worse.
Now you might assert that this will be applied in situations where scores are accurate and brown failures do not occur. Those aren’t the situations I deal with.
Sounds like the author of this would be interested in Queueing Theory[1] (in the sense of being interested in mathematical formalisms to explore this stuff). Apportionment[2] is also studied as a very specific thing unto itself.
There's a huge mass of published research "out there" dealing with queueing and scheduling. Not all of it pertains to "thread scheduling" but there's quite a bit of conceptual overlap between something like thread scheduling and job floor scheduling. And some of the stuff on apportionment likewise probably relates at least by analogy.
[1]: https://en.wikipedia.org/wiki/Queueing_theory
[2]: https://en.wikipedia.org/wiki/Mathematics_of_apportionment
That's why in the vast majority of circumstances you'll be running many things on many CPUs, you just throw all the work at the CPUs and let the chips fall where they may. Deliberate scheduling is a tool, but an unusual one, especially as many times the correct solution to tight scheduling situations is to throw more resources at it anyhow. (Trying to eke out wins by changing your scheduling implies that you're also in a situation where slight increases in workloads will back the entire system up no matter what you do.)
Pipelines are strictly processing stages where the 'production of the input' and processing on the inputs are not synchronized. For example, one sends n requests to via a pipeline protocol to a remote server without waiting for acks for each input from the server. There may only be one such processing pipeline (and thus no parallelism) while there is pipelining.
Maybe, if the processes at each stage are I/O-bound, then it might make sense. But if they are CPU-bound, then I am not sure this way of pipelining helps - you're moving data between different CPUs, destroying cache locality.