去除了小手段的BFS调度器最终将小手段去除是重要的,否则BFS最终还是会陷入类似O(1),CFS等复杂化的泥潭里面不可自拔,因此在后续的patch中,BFS去除了上述的小手段,用统一的O(n)复杂度来pick-next,毕竟前面已经说了O(n)在特定环境下并不是问题的关键,该patch在2.6.31.14-bfs318-330test.patch中体现。
队列外部执行BFS调度器和CFS是一样的,都是队列外执行进程的,这样可以减少锁争用带来的性能问题。再列出作者的一席话:
BFS has one single lock protecting the process local data of every task in the
global queue. Thus every insertion, removal and modification of task data in the
global runqueue needs to grab the global lock. However, once a task is taken by
a CPU, the CPU has its own local data copy of the running process' accounting
information which only that CPU accesses and modifies (such as during a
timer tick) thus allowing the accounting data to be updated lockless. Once a
CPU has taken a task to run, it removes it from the global queue. Thus the
global queue only ever has, at most,
(number of tasks requesting cpu time) - (number of logical CPUs) + 1
tasks in the global queue. This value is relevant for the time taken to look up
tasks during scheduling. This will increase if many tasks with CPU affinity set
in their policy to limit which CPUs they're allowed to run on if they outnumber
the number of CPUs. The +1 is because when rescheduling a task, the CPU's
currently running task is put back on the queue. Lookup will be described after
the virtual deadline mechanism is explained.
在schedule核心函数中,使用return_task来把prev进程重新入队,在earliest_deadline_task这个pick-next中,使用take_task将选中的next从队列取出,从而实现队列外执行。
综上的结论从上面的论述,我们丝毫没有看到有任何的诸如“SMP负载均衡”,“CPU亲和力”,“补偿”,“惩罚”之类的字眼,是的,这些字眼在BFS中完全不需要,BFS也正是摒弃了这些字眼才获得成功的,毕竟在一个一般人使用的桌面操作系统中,没有这么多的套套,大多数人使用的就是一个只有一个到两个处理器核心的系统,难道有必要搞什么调度域么?难道有必要搞什么NUMA么?需求决定一切,面对大型服务器,有UNIX的机制站在那里,而如果我们想把Linux推广到每一个掌上设备,那就没必要复制UNIX的那套了,BFS完全可以完美的搞定一切。小手段的去除,说明BFS调度器的发展方向起码是正确的。
BFS对SMP的支持如何呢?答案是它仅仅支持少量CPU的SMP体系,别忘了BFS的应用场合。因为在调度过程中需要一个遍历所有CPU的O(m)复杂度的计算,这就明确告诉人们,别指望BFS使用在拥有4096个CPU的系统上,正如没人用这种系统看视频一样,那样的话,还是乖乖使用CFS吧。
返璞归真的Linux BFS调度器(4)
内容版权声明:除非注明,否则皆为本站原创文章。