Make it possible to maintain low latency as much as possible at high loads by shortening timeslice the more loaded the machine will get. Do so by adding a tunable latency_bias which is disabled by default. Valid values are from 0 to 100, where higher values mean bias more for latency as load increases. Note that this should still maintain fairness, but will sacrifice throughput, potentially dramatically, to try and keep latencies as low as possible. Hz will still be a limiting factor so the higher Hz is, the lower the latencies maintainable. The effect of enabling this tunable will be to ensure that very low CPU usage processes, such as mouse cursor movement, will remain fluid no matter how high the load is. It's possible to have a smooth mouse cursor with massive loads, but the effect on throughput can be up to a -20%- loss at ultra high loads. At meaningful loads, a value of one will have minimal impact on throughput and ensure that under the occasional overload condition the machine will still feel fluid. -ck --- kernel/sched_bfs.c | 31 +++++++++++++++++++++++++++++-- kernel/sysctl.c | 10 ++++++++++ 2 files changed, 39 insertions(+), 2 deletions(-) Index: linux-2.6.35.5-ck1/kernel/sched_bfs.c =================================================================== --- linux-2.6.35.5-ck1.orig/kernel/sched_bfs.c 2010-10-03 14:00:10.713750940 +1100 +++ linux-2.6.35.5-ck1/kernel/sched_bfs.c 2010-10-03 15:30:40.896092403 +1100 @@ -135,6 +135,14 @@ int rr_interval __read_mostly = 6; int sched_iso_cpu __read_mostly = 70; /* + * latency_bias is used to determine whether we should sacrifice throughput + * as load increases to try and keep latencies bound to rr_interval. It does + * not change the fairness, so heavy CPU users will still run slow (slower + * since throughput decreases dramatically the higher this is set to). + */ +int latency_bias __read_mostly; + +/* * The relative length of deadline for each priority(nice) level. */ static int prio_ratios[PRIO_RANGE] __read_mostly; @@ -2048,11 +2056,30 @@ update_cpu_clock(struct rq *rq, struct t * time_slice accounting. */ if (unlikely(time_diff <= 0)) - time_diff = JIFFIES_TO_NS(1) / 2; + time_diff = HALF_JIFFY_NS; else if (unlikely(time_diff > JIFFIES_TO_NS(1))) time_diff = JIFFIES_TO_NS(1); - rq->rq_time_slice -= NS_TO_US(time_diff); + /* + * If we are overloaded, then shorten the effective timeslices + * to ensure latencies are kept as small as is possible by + * making them expire at a rate proportional to load/CPUs. Use + * latency_bias to determine the upper limit for how much to + * shorten the effective timeslice. + */ + if (latency_bias) { + int nr = nr_running(), nol = num_online_cpus(); + + if (nr > nol) { + time_diff /= nol; + if (nr > latency_bias * nol) + nr = latency_bias * nol; + /* Start shortening when load is==CPUs */ + time_diff *= nr + 1; + } + } + time_diff = NS_TO_US(time_diff); + rq->rq_time_slice -= time_diff; } rq->rq_last_ran = rq->timekeep_clock = rq->clock; } Index: linux-2.6.35.5-ck1/kernel/sysctl.c =================================================================== --- linux-2.6.35.5-ck1.orig/kernel/sysctl.c 2010-10-03 13:32:31.271522225 +1100 +++ linux-2.6.35.5-ck1/kernel/sysctl.c 2010-10-03 14:34:21.557481440 +1100 @@ -119,6 +119,7 @@ static int __maybe_unused one_hundred = #ifdef CONFIG_SCHED_BFS extern int rr_interval; extern int sched_iso_cpu; +extern int latency_bias; static int __read_mostly one_thousand = 1000; #endif #ifdef CONFIG_PRINTK @@ -802,6 +803,15 @@ static struct ctl_table kern_table[] = { .maxlen = sizeof (int), .mode = 0644, .proc_handler = &proc_dointvec_minmax, + .extra1 = &zero, + .extra2 = &one_hundred, + }, + { + .procname = "latency_bias", + .data = &latency_bias, + .maxlen = sizeof (int), + .mode = 0644, + .proc_handler = &proc_dointvec_minmax, .extra1 = &zero, .extra2 = &one_hundred, },