High CPU utilization but low load average The 2019 Stack Overflow Developer Survey Results Are InHow to understand the memory usage and load average in linux serverWhat's going on with my server? High load, lots of idle CPU time, low disk utilizationHigh load average, low cpuLoad average is 50 while CPU Utilization is %60Uneven CPU core utilizationHigh Load Average with modest CPU Utilization and almost no IOHigh load with low CPU usage and low IO usage on Solaris with ZFS and MySQLLow load average, but high %user and %system cpu usageLinux: Extreme load on idle CPUDiscrepancy between “ps aux” and the 1-minute average server loadAll CPU busy, high avg load, but tasks CPU usage don't add up

Likelihood that a superbug or lethal virus could come from a landfill

Why can't wing-mounted spoilers be used to steepen approaches?

Does adding complexity mean a more secure cipher?

If I score a critical hit on an 18 or higher, what are my chances of getting a critical hit if I roll 3d20?

Deal with toxic manager when you can't quit

The phrase "to the numbers born"?

A female thief is not sold to make restitution -- so what happens instead?

Why are there uneven bright areas in this photo of black hole?

What do I do when my TA workload is more than expected?

Is an up-to-date browser secure on an out-of-date OS?

What information about me do stores get via my credit card?

Question on an engine pulling a train

Am I ethically obligated to go into work on an off day if the reason is sudden?

Why can I use a list index as an indexing variable in a for loop?

C++ auto on int16_t casts to integer

Christmas short horror story about a woman who becomes trapped in another body?

Using `min_active_rowversion` for global temporary tables

The following signatures were invalid: EXPKEYSIG 1397BC53640DB551

What is this sharp, curved notch on my knife for?

Why is this recursive code so slow?

Why didn't the Event Horizon Telescope team mention Sagittarius A*?

Is there a way to generate a uniformly distributed point on a sphere from a fixed amount of random real numbers?

How much of the clove should I use when using big garlic heads?

What was the last CPU that did not have the x87 floating-point unit built in?

High CPU utilization but low load average

The 2019 Stack Overflow Developer Survey Results Are InHow to understand the memory usage and load average in linux serverWhat's going on with my server? High load, lots of idle CPU time, low disk utilizationHigh load average, low cpuLoad average is 50 while CPU Utilization is %60Uneven CPU core utilizationHigh Load Average with modest CPU Utilization and almost no IOHigh load with low CPU usage and low IO usage on Solaris with ZFS and MySQLLow load average, but high %user and %system cpu usageLinux: Extreme load on idle CPUDiscrepancy between “ps aux” and the 1-minute average server loadAll CPU busy, high avg load, but tasks CPU usage don't add up

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

We are running into a strange behavior where we see high CPU utilization but quite low load average.

The behavior is best illustrated by the following graphs from our monitoring system.

CPU usage and load

At about 11:57 the CPU utilization goes from 25% to 75%. The load average is not significantly changed.

We run servers with 12 cores with 2 hyper threads each. The OS sees this as 24 CPUs.

The CPU utilization data is collected by running /usr/bin/mpstat 60 1 each minute. The data for the all row and the %usr column is shown in the chart above. I am certain this does show the average per CPU data, not the "stacked" utilization. While we see 75% utilization in the chart we see a process showing to use about 2000% "stacked" CPU in top.

The load average figure is taken from /proc/loadavg each minute.

uname -a gives:

Linux ab04 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Linux dist is Red Hat Enterprise Linux Server release 6.3 (Santiago)

We run a couple of Java web applications under fairly heavy load on the machines, think 100 requests/s per machine.

If I interpret the CPU utilization data correctly, when we have 75% CPU utilization it means that our CPUs are executing a process 75% of the time, on average. However, if our CPUs are busy 75% of the time, shouldn't we see higher load average? How could the CPUs be 75% busy while we only have 2-4 jobs in the run queue?

Are we interpreting our data correctly? What can cause this behavior?

edited Feb 12 '15 at 14:46

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

Is the monitoring system showing normalized CPU load (load / #CPUs)? Regular Linux CPU load is hard to compare across systems with different core/cpu counts so some tools use a normalized CPU load instead.

– Brian
Feb 12 '15 at 12:08

Do you mean dividing each data point with the number of CPUs? I.e. loadavg/24 in our case? I can easily create such a chart from the data if that helps.

– K Erlandsson
Feb 12 '15 at 12:14

I was suggesting your chart may already be showing that.

– Brian
Feb 12 '15 at 12:51

Ah, sorry for misunderstanding you. It would have been a nice explanation, but unfortunately it is the system-wide load average that is shown. I just triple checked.

– K Erlandsson
Feb 12 '15 at 12:54

add a comment |

We are running into a strange behavior where we see high CPU utilization but quite low load average.

The behavior is best illustrated by the following graphs from our monitoring system.

CPU usage and load

At about 11:57 the CPU utilization goes from 25% to 75%. The load average is not significantly changed.

We run servers with 12 cores with 2 hyper threads each. The OS sees this as 24 CPUs.

The load average figure is taken from /proc/loadavg each minute.

uname -a gives:

Linux ab04 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Linux dist is Red Hat Enterprise Linux Server release 6.3 (Santiago)

We run a couple of Java web applications under fairly heavy load on the machines, think 100 requests/s per machine.

Are we interpreting our data correctly? What can cause this behavior?

edited Feb 12 '15 at 14:46

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

Is the monitoring system showing normalized CPU load (load / #CPUs)? Regular Linux CPU load is hard to compare across systems with different core/cpu counts so some tools use a normalized CPU load instead.

– Brian
Feb 12 '15 at 12:08

Do you mean dividing each data point with the number of CPUs? I.e. loadavg/24 in our case? I can easily create such a chart from the data if that helps.

– K Erlandsson
Feb 12 '15 at 12:14

I was suggesting your chart may already be showing that.

– Brian
Feb 12 '15 at 12:51

Ah, sorry for misunderstanding you. It would have been a nice explanation, but unfortunately it is the system-wide load average that is shown. I just triple checked.

– K Erlandsson
Feb 12 '15 at 12:54

add a comment |

We are running into a strange behavior where we see high CPU utilization but quite low load average.

The behavior is best illustrated by the following graphs from our monitoring system.

CPU usage and load

At about 11:57 the CPU utilization goes from 25% to 75%. The load average is not significantly changed.

We run servers with 12 cores with 2 hyper threads each. The OS sees this as 24 CPUs.

The load average figure is taken from /proc/loadavg each minute.

uname -a gives:

Linux ab04 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Linux dist is Red Hat Enterprise Linux Server release 6.3 (Santiago)

We run a couple of Java web applications under fairly heavy load on the machines, think 100 requests/s per machine.

Are we interpreting our data correctly? What can cause this behavior?

edited Feb 12 '15 at 14:46

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

We are running into a strange behavior where we see high CPU utilization but quite low load average.

The behavior is best illustrated by the following graphs from our monitoring system.

CPU usage and load

At about 11:57 the CPU utilization goes from 25% to 75%. The load average is not significantly changed.

We run servers with 12 cores with 2 hyper threads each. The OS sees this as 24 CPUs.

The load average figure is taken from /proc/loadavg each minute.

uname -a gives:

Linux ab04 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Linux dist is Red Hat Enterprise Linux Server release 6.3 (Santiago)

We run a couple of Java web applications under fairly heavy load on the machines, think 100 requests/s per machine.

Are we interpreting our data correctly? What can cause this behavior?

linux cpu-usage troubleshooting load-average

edited Feb 12 '15 at 14:46

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

edited Feb 12 '15 at 14:46

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

edited Feb 12 '15 at 14:46

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

asked Feb 12 '15 at 11:53

K Erlandsson

2751612

Is the monitoring system showing normalized CPU load (load / #CPUs)? Regular Linux CPU load is hard to compare across systems with different core/cpu counts so some tools use a normalized CPU load instead.

– Brian
Feb 12 '15 at 12:08

Do you mean dividing each data point with the number of CPUs? I.e. loadavg/24 in our case? I can easily create such a chart from the data if that helps.

– K Erlandsson
Feb 12 '15 at 12:14

I was suggesting your chart may already be showing that.

– Brian
Feb 12 '15 at 12:51

Ah, sorry for misunderstanding you. It would have been a nice explanation, but unfortunately it is the system-wide load average that is shown. I just triple checked.

– K Erlandsson
Feb 12 '15 at 12:54

add a comment |

Is the monitoring system showing normalized CPU load (load / #CPUs)? Regular Linux CPU load is hard to compare across systems with different core/cpu counts so some tools use a normalized CPU load instead.

– Brian
Feb 12 '15 at 12:08

Do you mean dividing each data point with the number of CPUs? I.e. loadavg/24 in our case? I can easily create such a chart from the data if that helps.

– K Erlandsson
Feb 12 '15 at 12:14

I was suggesting your chart may already be showing that.

– Brian
Feb 12 '15 at 12:51

Ah, sorry for misunderstanding you. It would have been a nice explanation, but unfortunately it is the system-wide load average that is shown. I just triple checked.

– K Erlandsson
Feb 12 '15 at 12:54

Is the monitoring system showing normalized CPU load (load / #CPUs)? Regular Linux CPU load is hard to compare across systems with different core/cpu counts so some tools use a normalized CPU load instead.

– Brian
Feb 12 '15 at 12:08

Do you mean dividing each data point with the number of CPUs? I.e. loadavg/24 in our case? I can easily create such a chart from the data if that helps.

– K Erlandsson
Feb 12 '15 at 12:14

I was suggesting your chart may already be showing that.

– Brian
Feb 12 '15 at 12:51

Ah, sorry for misunderstanding you. It would have been a nice explanation, but unfortunately it is the system-wide load average that is shown. I just triple checked.

– K Erlandsson
Feb 12 '15 at 12:54

add a comment |

7 Answers
7

active

oldest

votes

On Linux at least, the load average and CPU utilization are actually two different things. Load average is a measurement of how many tasks are waiting in a kernel run queue (not just CPU time but also disk activity) over a period of time. CPU utilization is a measure of how busy the CPU is right now. The most load that a single CPU thread pegged at 100% for one minute can "contribute" to the 1 minute load average is 1. A 4 core CPU with hyperthreading (8 virtual cores) all at 100% for 1 minute would contribute 8 to the 1 minute load average.

Often times these two numbers have patterns that correlate to each other, but you can't think of them as the same. You can have a high load with nearly 0% CPU utilization (such as when you have a lot of IO data stuck in a wait state) and you can have a load of 1 and 100% CPU, when you have a single threaded process running full tilt. Also for short periods of time you can see the CPU at close to 100% but the load is still below 1 because the average metrics haven't "caught up" yet.

I've seen a server have a load of over 15,000 (yes really that's not a typo) and a CPU % of close to 0%. It happened because a Samba share was having issues and lots and lots of clients started getting stuck in an IO wait state. Chances are if you are seeing a regular high load number with no corresponding CPU activity, you are having a storage problem of some kind. On virtual machines this can also mean that there are other VMs heavily competing for storage resources on the same VM host.

High load is also not necessarily a bad thing, most of the time it just means the system is being utilized to it's fullest capacity or maybe is beyond it's capability to keep up (if the load number is higher than the number of processor cores). At a place I used to be a sysadmin, they had someone who watched the load average on their primary system closer than Nagios did. When the load was high, they would call me 24/7 faster than you could say SMTP. Most of the time nothing was actually wrong, but they associated the load number with something being wrong and watched it like a hawk. After checking, my response was usually that the system was just doing it's job. Of course this was the same place where the load got up over 15000 (not the same server though) so sometimes it does mean something is wrong. You have to consider the purpose of your system. If it's a workhorse, then expect the load to be naturally high.

edited 47 mins ago

answered Feb 12 '15 at 21:38

deltaray

980613

How do you mean that I can have a load of 1 and 100% CPU with a single threaded process? What kind of threads are you talking about? If we consider our Java processes, they have tons of threads, but I was under the assumption that the threads were treated as processes from the perspective of the OS (they have separate PIDs on Linux after all). Could it be so that a single multi threaded java process is only counted as one task from a load average perspective?

– K Erlandsson
Feb 13 '15 at 9:17

I just did a test on my own, the threads in a Java process contributes to the load average as if they where separate processes (I.e. a java class that runs 10 threads in a busy-wait loop gives me a load close to 10). I would appreciate a clarification about the threaded process you mentioned above. Thank you!

– K Erlandsson
Feb 13 '15 at 9:26

I mean if you have a non-multithreading process (ie, one that just uses a single CPU at a time). For instance if you just write a simple C program that runs a busy loop, its just a single thread running and uses only 1 CPU at a time.

– deltaray
Feb 13 '15 at 20:36

All information I have found says that threads count as separate processes when seen from the kernel and when calculating load. Hence I fail to see how I could have a multi threaded process on full tilt resulting in 1 load and 100% CPU on a multi-CPU system. Could you please help me understand how you mean?

– K Erlandsson
Feb 14 '15 at 13:13

For anyone looking for more detail: "Linux Load Averages: Solving the Mystery" by Brendan Gregg had all the answers I ever needed.

– Nickolay
Sep 12 '18 at 12:50

add a comment |

Load is a very deceptive number. Take it with a grain of salt.

If you spawn many tasks in very quick succession which complete very quickly, the number of processes in the run queue is too small to register the load for them (the kernel counts load once every five seconds).

Consider this example, on my host which has 8 logical cores, this python script will register a large CPU usage in top (about 85%), yet hardly any load.

import os, sys

while True:
 for j in range(8):
 parent = os.fork()
 if not parent:
 n = 0
 for i in range(10000):
 n += 1
 sys.exit(0)
 for j in range(8):
 os.wait()

Another implementation, this one avoids wait in groups of 8 (which would skew the test). Here the parent always attempts to keep the number of children at the number of active CPUs such it will be much busier than the first method and hopefully more accurate.

/* Compile with flags -O0 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <err.h>
#include <errno.h>

#include <sys/signal.h>
#include <sys/types.h>
#include <sys/wait.h>

#define ITERATIONS 50000

int maxchild = 0;
volatile int numspawned = 0;

void childhandle(
 int signal)

 int stat;
 /* Handle all exited children, until none are left to handle */
 while (waitpid(-1, &stat, WNOHANG) > 0) 
 numspawned--;
 


/* Stupid task for our children to do */
void do_task(
 void)

 int i,j;
 for (i=0; i < ITERATIONS; i++)
 j++;
 exit(0);


int main() 
 pid_t pid;

 struct sigaction act;
 sigset_t sigs, old;

 maxchild = sysconf(_SC_NPROCESSORS_ONLN);

 /* Setup child handler */
 memset(&act, 0, sizeof(act));
 act.sa_handler = childhandle;
 if (sigaction(SIGCHLD, &act, NULL) < 0)
 err(EXIT_FAILURE, "sigaction");

 /* Defer the sigchild signal */
 sigemptyset(&sigs);
 sigaddset(&sigs, SIGCHLD);
 if (sigprocmask(SIG_BLOCK, &sigs, &old) < 0)
 err(EXIT_FAILURE, "sigprocmask");

 /* Create processes, where our maxchild value is not met */
 while (1) 
 while (numspawned < maxchild) 
 pid = fork();
 if (pid < 0)
 err(EXIT_FAILURE, "fork");

 else if (pid == 0) /* child process */
 do_task();
 else /* parent */
 numspawned++;
 
 /* Atomically unblocks signal, handler then picks it up, reblocks on finish */
 if (sigsuspend(&old) < 0 && errno != EINTR)
 err(EXIT_FAILURE, "sigsuspend");

The reason for this behaviour is the algorithm spends more time creating child processes than it does running the actual task (counting to 10000). Tasks not yet created cannot count towards the 'runnable' state, yet will take up %sys on CPU time as they are spawned.

So, the answer could really be in your case that whatever work is being done spawns large numbers of tasks in quick succession (threads, or processes).

edited Feb 12 '15 at 14:08

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

Thank you for the suggestion. The chart in my question shows %user time (CPU system time is excluded, we do only see a very slight increase in system time). Could many small tasks be the explanation anyways? If the load average is sampled every 5 seconds, is the CPU utilization data as given by mpstat more frequently sampled?

– K Erlandsson
Feb 12 '15 at 13:23

I am not familiar with how CPU sampling is done there. Never read the kernel source regarding it. In my example %usr was 70%+ and %sys was 15%.

– Matthew Ife
Feb 12 '15 at 13:30

Good examples !

– Xavier Lucas
Feb 12 '15 at 18:06

add a comment |

If the load average doesn't increase much then it just means that your hardware specs and the nature of the tasks to be processed result in a good overall throughput, avoiding them to be piled up in the task queue for some time.

If there was a contention phenomenom because for instance the average task complexity is too high or task average processing time takes too many CPU cycles, then yes, load average would increase.

UPDATE :

It may not be clear in my original answer, so I'm clarifying now :

The exact formula of load average calculation is : loadvg = tasks running + tasks waiting (for cores) + tasks blocked.

You can definately have a good throughput and get close to a load average of 24 but without penalty on tasks processing time. On the other hand you can also have 2-4 periodic tasks not completing quickly enough, then you will see the number of task waiting (for CPU cycles) growing and you will eventually reach a high load average. Another thing that can happen is having tasks running outstanding synchronous I/O operations then blocking a core, lowering the throughput and making the waiting task queue growing (in that case you may see the iowait metric changing)

edited Feb 12 '15 at 14:37

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

It is my understanding that load average also includes the tasks currently executing. That would mean we definitely can have an increase in load average without actual contention for the CPUs. Or am I mistaken/misunderstanding you?

– K Erlandsson
Feb 12 '15 at 13:24

@KristofferE You are completely right. The actual formula is loadavg = taks running + tasks waiting (for available cores) + tasks blocked. This mean you can have a load average of 24, no task waiting or blocked, thus having just a "full usage" or your hardware capacity without any contention. As you seemed confused about load average vs number of processes running vs CPU usage, I mainly focused my answer on explanations about how a load average can still grow with so few running processes overall. It may not be that clear indeed after re-reading it.

– Xavier Lucas
Feb 12 '15 at 14:26

add a comment |

Load average includes tasks that are blocked on disk IO, so you can easily have zero cpu utilization and a load average of 10 just by having 10 tasks all trying to read from a very slow disk. Thus it is common for a busy server to start thrashing the disk and all of the seeking causes lots of blocked tasks, driving up the load average, while cpu usage drops, since all of the tasks are blocked on the disk.

answered Feb 12 '15 at 20:34

psusi

2,6771119

add a comment |

While Matthew Ife's answer was very helpful and led us in the right direction, it was not exactly the what caused the behavior in our case. In our case we have a multi threaded Java application that uses thread pooling, why no work is done creating the actual tasks.

However, the actual work the threads do is short lived and includes IO waits or synchornization waits. As Matthew mentions in his answer, the load average is sampled by the OS, thus short lived tasks can be missed.

I made a Java program that reproduced the behavior. The following Java class generates a CPU utilization of 28% (650% stacked) on one of our servers. While doing this, the load average is about 1.3. The key here is the sleep() inside the thread, without it the load calculation is correct.

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class MultiThreadLoad 

 private ThreadPoolExecutor e = new ThreadPoolExecutor(200, 200, 0l, TimeUnit.SECONDS,
 new ArrayBlockingQueue<Runnable>(1000), new ThreadPoolExecutor.CallerRunsPolicy());

 public void load() 
 while (true) 
 e.execute(new Runnable() 

 @Override
 public void run() 
 sleep100Ms();
 for (long i = 0; i < 5000000l; i++)
 ;
 

 private void sleep100Ms() 
 try 
 Thread.sleep(100);
 catch (InterruptedException e) 
 throw new RuntimeException(e);
 
 
 );
 
 

 public static void main(String[] args) 
 new MultiThreadLoad().load();

To summarize, the theory is that the threads in our applications idle a lot and then perform short-lived work, why the tasks are not correctly sampled by the load average calculation.

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

add a comment |

Load average is average number of processes in the CPU queue. It is specific for each system, you cannot say that one LA is generically high on all systems, and another is low.
So you have 12 cores, and for LA to increase significantly the number of processes must be really high.

Another question is what is meant by the "CPU Usage" graph. If it's taken from SNMP, like it should be, and your SNMP implementation is net-snmp, then in just stacks CPU-load from each of your 12 CPU. So for net-snmp the total amount of CPU load is 1200%.

If my assumptions are correct, then the CPU usage didn't increased significantly.
Thus, LA didn't increased significantly.

answered Feb 12 '15 at 12:21

drookie

6,13711219

The cpu usage is taken from mpstat, the all row. I am fairly certain it is an average across all CPUs, it is not stacked. For example, when the problem occurs, top shows 2000% CPU usage for one process. That is stacked usage.

– K Erlandsson
Feb 12 '15 at 12:31

add a comment |

The scenario here is not particularly unexpected although it is a little unusual. What Xavier touches on, but does not develop much, is that although Linux (by default) and most flavours of Unix implement pre-emptive multi-tasking, on a healthy machine, tasks will rarely be pre-empted. Each task is alotted a time slice for occupying the CPU, it is only pre-empted if it exceeds this time and there are other tasks waiting to run (note that load reports the average number of processes both in the CPU and waiting to run). Most of the time, a process will yield rather than being interrupted.

(in general you only need to worry about load when it gets close the number of CPUs - i.e. when the scheduler starts pre-empting tasks).

if our CPUs are busy 75% of the time, shouldn't we see higher load average?

Its all about the pattern of activity, clearly increased utilization of the CPU by some tasks (most likely a small mintority) was not having an adverse effect on the processing of other tasks. If you could isolate the transactions being processed, I would expect you would see a new group emerging during the slowdown, while the extant task set was not affected.

update

One common scenario where high CPU can occur without a big increase in load is where a task triggers one (or a sequence) of other tasks, e.g. on receipt of a network request, the handler routes the request to a seperate thread, the seperate thread then makes some asynchronous calls to other processes.... the sampling of the runqueue causes the load to reported lower than it really is - but it does not rise linearly with CPU usage - the chain of tasks triggerred would not have been runnable without the initial event, and because they occur (more or less) sequentially the run queue is not inflated.

edited Feb 12 '15 at 14:55

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

The OP originally provided indications that the aggregate CPU% was "2000%" suggesting there are many tasks using up CPU, rather than just 1 busy process. If it was a consistent 2000% for a minute you'd normally anticipate the load be 20-ish.

– Matthew Ife
Feb 12 '15 at 14:10

...in a comment, not in the question, and he's not very sure about that. In the absence of the 'ALL' option, mpstat reports the total % usage not the average. But that doesn't change the answer - it's about the pattern of activity.

– symcbean
Feb 12 '15 at 14:40

I'm 100% positive that the CPU util we see in the chart is the "average per CPU". Mpstat is run without ALL, but that only leaves out the per-CPU info, the all row still shows the average per CPU. I will clarify the question.

– K Erlandsson
Feb 12 '15 at 14:43

Could you please elaborate yoru last section a bit? I fail to grasp what you mean, while the part of my question you cited is the part I have most trouble understanding.

– K Erlandsson
Feb 12 '15 at 14:44

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f667078%2fhigh-cpu-utilization-but-low-load-average%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

7 Answers
7

active

oldest

votes

7 Answers
7

active

oldest

votes

edited 47 mins ago

answered Feb 12 '15 at 21:38

deltaray

980613

How do you mean that I can have a load of 1 and 100% CPU with a single threaded process? What kind of threads are you talking about? If we consider our Java processes, they have tons of threads, but I was under the assumption that the threads were treated as processes from the perspective of the OS (they have separate PIDs on Linux after all). Could it be so that a single multi threaded java process is only counted as one task from a load average perspective?

– K Erlandsson
Feb 13 '15 at 9:17

I just did a test on my own, the threads in a Java process contributes to the load average as if they where separate processes (I.e. a java class that runs 10 threads in a busy-wait loop gives me a load close to 10). I would appreciate a clarification about the threaded process you mentioned above. Thank you!

– K Erlandsson
Feb 13 '15 at 9:26

I mean if you have a non-multithreading process (ie, one that just uses a single CPU at a time). For instance if you just write a simple C program that runs a busy loop, its just a single thread running and uses only 1 CPU at a time.

– deltaray
Feb 13 '15 at 20:36

All information I have found says that threads count as separate processes when seen from the kernel and when calculating load. Hence I fail to see how I could have a multi threaded process on full tilt resulting in 1 load and 100% CPU on a multi-CPU system. Could you please help me understand how you mean?

– K Erlandsson
Feb 14 '15 at 13:13

For anyone looking for more detail: "Linux Load Averages: Solving the Mystery" by Brendan Gregg had all the answers I ever needed.

– Nickolay
Sep 12 '18 at 12:50

add a comment |

edited 47 mins ago

answered Feb 12 '15 at 21:38

deltaray

980613

How do you mean that I can have a load of 1 and 100% CPU with a single threaded process? What kind of threads are you talking about? If we consider our Java processes, they have tons of threads, but I was under the assumption that the threads were treated as processes from the perspective of the OS (they have separate PIDs on Linux after all). Could it be so that a single multi threaded java process is only counted as one task from a load average perspective?

– K Erlandsson
Feb 13 '15 at 9:17

I just did a test on my own, the threads in a Java process contributes to the load average as if they where separate processes (I.e. a java class that runs 10 threads in a busy-wait loop gives me a load close to 10). I would appreciate a clarification about the threaded process you mentioned above. Thank you!

– K Erlandsson
Feb 13 '15 at 9:26

I mean if you have a non-multithreading process (ie, one that just uses a single CPU at a time). For instance if you just write a simple C program that runs a busy loop, its just a single thread running and uses only 1 CPU at a time.

– deltaray
Feb 13 '15 at 20:36

All information I have found says that threads count as separate processes when seen from the kernel and when calculating load. Hence I fail to see how I could have a multi threaded process on full tilt resulting in 1 load and 100% CPU on a multi-CPU system. Could you please help me understand how you mean?

– K Erlandsson
Feb 14 '15 at 13:13

For anyone looking for more detail: "Linux Load Averages: Solving the Mystery" by Brendan Gregg had all the answers I ever needed.

– Nickolay
Sep 12 '18 at 12:50

add a comment |

edited 47 mins ago

answered Feb 12 '15 at 21:38

deltaray

980613

edited 47 mins ago

answered Feb 12 '15 at 21:38

deltaray

980613

edited 47 mins ago

answered Feb 12 '15 at 21:38

deltaray

980613

answered Feb 12 '15 at 21:38

deltaray

980613

answered Feb 12 '15 at 21:38

deltaray

980613

How do you mean that I can have a load of 1 and 100% CPU with a single threaded process? What kind of threads are you talking about? If we consider our Java processes, they have tons of threads, but I was under the assumption that the threads were treated as processes from the perspective of the OS (they have separate PIDs on Linux after all). Could it be so that a single multi threaded java process is only counted as one task from a load average perspective?

– K Erlandsson
Feb 13 '15 at 9:17

I just did a test on my own, the threads in a Java process contributes to the load average as if they where separate processes (I.e. a java class that runs 10 threads in a busy-wait loop gives me a load close to 10). I would appreciate a clarification about the threaded process you mentioned above. Thank you!

– K Erlandsson
Feb 13 '15 at 9:26

I mean if you have a non-multithreading process (ie, one that just uses a single CPU at a time). For instance if you just write a simple C program that runs a busy loop, its just a single thread running and uses only 1 CPU at a time.

– deltaray
Feb 13 '15 at 20:36

All information I have found says that threads count as separate processes when seen from the kernel and when calculating load. Hence I fail to see how I could have a multi threaded process on full tilt resulting in 1 load and 100% CPU on a multi-CPU system. Could you please help me understand how you mean?

– K Erlandsson
Feb 14 '15 at 13:13

For anyone looking for more detail: "Linux Load Averages: Solving the Mystery" by Brendan Gregg had all the answers I ever needed.

– Nickolay
Sep 12 '18 at 12:50

add a comment |

How do you mean that I can have a load of 1 and 100% CPU with a single threaded process? What kind of threads are you talking about? If we consider our Java processes, they have tons of threads, but I was under the assumption that the threads were treated as processes from the perspective of the OS (they have separate PIDs on Linux after all). Could it be so that a single multi threaded java process is only counted as one task from a load average perspective?

– K Erlandsson
Feb 13 '15 at 9:17

I just did a test on my own, the threads in a Java process contributes to the load average as if they where separate processes (I.e. a java class that runs 10 threads in a busy-wait loop gives me a load close to 10). I would appreciate a clarification about the threaded process you mentioned above. Thank you!

– K Erlandsson
Feb 13 '15 at 9:26

I mean if you have a non-multithreading process (ie, one that just uses a single CPU at a time). For instance if you just write a simple C program that runs a busy loop, its just a single thread running and uses only 1 CPU at a time.

– deltaray
Feb 13 '15 at 20:36

All information I have found says that threads count as separate processes when seen from the kernel and when calculating load. Hence I fail to see how I could have a multi threaded process on full tilt resulting in 1 load and 100% CPU on a multi-CPU system. Could you please help me understand how you mean?

– K Erlandsson
Feb 14 '15 at 13:13

For anyone looking for more detail: "Linux Load Averages: Solving the Mystery" by Brendan Gregg had all the answers I ever needed.

– Nickolay
Sep 12 '18 at 12:50

How do you mean that I can have a load of 1 and 100% CPU with a single threaded process? What kind of threads are you talking about? If we consider our Java processes, they have tons of threads, but I was under the assumption that the threads were treated as processes from the perspective of the OS (they have separate PIDs on Linux after all). Could it be so that a single multi threaded java process is only counted as one task from a load average perspective?

– K Erlandsson
Feb 13 '15 at 9:17

I just did a test on my own, the threads in a Java process contributes to the load average as if they where separate processes (I.e. a java class that runs 10 threads in a busy-wait loop gives me a load close to 10). I would appreciate a clarification about the threaded process you mentioned above. Thank you!

– K Erlandsson
Feb 13 '15 at 9:26

I mean if you have a non-multithreading process (ie, one that just uses a single CPU at a time). For instance if you just write a simple C program that runs a busy loop, its just a single thread running and uses only 1 CPU at a time.

– deltaray
Feb 13 '15 at 20:36

All information I have found says that threads count as separate processes when seen from the kernel and when calculating load. Hence I fail to see how I could have a multi threaded process on full tilt resulting in 1 load and 100% CPU on a multi-CPU system. Could you please help me understand how you mean?

– K Erlandsson
Feb 14 '15 at 13:13

For anyone looking for more detail: "Linux Load Averages: Solving the Mystery" by Brendan Gregg had all the answers I ever needed.

– Nickolay
Sep 12 '18 at 12:50

add a comment |

Load is a very deceptive number. Take it with a grain of salt.

Consider this example, on my host which has 8 logical cores, this python script will register a large CPU usage in top (about 85%), yet hardly any load.

import os, sys

while True:
 for j in range(8):
 parent = os.fork()
 if not parent:
 n = 0
 for i in range(10000):
 n += 1
 sys.exit(0)
 for j in range(8):
 os.wait()

/* Compile with flags -O0 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <err.h>
#include <errno.h>

#include <sys/signal.h>
#include <sys/types.h>
#include <sys/wait.h>

#define ITERATIONS 50000

int maxchild = 0;
volatile int numspawned = 0;

void childhandle(
 int signal)

 int stat;
 /* Handle all exited children, until none are left to handle */
 while (waitpid(-1, &stat, WNOHANG) > 0) 
 numspawned--;
 


/* Stupid task for our children to do */
void do_task(
 void)

 int i,j;
 for (i=0; i < ITERATIONS; i++)
 j++;
 exit(0);


int main() 
 pid_t pid;

 struct sigaction act;
 sigset_t sigs, old;

 maxchild = sysconf(_SC_NPROCESSORS_ONLN);

 /* Setup child handler */
 memset(&act, 0, sizeof(act));
 act.sa_handler = childhandle;
 if (sigaction(SIGCHLD, &act, NULL) < 0)
 err(EXIT_FAILURE, "sigaction");

 /* Defer the sigchild signal */
 sigemptyset(&sigs);
 sigaddset(&sigs, SIGCHLD);
 if (sigprocmask(SIG_BLOCK, &sigs, &old) < 0)
 err(EXIT_FAILURE, "sigprocmask");

 /* Create processes, where our maxchild value is not met */
 while (1) 
 while (numspawned < maxchild) 
 pid = fork();
 if (pid < 0)
 err(EXIT_FAILURE, "fork");

 else if (pid == 0) /* child process */
 do_task();
 else /* parent */
 numspawned++;
 
 /* Atomically unblocks signal, handler then picks it up, reblocks on finish */
 if (sigsuspend(&old) < 0 && errno != EINTR)
 err(EXIT_FAILURE, "sigsuspend");

So, the answer could really be in your case that whatever work is being done spawns large numbers of tasks in quick succession (threads, or processes).

edited Feb 12 '15 at 14:08

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

Thank you for the suggestion. The chart in my question shows %user time (CPU system time is excluded, we do only see a very slight increase in system time). Could many small tasks be the explanation anyways? If the load average is sampled every 5 seconds, is the CPU utilization data as given by mpstat more frequently sampled?

– K Erlandsson
Feb 12 '15 at 13:23

I am not familiar with how CPU sampling is done there. Never read the kernel source regarding it. In my example %usr was 70%+ and %sys was 15%.

– Matthew Ife
Feb 12 '15 at 13:30

Good examples !

– Xavier Lucas
Feb 12 '15 at 18:06

add a comment |

Load is a very deceptive number. Take it with a grain of salt.

Consider this example, on my host which has 8 logical cores, this python script will register a large CPU usage in top (about 85%), yet hardly any load.

import os, sys

while True:
 for j in range(8):
 parent = os.fork()
 if not parent:
 n = 0
 for i in range(10000):
 n += 1
 sys.exit(0)
 for j in range(8):
 os.wait()

/* Compile with flags -O0 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <err.h>
#include <errno.h>

#include <sys/signal.h>
#include <sys/types.h>
#include <sys/wait.h>

#define ITERATIONS 50000

int maxchild = 0;
volatile int numspawned = 0;

void childhandle(
 int signal)

 int stat;
 /* Handle all exited children, until none are left to handle */
 while (waitpid(-1, &stat, WNOHANG) > 0) 
 numspawned--;
 


/* Stupid task for our children to do */
void do_task(
 void)

 int i,j;
 for (i=0; i < ITERATIONS; i++)
 j++;
 exit(0);


int main() 
 pid_t pid;

 struct sigaction act;
 sigset_t sigs, old;

 maxchild = sysconf(_SC_NPROCESSORS_ONLN);

 /* Setup child handler */
 memset(&act, 0, sizeof(act));
 act.sa_handler = childhandle;
 if (sigaction(SIGCHLD, &act, NULL) < 0)
 err(EXIT_FAILURE, "sigaction");

 /* Defer the sigchild signal */
 sigemptyset(&sigs);
 sigaddset(&sigs, SIGCHLD);
 if (sigprocmask(SIG_BLOCK, &sigs, &old) < 0)
 err(EXIT_FAILURE, "sigprocmask");

 /* Create processes, where our maxchild value is not met */
 while (1) 
 while (numspawned < maxchild) 
 pid = fork();
 if (pid < 0)
 err(EXIT_FAILURE, "fork");

 else if (pid == 0) /* child process */
 do_task();
 else /* parent */
 numspawned++;
 
 /* Atomically unblocks signal, handler then picks it up, reblocks on finish */
 if (sigsuspend(&old) < 0 && errno != EINTR)
 err(EXIT_FAILURE, "sigsuspend");

So, the answer could really be in your case that whatever work is being done spawns large numbers of tasks in quick succession (threads, or processes).

edited Feb 12 '15 at 14:08

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

Thank you for the suggestion. The chart in my question shows %user time (CPU system time is excluded, we do only see a very slight increase in system time). Could many small tasks be the explanation anyways? If the load average is sampled every 5 seconds, is the CPU utilization data as given by mpstat more frequently sampled?

– K Erlandsson
Feb 12 '15 at 13:23

I am not familiar with how CPU sampling is done there. Never read the kernel source regarding it. In my example %usr was 70%+ and %sys was 15%.

– Matthew Ife
Feb 12 '15 at 13:30

Good examples !

– Xavier Lucas
Feb 12 '15 at 18:06

add a comment |

Load is a very deceptive number. Take it with a grain of salt.

Consider this example, on my host which has 8 logical cores, this python script will register a large CPU usage in top (about 85%), yet hardly any load.

import os, sys

while True:
 for j in range(8):
 parent = os.fork()
 if not parent:
 n = 0
 for i in range(10000):
 n += 1
 sys.exit(0)
 for j in range(8):
 os.wait()

/* Compile with flags -O0 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <err.h>
#include <errno.h>

#include <sys/signal.h>
#include <sys/types.h>
#include <sys/wait.h>

#define ITERATIONS 50000

int maxchild = 0;
volatile int numspawned = 0;

void childhandle(
 int signal)

 int stat;
 /* Handle all exited children, until none are left to handle */
 while (waitpid(-1, &stat, WNOHANG) > 0) 
 numspawned--;
 


/* Stupid task for our children to do */
void do_task(
 void)

 int i,j;
 for (i=0; i < ITERATIONS; i++)
 j++;
 exit(0);


int main() 
 pid_t pid;

 struct sigaction act;
 sigset_t sigs, old;

 maxchild = sysconf(_SC_NPROCESSORS_ONLN);

 /* Setup child handler */
 memset(&act, 0, sizeof(act));
 act.sa_handler = childhandle;
 if (sigaction(SIGCHLD, &act, NULL) < 0)
 err(EXIT_FAILURE, "sigaction");

 /* Defer the sigchild signal */
 sigemptyset(&sigs);
 sigaddset(&sigs, SIGCHLD);
 if (sigprocmask(SIG_BLOCK, &sigs, &old) < 0)
 err(EXIT_FAILURE, "sigprocmask");

 /* Create processes, where our maxchild value is not met */
 while (1) 
 while (numspawned < maxchild) 
 pid = fork();
 if (pid < 0)
 err(EXIT_FAILURE, "fork");

 else if (pid == 0) /* child process */
 do_task();
 else /* parent */
 numspawned++;
 
 /* Atomically unblocks signal, handler then picks it up, reblocks on finish */
 if (sigsuspend(&old) < 0 && errno != EINTR)
 err(EXIT_FAILURE, "sigsuspend");

So, the answer could really be in your case that whatever work is being done spawns large numbers of tasks in quick succession (threads, or processes).

edited Feb 12 '15 at 14:08

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

Load is a very deceptive number. Take it with a grain of salt.

Consider this example, on my host which has 8 logical cores, this python script will register a large CPU usage in top (about 85%), yet hardly any load.

import os, sys

while True:
 for j in range(8):
 parent = os.fork()
 if not parent:
 n = 0
 for i in range(10000):
 n += 1
 sys.exit(0)
 for j in range(8):
 os.wait()

/* Compile with flags -O0 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <err.h>
#include <errno.h>

#include <sys/signal.h>
#include <sys/types.h>
#include <sys/wait.h>

#define ITERATIONS 50000

int maxchild = 0;
volatile int numspawned = 0;

void childhandle(
 int signal)

 int stat;
 /* Handle all exited children, until none are left to handle */
 while (waitpid(-1, &stat, WNOHANG) > 0) 
 numspawned--;
 


/* Stupid task for our children to do */
void do_task(
 void)

 int i,j;
 for (i=0; i < ITERATIONS; i++)
 j++;
 exit(0);


int main() 
 pid_t pid;

 struct sigaction act;
 sigset_t sigs, old;

 maxchild = sysconf(_SC_NPROCESSORS_ONLN);

 /* Setup child handler */
 memset(&act, 0, sizeof(act));
 act.sa_handler = childhandle;
 if (sigaction(SIGCHLD, &act, NULL) < 0)
 err(EXIT_FAILURE, "sigaction");

 /* Defer the sigchild signal */
 sigemptyset(&sigs);
 sigaddset(&sigs, SIGCHLD);
 if (sigprocmask(SIG_BLOCK, &sigs, &old) < 0)
 err(EXIT_FAILURE, "sigprocmask");

 /* Create processes, where our maxchild value is not met */
 while (1) 
 while (numspawned < maxchild) 
 pid = fork();
 if (pid < 0)
 err(EXIT_FAILURE, "fork");

 else if (pid == 0) /* child process */
 do_task();
 else /* parent */
 numspawned++;
 
 /* Atomically unblocks signal, handler then picks it up, reblocks on finish */
 if (sigsuspend(&old) < 0 && errno != EINTR)
 err(EXIT_FAILURE, "sigsuspend");

So, the answer could really be in your case that whatever work is being done spawns large numbers of tasks in quick succession (threads, or processes).

edited Feb 12 '15 at 14:08

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

edited Feb 12 '15 at 14:08

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

answered Feb 12 '15 at 13:05

Matthew Ife

20.5k24663

Thank you for the suggestion. The chart in my question shows %user time (CPU system time is excluded, we do only see a very slight increase in system time). Could many small tasks be the explanation anyways? If the load average is sampled every 5 seconds, is the CPU utilization data as given by mpstat more frequently sampled?

– K Erlandsson
Feb 12 '15 at 13:23

I am not familiar with how CPU sampling is done there. Never read the kernel source regarding it. In my example %usr was 70%+ and %sys was 15%.

– Matthew Ife
Feb 12 '15 at 13:30

Good examples !

– Xavier Lucas
Feb 12 '15 at 18:06

add a comment |

Thank you for the suggestion. The chart in my question shows %user time (CPU system time is excluded, we do only see a very slight increase in system time). Could many small tasks be the explanation anyways? If the load average is sampled every 5 seconds, is the CPU utilization data as given by mpstat more frequently sampled?

– K Erlandsson
Feb 12 '15 at 13:23

I am not familiar with how CPU sampling is done there. Never read the kernel source regarding it. In my example %usr was 70%+ and %sys was 15%.

– Matthew Ife
Feb 12 '15 at 13:30

Good examples !

– Xavier Lucas
Feb 12 '15 at 18:06

Thank you for the suggestion. The chart in my question shows %user time (CPU system time is excluded, we do only see a very slight increase in system time). Could many small tasks be the explanation anyways? If the load average is sampled every 5 seconds, is the CPU utilization data as given by mpstat more frequently sampled?

– K Erlandsson
Feb 12 '15 at 13:23

I am not familiar with how CPU sampling is done there. Never read the kernel source regarding it. In my example %usr was 70%+ and %sys was 15%.

– Matthew Ife
Feb 12 '15 at 13:30

Good examples !

– Xavier Lucas
Feb 12 '15 at 18:06

add a comment |

If there was a contention phenomenom because for instance the average task complexity is too high or task average processing time takes too many CPU cycles, then yes, load average would increase.

UPDATE :

It may not be clear in my original answer, so I'm clarifying now :

The exact formula of load average calculation is : loadvg = tasks running + tasks waiting (for cores) + tasks blocked.

edited Feb 12 '15 at 14:37

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

It is my understanding that load average also includes the tasks currently executing. That would mean we definitely can have an increase in load average without actual contention for the CPUs. Or am I mistaken/misunderstanding you?

– K Erlandsson
Feb 12 '15 at 13:24

@KristofferE You are completely right. The actual formula is loadavg = taks running + tasks waiting (for available cores) + tasks blocked. This mean you can have a load average of 24, no task waiting or blocked, thus having just a "full usage" or your hardware capacity without any contention. As you seemed confused about load average vs number of processes running vs CPU usage, I mainly focused my answer on explanations about how a load average can still grow with so few running processes overall. It may not be that clear indeed after re-reading it.

– Xavier Lucas
Feb 12 '15 at 14:26

add a comment |

If there was a contention phenomenom because for instance the average task complexity is too high or task average processing time takes too many CPU cycles, then yes, load average would increase.

UPDATE :

It may not be clear in my original answer, so I'm clarifying now :

The exact formula of load average calculation is : loadvg = tasks running + tasks waiting (for cores) + tasks blocked.

edited Feb 12 '15 at 14:37

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

It is my understanding that load average also includes the tasks currently executing. That would mean we definitely can have an increase in load average without actual contention for the CPUs. Or am I mistaken/misunderstanding you?

– K Erlandsson
Feb 12 '15 at 13:24

@KristofferE You are completely right. The actual formula is loadavg = taks running + tasks waiting (for available cores) + tasks blocked. This mean you can have a load average of 24, no task waiting or blocked, thus having just a "full usage" or your hardware capacity without any contention. As you seemed confused about load average vs number of processes running vs CPU usage, I mainly focused my answer on explanations about how a load average can still grow with so few running processes overall. It may not be that clear indeed after re-reading it.

– Xavier Lucas
Feb 12 '15 at 14:26

add a comment |

If there was a contention phenomenom because for instance the average task complexity is too high or task average processing time takes too many CPU cycles, then yes, load average would increase.

UPDATE :

It may not be clear in my original answer, so I'm clarifying now :

The exact formula of load average calculation is : loadvg = tasks running + tasks waiting (for cores) + tasks blocked.

edited Feb 12 '15 at 14:37

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

If there was a contention phenomenom because for instance the average task complexity is too high or task average processing time takes too many CPU cycles, then yes, load average would increase.

UPDATE :

It may not be clear in my original answer, so I'm clarifying now :

The exact formula of load average calculation is : loadvg = tasks running + tasks waiting (for cores) + tasks blocked.

edited Feb 12 '15 at 14:37

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

edited Feb 12 '15 at 14:37

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

answered Feb 12 '15 at 13:00

Xavier Lucas

10.5k23245

It is my understanding that load average also includes the tasks currently executing. That would mean we definitely can have an increase in load average without actual contention for the CPUs. Or am I mistaken/misunderstanding you?

– K Erlandsson
Feb 12 '15 at 13:24

@KristofferE You are completely right. The actual formula is loadavg = taks running + tasks waiting (for available cores) + tasks blocked. This mean you can have a load average of 24, no task waiting or blocked, thus having just a "full usage" or your hardware capacity without any contention. As you seemed confused about load average vs number of processes running vs CPU usage, I mainly focused my answer on explanations about how a load average can still grow with so few running processes overall. It may not be that clear indeed after re-reading it.

– Xavier Lucas
Feb 12 '15 at 14:26

add a comment |

It is my understanding that load average also includes the tasks currently executing. That would mean we definitely can have an increase in load average without actual contention for the CPUs. Or am I mistaken/misunderstanding you?

– K Erlandsson
Feb 12 '15 at 13:24

@KristofferE You are completely right. The actual formula is loadavg = taks running + tasks waiting (for available cores) + tasks blocked. This mean you can have a load average of 24, no task waiting or blocked, thus having just a "full usage" or your hardware capacity without any contention. As you seemed confused about load average vs number of processes running vs CPU usage, I mainly focused my answer on explanations about how a load average can still grow with so few running processes overall. It may not be that clear indeed after re-reading it.

– Xavier Lucas
Feb 12 '15 at 14:26

It is my understanding that load average also includes the tasks currently executing. That would mean we definitely can have an increase in load average without actual contention for the CPUs. Or am I mistaken/misunderstanding you?

– K Erlandsson
Feb 12 '15 at 13:24

@KristofferE You are completely right. The actual formula is loadavg = taks running + tasks waiting (for available cores) + tasks blocked. This mean you can have a load average of 24, no task waiting or blocked, thus having just a "full usage" or your hardware capacity without any contention. As you seemed confused about load average vs number of processes running vs CPU usage, I mainly focused my answer on explanations about how a load average can still grow with so few running processes overall. It may not be that clear indeed after re-reading it.

– Xavier Lucas
Feb 12 '15 at 14:26

add a comment |

answered Feb 12 '15 at 20:34

psusi

2,6771119

add a comment |

answered Feb 12 '15 at 20:34

psusi

2,6771119

add a comment |

answered Feb 12 '15 at 20:34

psusi

2,6771119

answered Feb 12 '15 at 20:34

psusi

2,6771119

answered Feb 12 '15 at 20:34

psusi

2,6771119

answered Feb 12 '15 at 20:34

psusi

2,6771119

answered Feb 12 '15 at 20:34

psusi

2,6771119

add a comment |

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class MultiThreadLoad 

 private ThreadPoolExecutor e = new ThreadPoolExecutor(200, 200, 0l, TimeUnit.SECONDS,
 new ArrayBlockingQueue<Runnable>(1000), new ThreadPoolExecutor.CallerRunsPolicy());

 public void load() 
 while (true) 
 e.execute(new Runnable() 

 @Override
 public void run() 
 sleep100Ms();
 for (long i = 0; i < 5000000l; i++)
 ;
 

 private void sleep100Ms() 
 try 
 Thread.sleep(100);
 catch (InterruptedException e) 
 throw new RuntimeException(e);
 
 
 );
 
 

 public static void main(String[] args) 
 new MultiThreadLoad().load();

To summarize, the theory is that the threads in our applications idle a lot and then perform short-lived work, why the tasks are not correctly sampled by the load average calculation.

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

add a comment |

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class MultiThreadLoad 

 private ThreadPoolExecutor e = new ThreadPoolExecutor(200, 200, 0l, TimeUnit.SECONDS,
 new ArrayBlockingQueue<Runnable>(1000), new ThreadPoolExecutor.CallerRunsPolicy());

 public void load() 
 while (true) 
 e.execute(new Runnable() 

 @Override
 public void run() 
 sleep100Ms();
 for (long i = 0; i < 5000000l; i++)
 ;
 

 private void sleep100Ms() 
 try 
 Thread.sleep(100);
 catch (InterruptedException e) 
 throw new RuntimeException(e);
 
 
 );
 
 

 public static void main(String[] args) 
 new MultiThreadLoad().load();

To summarize, the theory is that the threads in our applications idle a lot and then perform short-lived work, why the tasks are not correctly sampled by the load average calculation.

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

add a comment |

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class MultiThreadLoad 

 private ThreadPoolExecutor e = new ThreadPoolExecutor(200, 200, 0l, TimeUnit.SECONDS,
 new ArrayBlockingQueue<Runnable>(1000), new ThreadPoolExecutor.CallerRunsPolicy());

 public void load() 
 while (true) 
 e.execute(new Runnable() 

 @Override
 public void run() 
 sleep100Ms();
 for (long i = 0; i < 5000000l; i++)
 ;
 

 private void sleep100Ms() 
 try 
 Thread.sleep(100);
 catch (InterruptedException e) 
 throw new RuntimeException(e);
 
 
 );
 
 

 public static void main(String[] args) 
 new MultiThreadLoad().load();

To summarize, the theory is that the threads in our applications idle a lot and then perform short-lived work, why the tasks are not correctly sampled by the load average calculation.

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class MultiThreadLoad 

 private ThreadPoolExecutor e = new ThreadPoolExecutor(200, 200, 0l, TimeUnit.SECONDS,
 new ArrayBlockingQueue<Runnable>(1000), new ThreadPoolExecutor.CallerRunsPolicy());

 public void load() 
 while (true) 
 e.execute(new Runnable() 

 @Override
 public void run() 
 sleep100Ms();
 for (long i = 0; i < 5000000l; i++)
 ;
 

 private void sleep100Ms() 
 try 
 Thread.sleep(100);
 catch (InterruptedException e) 
 throw new RuntimeException(e);
 
 
 );
 
 

 public static void main(String[] args) 
 new MultiThreadLoad().load();

To summarize, the theory is that the threads in our applications idle a lot and then perform short-lived work, why the tasks are not correctly sampled by the load average calculation.

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

answered Feb 17 '15 at 8:45

K Erlandsson

2751612

add a comment |

If my assumptions are correct, then the CPU usage didn't increased significantly.
Thus, LA didn't increased significantly.

answered Feb 12 '15 at 12:21

drookie

6,13711219

The cpu usage is taken from mpstat, the all row. I am fairly certain it is an average across all CPUs, it is not stacked. For example, when the problem occurs, top shows 2000% CPU usage for one process. That is stacked usage.

– K Erlandsson
Feb 12 '15 at 12:31

add a comment |

If my assumptions are correct, then the CPU usage didn't increased significantly.
Thus, LA didn't increased significantly.

answered Feb 12 '15 at 12:21

drookie

6,13711219

The cpu usage is taken from mpstat, the all row. I am fairly certain it is an average across all CPUs, it is not stacked. For example, when the problem occurs, top shows 2000% CPU usage for one process. That is stacked usage.

– K Erlandsson
Feb 12 '15 at 12:31

add a comment |

If my assumptions are correct, then the CPU usage didn't increased significantly.
Thus, LA didn't increased significantly.

answered Feb 12 '15 at 12:21

drookie

6,13711219

If my assumptions are correct, then the CPU usage didn't increased significantly.
Thus, LA didn't increased significantly.

answered Feb 12 '15 at 12:21

drookie

6,13711219

answered Feb 12 '15 at 12:21

drookie

6,13711219

answered Feb 12 '15 at 12:21

drookie

6,13711219

answered Feb 12 '15 at 12:21

drookie

6,13711219

The cpu usage is taken from mpstat, the all row. I am fairly certain it is an average across all CPUs, it is not stacked. For example, when the problem occurs, top shows 2000% CPU usage for one process. That is stacked usage.

– K Erlandsson
Feb 12 '15 at 12:31

add a comment |

The cpu usage is taken from mpstat, the all row. I am fairly certain it is an average across all CPUs, it is not stacked. For example, when the problem occurs, top shows 2000% CPU usage for one process. That is stacked usage.

– K Erlandsson
Feb 12 '15 at 12:31

The cpu usage is taken from mpstat, the all row. I am fairly certain it is an average across all CPUs, it is not stacked. For example, when the problem occurs, top shows 2000% CPU usage for one process. That is stacked usage.

– K Erlandsson
Feb 12 '15 at 12:31

add a comment |

(in general you only need to worry about load when it gets close the number of CPUs - i.e. when the scheduler starts pre-empting tasks).

if our CPUs are busy 75% of the time, shouldn't we see higher load average?

update

edited Feb 12 '15 at 14:55

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

The OP originally provided indications that the aggregate CPU% was "2000%" suggesting there are many tasks using up CPU, rather than just 1 busy process. If it was a consistent 2000% for a minute you'd normally anticipate the load be 20-ish.

– Matthew Ife
Feb 12 '15 at 14:10

...in a comment, not in the question, and he's not very sure about that. In the absence of the 'ALL' option, mpstat reports the total % usage not the average. But that doesn't change the answer - it's about the pattern of activity.

– symcbean
Feb 12 '15 at 14:40

I'm 100% positive that the CPU util we see in the chart is the "average per CPU". Mpstat is run without ALL, but that only leaves out the per-CPU info, the all row still shows the average per CPU. I will clarify the question.

– K Erlandsson
Feb 12 '15 at 14:43

Could you please elaborate yoru last section a bit? I fail to grasp what you mean, while the part of my question you cited is the part I have most trouble understanding.

– K Erlandsson
Feb 12 '15 at 14:44

add a comment |

(in general you only need to worry about load when it gets close the number of CPUs - i.e. when the scheduler starts pre-empting tasks).

if our CPUs are busy 75% of the time, shouldn't we see higher load average?

update

edited Feb 12 '15 at 14:55

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

The OP originally provided indications that the aggregate CPU% was "2000%" suggesting there are many tasks using up CPU, rather than just 1 busy process. If it was a consistent 2000% for a minute you'd normally anticipate the load be 20-ish.

– Matthew Ife
Feb 12 '15 at 14:10

...in a comment, not in the question, and he's not very sure about that. In the absence of the 'ALL' option, mpstat reports the total % usage not the average. But that doesn't change the answer - it's about the pattern of activity.

– symcbean
Feb 12 '15 at 14:40

I'm 100% positive that the CPU util we see in the chart is the "average per CPU". Mpstat is run without ALL, but that only leaves out the per-CPU info, the all row still shows the average per CPU. I will clarify the question.

– K Erlandsson
Feb 12 '15 at 14:43

Could you please elaborate yoru last section a bit? I fail to grasp what you mean, while the part of my question you cited is the part I have most trouble understanding.

– K Erlandsson
Feb 12 '15 at 14:44

add a comment |

(in general you only need to worry about load when it gets close the number of CPUs - i.e. when the scheduler starts pre-empting tasks).

if our CPUs are busy 75% of the time, shouldn't we see higher load average?

update

edited Feb 12 '15 at 14:55

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

(in general you only need to worry about load when it gets close the number of CPUs - i.e. when the scheduler starts pre-empting tasks).

if our CPUs are busy 75% of the time, shouldn't we see higher load average?

update

edited Feb 12 '15 at 14:55

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

edited Feb 12 '15 at 14:55

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

answered Feb 12 '15 at 13:55

symcbean

18.7k12339

The OP originally provided indications that the aggregate CPU% was "2000%" suggesting there are many tasks using up CPU, rather than just 1 busy process. If it was a consistent 2000% for a minute you'd normally anticipate the load be 20-ish.

– Matthew Ife
Feb 12 '15 at 14:10

...in a comment, not in the question, and he's not very sure about that. In the absence of the 'ALL' option, mpstat reports the total % usage not the average. But that doesn't change the answer - it's about the pattern of activity.

– symcbean
Feb 12 '15 at 14:40

I'm 100% positive that the CPU util we see in the chart is the "average per CPU". Mpstat is run without ALL, but that only leaves out the per-CPU info, the all row still shows the average per CPU. I will clarify the question.

– K Erlandsson
Feb 12 '15 at 14:43

Could you please elaborate yoru last section a bit? I fail to grasp what you mean, while the part of my question you cited is the part I have most trouble understanding.

– K Erlandsson
Feb 12 '15 at 14:44

add a comment |

The OP originally provided indications that the aggregate CPU% was "2000%" suggesting there are many tasks using up CPU, rather than just 1 busy process. If it was a consistent 2000% for a minute you'd normally anticipate the load be 20-ish.

– Matthew Ife
Feb 12 '15 at 14:10

...in a comment, not in the question, and he's not very sure about that. In the absence of the 'ALL' option, mpstat reports the total % usage not the average. But that doesn't change the answer - it's about the pattern of activity.

– symcbean
Feb 12 '15 at 14:40

I'm 100% positive that the CPU util we see in the chart is the "average per CPU". Mpstat is run without ALL, but that only leaves out the per-CPU info, the all row still shows the average per CPU. I will clarify the question.

– K Erlandsson
Feb 12 '15 at 14:43

Could you please elaborate yoru last section a bit? I fail to grasp what you mean, while the part of my question you cited is the part I have most trouble understanding.

– K Erlandsson
Feb 12 '15 at 14:44

The OP originally provided indications that the aggregate CPU% was "2000%" suggesting there are many tasks using up CPU, rather than just 1 busy process. If it was a consistent 2000% for a minute you'd normally anticipate the load be 20-ish.

– Matthew Ife
Feb 12 '15 at 14:10

...in a comment, not in the question, and he's not very sure about that. In the absence of the 'ALL' option, mpstat reports the total % usage not the average. But that doesn't change the answer - it's about the pattern of activity.

– symcbean
Feb 12 '15 at 14:40

I'm 100% positive that the CPU util we see in the chart is the "average per CPU". Mpstat is run without ALL, but that only leaves out the per-CPU info, the all row still shows the average per CPU. I will clarify the question.

– K Erlandsson
Feb 12 '15 at 14:43

Could you please elaborate yoru last section a bit? I fail to grasp what you mean, while the part of my question you cited is the part I have most trouble understanding.

– K Erlandsson
Feb 12 '15 at 14:44

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Server Fault!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Irtsgbr

7 Answers
7

Your Answer

Post as a guest

7 Answers
7

7 Answers
7

Post as a guest

Popular posts from this blog

7 Answers 7

Your Answer

Sign up or log in

Post as a guest

Post as a guest

7 Answers 7

7 Answers 7

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

7 Answers
7

7 Answers
7

7 Answers
7