# Unresponsive system with almost 100% RAM and swap usage

## root.exe

That's probably general Linux kernel problem. The problem is that when you manage to fill up both your RAM and swap, your system becomes (sometimes) completely unresponsive. Soon, some hungry processes get killed freeing up some memory.

For example, I have a system with 8 GB RAM and 2 GB swap (HDD, while the whole OS is on SSD). I have to run many different applications (hundreds of chromium tabs also go here), sometimes plus a couple of virtual machines. So it's not that hard to end up with out of memory. Of course, vm.swappiness is set to 0 in order to use fast RAM, not slow HDD, when almost all RAM is filled up. But instead of just slow system, I may deal with completely unresponsive one, so that I can't move a cursor and see the current time. It may take up to ah hour (usually several minutes) for OOM killer to do its work. Well, indeed, I can try forcing that via Alt-SysRq-F, but this may lead to loss of important processes which shouldn't be killed.

Without a swap, the problem is the same.

So, the question is what to do with overall system performance in such cases?

----------

## NeddySeagoon

root.exe,

Welcome to Gentoo

Your system is swapping to HDD. You just can't see it easily.

The swap space is used only for dynamically allocated RAM. If swap cannot be used, the kernel will flush anything that has a permanent space on the HDD to free RAM, then reload it later.

By denying the use of swap, you remove one of the kernels swapping choices, since all dynamically allocated RAM is effectively locked into RAM.

If you have dirty disk buffers, they must be commited before they are flushed.  Code and read only data will be flushed immedately.

Let swap be used and/or fit more RAM.

----------

## PaulBredbury

Put the swap partition/file on the SSD  :Wink: 

----------

## root.exe

NeddySeagoon,

PaulBredbury,

thanks for your replies!

So, when I see 0 MB usage of swap partition, that's not entirely true, right? Even with vm.swappiness set to default 60 value I saw the same 0 MB until 6-7 GB of RAM were filled up.

Previously, the swap was on 5400 RPM HDD (really slow), but now i'm looking at the behaviour without slow swap. Sometimes I might just get some out of memory error (instead of getting really slow system with HDD swap), or sometimes the same freeze as when the swap is enabled.

PaulBredbury, 

I'm still not sure SSD should be used for a swap partition, however I have quite modern MLC SSD (OCZ Agility 4). Do you have any experience of using SSDs as a swap for fairly long time?

Thereby, the preferred way is to leave swap on and increase its size if needed?

----------

## NeddySeagoon

root.exe,

"0 MB usage of swap partition" means exactly that. It does not imply that no swapping is taking place.

During normal use, you should have about 10Mb of swap in use as dynamically allocated space for startup/shutdown stuff is swapped out.

Thats fine, its not needed until you shut down.  

Run top, when your system gets slow, post the first 6 lines.  free is also useful

----------

## root.exe

The very beginning (I just got it rebooted and opened several light VMs and chromium with a few tabs):

```

top - 23:52:01 up  1:04,  6 users,  load average: 0.91, 0.65, 0.51

Tasks: 234 total,   2 running, 232 sleeping,   0 stopped,   0 zombie

%Cpu(s):  2.8 us,  2.7 sy,  0.0 ni, 93.7 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st

KiB Mem:   8160216 total,  8018572 used,   141644 free,      372 buffers

KiB Swap:  2097148 total,        0 used,  2097148 free,   127972 cached

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

 5028 user      20   0 4603032 3.046g 3.004g S  19.0 39.1  10:35.77 VirtualBox

 2559 root      20   0  248032  70540  45392 R   6.3  0.9   0:29.41 X

 3711 user      20   0 3010100  46344   3308 S   6.3  0.6   0:36.48 kwin

 4155 user      20   0 1812952 329664 293584 S   6.3  4.0   1:54.07 VirtualBox

 4781 user      20   0 2013428 594888 562168 S   6.3  7.3   2:38.47 VirtualBox

```

After the first failed attempt to get a freeze:

```

top - 23:54:36 up  1:06,  6 users,  load average: 3.06, 2.21, 1.15

Tasks: 234 total,   1 running, 233 sleeping,   0 stopped,   0 zombie

%Cpu(s):  2.8 us,  2.9 sy,  0.0 ni, 92.8 id,  1.4 wa,  0.0 hi,  0.0 si,  0.0 st

KiB Mem:   8160216 total,  8033956 used,   126260 free,      384 buffers

KiB Swap:  2097148 total,   495840 used,  1601308 free,    66572 cached

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

 5028 user      20   0 4603032 3.014g 3.003g S  18.5 38.7  10:52.84 VirtualBox

 4641 user      20   0 2010308 560180 558312 S   6.2  6.9   2:42.66 VirtualBox

 4781 user      20   0 2013428 563264 561292 S   6.2  6.9   2:43.38 VirtualBox

10558 user      20   0  123504   1608   1084 R   6.2  0.0   0:00.01 top

    1 root      20   0    4232     20      0 S   0.0  0.0   0:00.47 init

```

And the most interesting part (having little free memory I ran firefox in while true inside a VM which had several GBs of memory limit; of course I didn't have so much memory available):

```

top - 23:57:00 up  1:09,  6 users,  load average: 30.30, 12.16, 4.92

Tasks: 232 total,   1 running, 231 sleeping,   0 stopped,   0 zombie

%Cpu(s):  2.7 us,  3.1 sy,  0.0 ni, 90.0 id,  4.2 wa,  0.0 hi,  0.0 si,  0.0 st

KiB Mem:   8160216 total,  8053452 used,   106764 free,      360 buffers

KiB Swap:  2097148 total,   630584 used,  1466564 free,    32888 cached

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

  679 root      20   0       0      0      0 D   6.2  0.0   0:23.17 kswapd0

10661 user      20   0  123352   1604   1084 R   6.2  0.0   0:00.01 top

    1 root      20   0    4232      0      0 S   0.0  0.0   0:00.54 init

    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd

    3 root      20   0       0      0      0 S   0.0  0.0   0:00.22 ksoftirqd/0

top - 23:57:29 up  1:09,  6 users,  load average: 30.97, 13.83, 5.67

Tasks: 233 total,   1 running, 225 sleeping,   0 stopped,   7 zombie

%Cpu(s):  2.7 us,  3.1 sy,  0.0 ni, 89.8 id,  4.4 wa,  0.0 hi,  0.0 si,  0.0 st

KiB Mem:   8160216 total,  4766084 used,  3394132 free,     1032 buffers

KiB Swap:  2097148 total,   573244 used,  1523904 free,    89556 cached

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

10095 user      20   0 3570292 1.767g 1.720g S  84.2 22.7   1:39.72 VirtualBox

 4155 user      20   0 1812952 290392 288996 S   6.5  3.6   2:02.67 VirtualBox

    1 root      20   0    4232      0      0 S   0.0  0.0   0:00.57 init

    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd

    3 root      20   0       0      0      0 S   0.0  0.0   0:00.22 ksoftirqd/0

```

Here I was dealing with absolutely unresponsive system. I suppose these two outputs are right before pressing Alt-SysRq-F and after that, respectively (though there should be virtualbox in the first one).

I guess over 30 load average speaks for itself. I have 8 virtual cores. The script was run this way:

```
while true; do top | head -n 12 >> /tmp/memlog2; sleep 5; done
```

So, the interval was 29 seconds instead of 5.

By the way, after RAM usage is stabilized, I may keep getting some after-effects, e.g.: the first opening of any application may be slower than usually, or window decorations are drawn before the window itself do. I haven't checked, but I hope running swapoff/swapon should fix such things.

----------

## NeddySeagoon

root.exe

```
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

 5028 user      20   0 4603032 3.046g 3.004g S  19.0 39.1  10:35.77 VirtualBox

 2559 root      20   0  248032  70540  45392 R   6.3  0.9   0:29.41 X

 3711 user      20   0 3010100  46344   3308 S   6.3  0.6   0:36.48 kwin

 4155 user      20   0 1812952 329664 293584 S   6.3  4.0   1:54.07 VirtualBox

 4781 user      20   0 2013428 594888 562168 S   6.3  7.3   2:38.47 VirtualBox 
```

You plan to over commit RAM.  You have 8G in use by your three Virtualbox sessions, 3G in use by kwin.

So somewhen somehow, as you only have 8G RAM you will have 3G always being swapped. Thats what happens when you over commit RAM. 

This 3G may not be in the swap partition.

----------

## root.exe

NeddySeagoon,

ok, that's right, however we both do know the virtual memory numbers mean nothing.

Well, the main question is about the kernel. How to get rid of these freezes? What exactly does it try doing under such memory consumption, why does it lead to tens of load average value?

----------

## NeddySeagoon

root.exe,

Check your iowait in top, although here, 

```
top - 23:57:29 up  1:09,  6 users,  load average: 30.97, 13.83, 5.67

Tasks: 233 total,   1 running, 225 sleeping,   0 stopped,   7 zombie

%Cpu(s):  2.7 us,  3.1 sy,  0.0 ni, 89.8 id,  4.4 wa,  0.0 hi,  0.0 si,  0.0 st 

KiB Mem:   8160216 total,  4766084 used,  3394132 free,     1032 buffers

KiB Swap:  2097148 total,   573244 used,  1523904 free,    89556 cached 
```

its only 4.4%

The load average is a count of the jobs in the ready to run state, during the time period. A count of 1.0 means that there is always a job waiting for the CPU.

With over nearly 600MB of swap in use, just for dynamically allocated memory, you clearly have a real shortage of RAM but you also have 3.3G free, which doesn't make sense. 

Under those circumstances you can issue a 

```
swapoff -a
```

to flush swap back to RAM. 

Have you set swappieness too high so the kernel swaps way before it needs to.

----------

## root.exe

NeddySeagoon,

yes, io wait seems to be ok. As for load average, yes again, I do know that; and there are so many jobs in a queue when the kernel is busy with doing score calculation for OOM killer or swapping processes and getting them back to RAM, or whatever else - unfortunately, I don't know what it actually is busy with.

```
# sysctl vm.swappiness

vm.swappiness = 0
```

And the same was true during the experiment.

I just don't want to get such freezes. If there was no spare or free memory available, I'd prefer (as most of users would, I suppose) to get some OOM exception and stay with working, available to use system. Are there any kernel configuration options being responsible for such behaviour? CONFIG_PREEMPT, CONFIG_HZ or so - don't know.

----------

## eccerr0r

I don't think swappiness will help in this case.

I think what's happening is that you truly out out of memory here - but the culprit is not swap, but the inability to swap.  Not only it has problems swapping, there are not enough pages of swappable memory to work with, and your performance tanks.

VMMs, to keep performance of the underlying machines consistent, will tend to lock pages in memory so they can't be swapped out.  The solution is reduce the physical memory allocated to the VMs plus add swap to the underlying VMs, or get more RAM.  (Having 8GB of RAM does not mean you can make four 2GB VM's, at first you can allocate all four VM's but once they use up their own piece of RAM, the host machine will be in trouble.  Over committing is evil but helps poorly designed software run... )

[ soapbox ]

I don't know why people insist on swappiness=0 and/or have no swap.  I think Linux does a good job on what to swap out when the machine is idle.  If it finds memory idle, it will swap that process out so the next huge allocation request comes, that page is ready and you won't have to swap it out before the application grabs and uses it.  And even if no application needs it, using it as disk cache is better than wasting it on sleeping processes.

[ /soapbox ]

----------

## CleanTestr

I'm looking at your top in the 'about to fail' condition...

Perhaps you could set up a *Not* -nice sshd on a port that you can talk to 

just to get things restarted *when needed*.  Most of the time it would be

'blocked' non-running, so it oughtn't get in the way of the rest of the

system...

Not knowing your particular job batch mix, perhaps a way could be found,

to re-program some of the script work, so as to cause complete program

exit after task completion, rather than have programs wait around in loops

for more input; this would reduce swapp-ity somewhat (?)

----------

## eccerr0r

The more I look into VirtualMachines, the more it seems that you should never use the host machine for general purpose uses.

Make sure you have enough RAM to encompass the VirtualMachines and have an extra 512MB or so for maintaining the VirtualMachines.  Don't use this 512MB for general purpose computing.

If you had been using all the ram, say if you were running four 2GB simulations that were free to swap in an 8GB machine, the machine would not become unresponsive if another 1GB application was suddenly started.  The machine would slow down with the VirtualMem pressure but it would be free to swap what it needed instead of being forced to not swap in the VirtualMachine case.

I don't know what would happen if VirtualMachine images were allowed to swap.  This would allow your machine to not get into the stuck state, but the performance of the VMs could be very chunky as ram allocated that the OS thought was in RAM would actually be swapped onto disk...  a fake page fault - the guest OS has to treat that page as a non-page fault but the host OS is a page fault.  That application that was running in the guest now accesses that page that should not have been a fault and suddenly takes 1 second to access instead of nanoseconds...

Oh... and I'm curious if you found this true or not: the virtual machines would be running just fine when the host is undergoing a serious OOM condition.  Other than the hard drive trying to service swap requests, the VirtualMachines would not really know the host was in trouble...

----------

## NeddySeagoon

root.exe, 

It may be worth investigating Kernel Same Page merging.  This is designed to help the host reduce memory requirements when supporting several virtual machines, since commom pages in RAM are kept in RAM only once - not once per VM.

Read the kernel help. 

I can confirm what eccerr0r said about RAM overcommit.  I managed to start 8 x 2G VMs in a host with only 8G or physical RAM.

The OOM did not kick in but the VMs ran very very slowly due to the swapping.

The bare metal host in this case does nothing except support VMs. That was an accident I hope I never repeat.

----------

## CleanTestr

A question:  if all 'swappable' pages have been written to swap, and

therefore 'everything else' in RAM is non-swappable, does that 100'000

figure from top mean that any future attempts to I/O, DMA or swap 

will require loading pages a modulo of that size (100'000) rather than

something larger, such as would happen on a loosely loaded machine?

Could one pre-allocate(reserve and 'never' delete or re-use for 

some other purpose: such as execute code) a I/O or DMA 'space' of 

some 256Meg to do a larger 'chunk-size' bus access through? Can one 

specify that the 'buffer' (not, in this case: buffers) for kernel or pipe 

should use a specified memory range (eg fuse)?

----------

## eccerr0r

All the pages ineligible for swap basically removes that memory from the pool of available memory.  So basically your 8GB machine becomes a 128MB machine when 8064MB is locked to the VMs.  The VMs are also doing disk requests so that 128MB is being stressed by cache requests from those too.

Fuse is messed up as it is, it's slow not because of buffer memory, but rather that it has to cross the kernel-user boundary often - there's a lot of overhead there.  The FUSE driver can be written to add its own buffers too...

----------

