# Would a Gentoo KVM Server Benefit From A M2/NVME Drive?

## dman777

I am slowly building a KVM server with Gentoo. I thought about getting a M2/NVME drive but I want to keep my server/PC as cool as possible. I have read that they give off a significant amount of heat, even when idle. Cost is not really a issue, it's more just the heat thing. 

I don't move around large files. Most of the work is just development web server Node stuff. Some backend stuff also like DB. But mainly, I think, most of the speed needed is random access/random writes/random reads.  As far as compiling goes, very minimal once I get all my tools installed. 

Curious, would I see a difference with my Gentoo KVM host and Gentoo KVM guests if I used a M2/NVME drive over a normal SSD?

----------

## aidanjt

It depends on your specific workloads.  If a bottleneck on your system exists and is the IOPS/bandwidth of your mass storage device, then NVMe will certainly remove that bottleneck.

----------

## pjp

Moved from Off the Wall to Kernel & Hardware.

----------

## mike155

 *Quote:*   

> I have read that they give off a significant amount of heat, even when idle. Cost is not really a issue, it's more just the heat thing.

 

It depends. Please look at this article and scroll down to find tables with the power consumption of various NVMe SSDs.

You will see that a NVMe SSD consumes far less power than a spinning disk drive. On the other hand, you will need multiple NVMe SSDs to get the same capacity as a 14TB spinning disk drive.

In my QEMU/KVM server, I use two 2TB Samsung 970 Evo Plus NVMe SSDs. No problems so far - I'm happy.

----------

## dman777

 *mike155 wrote:*   

>  *Quote:*   I have read that they give off a significant amount of heat, even when idle. Cost is not really a issue, it's more just the heat thing. 
> 
> It depends. Please look at this article and scroll down to find tables with the power consumption of various NVMe SSDs.
> 
> You will see that a NVMe SSD consumes far less power than a spinning disk drive. On the other hand, you will need multiple NVMe SSDs to get the same capacity as a 14TB spinning disk drive.
> ...

 

How is the heat of those compared to a normal SSD? 

Also, are you using a NVMe for your host system or just the guests?

----------

## mike155

 *Quote:*   

> How is the heat of those compared to a normal SSD? 

 

There's a difference between 'generated heat' and 'surface temperature'.

The 'generated heat' is important, because that's what you'll have to transport out of your chassis. The 'generated heat' is equivalent to the power consumption. If you look at the power consumption tables in the article I referred to, you will see that the Samsung 970 (a NVMe SSD) consumes 20% more power than the Samsung 960 (a 'normal' 2.5" SSD). That's probably negligible. A spinning disk drive consumes around 5 W, more than 2 times more.

Don't confuse 'generated heat' with the  'surface temperature'. The NVMe controller chip has a much smaller surface than a 'normal' SSD. Therefore, its temperature will be much higher than the surface temperature of a 'normal' SSD.

 *Quote:*   

> Also, are you using a NVMe for your host system or just the guests?

 

I use the two NVMe SSDs for my host system as well as for the guests.

----------

## dman777

Oh.... I read that article before.... really good article. 

So, I assume your NVME SSD's never go into deep power save mode? Being they are on a KVM system. Also, do you notice any latency from the other power saver modes? It seems that the latency from power saver modes offsets the benefits of NVME in some ways. I like the wait for opening a file for vim if I have not done any activities in awhile.

----------

## DaggyStyle

technically speaking, the pcie bandwidth is higher than the sata port. so you should get higher performance with NVMe than normal ssd.

infact, Intel backed it up when they decided to start selling IMDT

----------

## mike155

 *Quote:*   

> So, I assume your NVME SSD's never go into deep power save mode? 

 

I replaced my old spinning disk drives with Samsung 970 EVO Plus NVMe SSDs two weeks ago. In the last two weeks, I copied all my data and made sure that everything works. 

The next topic on my agenda is power management. Here is what I've found out so far:

Samsung NVMe SSDs support multiple power states. You can get a list if the power states with

```
nvme id-ctrl /dev/nvme0
```

Below are the last lines of the output (shortened):

```
ps    0 : mp:7.50W operational enlat:0 exlat:0

ps    1 : mp:5.90W operational enlat:0 exlat:0

ps    2 : mp:3.60W operational enlat:0 exlat:0

ps    3 : mp:0.0700W non-operational enlat:210 exlat:1200

ps    4 : mp:0.0050W non-operational enlat:2000 exlat:8000
```

The NVMe SSD supports 5 different power states. Power consumption ranges between 0.005W to 7.5 W. enlat is the time needed to enter the state in microseconds, exlat is the time needed to exit the state in microseconds.

Does Linux support those power states? Yes, a patch from Andy Lutomirski was included into the mainline kernel in 2017. See: https://lore.kernel.org/patchwork/patch/711737/

Does it work on my machine? Is automatic switching between power states enabled on my machine? When will the NVMe SSD enter power states 3 and 4?

```
nvme get-feature -f 0x0c -H /dev/nvme0
```

Below are the first lines of the output (shortened):

```
get-feature:0xc (Autonomous Power State Transition), Current value:0x000001

Autonomous Power State Transition Enable (APSTE): Enabled

Auto PST Entries

.................

Entry[ 0]   

Idle Time Prior to Transition (ITPT): 71 ms

Idle Transition Power State   (ITPS): 3

.................

Entry[ 1]   

Idle Time Prior to Transition (ITPT): 71 ms

Idle Transition Power State   (ITPS): 3

................

Entry[ 2]   

Idle Time Prior to Transition (ITPT): 71 ms

Idle Transition Power State   (ITPS): 3

.................

Entry[ 3]   

Idle Time Prior to Transition (ITPT): 500 ms

Idle Transition Power State   (ITPS): 4
```

Autonomous Power State Transition is enabled. The output shows that the SSD will enter PS3 after 71ms and PS4 after 500ms of... of what? inactivity?

Right now, I'm looking for a command that can read the current power state of my NVMe SSDs. Does anybody know such a command? Please tell me - I haven't found such a command yet.

I have some evidence that my NVMe SSDs enter PS4. Look at the statements below, which will clear the cache and read two blocks from a NVMe SSD:

```
echo 3 >/proc/sys/vm/drop_caches; \

/usr/bin/time dd if=/dev/nvme1n1 of=/dev/zero bs=512 count=1; \

/usr/bin/time dd if=/dev/nvme1n1 of=/dev/zero bs=512 count=1 skip=2G
```

Below is the output (shortened):

```
512 bytes copied, 0.00719804 s

512 bytes copied, 6.0646e-05 s

```

Please note that the first dd statement takes a little longer: 7ms - that roughly the exit latency time from PS4. 

----------

## DaggyStyle

 *mike155 wrote:*   

> 
> 
> Right now, I'm looking for a command that can read the current power state of my NVMe SSDs. Does anybody know such a command? Please tell me - I haven't found such a command yet.
> 
> 

 

I think you are looking for feature 0x2

----------

## mike155

 *DaggyStyle wrote:*   

> I think you are looking for feature 0x2

 

Great! That's it:

```
nvme get-feature -f 0x02 -H /dev/nvme0
```

returns

```
get-feature:0x2 (Power Management), Current value:0x000004

Workload Hint (WH): 0 - No Workload

Power State   (PS): 4
```

Power state 4 (0.005W = 5mW) - exactly what I hoped for!

@DaggyStyle: Thanks!!!

@dman: you shouldn't worry too much about generated heat...

----------

## dman777

Is power state 4 what you want for a kvm guest? I ask because:

1) That guest is always running and accepting ssh connections

2)  The wake up time lag is greater than a normal ssd

----------

## dman777

 *mike155 wrote:*   

>  *DaggyStyle wrote:*   I think you are looking for feature 0x2 
> 
> Great! That's it:
> 
> ```
> ...

 

If I may ask, are you using direct pass through? I was wondering if those power states would work with direct pass through of the disk to the guest.

----------

## mike155

 *dman777 wrote:*   

> Is power state 4 what you want for a kvm guest? I ask because:
> 
> 1) That guest is always running and accepting ssh connections
> 
> 2)  The wake up time lag is greater than a normal ssd

 

I wouldn't worry about a few ms wakeup time from PS4.

A few years ago, we had spinning disks. They had a few ms seek delay for nearly every access. Now we have NVMe SSDs which don't have seek delays. Access is much faster! Only the first access after an inactivity period of 500ms or more will have a delay of a few ms. There's really no reason to worry about that. I think that even 'normal' SSDs have sleep states and probably a short wakeup delay. I don 't know - I never cared about that.

Furthermore, there are multiple levels of caches between the NVMe and the SSH daemon. It might well be that no read access on the NVMe is necessary if a user logs in via SSH. And most writes to the NVMe will be cached anyway.

----------

## mike155

 *dman777 wrote:*   

> If I may ask, are you using direct pass through? I was wondering if those power states would work with direct pass through of the disk to the guest.

 

No, I create QEMU image files on an ext4 filesystem and use them as disk volumes in my QEMU/KVM guests.

The answer to your question depends on what exactly you mean with 'direct pass through'. Are you talking of direct access to the NVMe:

 *man qemu wrote:*   

> NVM Express (NVMe) storage controllers can be accessed directly by a userspace driver in QEMU. This bypasses the host kernel file system and block layers while retaining QEMU block layer functionalities, such as block jobs, I/O throttling, image formats, etc. Disk I/O performance is typically higher than with -drive file=/dev/sda using either thread pool or linux-aio. 

 

I don't know what will happen to NVMe power states if you completely bypass the Linux kernel on the host system.

----------

