# NCQ support since which year?

## Keruskerfuerst

Since which year does the linux kernel support NCQ?

----------

## theotherjoe

found this articel via google:

http://lwn.net/Articles/183234/

under "big serial ATA changes" one finds NCQ mentioned in a set of

patches which were to go into 2.6.18.

2.6.18 apparently was released in september 2006

----------

## Keruskerfuerst

NCQ works as I know as follows:

NCQ resorts a row to sort the access to the disk to reduce head movement.

I have Suse 42.1 with kernel 4.1.15-pv indstalled.

During boot, i see the following in dmesg:

NCQ not fully supported

I think NCQ for a SSD should resort the read/write to disk for serial row and column access.

----------

## NeddySeagoon

Keruskerfuerst,

libata which provides the kernel SATA drivers runs a blacklist of drives that are buggy.

It also needs kernel and drive support to work.

NCQ does little or nothing for SSDs as they are almost random access.

----------

## Keruskerfuerst

I think NCQ should resort the access to SSD from LBA to row/column.

Flash is like normal RAM and is accessed in a similar way.

----------

## Keruskerfuerst

And access to "normal" (=magnetical) disc shoud be done this way:

Row of certain lenght should be resorted.

To minimize the (read/write) head movement.

The formware of the disc knows the actual position of the (read/write) head.

The cylinder position does relevant.

----------

## NeddySeagoon

Keruskerfuerst,

In an SSD, tho operating system has no knowledge of the internal structure of the media.

That's hidden behind the LBA that the SATA interface imposes between the operating system and the data.

Rotating rust is a bit more difficult as HDD are divided into zones.  In each zone, the physical sectors per track varies, there ale about 50% less sectors per track near the spindle than at the outside of the drive.  This is easily detectable by doing read speed tests on a per partition basis.

This makes the mapping between LBA and the location on the drive non linear.  It gets harder still when the drive has relocated logical blocks to different physical sectors as the physical sectors fail.  Only the drive knows this mapping.   

SSDs do bad block remapping too. So the same applies. Its just that as they are near random access, the speed penalty for accessing a remapped block is much reduced. Both sorts of drive may read data from remapped sectors out of order to minimise seek times.  That's the drive doing NCQ for itself, without any operating system involvement.  This does not mean that the data is returned to the operating system out of sequence, as it is when the operating system has several commands in flight and the drive returns data is i different sequence to that requested by the operating system.

----------

## Keruskerfuerst

The resorting of data access of disc is done by the firmware of the disc (magentical or flash).

Magnetical: resort access to disc in mention of position of read/write head of disc.

Flash: resort the data access to serialize the access to flash RAM.

How exactly is RAM access (row, column)? 

Is it done by accessing a row in serial 

or

by accessing a column in serial?

----------

## krinn

ncq is helping because of the rotational time to access an area on the drive, and drive internal is made so a sector access take the rotational factor.

if your disk need 1ms to read a sector, and you put sector0 and sector1 next to each other in the disk plater, than anytime you want access sector0 and sector1, the 1ms time taken to read sector0 will let the rotation goes passing sector1, and to access sector1, you must wait for next iteration.

In order to optimize that you can put sector0 and sector1 at a 1ms distance: now you access sector0, you take 1ms, and the disk has rotate exactly on sector1 position, allowing you to access immediately sector1.

This is predictable base on a linear access, but when doing random it can goes bad.

if your heads are upper sector4, accessing sector12 is ahead of it, and will be quicker made than accessing sector1: as sector1 is before sector4, your heads will need to wait next rotation to reach sector1, but while doing the current rotation, your heads will soon be upper sector12.

in this case your ncq queue is made of : 12 & 1 (access sector 12 prior 1). because it will read sector12 first and sector1 on next rotation.

but if your heads are on sector0: ncq queue should be rearrange to read sector 1 and sector 12: you will read sector1 that is next 0, and on the same rotation you will soon be able to read sector12.

Because of your heads position over sector4 not reordering read priority will delay time to access the wanted sector.

if we take again the "we need 1ms to access next sector" if your heads are over sector4 on a disk made of 20 sectors

and you ask to access again sector 1 and sector 12

- accessing sector 1 will take

> (20-4) + 1 +1 * 1ms = 18ms (you will wait to goes over sector5, 6, 7... 20, and 0 and 1)

- accessing now sector 12 will take

> (12-1) * 1ms = 11ms (you will wait to goes over sector2, 3... until 12)

and if you had rearrange them

- accessing sector12 will take

> (12-4) * 1ms = 8ms (you goes over 5, 6... upto 12)

accessing sector1 will take

> (20-12) + 1 + 1 * 1ms = 10ms (you goes over sector13, 14...20 and 0 and 1)

as your heads were over sector4 accessing sector1 than 12 takes 8+10ms

and accessing sector12 than 1 takes 18+11ms

it's in real more complex because disks are made of multi-heads and multi-plater.

i think the ncq from OS pov is just that: you fill your need for sectors access and OS doing ncq is just doing "i want sector 1 and sector 12" query.

and the disk controller is reading that queue and arrange it base on heads positions and number, rotational speed, number of platers... in order to have the queries answered faster.

ps: that's my theory on ncq, never really dig into it to confirm the behavior, so don't take my words for reality  :Wink: 

----------

## NeddySeagoon

Keruskerfuerst,

FLASH memory does not normally use rows and columns, that's a DRAM trick.

Instead, it has an address bus - the whole address is applied to the chip in one go, and one or more chip select lines to make selecting the right chips easy.

The chip select is usually driven by decoding the higher address bits that don't fit on an individual chip address bus.

----------

## frostschutz

I don't use NCQ (libata.force=noncq). In regards to SSD/Queued TRIM there have been too many bugs for me to feel comfortable with it; and in regards to my WD Green HDDs they seem to perform poorly with NCQ.

As for how long the support has been there, you should be able to tell from the git history (e.g. on GitHub...). It's been a while in any case, NCQ is not particularly new.

----------

## Keruskerfuerst

Flash is like normal RAm, which does discharge.

----------

## krinn

 *frostschutz wrote:*   

> and in regards to my WD Green HDDs they seem to perform poorly with NCQ.

 

that's an impression because they perform poorly, because strictly about NCQ performance, they should perform the most, as the slower the rpm, the more NCQ is helping ; and it should be SSD that have no real usage of NCQ.

----------

## NeddySeagoon

Keruskerfuerst,

DRAM leaks by design, thus needs to be refreshed.

Also reads are destructive, the data can only be read once and it must be written back.  The DRAM does this itself.

FLASH leaks because 

a) the insulator cannot be made any better (its good enough)

b) quantum tunnelling

FLASH does not need to be refreshed, reads are not destructive.

The memory cell designs are quite different.

----------

