# SSD writings: should I really reduce them?

## hujuice

Hello everybody.

I would like to share a question that I'm posing to myself since I bought an SSD.  :Rolling Eyes: 

Internet is full of serious recommendations about the need to avoid massive, daily writings, to avoid writing speed degrade.

Despite this large amount of papers, I'm still unable to understand how much this degrade could become, as scaling: 50%? 10? 1%?

The result, as a noob about SSD, is that moving away frequently updated dirs, I spent 100 € to make use of 5 GB of disk: that's fool!  :Shocked: 

The first effect is that I say to all my friends: «think twicebefore buy an SSD». Even to my richest friends.

Can this degrade to make the SSD slower in writings than a modern, SATA 3 rotating HDD?

The question is very important. Still more important since I want mostly quick readings (run bins, read data).

If I start to use my SSD like an HDD, without take in any care the writing saving, what's the risk?

1) I need a better understanding: readings could degrade too to a very slow general throughput.

2) I could arrive to an extreme situation where my SSD becomes slower in writings than a HDD. Readings are still fast.

3) I can make full use of my 100 € (writing logs, and caches and so on), but for a limited amount of time, then the drives becomes significantly slower in writings, but still faster than my HDD. Readings are still fast.

4) Since readings are the most important for me, if I make full use of the whole SSD I will be happy forever.

What do you think?

Regards,

HUjuice

PS: the drive is OCZ Agility 3 SATA III 2.5'' SSD (60G).

----------

## NeddySeagoon

hujuice,

SSDs are much faster to read than to write - between 10s and 100s of times slower to write.

The problem is that writes requite a prior erase operation and its actually the erase that is slow, not the write.

To address this 'feature' the controller on the SSD keeps track of which blocks are clean (already erased) and which have data in them.

Writing to an already erased block is almost as fast as a read, since the erase operation can be skipped.

Eventually, all the free space on the SSD contains data that must be erased before the block can be written.  When this happens, writes are slower than a 'spinny' hard drive but reads are not affected.

There is more.  If the SSD knew in advance when data was being discarded, the block could be erased at data discard time, so that the unused blocks in the SSD would always be blank (erased) and ready to write without the erase operation. This is the purpose of the trim instruction, which is supported by most SSDs now and some filesystems under some cirumstances.

In short, it all depends on your SSD, your setup and choice of filesystems if you get trim support or not.

----------

## Hypnos

My understanding --

* SSD writes do slow down over time, due to two effects:

- Write amplification, solved by having TRIM enabled

- Degradation of blocks, causing the controller to use spare blocks which are not contiguously located.

However, I have never heard of SSDs writes becoming slower than those on a regular disk.

* SSD reads will never become slower than a regular disk.  Random access reads on an SSD are faster than even contiguous reads on an regular disk.  

If your SSD ever becomes as slow as a regular disk it probably means it's about to fail.

----------

## albright

 *Quote:*   

> Degradation of blocks, causing the controller to use spare blocks which are not contiguously located

 

while degradation is real, I don't think ssds care about contiguity, any more than

ram does (but I could be wrong ...)

----------

## Hypnos

Contiguity matters even for SSDs.  Look at any benchmark.  (For example)

----------

## albright

 *Hypnos wrote:*   

> Contiguity matters even for ssds

 

That would seem to imply that defragging is needed, which

I find doubtful. What speed improvement would be realized 

by defragging an ssd whose files were quite discontiguous 

(all else being the same of course)? I can see there might be

a purely theoretical difference. ...

----------

## Hypnos

There are two separate fragmentation issues: filesystem fragmentation, and disk fragmentation.

On a regular disk these are nearly one-in-the-same except for any bad blocks.  With an SSD they can be rather different, as the controller presents a contiguous block table to the operating system while secretly shuffling them around as they go bad.

I imagine that controllers are programmed to present contiguous blocks that are actually as contiguous as possible in order to keep up their benchmark numbers.  But with enough bad blocks this can't be done.

I have encountered the issue that my sequential benchmarks are much worse on a full filesystem than on a fresh filesystem, yet to solve it. (link)

----------

## whiteghost

i can not find the article, but anandtech calculated normal desktop use and an ssd will last years.

sure i read somewhere fragmentation is purposeful so writes are not done in one area too long and generate unwanted heat.

do your best tweaks and forget about it.

my /tmp is mounted tmpfs, 

i have 8gb memory so /var/tmp/portage is tmpfs. probably most useful for gentoo users

ahci driver, ext4 and discard mount option.

hujuice i have an agility 2 60gb 1 yr old. tests from 

https://wiki.archlinux.org/index.php/SSD_Benchmarking#OCZ-VERTEX2_240GB

```
hdparm -Tt /dev/sda

/dev/sda:

 Timing cached reads:   8226 MB in  2.00 seconds = 4114.15 MB/sec

 Timing buffered disk reads: 584 MB in  3.01 seconds = 193.98 MB/sec

localhost mike # dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc

1024+0 records in

1024+0 records out

1073741824 bytes (1.1 GB) copied, 4.56786 s, 235 MB/s

localhost mike # echo 3 > /proc/sys/vm/drop_caches

localhost mike # dd if=tempfile of=/dev/null bs=1M count=1024

1024+0 records in

1024+0 records out

1073741824 bytes (1.1 GB) copied, 4.15339 s, 259 MB/s

localhost mike # dd if=tempfile of=/dev/null bs=1M count=1024

1024+0 records in

1024+0 records out

1073741824 bytes (1.1 GB) copied, 0.268517 s, 4.0 GB/s

```

----------

## destroyedlolo

Hello,

The problem is not fragmentation or write access, the problem is SSD cells can afford only a certains number of write access (in millions).

So, if for example your filesystem is login access to file, you will write again and again and again the same cells everything you're accessing to the same file.

SSD controlers have some workaround by using spare cells.

As example, some micro-controlers as PICs are using flash memory : they have a guaranted retention time of 40 years, but you can kill you chips if few hours only by doing a loop they write continuously its flash.

The solutions is :

configure FS to not login access to files

don't use it for disk intensive applications (database)

configure lot of cashes

I don't have any links here, but I think it can be easily found on the web.

Bye

Laurent

----------

## WorBlux

There's a huge thread sticky on the OCZ linux support forum about how to maximize SSD performance for linux.

To summarize it...

1. Allign the partitions to the drive sectors.

2. Mount with noatime and trim options

3. If you can reasonably avoid a swap partition, do so. 

4. Use the deadline or no-op disk schedulers for the drive.

The agility three series are serious drives, with a few precautions they ought to have a long life, the first two being the more important.

----------

## hujuice

I find this discussion very interesting, even I'm still unable to find a way to manage practical situations.  :Surprised: 

First of all, I don't have any problem in aligning, trimming and so on.

Internet is full of papers about.

My problem is that avoid writings could mean avoid disk usage.

When I was child it was forbidden to walk on the grand-mother's carpet, to keep it clean and new.

But I was a kid and stepping and laying on it was the reason for a carpet to exist.

So, I didn't understand that rule.

Similarly, «avoid writings» is not a good principle for my feeling.  :Rolling Eyes: 

If you need writing performances, what's the sense to drive writings to a slower device?

And if you need mostly reading performances (my case)... again, what's the sense to write elsewhere, degrading the reading?

I think that the general sentence «avoid writings» should be rewritten with a more complex one: «avoid secondary writings if you need writing performances for primary tasks».

So, I'm considering this:

```
/tmp     -> tmpfs

/var/tmp -> tmpfs

/var/log -> to my "old" HDD (a secondary task, in performance needs)

*        -> to my new SDD
```

Do you think that I will somehow waste my 100€?

HUjuice

----------

## whiteghost

```

/tmp     -> tmpfs 

/var/tmp -> tmpfs 

/var/log -> to my "old" HDD (a secondary task, in performance needs) 

*        -> to my new SDD
```

perfect

----------

## Hypnos

hujuice,

I think your scheme is sensible.  In my case, I have a laptop preinstalled with an SSD.  I put /tmp and /var/tmp/paludis on tmpfs, but /var/log is just on the SSD.  Don't forget the "noatime,discard" options for your SSD.  I have also:

* reduced Firefox disk cache to 0

* only use ccache for specific packages

* set swappiness to zero (I have a swapfile for hibernation)

Each person must decide what functionality he can go without for the sake of preserving the SSD.  I should probably put /var/log in tmpfs as it's just a few MB.

----------

## whiteghost

i disable ccache. 

i checked to see if i got 'hits' that would show it was doing me any good

and got very few hits.

----------

## Hypnos

I develop software for my field, so I often end up doing quick rebuilds on my personal ebuilds.   I use a wildcard in my Paludis config to use ccache only for these packages:

```
if ( echo "${CATEGORY}" | grep -q "sci-" ); then

   CFLAGS="${CFLAGS} -ggdb"

   CXXFLAGS="${CXXFLAGS} -ggdb"

   PATH="/usr/lib/ccache/bin:${PATH}"

   CCACHE_DIR="/var/tmp/ccache"

   SANDBOX_WRITE="${SANDBOX_WRITE}:${CCACHE_DIR}"

fi

```

Plenty of hits, of course.

----------

## hujuice

It's a "headless" host for small LAN services: apache, common development area, portage sharing via NFS, dnsmasq for dhcp/DNS purposes, centralized logs, backups, what else?

No X.

Actually, I care mostly for webserver readings, portage sharing and local backups.

/var/www and /usr/portage are in HDD and I will move them back to SDD, to improve performances.

Also, I backup everything to HDD (I love rdiff-backup!) and it doesn't make sense if /var/www is already there.

But these are my little, dirty annoyances  :Smile: 

Thank you!

Hujuice

----------

## NeddySeagoon

albright,

Contiguousness matters because the erase operation works on much bigger areas of the disk than the sector size.

This means that random writes case the drive to either do read/modify/write cycles which are slow, or to move a chunk of data equal to the erase size somewhere else.

Thats like a read/modify/write but skipping the erase.  Thus contiguous writes do not incur any overhead associated with moving data around.

----------

## hujuice

A very interesting reading is a series of articles on AnandTech.

Someone of you could already know them.

About writing degrade: http://www.anandtech.com/show/2738/8

About the amount of the degrade: http://www.anandtech.com/show/2738/13

About the reading (almost non-existent) degrade: http://www.anandtech.com/show/2738/14

And so on...

Regards,

HUjuice

----------

## albright

 *NeddySeagoon wrote:*   

> 
> 
> Contiguousness matters because the erase operation works on much bigger areas of the disk than the sector size.
> 
> This means that random writes case the drive to either do read/modify/write cycles which are slow, or to move a chunk of data equal to the erase size somewhere else.
> ...

 

thanks - that is helpful. But won't that make a difference only on a drive that 

is running out of space which has never been written to ... and does not trim

solve that problem?

----------

## NeddySeagoon

albright,

Trim is not really relevant.  Lets take a worked example. 

Suppose the SSD has a 4k sector size. Further, suppose that the minimum erase chunk size is 64k.

The 4k sector size is real. The 64k erase size was typical whe I used to work with raw FLASH chips.

To write a 4k random sector, at best the drive needs to relocate 64k, changing a 4k sector on the way. So a random 4k write actually causes a read of 60k and write of 64k.

Trim can then erase the 64k area that was relocated.  All this happens in the on drive RAM.

For sequential writes, the drive knows that the erase area is clean at the start, and a sequence of 4k sectors can be written. 

These are the two extremes. Tricks like write combining help alleviate the worst case, also the drive keeps track of erased sectors, so a write to a clean sector does not incur the 64k move.

----------

