# NFS vs SMB?

## pwnell

Hi,

I have a Gigabit Ethernet Switch to which I have connected one Gentoo Linux 2.6.15rc2 machine and two Apple Mac's - one Powerbook and one G5.  All three machines are running gigabit NIC's and are functioning correctly at 1000Mbps.

I have fast 400GB SATA 7200rpm HDD's on all machines (WD4000YR).  I want to share a 400GB HDD on the Linux machine for the Mac's use.  Obviously I need to choose SMB or NFS.  My question is - which one is better and what is the performance I can typically expect? (The below figures are based on a transfer of thousands of 8MB files)

I am getting 14MB/s write FROM Mac TO Linux using NFS.

I am getting 10MB/s read FROM Mac reading off Linux using NFS.

I am getting 33MB/s write FROM Mac TO Linux using SMB.

I am getting 22MB/s read FROM Mac reading off Linux using SMB.

I am getting 34MB/s write FROM Mac TO Linux using FTP.

Are these figures good or bad?

----------

## zxiiro

looks normal to me, i've always had terrible speeds using NFS when compared to SMB thus i got rid of NFS in favour of SMB

----------

## pwnell

I am just  abit confused.  If my hard disk drives are rated at 69MB/s write/read speed, then how come I get only 34MB/s using FTP (with almost no overhead)??

----------

## sundialsvc4

Well, if those drives are "a network away" from you .. then the speed of the network and of the server-software would matter too, would it not?

----------

## pwnell

 *sundialsvc4 wrote:*   

> Well, if those drives are "a network away" from you .. then the speed of the network and of the server-software would matter too, would it not?

 

Yes but I am on a switched Gigabit LAN with no traffic - the two machines are the only ones on the network.

----------

## 1U

Scp is great for copying files too. It's not as convenient as having a system always mounted (for that you can try shfs, ssh fs) but it's VERY fast. I did 90 gigabytes once with scp without a single problem and though I did not monitor transfer rates, it was done under 2 hours on 100mbit. Give it a try if you haven't already, every system which has ssh installed should have it.

----------

## allucid

Really? I tried SCP between two machines on a local network and it wasn't fast at all (compared to FTP). This is with two fast machines and using the blowfish cipher.

----------

## pwnell

 *1U wrote:*   

> Scp is great for copying files too. It's not as convenient as having a system always mounted (for that you can try shfs, ssh fs) but it's VERY fast. I did 90 gigabytes once with scp without a single problem and though I did not monitor transfer rates, it was done under 2 hours on 100mbit. Give it a try if you haven't already, every system which has ssh installed should have it.

 

I'll give shfs and ssh fs a try.  I cannot use scp since I need my Mac to see it as part of the file system (I want to store Aperture's library there).

Btw - it is a bit impossible to transfer 90GB over a 100Mbps link in under 2 hours.  Assuming your machines were fast enough as to not be a bottlenect with the encryption overhead in ssh, the theoretical throughput is about 11MB/s (there is overhead involved).  This gives 92160MB / 11MB/s = 8378s = 139 minutes > 120 minutes (two hours)...

----------

## 1U

The transfer felt faster than 2 hours. I had 1000mbit nics although the switch was only capable of 100mbits. Scp was reporting 13 MB/s transfer rates. I was amazed too at starters, I didn't expect it to be this quick. But I checked the files and they were not corrupt and nothing was wrong.

Scp doesn't use encryption by default, so it didn't create any overhead. And another thing I noticed was that the machines were using all free ram possible even if it was a gigabyte. It seems like it was caching a lot of files to improve performance. Most of what I copied were all very large files so that's another reason I was able to move so many without wasting time on the actual copy commands and other inner workings required for dealing with each file.

Also as for shfs, me and a friend tried it and we both noticed errors sometimes. After all it's still supposed to be experimental. It's convenient for things like copying a few files here and there when working on a website remotely for example. But I wouldn't trust it to move any really important or large files. My friend tried it with larger files and reported heavy corruption, slow speed, and other errors. I also noticed sometimes it gives read errors even for small tasks. Perhaps others have been lucky but I'm going to use other means of copying like scp for now.

----------

## allucid

 *1U wrote:*   

> Scp doesn't use encryption by default, so it didn't create any overhead.

 

ssh and scp always use encryption, you cannot disable it. Even if you don't specify the cipher it uses a default ciper. maybe you are thinking of rsh/rcp?

----------

## nobspangle

@pwnell

What bus are your gigabit and sata cards on?

If they are on PCI, that is your bottleneck.

34MB/s sounds pretty quick to me for 1 drive. You can never expect to get close to the maximum speed. If you want more speed, try a raid array.

----------

## pwnell

 *nobspangle wrote:*   

> @pwnell
> 
> What bus are your gigabit and sata cards on?
> 
> If they are on PCI, that is your bottleneck.
> ...

 

Wel I actually have a RAID array.  RAID5 with 4 disks.  On Adaptec HW RAID Controller.  I get a raw disk throughput of 55MB/s both read and write to/from the array.

The cards are all on PCI - but I do not believe that to be the bottleneck.  The PCI bus can handle 127MB/s - the max of the network is 100MB/s.  However - if they share the bandwidth then that would translate to 63MB/s per device - right?  Still - I am only getting 34MB/s........

----------

## groovin

 *pwnell wrote:*   

> 
> 
> The cards are all on PCI - but I do not believe that to be the bottleneck.  The PCI bus can handle 127MB/s - the max of the network is 100MB/s.  However - if they share the bandwidth then that would translate to 63MB/s per device - right?  Still - I am only getting 34MB/s........

 

thats all theoretical speeds... in real life you dont see numbers that high.

----------

## neuron

gah, yet another user that finds smb faster than nfs, I have no idea why but my smb is about 7mb/sec and nfs is 11mb/sec (on standard 100mbit) :/

----------

## pwnell

Ok I am now TOTALLY lost.

I decided to try netatalk since it is AFP and maybe it works better with the Mac.  I compiled AFP support in to the kernel and started the daemon.  The volume mounted fine from Mac OS X 10.4.3.

I copied a file from the mac to Linux via AFP (Netatalk).  It goes quickly (about 40MB/s) up to 205MB, then it stalls for a second or two, then carries on quickly, then stalls, then on etc.  The times it stalls correspond *exactly* with the times pdflush writes the dirty pages to disk:

```

waldopcl vm # vmstat 1 1000

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----

 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa

 0  0    344 1480188     48 505532    0    0   261  1120  254   213  0  1 95  3

 0  0    344 1480188     48 505532    0    0     0     0  255    26  0  0 100  0

 0  0    344 1480188     48 505532    0    0     0     0  253    26  0  0 100  0

 1  0    344 1471740     84 515016    0    0    20     0 1411  2035  0  5 94  1

 0  0    344 1449980    104 536824    0    0     0    48 2826  4536  0  9 91  0

 0  0    344 1434108    120 552652    0    0     0     0 2121  3312  0  6 94  0

 0  0    344 1410556    144 576156    0    0     0     0 2990  4848  0  9 92  0

 1  0    344 1386748    168 599864    0    0     0     0 3020  4907  0 10 90  0

 1  0    344 1362172    196 624520    0    0     4     9 3157  5152  0 10 89  1

 1  0    344 1338876    216 647824    0    0    16    24 2977  4829  0  9 91  0

 0  0    344 1315324    240 671464    0    0     0     0 3016  4933  0 10 90  0

 0  0    344 1291516    264 695240    0    0     0     0 3015  4920  1 11 88  0

 0  0    344 1275388    280 711408    0    0     0     0 2149  3350  0  7 93  0

 0  0    344 1269236    284 717388    0    0     0  4148 1001  1302  0  3 97  0

 0  0    344 1267700    288 719084    0    0    40    56  560   567  0  1 99  0

 0  0    344 1261012    292 725744    0    0     0  8252 1181  1590  0  2 99  0

 0  0    344 1252700    304 733892    0    0     4  8296 1274  1756  0  4 96  0

 0  1    344 1252004    304 734436    0    0     0 74176  357   176  0  1 95  4

 0  1    344 1252004    304 734436    0    0     0 58424  379    28  0  1 50 49

 0  2    344 1252020    304 734436    0    0     0 58424  380    59  0  1 50 49

 0  1    344 1252020    304 734436    0    0     0 17380  377    48  0  1 50 49

 0  0    344 1241284    316 745236    0    0     0     4 1815  2652  0  5 68 27

 0  0    344 1217220    340 769284    0    0     0     0 3071  5007  0 10 90  0

 1  0    344 1195492    360 790616    0    0     0     0 2752  4426  0 11 89  0

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----

 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa

 2  0    344 1175780    380 810044    0    0     0    28 2539  4073  0  7 93  0

 0  0    344 1158884    396 826620    0    0     0     0 2189  3448  0  6 94  0

 0  0    344 1138148    416 847136    0    0     0     0 2651  4261  0  9 91  0

 1  0    344 1113044    444 871656    0    0     4     0 3109  5065  0 10 90  1

 0  0    344 1109460    448 875256    0    0     0   186  709   748  0  1 98  0

 1  0    344 1089748    468 894888    0    0     0     0 2657  4275  0  8 93  0

 1  0    344 1066452    488 917512    0    0     0     0 2891  4685  0  9 91  0

 0  0    344 1043724    512 940200    0    0     0     0 2928  4742  0 10 89  0

 0  0    344 1039372    516 944412    0    0     0  4144  768   920  0  2 98  0

 0  0    344 1034012    520 949848    0    0     0  4144  930  1198  0  2 98  0

 0  0    344 1030428    524 953108    0    0     0  4144  829  1022  0  1 99  0

 0  0    344 1011516    540 971928    0    0     0 16576 2559  4086  0  8 93  0

 1  0    344 1007164    544 976276    0    0     0  4144  817   991  0  1 99  0

 0  0    344 1003068    548 980216    0    0     0  4144  722   834  0  1 99  0

 0  0    344 994876    556 988232    0    0     0  8288 1555  2322  0  4 97  0

 1  0    344 990812    564 992304    0    0     4  4144  953  1255  0  2 98  0

 0  0    344 986460    568 996312    0    0     0  4144  755   904  0  2 98  0

 0  0    344 974436    580 1008268    0    0     0 12432 1884  2882  0  7 93  0

 0  0    344 970084    584 1012480    0    0     0     0  925  1192  1  2 98  0

 0  0    344 958836    596 1023688    0    0     0 16585 1683  2504  0  5 95  0

 0  0    344 953460    600 1029056    0    0     0  4144  907  1171  0  2 98  0

 0  0    344 952692    600 1029736    0    0     0     0  367   225  0  0 100  0

 0  0    344 940916    612 1041148    0    0     0 12432 1710  2534  0  7 93  0

 0  0    344 936596    616 1045632    0    0     0  4144  809   981  0  2 98  0

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----

 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa

 2  0    344 1424532    616 558412    0    0     0     0  291    75  0 11 89  0

 0  0    344 1476620    616 507140    0    0  1528    10  408   219  0  2 91  7

 0  0    344 1476620    616 507140    0    0     0     0  255    26  0  0 100  0

```

I understand why 205MB - my /proc/sys/vm/dirty_background_ratio is set to 10, which is 10% of 2GB RAM which is 200MB.  What bugs me however, is *why* does the system (a) not continuously flush the buffers once it fills up (and not stall the sending side, but rather try and keep up writing the data), and (b) why does writing the data to disk cause 99% CPU (Hyperthreading enabled) and blocks the client application (netatalk)? I am 99% sure DMA is enabled.  

It works fine with SMB and FTP.  I get sustained 34MB/s.  Also dd if=/dev/zero of=/tmp/xxx gives "1069570560 bytes (1.1 GB) copied, 9.39887 seconds, 114 MB/s".  

How do I fix this?  I am running 2.6.15rc2.  I want AFP (netatalk) to write data to the disk continuously as it received new data.  The reading off the TCP buffers should be done at the fastest possible speed allowed by the disk subsystem's write speed.  Not read from network, write to disk (while pausing the network data flow), once all data has been flushed to disk read some more etc.

----------

## pwnell

For me this behaviour seems 100% similar to when you have bus contention.  The traffic goes at full speed over the wire until the host cache fills up (205MB).  At that time it needs to use pdflush to write out the dirty buffers.  However writing to a SATA device connected to the standard PCI bus (the same bus as on which the Gigabit NIC is) means they need to share the 127MB/s bandwidth.  Since two messages can't travel down this bus from two different devices (i.e. two devices cannot be bus master simultaneously), the NIC is neccesarily released as master while the SATA device flushes data quickly.  Once the data has been flushed, it stops being bus master and the NIC acquires bus master status again.  

What bothers me though is why does FTP transfer sustained at 34MB/s?  If that was true then it would be applicable to anything transferring data over the NIC?

----------

## pwnell

I have tested again now and can verify that FTP exhibits the same behaviour - copies from Linux to G5 works fine - I get high sustained transfers.  Copies from the G5 to the Linux machine is the problem.  I am 100% sure it is bus contention (the G5 uses all PCIe busses/devices so there is no contention).

Does anyone agree?

----------

## frenkel

 *pwnell wrote:*   

> For me this behaviour seems 100% similar to when you have bus contention.  The traffic goes at full speed over the wire until the host cache fills up (205MB).  At that time it needs to use pdflush to write out the dirty buffers.  However writing to a SATA device connected to the standard PCI bus (the same bus as on which the Gigabit NIC is) means they need to share the 127MB/s bandwidth.  Since two messages can't travel down this bus from two different devices (i.e. two devices cannot be bus master simultaneously), the NIC is neccesarily released as master while the SATA device flushes data quickly.  Once the data has been flushed, it stops being bus master and the NIC acquires bus master status again.  
> 
> What bothers me though is why does FTP transfer sustained at 34MB/s?  If that was true then it would be applicable to anything transferring data over the NIC?

 

Yes, that would explain this problem the best. Do you have any IDE drives to test it with? It shouldn't happen with IDE drives then...

----------

## pwnell

This just got a bit more complicated.  Copying one huge file causes the issues I have discussed.  Copying lots of 8MB files works perfectly - there is a sustained throughput and sustained disk write.  I assume then it is attributable to the way writing dirty caches to disk works in Linux when the close() call has not yet been made?  Seems like an implementation issue to me.

----------

