# Help me with Samba performance

## mbar

I have this home router/gateway/file server with Gentoo 2006.1 AMD64 configured as a simple Samba server for my home LAN (3 computers in total). For quite some time I've been struggling to optimize performance of simple large file copy (say 700 MB AVI file) from Samba server to Windows desktop. It has quite disappointing transfer speed: approx. 30 MB/s on a gigabit network. I measure the copy speed using Total Commander, and when I try to copy the same file again (when it still resides in Gentoo's file cache) the speed jumps to approx 115 MB/s for 2.6.19-rc4-mm2 kernel and something about 75-80 MB/s for 2.6.17-emission8 kernel. So I think raw network performance is quite good, the problem might be with HDD reading performance, right? Here are some facts about my systems:

- Windows desktop has 2x250 SATA 300 RAID0, so it surely is faster than 30 MB/s when writing. 2 GB RAM helps also. It has build it Nforce4 network card (gigabit of course)

- Gentoo Samba server is based around Sempron 3100+ (256 kB cache, overclocked to 2200 MHz, 1 GB RAM single channel). It has Intel Pro 1000 GT gigabit card. Both computers are connected to Dlink DGS-1005D switch.

Gentoo configs:

```
gateway ~ # ifconfig eth1

eth1      Link encap:Ethernet  HWaddr 00:0E:0C:C0:39:F1

          inet addr:10.0.1.1  Bcast:10.0.1.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:4500  Metric:1

          RX packets:3302448 errors:0 dropped:0 overruns:0 frame:0

          TX packets:3780438 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:2666641187 (2543.1 Mb)  TX bytes:7318655755 (6979.6 Mb)

          Base address:0xe800 Memory:ec120000-ec140000

```

MTU for both computers is set to 4500 (jumbo frame).

```
[ebuild   R   ] net-fs/samba-3.0.23c  USE="async cups pam python readline -acl -automount -doc -examples -kerberos -ldap -oav -quotas (-selinux) -swat -syslog -winbind" LINGUAS="pl -ja" 17,290 kB [1]

```

```
gateway ~ # cat /etc/samba/smb.conf

[global]

workgroup = SIATKA

server string = Gateway %v

log file = /var/log/samba/log.%m

max log size = 256

socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=8760 SO_SNDBUF=8760

getwd cache = yes

#aio read size = 65536

#aio write size = 65536

preferred master = yes

interfaces = lo eth0 eth1 ath0

#ra0 ath0

bind interfaces only = yes

hosts allow = 127.0.0.1 10.0.0.0/24 10.0.1.0/24 10.0.2.0/24

hosts deny = 0.0.0.0/0

security = share

guest account = nobody

guest ok = yes

printcap name = cups

printing = cups

enhanced browsing = yes

large readwrite = yes

max xmit = 65536

client lanman auth = yes

client use spnego = no

client signing = disabled

[raid]

browseable = no

writable = yes

public = no

create mode = 0766

guest ok = yes

path = /home/raid

```

Gentoo server has 2 x 320 Seagate 7200.10 SATA 150 in-kernel RAID0 and the system is on a separate 60 GB IDE disk. RAID0 has 1024 kB stripe size and XFS file system (on one big 640 GB partition).

```
gateway ~ # hdparm -tT /dev/sda

/dev/sda:

 Timing cached reads:   1414 MB in  2.00 seconds = 706.46 MB/sec

 Timing buffered disk reads:  220 MB in  3.00 seconds =  73.28 MB/sec

gateway ~ # hdparm -tT /dev/sdb

/dev/sdb:

 Timing cached reads:   1408 MB in  2.00 seconds = 703.77 MB/sec

 Timing buffered disk reads:  226 MB in  3.00 seconds =  75.32 MB/sec

gateway ~ # hdparm -tT /dev/md0

/dev/md0:

 Timing cached reads:   1406 MB in  2.00 seconds = 702.49 MB/sec

 Timing buffered disk reads:  400 MB in  3.00 seconds = 133.12 MB/sec

```

```
gateway ~ # blockdev --report

RO    RA   SSZ   BSZ   StartSec     Size    Device

rw  8192   512  4096          0  117231408  /dev/hda

rw  8192   512  4096         63    1285137  /dev/hda1

rw  8192   512  1024    1285200      48195  /dev/hda2

rw  8192   512  4096    1333395  115892910  /dev/hda3

rw  8192   512  4096          0  625142448  /dev/sda

rw  8192   512   512         63  625121217  /dev/sda1

rw  8192   512  1024          0  625132714  /dev/sdb

rw  8192   512   512         63  625121217  /dev/sdb1

rw 16384   512   512          0 1250238464  /dev/md0

```

```
gateway ~ # cat /etc/raidtab

#/home/raid (RAID0)

raiddev                 /dev/md0

raid-level              0

nr-raid-disks           2

chunk-size              1024

persistent-superblock   1

device                  /dev/sda1

raid-disk               0

device                  /dev/sdb1

raid-disk               1

```

```
gateway ~ # emerge --info

Portage 2.1.2_rc1-r3 (default-linux/amd64/2006.1/server, gcc-4.1.1, glibc-2.5-r0, 2.6.19-rc4-mm2 x86_64)

=================================================================

System uname: 2.6.19-rc4-mm2 x86_64 AMD Sempron(tm) Processor 3100+

Gentoo Base System version 1.12.6

Last Sync: Sat, 04 Nov 2006 10:29:01 +0000

distcc 2.18.3 x86_64-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]

app-admin/eselect-compiler: [Not Present]

dev-java/java-config: [Not Present]

dev-lang/python:     2.4.3-r4

dev-python/pycrypto: 2.0.1-r5

dev-util/ccache:     [Not Present]

dev-util/confcache:  [Not Present]

sys-apps/sandbox:    1.2.18.1

sys-devel/autoconf:  2.13, 2.60

sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10

sys-devel/binutils:  2.17.50.0.6

sys-devel/gcc-config: 1.3.14

sys-devel/libtool:   1.5.22

virtual/os-headers:  2.6.17-r1

ACCEPT_KEYWORDS="amd64 ~amd64"

AUTOCLEAN="yes"

CBUILD="x86_64-pc-linux-gnu"

CFLAGS="-O2 -march=athlon64 -pipe -fomit-frame-pointer"

CHOST="x86_64-pc-linux-gnu"

CONFIG_PROTECT="/etc"

CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo"

CXXFLAGS="-O2 -march=athlon64 -pipe -fomit-frame-pointer"

DISTDIR="/usr/portage/distfiles"

FEATURES="autoconfig distlocks metadata-transfer parallel-fetch prelink sandbox sfperms strict"

GENTOO_MIRRORS="http://src.gentoo.pl http://gentoo.prz.rzeszow.pl http://gentoo.zie.pg.gda.pl"

LANG="pl_PL"

LC_ALL="pl_PL"

LDFLAGS="-Wl,-O1 -Wl,-s -Wl,--hash-style=both"

LINGUAS="pl"

MAKEOPTS="-j2"

PKGDIR="/usr/portage/packages"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

PORTDIR_OVERLAY="/usr/local/portage"

SYNC="rsync://rsync.gentoo.org/gentoo-portage"

USE="amd64 acpi async atm bitmap-fonts bzip2 cgi cli cracklib crosscompile crypt cups dhcp dlloader dri elibc_glibc fbcon foomaticdb ftp gcc64 gd geoip glibc-omitfp gpm graphlcd hashstyle iconv input_devices_evdev input_devices_keyboard input_devices_mouse isdnlog ithreads jpeg kernel_linux libg++ linguas_pl lm_sensors madwifi mailwrapper mysql ncurses nls nptl nptlonly offensive pam pcre perl php pic png ppds pppd pppoa python readline reflection samba session snmp sockets spell spl ssl symlink tcpd threads tiff transparent-proxy truetype truetype-fonts type1-fonts udev unicode usb userland_GNU userlocales video_cards_apm video_cards_ark video_cards_ati video_cards_chips video_cards_cirrus video_cards_cyrix video_cards_dummy video_cards_fbdev video_cards_glint video_cards_i128 video_cards_i810 video_cards_mga video_cards_neomagic video_cards_nv video_cards_rendition video_cards_s3 video_cards_s3virge video_cards_savage video_cards_siliconmotion video_cards_sis video_cards_sisusb video_cards_tdfx video_cards_tga video_cards_trident video_cards_tseng video_cards_v4l video_cards_vesa video_cards_vga video_cards_via video_cards_vmware video_cards_voodoo wifi xml xml2 xorg zip zlib"

Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, PORTAGE_RSYNC_EXTRA_OPTS

```

And yet I can't read from Samba raid share faster than 30-32 MB/s. This is kinda annoying, because I could buy cheapest IDE drives with only 2 MB cache and have the same level of performance. Someone can spare some tips?

----------

## V0rtex

I'm not completely sure what your problem may be, but I can offer you my smb.conf file as an example that does work pretty well (I get no less than 50-55MB/s and usually more on my gigabit network):

```
[global]

    workgroup = DEKAWAR

    netbios name = nibbler

    server string = Nibbler's Box (%v)

    load printers = no

    log file = /var/log/samba/log.%m

    max log size = 50

    log level = 1

    hosts allow = 192.168.0. 192.168.1. 127.

    map to guest = bad user

    security = user

    username level = 8

    encrypt passwords = yes

    smb passwd file = /var/lib/samba/private/smbpasswd

    unix password sync = yes

    pam password change = yes

    passwd program = /usr/bin/passwd %u

    username map = /etc/samba/smbusers

    include = /etc/samba/smb.conf.%U

    obey pam restrictions = yes

    socket options = TCP_NODELAY

    interfaces = 192.168.0.200/24 192.168.1.200/24

    remote announce = 192.168.0.255 192.168.1.255

    local master = no

    name resolve order = wins bcast lmhosts host

    wins support = yes

    dns proxy = no 

    unix extensions = no

```

I remember when I was setting mine up, I was trying to tweak the socket options to try and make things work faster because I was having slowness issues as well.  When I did this, depending on what I set the SO_SNDBUF and SO_RCVBUF to I either had no improvement or it got slower.  After I discovered the root of my issues (it was using the wrong interface), I set it back to just using TCP_NODELAY in the socket options.

On my server, I have 2 RAID arrays set up that are each in a RAID 0 configuration and they seem to work out just fine.  For comparison (if it helps), my server has an AMD Athlon 64 3200+ processor and 1GB of RAM.

----------

## mbar

Do you have Nforce mainboard?

----------

## V0rtex

Yes, it is an ASUS K8N-E Deluxe and it does have an nforce chipset.

----------

## mbar

Maybe my MB in Gentoo server is a problem. It's a cheapo SiS 760 based MB, it doesn't even have integrated gigabit LAN, I had to buy a PCI GB adapter. Maybe it chokes itself when simultaneously trasferring data from SATA controllers (via internal PCI) and to PCI gigabit LAN. That would explain high reading performance when file is already in mem cache...

I also have ASUS K8N-E but use it on desktop  :Smile: 

----------

## V0rtex

That's certainly possible, but here's another theory:  Are you using Cat5-e cable or just normal Cat5 cable for the network connection?  With gigabit LAN you won't get full performance unless you're using Cat5-e cable for the network connection.  I recently encountered this with another server I built here at home which made me think of this as a possible solution to your problem.

Also, you could probably test your idea by using net-misc/iperf on two machines to test raw network bandwidth while simultaneously using hdparm to test the data read/write speeds of your drives.  If there is a noticable difference between network speeds when accessing the hard drive vs. doing nothing other than network transfers, I suppose it could indicate that your theory may be true.

----------

## mbar

Yes, I have Cat 5e cables, both computers are in the same room. I'm going to try iperf now, as you said. But in the meantime I managed to kinda "optimize" file transfer performance by increasing PCI latency timer to the max for IDE/SATA controller and network card. I also overclocked PCI bus in BIOS (to 37 MHz), and thanks to that transfer speed is now in 37-38 MB/s range.

----------

## Janne Pikkarainen

 *mbar wrote:*   

> Yes, I have Cat 5e cables, both computers are in the same room. I'm going to try iperf now, as you said. But in the meantime I managed to kinda "optimize" file transfer performance by increasing PCI latency timer to the max for IDE/SATA controller and network card. I also overclocked PCI bus in BIOS (to 37 MHz), and thanks to that transfer speed is now in 37-38 MB/s range.

 

This really starts to sound like hardware issue (the MB can't keep up with all that data coming and going  :Smile: ).

----------

## ijdod

Do you mean bits or bytes? Common (but unofficial) convention is that MB is Megabyte.

Rule of thumb: anything over 100 Mbit/s (roughly 10 MByte/s) is fine, if not optimal. Given two systems with decent (recent) specs, realistic one-on-one performance over a gigabit link for SMB is about 300-500 Mbit/s, or roughly 30-50 MByte/s. SMB is a kinda chatty protocol, which doesnt help throughput. There's other factors as well, obviously, such as system architecture. 

Check for fragmentation. Some torrent clients fragment their files an awfull lot. This kills performance when trying to push it through a gigabit link. 

As mentioned above, test point-to-point connectivity with a tool like ttcp (which is what iperf is, essentially). It will eliminate a lot of factors from the equasion. Remember to experiment with larger window sizes too (and don't be afraid to go past the 64 KB mark, either). Large windows are your friend. 

Jumbo frames should be avoided, unless you're doing some really fancy stuff like MPLS, and know what you're doing. Remember that there is no standard for jumbo frames in this context, and no host to host negationation for them. Jumbo frames were  a McGuyver (aka ducttape/paperclips) solution to an interface performance problem with early Gbit adapters. 

 *V0rtex wrote:*   

> That's certainly possible, but here's another theory:  Are you using Cat5-e cable or just normal Cat5 heyble for the network connection?  With gigabit LAN you won't get full performance unless you're using Cat5-e cable for the network connection.  I recently encountered this with another server I built here at home which made me think of this as a possible solution to your problem.

 

picking nits: You will get full performance on Cat5. Gigabit ethernet was developped for use with Cat5. While it is certainly true that Cat5e gives you more playing room in regards to bad cabling, electronically noisy environments and what have you, Cat5 in itself is fine. Also remember that just because it says Cat5 on the mantle, doesnt mean the whole finished product (including the connectors) conforms to the Cat5 specs. This goes for Cat5e and Cat6 as well, obviously. 

That said, Cat5 as a standard was replaced by Cat5e, so unless we're talking about an installed base, go for 5e or better  :Very Happy: 

----------

## mbar

Yeah, you're right, I used MB as a short for "mainboard" (should have used "mb"). Anyway, iperf shows ~700 Mbit/s speed when disks are idle and ~500 Mbit/s when I cat some large file to /dev/null. So, raw network performance is now rather good (after tweaking latency and PCI clock), and it seems that samba can't keep up or something. I'm done with it for now  :Smile:  thanks for support.

----------

