# Gentoo as a File Server

## remix

i need to make a file server for a small office (5 humans). they want 1TB of space, and backups.

How would you go about doing this? i've read a gentoo guide, but it was only for raid1.

what configuration would you go to get this job done?

tia.

----------

## Philz

Well I don't know your budget, but I suggest something like this:

Hardware:

Normal CPU (any more or less recent will do)

512 to 1024 MB RAM (512 should be enough for headless server)

Mainboard with a x4 PCI Express (PCIe) port or a x16 PCIe port (or both of course)

Linux Compatible RAID Controller (Highpoint, Promise, whatever)

4 disks à 300 - 400 GB (data disks)

1 disk à 120 GB (system disk)

Tape drive or DVD burner (expensive/cheap)

Software:

Gentoo Linux

Samba

Cron

Configure data disks as RAID5 volume,

setup linux on the system disk.

Install, configure & run samba.

Get yourself a script to either:

a) write backups to the tape drive

b) create ISO files to burn them on DVD

----------

## drescherjm

Around 6 months ago I have made a few of these lately and I am very happy with my latest design (performance, looks, almost silent ...):

Everything was purchased at newegg

1  ASUS M2N-E Socket AM2 NVIDIA nForce 570 Ultra MCP ATX AMD Motherboard  $94

http://www.newegg.com/Product/Product.asp?Item=N82E16813131022

6  Seagate Barracuda 7200.10 ST3320620AS (Perpendicular Recording Technology) 320GB 7200 RPM 16MB Cache SATA 3.0Gb/s Hard Drive - OEM $94.99

http://www.newegg.com/Product/Product.asp?Item=N82E16822148140

1  SAMSUNG Black 18X DVD+R 8X DVD+RW 8X DVD+R DL 18X DVD-R 6X DVD-RW 12X DVD-RAM 16X DVD-ROM 48X CD-R 32X CD-RW 48X CD-ROM 2M Cache IDE DVD Burner With 12X DVD-RAM Write, LightScribe Technology - OEM $32

http://www.newegg.com/Product/Product.asp?Item=N82E16827151136

1  AMD Athlon 64 3800+ Orleans 2.4GHz 512KB L2 Cache Socket AM2 Processor - Retail $115.99

http://www.newegg.com/Product/Product.asp?Item=N82E16819103029

1  Albatron 6600LEQ GeForce 6600LE 256MB 128-bit DDR PCI Express x16 Video Card - Retail $67

http://www.newegg.com/Product/Product.asp?Item=N82E16814170091

1   Antec PERFORMANCE TX TX1050B Black Steel ATX Mid Tower Computer Case 500W ATX12V v2.0 Power Supply - Retail - Retail $129

http://www.newegg.com/Product/Product.asp?Item=N82E16811129158

1   CORSAIR XMS2 2GB (2 x 1GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400) Dual Channel Kit Desktop Memory - Retail $280

http://www.newegg.com/Product/Product.asp?Item=N82E16820145590

Software:

Gentoo 2006.1 AMD64

Raid Setup:

I installed all 6 drives connected to the motherboards 6 SATA2 sockets and partitioned them exactly the same.

Partition /dev/sda then copy the partition table to the rest of the disks:

```

sfdisk  /dev/sda -l | sfdisk /dev/sdb

sfdisk  /dev/sda -l | sfdisk /dev/sdc 

sfdisk  /dev/sda -l | sfdisk /dev/sdd

sfdisk  /dev/sda -l | sfdisk /dev/sde

sfdisk  /dev/sda -l | sfdisk /dev/sdf
```

On all drives I created a 256 MB partition and gave it type FD (raid autodetect)

Then on all drives I created a 512 MB partition and made it type 82 (swap)

Then all drives I created a 12 GB partition and gave it type FD (raid autodetect)

Then on all drives I created a type FD partition with the rest of the space.

Now to create the raid arrays.

For partition 1 I created a raid1 array and added all 6 drives to that. Yes it will make 6 copies of the data. This is the boot partition.

Partition 2 on all drives is swap.

```
mkswap /dev/sda2

mkswap /dev/sdb2

mkswap /dev/sdc2

mkswap /dev/sdd2

mkswap /dev/sde2

mkswap /dev/sdf2

```

For Partition 3 I created a RAID6 array with all 6 members and a 256K block size. This is /dev/md2. And I mount this as /

For Partition 4 I created a RAID6 array with all 6 members and a 512k block size. This is /dev/md3. And on this array I used LVM to create a physical volume on the whole array so that I can create smaller logical volumes for my data instead of a single 1.2TB filesystem.

The reason for splitting the disks this way is both performance and for help with drive failures (which I actually had with one disk on one of these systems). The idea is if you have a boot, / and the rest of the data space separated a rebuild for adding a drive can happen quickly if you recover boot than root and then actually boot your working system to recover the other raid array.

I also forgot to mention. I installed grub on all disks too so that if one is bad I can easily take it out and still have a working system.

----------

## DooMi

a good alternative seems to be a HDD box with a samba builtin.

i saw some tests about theses boxes the other day and it seems they are very stable.

for pure file storage solutions these boxes with samba brought-by are a fine alternative i think.

check em out   :Twisted Evil: 

----------

## Rad

Using gentoo for a fileserver is not different than using any other distro, if that was part of the question.

I personally would tend towards software sata raid 5, if I hear 5 humans that usually should be fast and reliable enough. If that doesn't suffice, you also can use other raid levels, spare disks, and/or hardware raid. As to how to set the raid up and manage it - EVMS is nice and quite simple to use (it has two gui as well as a commandline interface), you should give it a try.

Backups could also be made with EVMS (+cron for instance), or you can use a script/program that does it file by file. Amanda / flexbackup  / rsync are  my favorites, but there's a lot more programs in app-backup/.

If you said 5 people, you should be fine with something simple like SCP, FTP or a small Samba setup.

Anyways. I concur that some NAS device may also be an alternative to an entry-level server.

----------

## Ast0r

If you have the money, I recommend getting a 3ware RAID controller to do the RAID5 array. They work perfectly in linux, are high performance, easy to use, and easier to manage. I can say nothing but good things about them, as they have served me very well on our servers.

Also, the more disks you use, the faster your array will be. Keep that in mind. I'm building a new file-server/domain controller for a small company that will have about 15-20 simultaneous users and I'm going to use a 3ware 9580SE-8LP with 8x80GB disks (one will be hot spare, so only 7 in the RAID array). This will yield 480GB of usable space on the array, but it should have really good read and write performance. I'm guessing that the server will be able to read from the array at around 400MB/s. More importantly, the RAID5 + hot-spare configuration will give them maximum uptime. If a disk dies, the 3ware controller will *automatically* start rebuilding the array with the hot-spare, and then I can replace the failed drive at my leisure.

So for my situation that will be fine because each client puts a decent amount of load on the server. For your situation, something like that may or may not be overkill, depending on how heavily they use the server. If each of them is constantly dump files that are multiple gigabytes in size, then you might need a more powerful setup. If the data acrues slowly and is mostly just archive data that is seldom accessed, then a less powerful system with software RAID might be more appropriate. Without knowing the load that you are going to put it under, it's hard to say.

----------

## remix

thanks guys.... all good advice...

a few things to mention, i'm NOT experienced with raid arrays, although i did read a fair share of manuals about them.

what i'm looking for is a simple setup, maybe with a wiki/tutorial to help step me through it with no problems.

and a raid1+0 or raid0+1 looked fairly simple to me (while reading the manuals)

but Astor, how much are one of those 3ware raid controllers? i don't know how to use them, are they easy to install, setup?

i do have a 'decent' budget of $1000-1300  and all i need is 1gb, but i don't need a monitor, so i wont need the video card. i don't mind raid5, i think i can manage that.

----------

## Rad

The 3ware controllers are easy to install and setup (they have proper linux drivers, an on-bios configuration utility which is OS independent and usually also come with monitoring and administration software for linux), but they're quite expensive (like 20-40% of the cost of the attached storage). Apart from that, they also require a decent mainboard to operate at full speed, although I think today's high end desktop mainboards can handle the smaller 3ware controllers just fine.

Anyways, the question would be if you need the performance of a 3ware controller. You'll already get about 25-30mb/s with a software raid on a normal mainboard... and managing a linux sw raid with evms is just as simple as managing the raid array in the 3ware card's BIOS.

About the raid levels - raid 1 and 0 in either order "wastes" 50% of the disk space for redundancy and up to half of your drives can fail at the same time without you loosing data (though it depends on which disks fail - only one is safe). 

Raid 5 will only waste one drive's worth of data and one (any, even if you have ten of them) can fail without loosing data. Raid 5 is probably the most popular choice for small servers nowadays.

Raid 6 is also more elegant than raid 1 + 0 or 0 + 1 since any two disks can fail, but it requires more calculations to be done by either your raid controller (so only the pricey ones support that level) or your CPU (linux software raid, proprietary software raid on cheap controllers).

Refer to Wikipedia if you need more information about the different raid levels. But I think Raid 5 is most attractive for small file servers that don't have any special performance requirenments.

----------

## Ast0r

You can find the pricing for the 3ware cards on newegg. A nice 4-port SATA-II card runs about $300. The 8-port versions are about $500, and the 12-port versions are about $700. It varies a bit based on which model you get, though. I like the 9000 series the best, personally. They're substantially better than the 8000 series (although the 8000 series is nice too and has good linux support). They're available in 64-bit PCI-X models and 4x PCI-E (which will work in an 8x or 16x PCI-E slot).

If your budget is $1200, I think it's a little out of your price range though. You need about $2000 to do what I was recommending.

----------

## drescherjm

The system I showed above came in around $1450 but could easily be reduced to your range if you reduce the memory to 1GB instead of 2 and reducing the video card could also shave off a few bucks. For me 2GB is not needed for the filesystem but these machines serve as double duty as they are distcc servers (which greatly speed up the build times in gentoo) and they also perform a few other server roles. You can also reduce 1 drive and make the system raid 5 (which I do not recommend), the reason for this is that with a raid 5 system if one drive goes bad (and you do not have a spare installed) you are essentially running a raid 0 system until things are fixed. If you are going to permanently add a hot spare why not make it be available for extra redundancy that raid 6 provides as I am not convinced that a hot spare drive will last any longer by sitting in a machine on with no data on it than on and actually being used. Remember in raid 6 if one drive dies you have raid5 and then if a second dies you have raid0. 

And then there is the choice between hardware and software raid. I can say that hardware raid is probably easier to setup and install but it does not give you the ability to split partition the drives in smaller chunks with multiple arrays on the drives for boot, root, swap. And for me this is important.  I see no reason why swap needs to be on a raid 5 or raid 6 array and with hardware raid you forced to do this or to add an additional drive for the operating system but maybe this is not that important anymore because how much do we actually use swap anymore. I mean on the machine I am typing right now I have 4GB of memory and 4 GB of swap but I almost never see any swap in use... For performance I give a slight edge to hardware raid but this is becoming less and less as processors get faster. Think of this your main cpu is 2 to 3 GHz and the processor on these hardware cards is on the order of 100 to 200 Mhz with the hardware card having the advantage of only running one or maybe two applications at a time while your system cpu probably has 100 or more threads running in the background. Either way with software raid on my systems I rarely see the cpu usage go above 5% for the raid array and this is for raid6. The only time it will be more is if you are doing a rebuild. And then there is tolerance to failure. For this I give software the edge for two reasons. In a software array if the controller card dies I can easily swap out the disks into a new system or install a new cheap controller and be up in minutes. With hardware I absolutely have to get a card at least from the same company as hardware cards are not compatible between vendors. And my second point on this topic. With software if the drives fail and somehow get out of sync you can at last resort force them to go online and then recover any data that has not changed since the one drive went offline.  With hardware I have not seen this in the past as if anything is detected wrong by the controller hardware usually rejects the array and you are out of luck (this may have changed recently as I have not bought a hardware controller in a few years).

----------

## Rad

 *drescherjm wrote:*   

> If you are going to permanently add a hot spare why not make it be available for extra redundancy that raid 6 provides as I am not convinced that a hot spare drive will last any longer by sitting in a machine on with no data on it than on and actually being used.

 

A drive will last longer as spare, since drives wear down mechanically. If it runs slowly in standby mode and doesn't use it's read/write heads it's of course not gonna be worn out as fast as it would if it was being used actively with RAID. Another point of using RAID 5 + a spare is that RAID 6 just requires significantly more calculations to work than RAID 5, and hence requires better hardware or more CPU time.

 *drescherjm wrote:*   

> 
> 
> And then there is the choice between hardware and software raid. I can say that hardware raid is probably easier to setup and install but it does not give you the ability to split partition the drives in smaller chunks with multiple arrays on the drives for boot, root, swap.

 

Booting off a hardware RAID array is usually possible without any tricks - hence you don't actually have a need to leave /boot off the array. Only bad mainboards don't allow for booting off PCI attached devices these days..

Also, quite a few 3ware cards would actually provide said ability to split drives logically before putting arrays over the chunks; that's not exclusively a software RAID feature.

----------

## drescherjm

 *Quote:*   

> Another point of using RAID 5 + a spare is that RAID 6 just requires significantly more calculations to work than RAID 5, and hence requires better hardware or more CPU time. 

 

I would not consider 5 to 10% cpu usage on a 2 GHz Opteron for software raid 6 too high a cost.

----------

## remix

ok so i read all your suggestions... raid5 seems to be the popular advice.

so what do you guys think of this setup for raid5

athlonXP (3200)

4 port sata on the motherboard

a cheap 50dollar 2 port pci sata controller

5 250gb hard drives 4 for 1gb, + 1 for parity

a dvd burner

this is well in my budget, and i'm looking for decent speed, redundancy, and backups, software raid6

is this a good setup? am i missing something? pci too slow? will i get what i want?

will it take me weeks of configuring (i've never done a raid system before, but i read 2 wikis on raid)

----------

## Rad

Looks okay; using standard pci for up to two drives shouldn't be a problem as such, even with SATA. Obviously the controller could still be bad, or worse, not have a driver for linux.

You should probably worry more about the onboard sata controller, in particular if you plan on using a cheaper mainboard...

----------

## remix

ok, i'm buying the hardware now then... hopefully my supplier has a good pci sata controller card that linux can support.

----------

## remix

oh wait! how the hell am i going to backup a 1tb server with a dvd burner? all the compression in the world wont make that 1tb fit on a 4gb dvd....

any solutions for backup?

----------

## Antimatter

There are a few options.

1) online backup, you upload that data to a server/datacenter somewhere

2) tape drive backup

3) Another "array" somewhere else to copy your data over to.

4) alots of monkey for backing up to dvd's

----------

## Ast0r

 *Antimatter wrote:*   

> 4) alots of monkey for backing up to dvd's

 

Yeah, spider monkeys!

----------

## remix

wow tape drives are expensive!!!!!

hmm.... the network thing might be the best way... i'm not exactly sure how to do it, but i can research.

----------

## tgh

My current gentoo boxes are very similar to drescherjm's.  The M2N-E motherboards are very good for entry level servers.

First, if you can't go with a redundant power-supply server case, then there's the less expensive route but you need to be able to afford downtime while you swap out the PSU.  We run everything in Xen domains, with hot spare hardware so that we can move drives or individual guest domains to other machines as needed.  You need to decide how many hard drives you want to install.

LIAN LI PC-6077B Black Aluminum ATX Mid Tower

- Can support 10 SATA drives in hot-swap cages

LIAN-LI PC-7077B SERVER CASE

- Can support 15 SATA drives in hot-swap cages

The drives can be stored in cages like this:

$0110 Athena Power 5:3 SATA-II Backplane SATA3051B

http://www.mwave.com/mwave/viewspec.hmx?scriteria=BA22992

They take up (3) 5.25" bay slots and hold (5) SATA drives.  They also come in a 4:3 style (some cases, the 5:3 style won't fit due to metal tabs inside the 5.25" bay area) and a 3:2 style that fits 3 drives into 2 bays.

For a PSU, grab something in the 650-750W range to support that many drives.  Something like the  Thermaltake W0106RU is very attractive because it has modular plugs.  Which means you can swap out a bad PSU a lot faster.  (You should always have a spare PSU on-hand.)

For the DVD-RW, go with any of the mid-level, $40 units that do DVD-RAM and the two dozen other formats.

Motherboard - go with the Asus M2N-E, but make sure you buy the right RAM.  Buy a 2nd motherboard to have on-hand if you can't deal with downtimes that last until the next AM (when you can get a replacement).

Memory - I always use Kingston KINGSTON VALUE RAM KVR533D2E4K2/2G 2GB KIT (1GB x 2) MATCH PAIR PC24200 533MHZ CL4 ECC 240-PIN DDR2 DIMM with my M2N-E and M2N32-SLI deluxe motherboards.  Go with 4GB because it will help the file server cache files better.  In a pinch, you can go with only 2GB but you'll upgrade to 4GB or 6GB down the road.  Plus, if you buy 4 sticks of RAM, you can always pull 2 to troubleshoot bad memory issues.

CPU - AMD Athlon64 X2 3800+ / 4200+ / 4600+ *AM2* - the AM2 allows you to do hardware virtualization if needed

Video card - Go with a PCI video card, or use the low end GeForce 6200 LE which is fanless.

NIC - Get an Intel dual-NIC PCI-X or PCIe card - that will allow you to do NIC bonding for fault tolerance against bad / disconnected network cables, plus you can get increased bandwidth out of the server.  You can use the PCI-X versions in regular PCI slots on the M2N-E motherboard, or you can use up your only PCIe x4 slot for the PCIe version of the Intel NIC.

Now for the hard drives.  A basic config would be (5) 750GB SATA drives filling up a single bank.  You can do all 5 by using the on-board (6) channel NVIDIA SATA plugs.  If you have to expand down the road, you'll need to buy PCIe 4-port and 8-port RAID cards.  So plan ahead and leave those slots free.  4-port cards require either a PCIe x1 slot or a PCIe x4 slot.  The 8-port cards require PCI x4 slots.

(Which is why you should buy a cheap PCI-only video card.  So you don't use up your PCI x16 slot.)

drescherjm's setup for mdadm is pretty much perfect, except that I'd add a hot-spare drive into the mix.

For performance, I'd recommend the following setup though:

md0 - 256MB, RAID1, 5 members, using the 1st partition on the disk

md1 - 12GB RAID1, 3 members, using the 2nd partition on the first 3 disks (sda - sdc)

md2 - 12GB RAID1, 2 members, using the 2nd partition on the last 2 disks (sdd - sde)

md3 - 512MB 4-disk RAID10 w/ hot-spare (1GB net space) for the swap

md4 - 4-disk RAID10 w/ hot spare using the rest of each disk (should net you around 1.3TB)

You can manage all of the space within md4 using LVM and create your data partitions in LVM.

For servers, you should put your swap file on top of the RAID.  Otherwise when a disk dies, you run a good chance of a server crash.  (The performance hit is negligible, especially since you're doing it on top of RAID10.)

Notes on md1 vs md2.  The md1 partition is where you will install Gentoo (it's the root partition).  Once you have gentoo configured and in a working state, you should copy the OS over to md2 and change the hostname to be something like "machinename-emergency".  This gives you a 2nd OS that is bootable in case the primary OS gets hosed.  You'll want to create a 2nd entry in your GRUB/LILO config that allows someone to boot into the 2nd emergency root partition when the primary root partition is hosed.

Of course, your fallback position is to have someone put the LiveCD in, fire it up, set the password, check the IP address and start up SSHD.  But a fall-back root partition saves a few steps.

For backing all this up?  Either write to removeable disk drives or go with tape.  Tape is better for long-term archives (due to storage density) but tape drives are very tempermental and it's best to have (2) tape drives so that you can troubleshoot.

----------

## Strunzdesign

Hi!

This thread is very interesting, but I would NEVER NEVER use a combination of RAID5 data storage and Swap-"RAID0" (that kind of high speed parallel use of disks that you get when you use multiple swap partitions with the same priority).

When a single drive fails, all your swap space will be messed up and all swapped-out memory pages will be lost! Your system will crash! On the other side... if you think that your system does not make use of swap space... why do you want to add swap space?  :Smile: 

What I would do:

Make your swap partitions "raid autodetect", set up a RAID5 for your swap space and you have all the benefits of a RAID5 system and no hard crashes when a single drive fails.

Florian

----------

## nevynxxx

You don't mention a budget.....but I would say the following.

You don't have much experiance of raid, and you want a pure file server....

http://www.hp.com/sbso/serverstorage/all-in-one-network-storage.html from HP would give you what you want.

Or just about any NFS server on the market. Why bother with a custom build that you have to maintain when you can get these things off the shelf?

Just another POV.

----------

## gohmdoree

sounds pretty cool.  i've become more of an advocate of raid. i haven't ventured beyond raid 1 setup, going from software raid to hardware raid.  i think the 3ware cards are great, though can come at a price, depending on how many drives you are intending on holding.

time to rebuild my file server at home, running on a single drive configuration.

----------

