# Serious XFS errors that will not go away.

## mmontg1

I'm getting XFS internal errors on my server that has been online for quite a bit.  I initially thought it was HD issues, and replaced the drive, to no avail.  I also thought it could be a ide port/motherboard issue, so I tried a new MB.  No go.  I assumed it may be a kernel bug, so I upgraded to newer kernel.  Nothing has worked sofar.

Strangely, the easiest way to produce the internal errors is by running an 'emerge --sync'  (hda7 is /var, and it gets dumped, and mounted read-only)

Here's the error:

```
May 10 22:25:22 tux 0x0: 49 41 42 54 00 00 00 b4 ff ff ff ff ff ff ff ff

May 10 22:25:22 tux Filesystem "hda7": XFS internal error xfs_ialloc_read_agi at line 1372 of file fs/xfs/xfs_ialloc.c.  Caller 0xc022efaf

May 10 22:25:22 tux [<c0230784>] xfs_ialloc_read_agi+0xf4/0x146

May 10 22:25:22 tux [<c0230784>] xfs_ialloc_read_agi+0xf4/0x146

May 10 22:25:22 tux [<c022efaf>] xfs_ialloc_ag_select+0x28f/0x2f0

May 10 22:25:22 tux [<c022efaf>] xfs_ialloc_ag_select+0x28f/0x2f0

May 10 22:25:22 tux [<c022efaf>] xfs_ialloc_ag_select+0x28f/0x2f0

May 10 22:25:22 tux [<c022efaf>] xfs_ialloc_ag_select+0x28f/0x2f0

May 10 22:25:22 tux [<c022efaf>] xfs_ialloc_ag_select+0x28f/0x2f0

May 10 22:25:22 tux [<c022efaf>] xfs_ialloc_ag_select+0x28f/0x2f0

May 10 22:25:22 tux [<c022fb31>] xfs_dialloc+0xb21/0xb50

May 10 22:25:22 tux [<c022fb31>] xfs_dialloc+0xb21/0xb50

May 10 22:25:22 tux [<c021c5e9>] xfs_da_brelse+0xa9/0xe0

May 10 22:25:22 tux [<c021c5e9>] xfs_da_brelse+0xa9/0xe0

May 10 22:25:22 tux [<c0241310>] xlog_grant_log_space+0x120/0x370

May 10 22:25:22 tux [<c0241310>] xlog_grant_log_space+0x120/0x370

May 10 22:25:22 tux [<c0236a66>] xfs_ialloc+0x66/0x510

May 10 22:25:22 tux [<c0236a66>] xfs_ialloc+0x66/0x510

May 10 22:25:22 tux [<c025b965>] kmem_zone_zalloc+0x35/0x90

May 10 22:25:22 tux [<c025b965>] kmem_zone_zalloc+0x35/0x90

May 10 22:25:22 tux [<c025083f>] xfs_dir_ialloc+0x8f/0x2e0

May 10 22:25:22 tux [<c025083f>] xfs_dir_ialloc+0x8f/0x2e0

May 10 22:25:22 tux [<c024d663>] xfs_trans_reserve+0x83/0x1e0

May 10 22:25:22 tux [<c024d663>] xfs_trans_reserve+0x83/0x1e0

May 10 22:25:22 tux [<c0257899>] xfs_mkdir+0x2b9/0x760

May 10 22:25:22 tux [<c0257899>] xfs_mkdir+0x2b9/0x760

May 10 22:25:22 tux [<c01f503b>] xfs_acl_vhasacl_default+0x3b/0x50

May 10 22:25:22 tux [<c01f503b>] xfs_acl_vhasacl_default+0x3b/0x50

May 10 22:25:22 tux [<c0262a6b>] linvfs_mknod+0x36b/0x410

May 10 22:25:22 tux [<c0262a6b>] linvfs_mknod+0x36b/0x410

May 10 22:25:22 tux [<c015def1>] real_lookup+0xc1/0xf0

May 10 22:25:22 tux [<c015def1>] real_lookup+0xc1/0xf0

May 10 22:25:22 tux [<c0167e24>] dput+0x24/0x210

May 10 22:25:22 tux [<c0167e24>] dput+0x24/0x210

May 10 22:25:22 tux [<c015dd25>] path_release+0x15/0x50

May 10 22:25:22 tux [<c015dd25>] path_release+0x15/0x50

May 10 22:25:22 tux [<c015e490>] link_path_walk+0x2d0/0xd20

May 10 22:25:22 tux [<c015e490>] link_path_walk+0x2d0/0xd20

May 10 22:25:22 tux [<c0254c1d>] xfs_access+0x4d/0x60

May 10 22:25:22 tux [<c0254c1d>] xfs_access+0x4d/0x60

May 10 22:25:22 tux [<c0262b4a>] linvfs_mkdir+0x2a/0x30

May 10 22:25:22 tux [<c0262b4a>] linvfs_mkdir+0x2a/0x30

May 10 22:25:22 tux [<c0160354>] vfs_mkdir+0x94/0x110

May 10 22:25:22 tux [<c0160354>] vfs_mkdir+0x94/0x110

May 10 22:25:22 tux [<c0160465>] sys_mkdir+0x95/0x100

May 10 22:25:22 tux [<c0160465>] sys_mkdir+0x95/0x100

May 10 22:25:22 tux [<c010271b>] syscall_call+0x7/0xb

May 10 22:25:22 tux [<c010271b>] syscall_call+0x7/0xb
```

Then you'll get this:

```
May 10 22:25:23 tux xfs_force_shutdown(hda7,0x8) called from line 1091 of file fs/xfs/xfs_trans.c.  Return address = 0xc026638c

May 10 22:25:23 tux xfs_force_shutdown(hda7,0x8) called from line 1091 of file fs/xfs/xfs_trans.c.  Return address = 0xc026638c

May 10 22:25:23 tux Filesystem "hda7": Corruption of in-memory data detected.  Shutting down filesystem: hda7

May 10 22:25:23 tux Please umount the filesystem, and rectify the problem(s)

May 10 22:25:51 tux xfs_force_shutdown(hda7,0x1) called from line 353 of file fs/xfs/xfs_rw.c.  Return address = 0xc026638c

May 10 22:25:51 tux xfs_force_shutdown(hda7,0x1) called from line 353 of file fs/xfs/xfs_rw.c.  Return address = 0xc026638c[quote]

```

More extended system information:

```

tux log # uname -a

Linux tux 2.6.11.6 #4 Fri Jun 10 20:19:00 CDT 2005 i686 AMD Athlon(tm) XP 2600+ AuthenticAMD GNU/Linux

tux linux # grep -i xfs .config

# XFS support

CONFIG_XFS_FS=y

CONFIG_XFS_EXPORT=y

# CONFIG_XFS_RT is not set

CONFIG_XFS_QUOTA=y

CONFIG_XFS_SECURITY=y

CONFIG_XFS_POSIX_ACL=y

# CONFIG_VXFS_FS is not set

tux linux # lspci -v

0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge (rev 80)

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, 66Mhz, medium devsel, latency 8

        Memory at e0000000 (32-bit, prefetchable)

        Capabilities: [80] AGP version 3.5

        Capabilities: [c0] Power Management version 2

0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge (prog-if 00 [Normal decode])

        Flags: bus master, 66Mhz, medium devsel, latency 0

        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0

        Capabilities: [80] Power Management version 2

0000:00:08.0 VGA compatible controller: S3 Inc. 86c764/765 [Trio32/64/64V+] (rev 53) (prog-if 00 [VGA])

        Flags: medium devsel, IRQ 12

        Memory at e8000000 (32-bit, non-prefetchable)

0000:00:09.0 Ethernet controller: D-Link System Inc RTL8139 Ethernet (rev 10)

        Subsystem: D-Link System Inc DFE-530TX+ 10/100 Ethernet Adapter

        Flags: bus master, medium devsel, latency 32, IRQ 10

        I/O ports at b000

        Memory at ec820000 (32-bit, non-prefetchable) [size=256]

        Capabilities: [50] Power Management version 2

0000:00:0a.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 64)

        Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100

        Flags: bus master, medium devsel, latency 32, IRQ 5

        I/O ports at b400

        Memory at ec822000 (32-bit, non-prefetchable) [size=128]

        Capabilities: [dc] Power Management version 1

0000:00:0b.0 RAID bus controller: 3ware Inc 3ware Inc 3ware 7xxx/8xxx-series PATA/SATA-RAID (rev 01)

        Subsystem: 3ware Inc 3ware Inc 3ware 7xxx/8xxx-series PATA/SATA-RAID

        Flags: bus master, medium devsel, latency 32, IRQ 11

        I/O ports at b800

        Memory at ec821000 (32-bit, non-prefetchable) [size=16]

        Memory at ec000000 (32-bit, non-prefetchable) [size=8M]

        Capabilities: [40] Power Management version 1

0000:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80)

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32, IRQ 10

        I/O ports at bc00

        I/O ports at c000 [size=4]

        I/O ports at c400 [size=8]

        I/O ports at c800 [size=4]

        I/O ports at cc00 [size=16]

        I/O ports at d000 [size=256]

        Capabilities: [c0] Power Management version 2

0000:00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP])

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32

        I/O ports at d400 [size=16]

        Capabilities: [c0] Power Management version 2

0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) (prog-if 00 [UHCI])

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32, IRQ 12

        I/O ports at d800 [size=32]

        Capabilities: [80] Power Management version 2

0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) (prog-if 00 [UHCI])

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32, IRQ 12

        I/O ports at dc00 [size=32]

        Capabilities: [80] Power Management version 2

0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) (prog-if 00 [UHCI])

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32, IRQ 10

        I/O ports at e000 [size=32]

        Capabilities: [80] Power Management version 2

0000:00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) (prog-if 00 [UHCI])

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32, IRQ 10

        I/O ports at e400 [size=32]

        Capabilities: [80] Power Management version 2

0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) (prog-if 20 [EHCI])

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32, IRQ 5

        Memory at ec823000 (32-bit, non-prefetchable)

        Capabilities: [80] Power Management version 2

0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South]

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, stepping, medium devsel, latency 0

        Capabilities: [c0] Power Management version 2

0000:00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60)

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: medium devsel, IRQ 5

        I/O ports at e800

        Capabilities: [c0] Power Management version 2

0000:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)

        Subsystem: ABIT Computer Corp.: Unknown device 1408

        Flags: bus master, medium devsel, latency 32, IRQ 12

        I/O ports at ec00

        Memory at ec824000 (32-bit, non-prefetchable) [size=256]

        Capabilities: [40] Power Management version 2

tux linux # mount

/dev/hda3 on / type xfs (rw,noatime)

none on /dev type devfs (rw)

none on /proc type proc (rw)

none on /sys type sysfs (rw)

none on /dev/pts type devpts (rw)

/dev/hda5 on /tmp type xfs (rw,noexec,nosuid,nodev,noatime)

/dev/hda6 on /var type xfs (rw,nosuid,nodev,noatime)

/dev/hda7 on /usr type xfs (rw,nodev,noatime)

none on /dev/shm type tmpfs (rw)

/dev/md0 on /home type reiserfs (rw,noexec,nosuid,nodev,noatime)

tux etc # cat make.conf  (cut for brevity)

# These settings were set by the catalyst build script that automatically built this stage

# Please consult /etc/make.conf.example for a more detailed example

USE="-X -ipv6 encode -gdbm xml xml2 maildir -mbox perl sasl -pdflib tcpd pam ssl sasl postfix libwww apache php mod_php imap pop3 gd dba snmp session mysql apache2 crypt -qt"

CFLAGS="-O2 -mcpu=athlon-xp -fomit-frame-pointer"

CHOST="i686-pc-linux-gnu"

CXXFLAGS="${CFLAGS}"

```

I am at a total loss now, and this is just causing nothing but issues, as you can well imagine.

Does anybody have any possible information, or suggestions that could help me out... it would be greatly appreciated.

----------

## mmontg1

Since noone had any insight, I went ahead and single user moded the machine, and changed /var and /usr from xfs to ext3, since it seems to be stable.

Maybe I should change the name of this thread to "xfs is not stable".

Have a good day guys.

----------

## fbcyborg

Hello, 

same problem here.

kernel 2.6.28-gentoo.

```
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

Filesystem "md5": XFS internal error xfs_ialloc_read_agi at line 1408 of file fs/xfs/xfs_ialloc.c.  Caller 0xffffffff803c7162

Pid: 5784, comm: kio_file Tainted: P           2.6.28-gentoo #3

Call Trace:

 [<ffffffff803c7162>] 0xffffffff803c7162

 [<ffffffff803c5e61>] 0xffffffff803c5e61

 [<ffffffff803c7162>] 0xffffffff803c7162

 [<ffffffff8069aa90>] 0xffffffff8069aa90

 [<ffffffff803c7162>] 0xffffffff803c7162

 [<ffffffff803c755d>] 0xffffffff803c755d

 [<ffffffff8022991c>] 0xffffffff8022991c

 [<ffffffff80228624>] 0xffffffff80228624

 [<ffffffff802267d0>] 0xffffffff802267d0

 [<ffffffff803cee1e>] 0xffffffff803cee1e

 [<ffffffff803eb13a>] 0xffffffff803eb13a

 [<ffffffff803e478b>] 0xffffffff803e478b

 [<ffffffff803e12d6>] 0xffffffff803e12d6

 [<ffffffff8028dd19>] 0xffffffff8028dd19

 [<ffffffff803e8ab1>] 0xffffffff803e8ab1

 [<ffffffff803f3535>] 0xffffffff803f3535

 [<ffffffff8028ee50>] 0xffffffff8028ee50

 [<ffffffff80291219>] 0xffffffff80291219

 [<ffffffff8028584a>] 0xffffffff8028584a

 [<ffffffff8020b5db>] 0xffffffff8020b5db

```

I was using xfs since few months but it's the first time I have this problem.

I'm using RAID 0 between 2 disks.

----------

## Akkara

Unmount the filesystem (or remount read-only if it is root), and try running xfs_check to see what it says.  Check your /var/log/messages for disk I/O errors as well during this.

If you aren't getting disk I/O errors, try xfs_repair.  Back up any important files first.

If you are also getting disk I/O errors, the disk is dying, so pull whatever you need off of it while you can.

----------

## pdw_hu

I've been using XFS for ages now, and never encountered anything like this. The only difference (apart from the drive) is that I only have plain XFS in the kernel, none of the options you do. It might be worth trying to disable those (unless you need them of course) and test it.

----------

## fbcyborg

Thank you guys, 

I've currently performing an xfs_repair on that partition.

I've never encountered that errors too before for the xfs filesystem since a long time ago.

----------

## fbcyborg

I've just finished to do an xfs_repair.

The only thing I can read is: 

"Sorry, could not find valid secondary superblock" an a few other similar messages during the check.

----------

## Nerevar

 *Quote:*   

> May 10 22:25:23 tux Filesystem "hda7": Corruption of in-memory data detected.  Shutting down filesystem: hda7

 

Have you checked your RAM to rule out that possibility? At least the error didn't get to disk.

----------

## fbcyborg

The problem happened to me again.

Maybe I will try another filesystem.

----------

## snIP3r

hi!

i use xfs also for a long time with this config:

```

area52 ~ # less /usr/src/linux/.config |grep XFS

CONFIG_XFS_FS=y

CONFIG_XFS_QUOTA=y

CONFIG_XFS_POSIX_ACL=y

CONFIG_XFS_RT=y

# CONFIG_XFS_DEBUG is not set

# CONFIG_VXFS_FS is not set

```

and i have never had such problems (also over different kernel versions). perhaps your problems are memory or harddisc related...

HTH

snIP3r

----------

## fbcyborg

Thank you. I don't know why it happened, but I formatted my partition with the ext4 filesystem and I had no more problems.

----------

