# ls causes segmentation fault on drbd + ocfs2

## petero

Hello, 

I've been trying to get drbd + ocfs2 working by following this guide: http://webcache.googleusercontent.com/search?q=cache:njQEooenU8cJ:en.gentoo-wiki.com/wiki/Active-active_DRBD_with_OCFS2+&cd=1&hl=en&ct=clnk

For the most part it works however when I create dir with symlinks to other dirs inside the drbd partition and then call

ls 

on that dir it will end with segmentation fault (either that or sometimes it just hangs) and I find this in log

kernel: general protection fault: 0000 [#1] SMP

...

kernel: Pid: 17098, comm: ls Not tainted 3.5.7-gentoo #5 VMware, Inc. VMware Virtual Platform

...

kernel: Call Trace:

kernel: [<ffffffff8125388a>] ocfs2_fast_symlink_readpage+0xde/0x15c

kernel: [<ffffffff8108c494>] ? add_to_page_cache_lru+0x2f/0x39

kernel: [<ffffffff8108c602>] do_read_cache_page+0x8e/0x13c

kernel: [<ffffffff812537ac>] ? ocfs2_unblock_signals+0x1c/0x1c

kernel: [<ffffffff8108c6ea>] read_cache_page_async+0x17/0x19

kernel: [<ffffffff8108c6f5>] read_cache_page+0x9/0x13

kernel: [<ffffffff810c1127>] page_getlink.clone.29+0x28/0x82

kernel: [<ffffffff810c11a2>] page_follow_link_light+0x21/0x34

kernel: [<ffffffff810bfbce>] generic_readlink+0x3a/0x97

kernel: [<ffffffff810bacb1>] sys_readlinkat+0x76/0x94

kernel: [<ffffffff810bace5>] sys_readlink+0x16/0x18

kernel: [<ffffffff81488262>] system_call_fastpath+0x16/0x1b

kernel: Code: d1 48 8d 44 0a ff 40 38 30 74 0a 48 ff c8 48 39 d0 73 f3 31 c0 c9 c3 55 48 89 f8 48 89 e5 eb 03 48 ff c0 48 85 f6 74 08 48 ff ce <80> 38 00 75 f0 48 29 f8 c9 c3 55 31 c0 48 89 e5 eb 17 44 38 c1 

kernel: RIP  [<ffffffff812e6786>] strnlen+0x14/0x1e

kernel: RSP <ffff88031f8c1d08>

kernel: ---[ end trace add4a6818eca9284 ]---

my kernel is 3.5.7-gentoo x64

initially I was getting warnings  that drbd version kernel space (8.3.13) doesn't match drbd tools user space (8.3.11) but I still continued installing and got it to the state when everything was working except of the symlinks

so then I tried installing 8.3.13 of drbd tools to see if it helps, and also different version of ocfs2 tools (finished installation on 1.8.2 later downgraded to 1.6.4) but none of those changes made any difference - ls is still crashing

My colleague has previously done the same installation on ubuntu using drbd 8.3.11-0ubuntu1 and ocfs2 1.6.3-4ubuntu1 and there it all works fine (our install steps were basically the same with the exception of gentoo vs. ubuntu specifics)

Could running kernel + user space drbd on version 8.3.11 (the same as the successful  ubuntu install) help ? And how can I install lower than default version of drbd into kernel ? I am a total gentoo beginner so I have no clue what else to do ....

Any ideas how to further investigate this or how to fix it ?

----------

## syn0ptik

try strace -f -o /tmp/out your_app

there mistakes in libc because

```
kernel: RIP [<ffffffff812e6786>] strnlen+0x14/0x1e 
```

happened or kernel modules.

or switch those modules for ocfs

----------

## petero

 *syn0ptik wrote:*   

> try strace -f -o /tmp/out your_app
> 
> there mistakes in libc because
> 
> ```
> ...

 

Hey if you mean I should run

strace -f -o /tmp/out ls

so that's what I've just tried and: the ls doesn't crash that way instead it prints correctly the content of the folder to the console. The logged /tmp/out is quite long - should I post it here (even if ls didn't crash) ?

What do you mean by: "switch those modules for ocfs" ?

----------

## petero

So I updated world and installed new kernel (3.7.3) but the problem remains. What else can I do ? Should I report this as a bug or... ?

----------

## syn0ptik

No, it for trace. It omit couple things when it runs.

Which command be crashed?

----------

## randalla

I'm sorry to bring this old post back up, but I ran into the same thing today. What I also found today was a fix for it:

http://comments.gmane.org/gmane.comp.file-systems.ocfs2.devel/8008

After applying the patch discussed there, I have not had any issues with symlinks on the ocfs2 partition.

Adam.

----------

## petero

Thx, randalla. I will give it a try.

----------

## 666threesixes666

upstream say

"Dok: issue is in ocfs2 not DRBD

Dok: the real question is 'do you really need a clustered filesystem?'

666threesixes666: what FS do you recommend for drbd?  did you test jfs for it?

Dok: I have

Dok: jfs takes some voodoo to get working in rhel, but it works

Dok: DRBD is just a block device

Dok: the filesystem, as with anything else, really depends upon your expected use case

Dok: I like ext4 myself

Dok: it seems to be the most stable"

you really don't want to move backwards in versions.  latest stable upstream is 8.4.3

(i seriously advise against this.....)

(as root)

```

echo ">=sys-cluster/drbd-8.3.12" >> /etc/portage/package.mask

emerge -av drbd

```

and that will put you on 8.3.11-r1

----------

## petero

OK, I can confirm that fix posted by randalla works for me as well.

On kernel 3.9.6 the patch is already included. (and everything works ok there out of the box)

So it seems that if you use reasonably new kernel you shouldn't run into this problem.

On the other node I have kernel 3.8.2 and I needed to manually edit fs/ocfs2/symlink.c 

After that (after new kernel is built and applied), everything works fine on both nodes.

Thanks !

----------

