# ext4 issues on 3.5.1 and 3.5.2

## Cr0t

I upgraded my fileserver to 3.5.2 from 3.50. If I delete a bigger file the kernel dumps and keeps on using cpu cycles. Reboots become impossible and I need to hit the reset button. I have seen this issues through VirtualBox aka deleting a vm, deleting a local file and/ or deleting files over nfs. The single files haven been between 30-100gb. It seems like this is a rare issue.

ext4_ext_remove_space (just google it) - that issues does not seem present in 3.5.0.

```
Aug 19 01:13:45 datastorm kernel: [22963.756343] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

Aug 19 01:13:45 datastorm kernel: [22963.756643] IP: [<ffffffff8116db24>] ext4_ext_remove_space+0x9f4/0xce0

Aug 19 01:13:45 datastorm kernel: [22963.756873] PGD 1a5343067 PUD 1a5342067 PMD 0 

Aug 19 01:13:45 datastorm kernel: [22963.757033] Oops: 0000 [#1] PREEMPT SMP 

Aug 19 01:13:45 datastorm kernel: [22963.757174] CPU 1 

Aug 19 01:13:45 datastorm kernel: [22963.757240] Modules linked in: vboxnetflt(O) vboxnetadp(O) vboxdrv(O) iptable_mangle iptable_nat iptable_filter ip_tables x_tables nf_nat_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_irc nf_conntrack_ftp nf_conntrack nvidia(PO) coretemp snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_page_alloc snd_timer firewire_ohci kvm_intel snd firewire_core kvm soundcore r8169 lpc_ich i2c_i801 mfd_core mii crc_itu_t

Aug 19 01:13:45 datastorm kernel: [22963.757325] 

Aug 19 01:13:45 datastorm kernel: [22963.757325] Pid: 6657, comm: VBoxSVC Tainted: P       A   O 3.5.2-gentoo #2 .   .  /IP35 Pro(Intel P35-ICH9R)

Aug 19 01:13:45 datastorm kernel: [22963.757325] RIP: 0010:[<ffffffff8116db24>]  [<ffffffff8116db24>] ext4_ext_remove_space+0x9f4/0xce0

Aug 19 01:13:45 datastorm kernel: [22963.757325] RSP: 0018:ffff8801933dbd38  EFLAGS: 00010246

Aug 19 01:13:45 datastorm kernel: [22963.757325] RAX: 0000000000000000 RBX: ffff8801928ce4b0 RCX: 00000000918c5000

Aug 19 01:13:45 datastorm kernel: [22963.757325] RDX: 0000000000000001 RSI: 000000001e90f7f9 RDI: 0000000000000002

Aug 19 01:13:45 datastorm kernel: [22963.757325] RBP: ffff880156145d88 R08: 00000000918c5000 R09: ffff8801928ce480

Aug 19 01:13:45 datastorm kernel: [22963.757325] R10: ffffffff8116d5e2 R11: 0000000000000000 R12: ffff8801928ce4e0

Aug 19 01:13:45 datastorm kernel: [22963.757325] R13: 0000000000000001 R14: ffff8801928ce4e0 R15: 0000000000d407ff

Aug 19 01:13:45 datastorm kernel: [22963.757325] FS:  00007fa331a68700(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000

Aug 19 01:13:45 datastorm kernel: [22963.757325] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b

Aug 19 01:13:45 datastorm kernel: [22963.757325] CR2: 0000000000000028 CR3: 00000001d4d14000 CR4: 00000000000007e0

Aug 19 01:13:45 datastorm kernel: [22963.757325] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

Aug 19 01:13:45 datastorm kernel: [22963.757325] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Aug 19 01:13:45 datastorm kernel: [22963.757325] Process VBoxSVC (pid: 6657, threadinfo ffff8801933da000, task ffff88011e868620)

Aug 19 01:13:45 datastorm kernel: [22963.757325] Stack:

Aug 19 01:13:45 datastorm kernel: [22963.757325]  ffff88015ca83410 ffffffff81170913 0000000000000000 ffff880156145d88

Aug 19 01:13:45 datastorm kernel: [22963.757325]  ffff880153ab0200 00000000fffffff5 ffff880221f10400 ffff880100000007

Aug 19 01:13:45 datastorm kernel: [22963.757325]  ffff880221559ca8 fffffffe81156a66 ffff880152e2900c ffff880152e29000

Aug 19 01:13:45 datastorm kernel: [22963.757325] Call Trace:

Aug 19 01:13:45 datastorm kernel: [22963.757325]  [<ffffffff81170913>] ? __ext4_handle_dirty_metadata+0x93/0x120

Aug 19 01:13:45 datastorm kernel: [22963.757325]  [<ffffffff8116f9a4>] ? ext4_ext_truncate+0x194/0x1f0

Aug 19 01:13:45 datastorm kernel: [22963.757325]  [<ffffffff81158e48>] ? ext4_evict_inode+0x368/0x3c0

Aug 19 01:13:45 datastorm kernel: [22963.757325]  [<ffffffff810fd807>] ? evict+0xa7/0x1b0

Aug 19 01:13:45 datastorm kernel: [22963.757325]  [<ffffffff810f2687>] ? do_unlinkat+0x167/0x1c0

Aug 19 01:13:45 datastorm kernel: [22963.757325]  [<ffffffff810e2f7f>] ? filp_close+0x5f/0x90

Aug 19 01:13:45 datastorm kernel: [22963.757325]  [<ffffffff8146fdbd>] ? system_call_fastpath+0x1a/0x1f

Aug 19 01:13:45 datastorm kernel: [22963.757325] Code: cf 02 00 00 66 a9 ff 7f 0f 84 c3 02 00 00 48 c7 44 24 60 00 00 00 00 66 0d 00 80 66 89 43 04 e9 5e fe ff ff 0f 1f 00 48 8b 43 28 <48> 8b 40 28 48 89 43 20 48 8b 43 18 48 85 c0 0f 85 a0 f8 ff ff 

Aug 19 01:13:45 datastorm kernel: [22963.757325] RIP  [<ffffffff8116db24>] ext4_ext_remove_space+0x9f4/0xce0

Aug 19 01:13:45 datastorm kernel: [22963.757325]  RSP <ffff8801933dbd38>

Aug 19 01:13:45 datastorm kernel: [22963.757325] CR2: 0000000000000028

Aug 19 01:13:45 datastorm kernel: [22963.796068] ---[ end trace 0bb9109a0ccfb97b ]---
```

----------

## toralf

/me assumes, that you reports such issues at lkml.org too ?

----------

## asturm

Lots of ext4 patches in stable-queue: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Fstable-queue.git&a=search&h=HEAD&st=commit&s=ext4

I've found a patch currently pending in stable-queue that you perhaps want to try out: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-3.5/ext4-fix-kernel-bug-on-large-scale-rm-rf-commands.patch;h=b9f17c711c3882824fb5aeff156affbd6ae5b485;hb=86e2fbe7a7aef808ed4d33158d431ee071b85034

 *Quote:*   

> Commit 968dee7722: "ext4: fix hole punch failure when depth is greater than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel crashes when users ran run "rm -rf" on large directory hierarchy on ext4 filesystems on RAID devices:

 

----------

## Cr0t

 *genstorm wrote:*   

> Lots of ext4 patches in stable-queue: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Fstable-queue.git&a=search&h=HEAD&st=commit&s=ext4
> 
> I've found a patch currently pending in stable-queue that you perhaps want to try out: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-3.5/ext4-fix-kernel-bug-on-large-scale-rm-rf-commands.patch;h=b9f17c711c3882824fb5aeff156affbd6ae5b485;hb=86e2fbe7a7aef808ed4d33158d431ee071b85034
> 
>  *Quote:*   Commit 968dee7722: "ext4: fix hole punch failure when depth is greater than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel crashes when users ran run "rm -rf" on large directory hierarchy on ext4 filesystems on RAID devices: 

 I can't test this right now since I will be going on a trip next week and don't want to leave the server in a questionable state. I posted this behavior just in case someone was running into the same issue.

----------

