# Script:Measuring fragmentation on Reiserfs (and other fs)

## _droop_

Hi,

Reiserfs is one fs which has no tools to measure fragmentation.

I asked myself how can we measure fragmentation on a reiserfs partition. I wrote a script that is capable to measure this fragmentation. 

This script requires e2fsprogs package (for filefrag command), but this package is installed on almost all gentoo. It is written in perl.

Since filefrag works on various fs type (including reiserfs), my script should work too on various fs type.

The script :

```
#!/usr/bin/perl -w

#this script search for frag on a fs

use strict;

#number of files

my $files = 0;

#number of fragment

my $fragments = 0;

#number of fragmented files

my $fragfiles = 0;

#search fs for all file

open (FILES, "find " . $ARGV[0] . " -xdev -type f |");

while (defined (my $file = <FILES>)) {

        #quote some chars in filename

        $file =~ s/!/\\!/g;

        $file =~ s/#/\\#/g;

        $file =~ s/&/\\&/g;

        $file =~ s/>/\\>/g;

        $file =~ s/</\\</g;

        $file =~ s/\$/\\\$/g;

        $file =~ s/\(/\\\(/g;

        $file =~ s/\)/\\\)/g;

        $file =~ s/\|/\\\|/g;

        $file =~ s/'/\\'/g;

        $file =~ s/ /\\ /g;

        #nb of fragment for the file

        open (FRAG, "filefrag $file |");

        my $res = <FRAG>;

        if ($res =~ m/.*:\s+(\d+) extents? found/) {

                my $fragment = $1;

                $fragments+=$fragment;

                if ($fragment > 1) {

                        $fragfiles++;

                }

                $files++;

        } else {

                print ("$res : not understand for $file.\n");

        }

        close (FRAG);

}

close (FILES);

print ( $fragfiles / $files * 100 . "% non contiguous files, " . $fragments / $files . " average fragments.\n");
```

To execute it : write it somewere with your beloved(!?) editor, for example in /root/fragck.pl. Change permission :

```
chmod u+x /root/fragck.pl
```

 It's ready.

You must be root to execute the script. It takes one argument : the mount point of the fs to analyze. It will report the percentage of fragmented files and the average number of fragment (see explanation below).

It has to scan all regular files of the examinated partition, so this script is quite slow (depends on the number of files). It takes about 5 mins on my root partition /.

Some examples from my personnal computer:

```

/root/fragck.pl /

5.10010105507148% non contiguous files, 1.11087084031179 average fragments.

/root/fragck.pl /divers

79.5620437956204% non contiguous files, 7060.22627737226 average fragments.

```

Ideally the script should report "0% non contiguous files, 1 average fragments.". The second fs is highly fragmented... "7060 average fragments" means that all files on the fs are splited into 7060 pieces (in average)...

I hope this script will help some people   :Very Happy: 

It is possible that you will have problem with file that contains special characters. Please report it, I will try to correct the problem.

Enjoy !

PS : the initial thread where I post the script is https://forums.gentoo.org/viewtopic-t-429134-start-0.html.

PPS : English is not my motherlanguage, feel free to correct me...

PPPS : I put no licence on the code, you are free to do what you want with it. But I'm interested in any enhancement   :Very Happy: 

----------

## dhave

Thanks, _droop_. I've tested your script on all the directories on my reiserfs partition, and it worked as advertised. It's a very useful tool. I posted a link to this thread on www.linuxquestions.org, by the way.

----------

## mdeininger

very useful script, I will give it a shot tomorrow

:thumbs up:

----------

## batistuta

I've been waiting for this for years!! OK, I've started using linux only one year ago   :Very Happy: 

Thanks

----------

## batistuta

could you give us some metrics of what the output means? I.e., when should we defragment? I've found for example that my portage tree rebuild was MUCH faster fater copying back and forth the portage tree. How would such a badly defragmented partition show up?

It would be cool if there was some cron job or something that would monitor partitions for fragmentation and do this automatically. What seems to you like a sensible threashold for defragmenting (if there is such a thing)?

----------

## fangorn

Out of curiosity:

Is there a actual application/script which can defragment a reiserfs partition? I just dont happen to have another 250 GB at hand to move the files and move them back  :Wink: 

----------

## dhave

 *fangorn wrote:*   

> Out of curiosity:
> 
> Is there a actual application/script which can defragment a reiserfs partition? I just dont happen to have another 250 GB at hand to move the files and move them back 

 

Well, you could look into Con Kolivas's defrag script, which is supposed to work on any filesystem. I don't think he claims that it's the solution to all your defrag problems, but it should reduce fragmentation some, since it rearranges all your data according to file size.

It worked well for me, though I used it on just a few select directories. You should probably try it on a few smallish directories before turning it loose on an entire partition.

----------

## fangorn

Will surely test this before my data partition is entirely "cleaned"  :Twisted Evil: 

----------

## Kateikyoushi

 *dhave wrote:*   

> 
> 
> It worked well for me, though I used it on just a few select directories. You should probably try it on a few smallish directories before turning it loose on an entire partition.

 

I let it loose in a half a year old reiserfs which was aroud 30% fragmented, pushed it down to 5%.

This will do till reiser4 hits mainstream.

----------

## grenouille

nice! thanks  :Smile: 

----------

## wrc1944

I assume this can be run on mounted partitions- correct?

----------

## fangorn

This has to be run on mounted partitions, because it uses shell functions to find and categorize files and move them according to file size. that means for this script to run you will need at least a little bit more than the size of the biggest file on the filesystem as free space. The more continuous space you have free the less fragmented will the first worked on files be.

----------

## gmichels

Nice script, I just found a little bug when a filename contains weird chars such as a double quote mark (") or a parentheses:

```
./fragck.pl /home

statfs: No such file or directory

sh: -).jpg: command not found

Use of uninitialized value in pattern match (m//) at ./fragck.pl line 32.

Use of uninitialized value in concatenation (.) or string at ./fragck.pl line 40.

 : not understand for /home/gmichels/Documentos/Fotos/Pessoal/Celular/Cacio\ ;-\).jpg

.

statfs: No such file or directory

Use of uninitialized value in pattern match (m//) at ./fragck.pl line 32.

Use of uninitialized value in concatenation (.) or string at ./fragck.pl line 40.

 : not understand for /home/gmichels/Documentos/Pessoal/Músicas/U2/War/10\ -\ "40".mp3
```

----------

## dhave

 *gmichels wrote:*   

> Nice script, I just found a little bug when a filename contains weird chars such as a double quote mark (") or a parentheses:
> 
> ```
> ./fragck.pl /home
> 
> ...

 

You might want to send _droop_ a pm to let him know. He can fix this easily; I had a similar problem with special characters in file names.

----------

## dundas

testing, book marked for now, thx partners.

----------

## sirtalon42

I ran your program on my /usr/portage directory and got results that don't seem possible (<1% non contiguous files):

```
# ./fragchk.pl /usr/portage

0.121585826566503% non contiguous files, 1.00151982283208 average fragments.
```

/usr/portage isn't on its own partition, and thats the only thing I can think of that would be throwing off these results.

In the morning I'm gonna try running it on '/', but I'm guessing stuff like /proc and /dev will mess up the results some (lowering the average fragmentation).

EDIT:  Oops for some reason I forgot that an ideal % non-contiguous files is 0% (and not 1%), though I am rather shocked that portage is that little fragmented (though I guess most likely portage itself's files are small enough to fit inbetween other files, but causing other files to fragment...).Last edited by sirtalon42 on Tue Feb 14, 2006 3:00 pm; edited 1 time in total

----------

## TGL

 *sirtalon42 wrote:*   

> I ran your program on my /usr/portage directory and got results that don't seem possible (<1% non contiguous files)

 

Lot of very small (1 block) files, so yes, it's quite possible.

 *Quote:*   

> In the morning I'm gonna try running it on '/', but I'm guessing stuff like /proc and /dev will mess up the results some (lowering the average fragmentation).

 

Won't be a problem: it will only check your root partition ("find -xdev").

----------

## as

With find -print0, putting \0 to record separator and running filefrag without shell interfering you can lose all filename mangling. A lot safer too.

```

#!/usr/bin/perl -w

#this script search for frag on a fs

use strict;

#number of files

my $files = 0;

#number of fragment

my $fragments = 0;

#number of fragmented files

my $fragfiles = 0;

#search fs for all file

open (FILES, "find " . $ARGV[0] . " -xdev -type f -print0 |");

$/ = "\0";

while (defined (my $file = <FILES>)) {

        open (FRAG, "-|", "filefrag", $file);

        my $res = <FRAG>;

        if ($res =~ m/.*:\s+(\d+) extents? found/) {

                my $fragment = $1;

                $fragments += $fragment;

                if ($fragment > 1) {

                        $fragfiles++;

                }

                $files++;

        } else {

                print ("$res : not understand for $file.\n");

        }

        close (FRAG);

}

close (FILES);

print ( $fragfiles / $files * 100 . "% non contiguous files, " . $fragments / $files . " average fragments.\n");

```

----------

## revertex

@ as,

Thank's for the fix, works fine here. 

WTF,all partition was created 3 months ago, reiserfs3, noatime, notail flags.

```
fragck.pl /home

2.5240078184754% non contiguous files, 3.15788221296847 average fragments.
```

Looks like the guys at nemesis are telling us a big lie about reiserfs fragmentation.

Reiser is only fast in the first week before format, a few weeks later the slowdown becomes irritating.

```
fragck.pl /tmp     

34.8745364350046% non contiguous files, 1.89023744160285 average fragments.
```

After koliva's defrag script

```
fragck.pl /tmp 

34.8793526946973% non contiguous files, 1.89038192939363 average fragments.
```

Look's like it does nothing to reiserfs.

My conclusion, if you reformat once a month then reiserfs should be a good choice.

----------

## d4rkside

Sweet Script for reiserfs users! Thanks

@as what does the script that you posted do differently?

Thanks.

----------

## revertex

 *d4rkside wrote:*   

> Sweet Script for reiserfs users! Thanks
> 
> @as what does the script that you posted do differently?
> 
> Thanks.

 

d4rkside,

it handle long filenames with spaces.

----------

## dundas

weird, I guess mine looks fine? but, my gentoo is like more than 1 yrs old

```
# ./frag.sh /

3.24979630988634% non contiguous files, 1.13087017572245 average fragments.
```

but thx for the new script

----------

## G2k

I didn't know that ReiserFS fragmented  :Crying or Very sad:  that makes me sad. Is there a file system that doesn't fragment available for Linux now? EXT3? JFS? Or is this something that Reiser4 promises to bring?

----------

## PabOu

all FS got fragmentation with time...

I have read somewhere that Reiser4 got an auto-defrag if you leave the computer ON without disk activity...

----------

## G2k

I don't know how much of a difference there is, but using' as's code instead of _droop_'s, these were the results:

```
./fragchk.pl /

2.89009086202895% non contiguous files, 1.21397383030757 average fragments.
```

meh...I guess it holds out better than NTFS

----------

## johntash

Awesome.  I was just thinking this morning that I should look for a script like this to see how my music partition is doing.  I'm trying it out now,  thanks!  =]

----------

## Letharion

Does anyone have any idea why the defrag script increases the fragmentation slightly on my /home/ everytime I run it? Reiserfs drive, seems very strange....

----------

## Kelvie

Just as an FYI:

When large (>100MB) files get overly fragmented, and you attempt to unlink (rm) them, reiserfs craps out and overflows its buffer, causing a kernel panic.  I had a few videos stored on a fairly full drive, and let me tell you, this caused way too many problems (including *gasp* reboots).  The solution was to upgrade to a 2.6.17 or newer kernel, or use this patch:

```

kelvie@valour linux $ cat reiserfs-fix-transaction-overflowing.patch 

From: Alexander Zarochentzev <zam@namesys.com>

This patch fixes a bug in reiserfs truncate.  A transaction might overflow

when truncating long highly fragmented file.  The fix is to split

truncation into several transactions to avoid overflowing.

Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>

Cc; Charles McColgan <cm@chuck.net>

Cc: Alexander Zarochentsev <zam@namesys.com>

Cc: Hans Reiser <reiser@namesys.com>

Cc: Chris Mason <mason@suse.com>

Cc: Jeff Mahoney <jeffm@suse.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>

---

 fs/reiserfs/stree.c         |  208 +++++++++++-----------------------

 include/linux/reiserfs_fs.h |    5 

 2 files changed, 76 insertions(+), 137 deletions(-)

diff -puN fs/reiserfs/stree.c~reiserfs-fix-transaction-overflowing fs/reiserfs/stree.c

--- devel/fs/reiserfs/stree.c~reiserfs-fix-transaction-overflowing   2006-02-21 12:50:26.000000000 -0800

+++ devel-akpm/fs/reiserfs/stree.c   2006-02-21 12:50:26.000000000 -0800

@@ -981,6 +981,8 @@ static inline int prepare_for_direntry_i

    return M_CUT;

 }

 

+#define JOURNAL_FOR_FREE_BLOCK_AND_UPDATE_SD (2 * JOURNAL_PER_BALANCE_CNT + 1)

+

 /*  If the path points to a directory or direct item, calculate mode and the size cut, for balance.

     If the path points to an indirect item, remove some number of its unformatted nodes.

     In case of file truncate calculate whether this item must be deleted/truncated or last

@@ -1020,148 +1022,79 @@ static char prepare_for_delete_or_cut(st

 

    /* Case of an indirect item. */

    {

-      int n_unfm_number,   /* Number of the item unformatted nodes. */

-       n_counter, n_blk_size;

-      __le32 *p_n_unfm_pointer;   /* Pointer to the unformatted node number. */

-      __u32 tmp;

-      struct item_head s_ih;   /* Item header. */

-      char c_mode;   /* Returned mode of the balance. */

-      int need_research;

-

-      n_blk_size = p_s_sb->s_blocksize;

-

-      /* Search for the needed object indirect item until there are no unformatted nodes to be removed. */

-      do {

-         need_research = 0;

-         p_s_bh = PATH_PLAST_BUFFER(p_s_path);

-         /* Copy indirect item header to a temp variable. */

-         copy_item_head(&s_ih, PATH_PITEM_HEAD(p_s_path));

-         /* Calculate number of unformatted nodes in this item. */

-         n_unfm_number = I_UNFM_NUM(&s_ih);

-

-         RFALSE(!is_indirect_le_ih(&s_ih) || !n_unfm_number ||

-                pos_in_item(p_s_path) + 1 != n_unfm_number,

-                "PAP-5240: invalid item %h "

-                "n_unfm_number = %d *p_n_pos_in_item = %d",

-                &s_ih, n_unfm_number, pos_in_item(p_s_path));

-

-         /* Calculate balance mode and position in the item to remove unformatted nodes. */

-         if (n_new_file_length == max_reiserfs_offset(inode)) {   /* Case of delete. */

-            pos_in_item(p_s_path) = 0;

-            *p_n_cut_size = -(IH_SIZE + ih_item_len(&s_ih));

-            c_mode = M_DELETE;

-         } else {   /* Case of truncate. */

-            if (n_new_file_length < le_ih_k_offset(&s_ih)) {

-               pos_in_item(p_s_path) = 0;

-               *p_n_cut_size =

-                   -(IH_SIZE + ih_item_len(&s_ih));

-               c_mode = M_DELETE;   /* Delete this item. */

-            } else {

-               /* indirect item must be truncated starting from *p_n_pos_in_item-th position */

-               pos_in_item(p_s_path) =

-                   (n_new_file_length + n_blk_size -

-                    le_ih_k_offset(&s_ih)) >> p_s_sb->

-                   s_blocksize_bits;

-

-               RFALSE(pos_in_item(p_s_path) >

-                      n_unfm_number,

-                      "PAP-5250: invalid position in the item");

-

-               /* Either convert last unformatted node of indirect item to direct item or increase

-                  its free space.  */

-               if (pos_in_item(p_s_path) ==

-                   n_unfm_number) {

-                  *p_n_cut_size = 0;   /* Nothing to cut. */

-                  return M_CONVERT;   /* Maybe convert last unformatted node to the direct item. */

-               }

-               /* Calculate size to cut. */

-               *p_n_cut_size =

-                   -(ih_item_len(&s_ih) -

-                     pos_in_item(p_s_path) *

-                     UNFM_P_SIZE);

+       int blk_size = p_s_sb->s_blocksize;

+       struct item_head s_ih;

+       int need_re_search;

+       int delete = 0;

+       int result = M_CUT;

+       int pos = 0;

+

+       if ( n_new_file_length == max_reiserfs_offset (inode) ) {

+      /* prepare_for_delete_or_cut() is called by

+       * reiserfs_delete_item() */

+      n_new_file_length = 0;

+      delete = 1;

+       }

+

+       do {

+      need_re_search = 0;

+      *p_n_cut_size = 0;

+      p_s_bh = PATH_PLAST_BUFFER(p_s_path);

+      copy_item_head(&s_ih, PATH_PITEM_HEAD(p_s_path));

+      pos = I_UNFM_NUM(&s_ih);

 

-               c_mode = M_CUT;   /* Cut from this indirect item. */

-            }

-         }

+      while (le_ih_k_offset (&s_ih) + (pos - 1) * blk_size > n_new_file_length) {

+          __u32 *unfm, block;

 

-         RFALSE(n_unfm_number <= pos_in_item(p_s_path),

-                "PAP-5260: invalid position in the indirect item");

+          /* Each unformatted block deletion may involve one additional

+           * bitmap block into the transaction, thereby the initial

+           * journal space reservation might not be enough. */

+          if (!delete && (*p_n_cut_size) != 0 &&

+         reiserfs_transaction_free_space(th) < JOURNAL_FOR_FREE_BLOCK_AND_UPDATE_SD) {

+         break;

+          }

 

-         /* pointers to be cut */

-         n_unfm_number -= pos_in_item(p_s_path);

-         /* Set pointer to the last unformatted node pointer that is to be cut. */

-         p_n_unfm_pointer =

-             (__le32 *) B_I_PITEM(p_s_bh,

-                   &s_ih) + I_UNFM_NUM(&s_ih) -

-             1 - *p_n_removed;

-

-         /* We go through the unformatted nodes pointers of the indirect

-            item and look for the unformatted nodes in the cache. If we

-            found some of them we free it, zero corresponding indirect item

-            entry and log buffer containing that indirect item. For this we

-            need to prepare last path element for logging. If some

-            unformatted node has b_count > 1 we must not free this

-            unformatted node since it is in use. */

-         reiserfs_prepare_for_journal(p_s_sb, p_s_bh, 1);

-         // note: path could be changed, first line in for loop takes care

-         // of it

+          unfm = (__u32 *)B_I_PITEM(p_s_bh, &s_ih) + pos - 1;

+          block = get_block_num(unfm, 0);

 

-         for (n_counter = *p_n_removed;

-              n_counter < n_unfm_number;

-              n_counter++, p_n_unfm_pointer--) {

-

-            cond_resched();

-            if (item_moved(&s_ih, p_s_path)) {

-               need_research = 1;

-               break;

-            }

-            RFALSE(p_n_unfm_pointer <

-                   (__le32 *) B_I_PITEM(p_s_bh, &s_ih)

-                   || p_n_unfm_pointer >

-                   (__le32 *) B_I_PITEM(p_s_bh,

-                         &s_ih) +

-                   I_UNFM_NUM(&s_ih) - 1,

-                   "vs-5265: pointer out of range");

-

-            /* Hole, nothing to remove. */

-            if (!get_block_num(p_n_unfm_pointer, 0)) {

-               (*p_n_removed)++;

-               continue;

-            }

+          if (block != 0) {

+         reiserfs_prepare_for_journal(p_s_sb, p_s_bh, 1);

+         put_block_num(unfm, 0, 0);

+         journal_mark_dirty (th, p_s_sb, p_s_bh);

+         reiserfs_free_block(th, inode, block, 1);

+          }

 

-            (*p_n_removed)++;

+          cond_resched();

 

-            tmp = get_block_num(p_n_unfm_pointer, 0);

-            put_block_num(p_n_unfm_pointer, 0, 0);

-            journal_mark_dirty(th, p_s_sb, p_s_bh);

-            reiserfs_free_block(th, inode, tmp, 1);

-            if (item_moved(&s_ih, p_s_path)) {

-               need_research = 1;

-               break;

-            }

-         }

+          if (item_moved (&s_ih, p_s_path))  {

+         need_re_search = 1;

+         break;

+          }

 

-         /* a trick.  If the buffer has been logged, this

-          ** will do nothing.  If we've broken the loop without

-          ** logging it, it will restore the buffer

-          **

-          */

-         reiserfs_restore_prepared_buffer(p_s_sb, p_s_bh);

-

-         /* This loop can be optimized. */

-      } while ((*p_n_removed < n_unfm_number || need_research) &&

-          search_for_position_by_key(p_s_sb, p_s_item_key,

-                      p_s_path) ==

-          POSITION_FOUND);

-

-      RFALSE(*p_n_removed < n_unfm_number,

-             "PAP-5310: indirect item is not found");

-      RFALSE(item_moved(&s_ih, p_s_path),

-             "after while, comp failed, retry");

-

-      if (c_mode == M_CUT)

-         pos_in_item(p_s_path) *= UNFM_P_SIZE;

-      return c_mode;

+          pos --;

+          (*p_n_removed) ++;

+          (*p_n_cut_size) -= UNFM_P_SIZE;

+

+          if (pos == 0) {

+         (*p_n_cut_size) -= IH_SIZE;

+         result = M_DELETE;

+         break;

+          }

+      }

+      /* a trick.  If the buffer has been logged, this will do nothing.  If

+      ** we've broken the loop without logging it, it will restore the

+      ** buffer */

+      reiserfs_restore_prepared_buffer(p_s_sb, p_s_bh);

+       } while (need_re_search &&

+           search_for_position_by_key(p_s_sb, p_s_item_key, p_s_path) == POSITION_FOUND);

+       pos_in_item(p_s_path) = pos * UNFM_P_SIZE;

+

+       if (*p_n_cut_size == 0) {

+      /* Nothing were cut. maybe convert last unformatted node to the

+       * direct item? */

+      result = M_CONVERT;

+       }

+       return result;

    }

 }

 

@@ -1948,7 +1881,8 @@ int reiserfs_do_truncate(struct reiserfs

        ** sure the file is consistent before ending the current trans

        ** and starting a new one

        */

-      if (journal_transaction_should_end(th, th->t_blocks_allocated)) {

+      if (journal_transaction_should_end(th, 0) ||

+          reiserfs_transaction_free_space(th) <= JOURNAL_FOR_FREE_BLOCK_AND_UPDATE_SD) {

          int orig_len_alloc = th->t_blocks_allocated;

          decrement_counters_in_path(&s_search_path);

 

@@ -1962,7 +1896,7 @@ int reiserfs_do_truncate(struct reiserfs

          if (err)

             goto out;

          err = journal_begin(th, p_s_inode->i_sb,

-                   JOURNAL_PER_BALANCE_CNT * 6);

+                   JOURNAL_FOR_FREE_BLOCK_AND_UPDATE_SD + JOURNAL_PER_BALANCE_CNT * 4) ;

          if (err)

             goto out;

          reiserfs_update_inode_transaction(p_s_inode);

diff -puN include/linux/reiserfs_fs.h~reiserfs-fix-transaction-overflowing include/linux/reiserfs_fs.h

--- devel/include/linux/reiserfs_fs.h~reiserfs-fix-transaction-overflowing   2006-02-21 12:50:26.000000000 -0800

+++ devel-akpm/include/linux/reiserfs_fs.h   2006-02-21 12:50:26.000000000 -0800

@@ -1704,6 +1704,11 @@ static inline int reiserfs_transaction_r

    return 0;

 }

 

+static inline int reiserfs_transaction_free_space(struct reiserfs_transaction_handle *th)

+{

+   return th->t_blocks_allocated - th->t_blocks_logged;

+}

+

 int reiserfs_async_progress_wait(struct super_block *s);

 

 struct reiserfs_transaction_handle *reiserfs_persistent_transaction(struct

_

```

I've applied this patch to all three of my boxes (2.6.16-reiser4 opteron, 2.6.16-gentoo-sources athlon, 2.6.15-suspend2 pentium-m), and it fixed the issues.

Sadly, reiserfs is still the best filesystem out there, despite stupid bugs like these.  Just have to wait until reiser4 comes with a grow utility, I guess.

Kelvie

----------

## G2k

 *Kelvie wrote:*   

> Sadly, reiserfs is still the best filesystem out there

 I don't second that although I use reiserfs myself. Ext3 and XFS are good competitors depending on what you have to do with your partition.

----------

## c07

 *sirtalon42 wrote:*   

> I ran your program on my /usr/portage directory and got results that don't seem possible (<1% non contiguous files):
> 
> ```
> # ./fragchk.pl /usr/portage
> 
> ...

 

Use a more verbose script like the following one (based on as' script):

```
#! /usr/bin/perl -w

use strict;

@ARGV >= 1 && @ARGV <= 2 or die "usage: $0 <dir> [<block size in KB>]";

$/= "\0";

my ($files, $blocks, $fragments, $frag, $fragblocks, $multi, $empty)= (0) x 7;

my $dir= shift;

my $blocksize= (shift || 4) + 0;

print qq|scanning "$dir", using block size $blocksize KB ...\n|;

open my $find, "-|", "find", $dir, qw"-xdev -type f -print0";

while ( my $file= <$find> ) {

  { open my $fh, "-|", "filefrag", $file; $_= <$fh> }

  /:\s+(\d+) extents? found/ or (print qq|"$_"?\n|), next;

  my $n= $1 + 0;

  { open my $fh, "-|", "ls", "-sk", $file; $_= <$fh> }

  /^(\d+)\s/ or (print qq|"$_" (ls)?\n|), next;

  my $s= $1 / $blocksize;

  ++$files;

  $s or ++$empty, next;

  $blocks += $s;

  $fragments += $n;

  ++$frag, $fragblocks += $s if $n > 1;

  ++$multi if $s > 1;

}

my $single= $files - $multi - $empty;

my $nonfrag= $files - $frag - $empty;

if ( ! $files ) { print "no files\n" }

else {

  printf "$files files, $frag (%.3f %%) fragmented\n", 100 * $frag / $files;

  if ( ! $multi ) { print "no multi-block files\n" }

  else {

    printf "$multi multi-block files, %.3f %% fragmented\n",

      100 * $frag / $multi;

  }

  print "$blocks blocks, $fragments fragments, $empty empty files\n";

  if ( $fragments ) {

    printf "average %.3f fragments per file, %.3f blocks per fragment,\n",

      $fragments / $files, $blocks / $fragments;

    if ( $multi ) {

      printf "%.3f fragments per multi-block file, %.3f blocks each,\n",

        ($fragments - $single) / $multi,

        ($blocks - $single) / ($fragments - $single);

      if ( $frag ) {

        printf "%.3f fragments per fragmented file, %.3f blocks each\n",

        ($fragments - $nonfrag) / $frag,

        $fragblocks / ($fragments - $nonfrag);

} } } }
```

Assumes 4 KB block size, but you can override it by supplying a second argument.

Performance for the portage tree doesn't depend much on internal file fragmentation, but on directory fragmentation (files scattered all around the filesystem).

To make fragmentation visible, you can use a script like this:

```
#! /usr/bin/perl -w

use strict;

require "linux/fs.ph";

@ARGV == 1 or die "usage: $0 <dir>";

$/= "\0";

my (%blocks, $last);

my $dir= shift;

print qq|scanning "$dir" ...\n|;

open my $find, "-|", "find", $dir, qw"-xdev -type f -print0";

while ( my $file= <$find> ) {

  open my $fh, "<", $file or die $!;

  my $x; { no warnings; ioctl $fh, &FIGETBSZ, $x }

  $x= unpack "L!", $x;

  $x or next;

  my ($blocks, $frag, $last)= ((-s $fh) / $x, 1);

  for ( my $i= 0; $i < $blocks; ++$i ) {

    $x= pack "L!", $i;

    ioctl $fh, &FIBMAP, $x;

    $x= unpack "L!", $x;

    ++$frag if defined $last && $x != $last + 1;

    $last= $x;

    $blocks{$x}= exists $blocks{$x} ? "+" :

      $frag == 1 ? "*" : $frag > 9 ? "#" : $frag;

  }

}

for my $k ( sort { $a <=> $b } keys %blocks ) {

  if ( ! defined $last || $k - $last > 128 ) {

    printf "\n0x%.8x: ", $k;

    print " " x ($k & 0x3f);

    $last= $k;

  }

  else {

    while ( 1 ) {

      ++$last & 0x3f or print "\n", " " x 12;

      last if $last == $k;

      print ".";

    }

  }

  print $blocks{$k};

}

print "\n";
```

Takes a directoty (or single file) as argument and prints a block map of all contained files. "*" marks first fragments, "2" to "9" 2nd to 9th fragments, "#" further fragments, "+" multiple used blocks (this should only apply to block #0, which is returned for any tails that do not have a real block associated) and "." blocks not used by these files (but note that they may be used by ext2's indirect blocks (which are supposed to fragment larger files (filefrag accounts for this)), directory lists, inodes, ...). Skips to the next used block if 128 or more consecutive blocks aren't used, and prints the number of the next block.

Probably it should do some error checking. Use it at your own risk.

----------

## dalek

I keep getting this:

```
root@smoker / # /root/fragck.pl /

/root/fragck.pl: line 5: use: command not found

/root/fragck.pl: line 8: my: command not found

/root/fragck.pl: line 10: my: command not found

/root/fragck.pl: line 12: my: command not found

/root/fragck.pl: line 15: syntax error near unexpected token `FILES,'

/root/fragck.pl: line 15: `open (FILES, "find " . $ARGV[0] . " -xdev -type f |");'

root@smoker / # 
```

What package has these commands so I can install them?

I read that XFS is really fast and a really good file system.  Anybody here use it a lot??

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## BitJam

Dalek, it looks like you may have copied the script incorrectly.

It looks like Bash is trying to interpret the script instead of Perl.  Also, the "use" error is reported to be on line 5 while in the actual script that line is line 4.   The very first line must be:

```
#!/usr/bin/perl -w 
```

A quick and dirty fix would be to run the Perl program explicitly:

```
# perl -w /root/fragck.pl / 
```

If this works then that would tend to confirm there is a problem with the first line in the script.

----------

## dalek

That was it.  There was a blank line for the first line.  Then there was a space in what was suposed to be the first line as well.  I had that thing badly confused.

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## dalek

Well, I have been running mine for a good while and this was the one that had the highest fragments:

```
root@smoker / # /root/fragck.pl /

6.23781676413255% non contiguous files, 1.16560503036536 average fragments.

root@smoker / #
```

I have been using it for a couple years so that is not to bad really.  What do you folks think??

Oh, here is how mine is set up:

```
root@smoker / # df

Filesystem           1K-blocks      Used Available Use% Mounted on

/dev/hda6             14647740   2951284  11696456  21% /

udev                    517288       196    517092   1% /dev

/dev/hda1               146612     39832    106780  28% /boot

/dev/hda7              9765136   3496720   6268416  36% /home

/dev/hda8             14647740   6256632   8391108  43% /usr

/dev/hda9              4882532   2134980   2747552  44% /usr/portage

/dev/hda10            35462336  14888940  20573396  42% /mnt/data

none                    517288         0    517288   0% /dev/shm

root@smoker / # 
```

Thoughts??

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## BitJam

One quick check would be to run: 

```
# time du -sh /usr/portage
```

I used the procedure outlined in this post and the time dropped from over 3 minutes to 20 seconds, almost a factor of 10!  My system was brand new two months ago.  It has also made the rsync portion of "emerge --sync" much faster but I don't have actually numbers to report.

If you want to try this make sure you move distfiles out of /usr/portage and make /usr/portage/distfiles a symlink to the real location.

----------

## dalek

I'm not sure if this is a bug or what.  I get a lot of these on one of my partitions:

```
statfs: No such file or directory

sh: [1].htm: command not found

Use of uninitialized value in pattern match (m//) at /root/fragck.pl line 32.

Use of uninitialized value in concatenation (.) or string at /root/fragck.pl line 41.

 : not understand for /mnt/data/Teresa/Documents\ and\ Settings2/Tee/Local\ Settings/Temporary\ Internet\ Files/Content.IE5/C18XM70P/keywords;kw=computer+keyboards+with+lights;cat=58058;tcat=51148;items=1743;sz=440x198;tile=5;ord=1146016301420;[1].htm

.

```

I notice that it has a lot of \ in the paths.  Could that have something to do with  it?  I back up my wife's laptop on this partition with samba.  Resierfack says it is fine though.  Any ideas?

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Gentree

 *Quote:*   

> Sadly, reiserfs is still the best filesystem out there, 

 

No , sadly reiserfs sucks. The good news is its not the best out there.   :Wink: 

R4 is streets ahead of reiserfs and if you really cant stand the idea of a fs that is not included in main line kernel yet use ext3 (with a few -b tweeks etcetera if you feel like.)

I dont see why having growfs should the determining factor of the choice of fs. 

Learn to manage your disk space better with several smaller partitions and you'll find backing up and life in general a lot easier.

To the guy who has all on a single 250G   :Rolling Eyes:  try using tar to archive portage, clean out the files and untar it back. This will defrag that bit at least.

 :Cool: 

----------

## dalek

Any idea what version of gentoo-sources will have reiserfs4 in it?  I have read it is a good thing.

Just curious.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Gentree

no version of std gentoo , vanilla or other sources support this yet, although it is getting active attention.

you will need to either patch your kernel sources (very simple , go get the reiser4 patch for your kernel version, apply patch , rebuild kernel. Really a doodle to do.) or use one of the many patched kernel sources in unsupported forum that do. eg. no-sources , beyond , viper ...... many of these are much more responcive as well and well worth checking out.

 :Cool: 

----------

## dalek

I tried to patch a gentoo kernel once and I kept getting errors because it was already patched with the Gentoo stuff and things were not where they are suposed to be or something.  Would that still be a issue or should I just wait till it comes out and is stable?  

I looked everywhere to see if I could find a date it would be released and I found nothing.  I looked here:

http://www.namesys.com/

Any place else that may have a clue??

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Gentree

you should find the appropriate patch file here , ftp://ftp.namesys.com/pub/reiser4-for-2.6

if you are a bit unsure just go for something like no-sources and enable support through make menuconfig.

 :Cool: 

----------

## Gentree

getting a bit OT here so back On T:

just ran droop's script on two of my R4 partitions. 

```
bash-3.1#fragck.pl /tmpd

28.0939116593713% non contiguous files, 1.60206923995225 average fragments.

bash-3.1#fragck.pl /usr/portage 

6.75392192630427% non contiguous files, 1.28991345413272 average fragments.

```

ran beautifully, surprisingly fast. 

Nice tool. Thanks a lot.  :Cool: 

One reason I like to keep my partitions small. Every week or so I do a clone of my root fs as a backup. If I need to defrag I clone with cp -ax and reboot to the clone. Sys backup and defrag takes about 15m , 12 of which is disk activity.

----------

## dalek

Well, how is this?

```
root@smoker / # /root/fragck.pl /

16.4263677754833% non contiguous files, 1.46472644490334 average fragments.

root@smoker / # /root/fragck.pl /boot/

4% non contiguous files, 1.12 average fragments.

root@smoker / # /root/fragck.pl /home/

8.08759345705719% non contiguous files, 1.70317659114198 average fragments.

root@smoker / # /root/fragck.pl /usr/

1.93224184188215% non contiguous files, 1.06032884115818 average fragments.

root@smoker / # /root/fragck.pl /usr/portage/

0.199848115432271% non contiguous files, 1.00777808865262 average fragments.

root@smoker / #   
```

Looks like the / partition is the worst don't it?  Or does that sort of include all the others too?  May be time for a backup and redoing my partitions again.  That will fix it.    :Shocked: 

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## nico_calais

Here's my /home partition after two weeks of use :

```
7.87931034482759% non contiguous files, 24.4593103448276 average fragments.
```

And after the defrag tool

```
3.271913208197% non contiguous files, 2.54537627001894 average fragments.
```

Thanks for your tool. And the defrag tool is very nice too   :Wink: 

----------

## gerardo

Defrag increases fragmentation for me.

```
/mnt/data # ~/defrag.sh /mnt/data/

Creating list of files...

156 files will be reordered

 0 files left                                                               

Succeeded

/mnt/data # ~/fragcheck.pl /mnt/data/

scanning "/mnt/data/", using block size 4 KB ...

155 files, 60 (38.710 %) fragmented

139 multi-block files, 43.165 % fragmented

3740532.75 blocks, 7376 fragments, 0 empty files

average 47.587 fragments per file, 507.122 blocks per fragment,

52.950 fragments per multi-block file, 508.222 blocks each,

121.350 fragments per fragmented file, 509.856 blocks each

/mnt/data # ~/defrag.sh /mnt/data/

Creating list of files...

156 files will be reordered

 0 files left                                                               

Succeeded

/mnt/data # ~/fragcheck.pl /mnt/data/

scanning "/mnt/data/", using block size 4 KB ...

155 files, 63 (40.645 %) fragmented

139 multi-block files, 45.324 % fragmented

3740532.75 blocks, 6444 fragments, 0 empty files

average 41.574 fragments per file, 580.468 blocks per fragment,

46.245 fragments per multi-block file, 581.910 blocks each,

100.825 fragments per fragmented file, 584.524 blocks each

```

This happens on a drive formatted with reiserfs 3.6 where there are a lot of files > 100 Mbytes.

Disk usage is only 48%:

```
/dev/md/2             31261248  14995300  16265948  48% /mnt/data
```

Next, I did some moving of the largest files (700 Mb and more) to other partitions.

Now usage is 39%:

```
/dev/md/2             31261248  12126788  19134460  39% /mnt/data
```

After another defrag, it is even worse:

```
/mnt/data # ~/defrag.sh /mnt/data/

Creating list of files...

153 files will be reordered

 0 files left                                                              

Succeeded

/mnt/data # ~/fragcheck.pl /mnt/data/

scanning "/mnt/data/", using block size 4 KB ...

152 files, 69 (45.395 %) fragmented

146 multi-block files, 47.260 % fragmented

3023422.5 blocks, 3035 fragments, 0 empty files

average 19.967 fragments per file, 996.185 blocks per fragment,

20.747 fragments per multi-block file, 998.157 blocks each,

42.783 fragments per fragmented file, 1015.926 blocks each

```

Any clues why and how to solve it?

----------

## buddabrod

Little late, i know, but where is your problem? It's much better, look at the number of fragments per file an at the size of the fragments.

----------

## user11

We may use xargs to improve performance.

The script below:

1. is based on as's script (uses -print0),

2. has much better performance (due to xargs),

3. reports a little more details if -v option is specified,

4. rounds values to reasonable precision.

Note: xargs is really safe and powerful tool  :Smile: 

```

#!/usr/bin/perl -w

#this script search for frag on a fs

use strict;

#number of files

my $files = 0;

#number of fragment

my $fragments = 0;

#number of fragmented files

my $fragfiles = 0;

my $verbose;

if ($ARGV[0] eq '-v') { shift @ARGV; $verbose++; }

open (REPORT, "find " . $ARGV[0] . " -xdev -type f -print0 | xargs -0 filefrag |");

while (defined (my $res = <REPORT>)) {

        if ($res =~ m/.*:\s+(\d+) extents? found$/) {

                my $fragment = $1;

                $fragments += $fragment;

                if ($fragment > 1) {

                        $fragfiles++;

                }

                $files++;

        } else {

                print ("Failed to parse: $res\n");

        }

}

close (REPORT);

if ($verbose) {

   print "Total files:      $files\n";

   print "Fragmented files: $fragfiles\n";

   print "Fragments:        $fragments\n";

}

sub round($$) {

   my $v = shift; # value

   my $p = shift; # rouding divisor (1 for '123', 10 for '123.4', 100 for '123.45')

   return int($v * $p) / $p;

}

print ( round($fragfiles / $files * 100, 10) . "% non contiguous files, " . round($fragments / $files, 10) . " average fragments.\n"); 

```

----------

## Naib

 *Quote:*   

> 
> 
> 16:18:16 < greybot> xargs is a broken tool if you do not use the -0 option. Use ''find -exec'' or ''for file in *'' 
> 
>                     instead if at all possible. Two xargs 'bugs' in one: xargs rm <<< "Don't cry.mp3"

 

DON'T use xargs...

----------

## user11

 *Quote:*   

> xargs is a broken tool if you do not use the -0 option

 

Hmmm... i DO use -0 option in script above. Afaik (and as far as I tested) it is really safe.

 *Quote:*   

> DON'T use xargs...

 

Any other objection?

----------

## Naib

there are better ways todo it

that is what the -exec option to find is for

----------

## user11

But -exec is often much slower. The script is an example.

----------

## honeymak

here's what i have.....em....any comments?

```

hm01 ~ # mount

/dev/sde4 on / type jfs (rw,noatime)

proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)

sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)

udev on /dev type tmpfs (rw,nosuid,relatime,size=10240k,mode=755)

devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620)

/dev/sde3 on /usr type reiserfs (rw,noatime)

/dev/md1 on /home type jfs (rw,noatime)

/home/root_var on /var type none (rw,bind)

shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev)

usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85)

hm01 ~ # df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/sde4             555G  616M  555G   1% /

udev                   10M  224K  9.8M   3% /dev

/dev/sde3             373G  6.4G  367G   2% /usr

/dev/md1              3.7T   28G  3.7T   1% /home

shm                   995M     0  995M   0% /dev/shm

hm01 ~ # ./fragck.pl /usr

0% non contiguous files, 1 average fragments.

hm01 ~ # ./fragck.pl /   

100% non contiguous files, 3 average fragments.

hm01 ~ # ./fragck.pl /home

0% non contiguous files, 0 average fragments.

hm01 ~ # 

```

----------

