# ldap database recovery - problem

## drrrl

Hi,

in the past I have noticed several problems with ldap databases and strange slapd behaviour (total hangup).

Finally I found the really working workaround:

- restart slapd once per week

- put db_recovery in ldap start script:

```

start() {

        ebegin "Starting ldap-server"

         cd /var/lib/openldap-data && /usr/bin/db4.1_recover -c -v                        <--------- db recovery

         eval start-stop-daemon --start --quiet --pidfile /var/run/openldap/slapd

.pid --exec /usr/lib/openldap/slapd -- -u ldap -g ldap "${OPTS}"

        eend $?

}

```

(the single database recovery worked for about one month, then a problem appeared, that's why I run it weekly)

Anyway... It worked really good and since I implemented this solution slapd neveg hung again.

Meanwhile I installed Berkeley DB 4.2:

```

* sys-libs/db

     Available versions:  1.85-r1 1.85-r2 3.2.9-r7 3.2.9-r10 4.0.14-r2 4.0.14-r3

 4.1.25_p1-r3 4.1.25_p1-r4 ~4.2.52_p1 4.2.52_p2 [M]4.3.27

     Installed:           1.85-r2 4.1.25_p1-r4 4.2.52_p2

     Homepage:            http://www.sleepycat.com/

     Description:         Berkeley DB

```

and during slapd restart I noticed the following errors:

```
 * Starting ldap-server ...

db_recover: unable to join the environment

db_recover: Ignoring log file: log.0000000005: unsupported log version 8

db_recover: Invalid log file: log.0000000005: Invalid argument

db_recover: PANIC: Invalid argument

db_recover: PANIC: DB_RUNRECOVERY: Fatal error, run database recovery

db_recover: fatal region error detected; run recovery

db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database reco  [ ok ]

```

which is not a big surprise, as log.0000000005 is really in newer version:

```

# pwd

/var/lib/openldap-data

# file log.000000000*

log.0000000001: Berkeley DB (Log, version 7, native byte-order)

log.0000000002: Berkeley DB (Log, version 7, native byte-order)

log.0000000003: Berkeley DB (Log, version 7, native byte-order)

log.0000000004: Berkeley DB (Log, version 7, native byte-order)

log.0000000005: Berkeley DB (Log, version 8, native byte-order)

```

I reminded that db4.1_recover is used in ldap start script, so I changed it to db4.2_recover. And... I got another errors:

```

# /etc/init.d/slapd start

 * Starting ldap-server ...

db_recover: Finding last valid log LSN: file: 5 offset 3460

db_recover: Ignoring log file: log.0000000004: unreadable log version 7

db_recover: Ignoring log file: log.0000000004: unreadable log version 7

db_recover: Ignoring log file: log.0000000003: unreadable log version 7

db_recover: Ignoring log file: log.0000000002: unreadable log version 7

db_recover: Ignoring log file: log.0000000001: unreadable log version 7

db_recover: Recovery starting from [5][28]

db_recover: Ignoring log file: log.0000000004: unreadable log version 7

db_recover: DB_ENV->log_flush: LSN of 5/5037379 past current end-of-log of 5/3460

db_recover: Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment

db_recover: uid.bdb: unable to flush page: 0

db_recover: txn_checkpoint: failed to flush the buffer cache Invalid argument

db_recover: PANIC: Invalid argument

db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database reco  [ ok ]

```

and after a few restarts during last months the error message is now as follows:

```

 * Starting ldap-server ...db_recover: Finding last valid log LSN: file: 5 offset 431890

db_recover: Ignoring log file: log.0000000004: unreadable log version 7

db_recover: Ignoring log file: log.0000000004: unreadable log version 7

db_recover: Ignoring log file: log.0000000003: unreadable log version 7

db_recover: Ignoring log file: log.0000000002: unreadable log version 7

db_recover: Ignoring log file: log.0000000001: unreadable log version 7

db_recover: Recovery starting from [5][28]

db_recover: Ignoring log file: log.0000000004: unreadable log version 7

db_recover: Log sequence error: page LSN 5 312547; previous LSN 5 4477179

db_recover: Recovery function for LSN 5 5609 failed on forward pass

db_recover: PANIC: Invalid argument

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database recovery

                                               [ ok ]

```

So... as for now LDAP server seems to run properly. But I'm worried more and more and I suspect a collapse of the system soon. Could you please advice me how to run a proper database recovery?

----------

## Pol

I had problem with openldap databases when I used bdb, now I use ldbm and no problems  :Wink: 

Hope it helps...

----------

## drrrl

Hmm...

how did you perform a transition from one database to another?

The question is if I could use another db in whole system... for instance with Cyrus?

That could be hard and dangerous work to do - or am I to pesimistic?

----------

## Pol

I didn't make any transition, I had problem with bdb while I was learning openldap, and I saw many examples of configurations files on the net using ldbm... so I changed bdb in ldbm and I didnt get any more problems  :Wink: 

I think the best way to do a transition is to make a dump of your ldap and then re inject it using ldbm databases ...

Yes you can use another database, it will not affect the other system because other softwares don't use files of the database directly, they ask the ldap, not the files  :Wink: 

I hope I'm understandable :p

Cya

----------

## drrrl

Yeah, sure  :Smile: , but I wonder what will happen for instance with cyrus when I uninstall bdb and install ldbm? It uses several files in bdb format like /var/imap/mailboxes.db. 

It seems that I should find any software that uses bdb, dump the files, change databases and rebuild db files, right?

----------

## Pol

 *drrrl wrote:*   

> Yeah, sure , but I wonder what will happen for instance with cyrus when I uninstall bdb and install ldbm? It uses several files in bdb format like /var/imap/mailboxes.db. 
> 
> It seems that I should find any software that uses bdb, dump the files, change databases and rebuild db files, right?

 

Yes I think that the best thing to do ... I'm not an expert but I think that's it.

----------

## drrrl

Seems I need long holidays  :Wink: 

Thanks for help.

----------

## AchilleTalon

Maybe it is far too late, but anyway, since I ran into this thread looking for a solution myself with the Sleepycat Berkeley DB, I found you simply had to upgrade you DB file to the latest 4.2 version.

You can perform this as follow:

```

cd /var/lib/openldap-data (or any other directory where you put you database)

db4.2_update all_your_bdb_files.bdb (of course, replace all_your_bdb_files.bdb by each actual .bdb file you have)

```

Doesn't harm to take a backup before doing this.  :Wink: 

----------

## drrrl

Hi,

it is far too late with my openldap indeed - after a horrible unrecoverable (except full restore from backup) crash that happened some time after my previous posts and after I noticed that using openldap was the reason that my system had constant load about 1 while CPU was 95% idle (really, I don't know why until today) I decided to throw openldap away.

...but - your solution can still be helpfull as some other databases are still in DB 4.2 format  :Smile: 

----------

