# Cyrus stopped - please help!

## drrrl

Hello,

environment:

net-mail/cyrus-imapd     2.2.12

sys-libs/db                    4.1.25_p1-r4 1.85-r2 (recently updated)

today in the morning I have noticed that postfix cannot deliver mail to mailboxes. A fast look at /var/log/everything/* showed that Cyrus went crazy with messages like these:

```
Mar 19 12:29:33 [lmtpunix] DBERROR db4: environment not yet opened

Mar 19 12:29:33 [lmtpunix] DBERROR: opening /var/imap/deliver.db: Invalid argument

Mar 19 12:29:33 [lmtpunix] DBERROR: opening /var/imap/deliver.db: cyrusdb error

Mar 19 12:29:33 [lmtpunix] FATAL: lmtpd: unable to init duplicate delivery database

Mar 19 12:29:33 [master] service lmtpunix pid 30445 in READY state: terminated abnormally

```

unfortunately logs have already rotated so many times, that could not find information about the beginning of the disaster.

Anyway - /etc/init.d/cyrus restart fortunately helped and until now everything was fine.

But then... cyrus went down again with logs like this:

```

Mar 19 22:27:28 [postfix/pipe] 86247530B5: to=<grzes@pyza.from.cx>, orig_to=<root@pyza.from.cx>, relay=cyrus, delay=1, status=sent (pyza.from.cx)

Mar 19 22:27:29 [postfix/pipe] 86247530B5: to=<marcin@pyza.from.cx>, orig_to=<root@pyza.from.cx>, relay=cyrus, delay=2, status=sent (pyza.from.cx)

Mar 19 22:27:29 [postfix/qmgr] 86247530B5: removed

(till now it was ok)

Mar 19 22:30:01 [/usr/sbin/cron] (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )

Mar 19 22:32:15 [ctl_cyrusdb] checkpointing cyrus databases

Mar 19 22:32:17 [ctl_cyrusdb] DBERROR db4: DB_LOGC->get: log record checksum mismatch

Mar 19 22:32:17 [ctl_cyrusdb] DBERROR db4: DB_LOGC->get: catastrophic recovery may be required

Mar 19 22:32:17 [ctl_cyrusdb] DBERROR db4: PANIC: DB_RUNRECOVERY: Fatal error, run database recovery

Mar 19 22:32:17 [ctl_cyrusdb] DBERROR: critical database situation

Mar 19 22:33:20 [imap] DBERROR db4: fatal region error detected; run recovery

Mar 19 22:33:20 [imap] DBERROR: dbenv->open '/var/imap/db' failed: DB_RUNRECOVERY: Fatal error, run database recovery

Mar 19 22:33:20 [imap] DBERROR: init() on berkeley

Mar 19 22:33:20 [imap] sql_select option missing

Mar 19 22:33:20 [imap] auxpropfunc error no mechanism available_

Mar 19 22:33:20 [imap] login: localhost [127.0.0.1] marcin plaintext User logged in

Mar 19 22:38:21 [imap] DBERROR db4: fatal region error detected; run recovery

```

and then:

```
Mar 19 22:51:05 [imaps] idle for too long, closing connection

                - Last output repeated twice -

Mar 19 22:52:05 [imaps] DBERROR db4: fatal region error detected; run recovery

Mar 19 22:52:05 [imaps] DBERROR: error closing: DB_RUNRECOVERY: Fatal error, run database recovery

Mar 19 22:52:05 [imaps] DBERROR: error closing tlsdb: cyrusdb error

Mar 19 22:52:05 [imaps] DBERROR db4: fatal region error detected; run recovery

Mar 19 22:52:05 [imaps] DBERROR: error exiting application: DB_RUNRECOVERY: Fatal error, run database recovery

Mar 19 22:52:13 [imaps] DBERROR db4: fatal region error detected; run recovery

Mar 19 22:52:13 [imaps] DBERROR: error closing: DB_RUNRECOVERY: Fatal error, run database recovery

Mar 19 22:52:13 [imaps] DBERROR: error closing tlsdb: cyrusdb error

Mar 19 22:52:13 [imaps] DBERROR db4: fatal region error detected; run recovery

Mar 19 22:52:13 [imaps] DBERROR: error exiting application: DB_RUNRECOVERY: Fatal error, run database recovery

```

 (I use imaps for remote mail clients)

I tried to restart cyrus, but got only:

```
Mar 19 22:59:57 [master] exiting on SIGTERM/SIGINT

Mar 19 22:59:58 [master] setrlimit: Unable to set file descriptors limit to -1: Operation not permitted

Mar 19 22:59:58 [master] retrying with 1024 (current max)

Mar 19 22:59:58 [master] process started

Mar 19 22:59:58 [ctl_cyrusdb] DBERROR db4: DB_ENV->log_flush: LSN past current end-of-log

Mar 19 22:59:58 [ctl_cyrusdb] DBERROR db4: /var/imap/deliver.db: unable to flush page: 0

Mar 19 22:59:58 [ctl_cyrusdb] DBERROR db4: txn_checkpoint: failed to flush the buffer cache Invalid argument

Mar 19 22:59:58 [ctl_cyrusdb] DBERROR db4: PANIC: Invalid argument

Mar 19 22:59:58 [ctl_cyrusdb] DBERROR: critical database situation

Mar 19 22:59:58 [master] process 3885 exited, status 75_

Mar 19 22:59:59 [master] ready for work

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR db4: fatal region error detected; run recovery

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: dbenv->open '/var/imap/db' failed: DB_RUNRECOVERY: Fatal error, run database  recovery

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: init() on berkeley

Mar 19 22:59:59 [ctl_cyrusdb] checkpointing cyrus databases

Mar 19 22:59:59 [tls_prune] DBERROR db4: fatal region error detected; run recovery

Mar 19 22:59:59 [tls_prune] DBERROR: dbenv->open '/var/imap/db' failed: DB_RUNRECOVERY: Fatal error, run database recovery

Mar 19 22:59:59 [tls_prune] DBERROR: init() on berkeley

Mar 19 22:59:59 [tls_prune] DBERROR db4: environment not yet opened

Mar 19 22:59:59 [tls_prune] DBERROR: opening /var/imap/tls_sessions.db: Invalid argument

Mar 19 22:59:59 [tls_prune] DBERROR: opening /var/imap/tls_sessions.db: cyrusdb error

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR db4: txn_checkpoint interface requires an environment configured for the tran

saction subsystem

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: couldn't checkpoint: Invalid argument

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: sync /var/imap/db: cyrusdb error

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR db4: DB_ENV->log_archive interface requires an environment configured for the logging subsystem

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: error listing log files: Invalid argument

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: archive /var/imap/db: cyrusdb error

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR db4: txn_checkpoint interface requires an environment configured for the tran

saction subsystem

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: couldn't checkpoint: Invalid argument

Mar 19 22:59:59 [ctl_cyrusdb] DBERROR: sync /var/imap/db: cyrusdb error

[...]

```

I tried ctl_cyrusdb -r, with no effect :

```
Mar 19 23:48:16 [ctl_cyrusdb] DBERROR db4: DB_ENV->log_flush: LSN past current end-of-log

Mar 19 23:48:16 [ctl_cyrusdb] DBERROR db4: /var/imap/deliver.db: unable to flush page: 0

Mar 19 23:48:16 [ctl_cyrusdb] DBERROR db4: txn_checkpoint: failed to flush the buffer cache Invalid argument

Mar 19 23:48:16 [ctl_cyrusdb] DBERROR db4: PANIC: Invalid argument

Mar 19 23:48:16 [ctl_cyrusdb] DBERROR: critical database situation

```

Please help!

----------

## drrrl

BTW: I think it does not deal with permissions:

```
root@pyza imap # pwd

/var/imap

root@pyza imap # ls -la

total 316

drwxr-x---  12 cyrus mail   4096 Mar 19 23:17 .

drwxr-xr-x  20 root  root   4096 Feb 22 19:47 ..

-rw-r--r--   1 root  root      0 Feb 21 23:22 .keep

-rw-------   1 cyrus mail    144 Mar 19 12:32 annotations.db

drwxr-x---   2 cyrus mail   4096 Mar 19 23:48 db

drwx------   2 cyrus mail   4096 Mar 19 23:17 db.backup1

drwx------   2 cyrus mail   4096 Mar 19 22:59 db.backup2

-rw-------   1 cyrus mail 237568 Mar 19 22:32 deliver.db

drwxr-x---   2 cyrus mail   4096 Dec  4 21:50 log

-rw-------   1 cyrus mail  16952 Mar 19 12:32 mailboxes.db

drwxr-x---   2 cyrus mail   4096 Dec  4 21:50 msg

drwxr-x---   2 cyrus mail   4096 Mar 19 23:17 proc

drwxr-xr-x  28 root  root   4096 Dec  4 21:50 quota

drwxr-xr-x  28 root  root   4096 Feb 21 23:22 sieve

drwxr-x---   2 cyrus mail   4096 Mar 19 23:17 socket

-rw-------   1 cyrus mail   8192 Mar 19 22:32 tls_sessions.db

drwxr-xr-x  28 root  root   4096 Dec  4 21:50 user

root@pyza imap # ls -l db

total 18964

-rw-------  1 cyrus mail     8192 Mar 19 23:48 __db.001

-rw-------  1 cyrus mail   663552 Mar 19 23:48 __db.002

-rw-------  1 cyrus mail    98304 Mar 19 23:48 __db.003

-rw-------  1 cyrus mail 18563072 Mar 19 23:48 __db.004

-rw-------  1 cyrus mail    32768 Mar 19 23:48 __db.005

-rw-------  1 cyrus mail  6258280 Mar 19 22:32 log.0000000001

-rw-------  1 cyrus mail        4 Mar 19 12:32 skipstamp

```

----------

## langthang

 *Quote:*   

> sys-libs/db 4.1.25_p1-r4 1.85-r2 (recently updated)

 

Could be your problem. Please post your `genlop cyrus-imapd db`

----------

## drrrl

 *langthang wrote:*   

> Please post your `genlop cyrus-imapd db`

 

Here it is:

```
 # genlop cyrus-imapd db

 * net-mail/cyrus-imapd

     Sat Dec  4 21:50:36 2004 >>> net-mail/cyrus-imapd-2.2.10

     Mon Feb 21 23:24:57 2005 >>> net-mail/cyrus-imapd-2.2.12

 * sys-libs/db

     Thu Oct 28 00:03:52 2004 >>> sys-libs/db-4.1.25_p1-r3

     Sat Dec  4 15:51:45 2004 >>> sys-libs/db-4.1.25_p1-r4

     Mon Dec  6 10:34:07 2004 >>> sys-libs/db-1.85-r1

     Sun Dec 12 22:36:55 2004 >>> sys-libs/db-4.1.25_p1-r4

     Tue Mar 15 19:04:49 2005 >>> sys-libs/db-1.85-r2

```

----------

## langthang

you didn't upgrade db from 4.1 to 4.2 and cyrus-imapd from 2.1 to 2.2, which has those symptoms you described. Anway, could you try these steps in http://asg.web.cmu.edu/archive/message.php?mailbox=archive.info-cyrus&searchterm=skiplist&msg=32337 to rebuild your database. Notes: backup your mail database first, and use the paths that match Gentoo cyrus install.

----------

## drrrl

Hi,

thanks a lot for help. I tried some of the steps and it worked  :Smile: 

Here are details of what I did:

```

# /etc/init.d/cyrus stop

.

.

$ sudo -u cyrus bash

$ cd /var/imap/

$ rm db/*

$ rm db.backup?/*

$ rm deliver.db

$ rm tls_sessions.db

$ /usr/lib/cyrus/ctl_mboxlist -d > mailboxes.txt

$ mv mailboxes.db mailboxes.db.old

$ /usr/lib/cyrus/ctl_mboxlist -u < mailboxes.txt

$ diff mailboxes.db*

Files mailboxes.db and mailboxes.db.old differ

.

.

# /etc/init.d/cyrus start

```

The last part of the solution, I mean converting users' .seen files was unnessesary. In fact after doing this all messages has been marked as unread  :Sad: . Fortunately I had backup  :Wink: , so I restored .seen files and now everything seems to be OK.

OK - my system is up, great thanks, but I'm still worried about it, as I don't know the reason of this failure. What had happend and why? And I still don't know what is the probability of same problems in near future? Just in case I set "soft_bounce = yes" in postfix main.cf and will monitor the system.

----------

