# syslog-ng-3.6.2 stops logging (hangs)

## hanj

Hello All

I've been running into a problem with syslog-ng-3.6.2. Originally, I thought it was a problem with DHCP (https://forums.gentoo.org/viewtopic-t-1016268-highlight-syslogng.html), but have locked it down to syslog-ng-3.6.2. The problem is that syslog-ng will be logging, but intermittently (sometimes hours .. sometimes minutes) syslog-ng will cease to log. Tailing logs and issuing `logger yo` will not write to the logs.

The only way to correct this, is by restarting syslog-ng. When I do so, it complains that syslog-ng is unable to 'stop'. I need to kill -9 the process and remove the pid file. Once it's started, I can restart over and over, without incident, until it hangs. I believe when syslog-ng hangs, it begins to interfere with the other services (ie: DHCP mentioned above).

Rolling syslog-ng back to syslog-ng-3.4.8 the problem goes away.

Here is my 3.6 syslog-ng config. I'm thinking something needs to be optimized in the config or possibly the kernel?

```
@version: 3.6

# $Header: /var/cvsroot/gentoo-x86/app-admin/syslog-ng/files/syslog-ng.conf.gentoo.3.2,v 1.1 2011/01/18 17:44:14 mr_bones_ Exp $

#

# Syslog-ng default configuration file for Gentoo Linux

options {

        chain_hostnames(no);

        stats_freq(43200);

        use_fqdn(yes);

        keep_hostname(yes);

        use_dns(yes);

        log_fifo_size(10000);

};

source src {

        unix-stream("/dev/log");

        unix-stream("/chroot/dhcp/dev/log");

        internal();

        file("/proc/kmsg");

};

source kernsrc {

        file("/proc/kmsg");

};

destination messages { file("/var/log/messages" owner(root) group(adm) perm(0640)); };

destination mail { file("/var/log/mail.log" owner(root) group(adm) perm(0640)); };

destination authlog { file("/var/log/auth.log" owner(root) group(adm) perm(0640)); };

destination console { usertty("root"); };

destination console_all { file("/dev/tty12"); };

destination d_sec {

        program("/usr/bin/sec -input=\"-\" -conf=/etc/sec/sec.conf -log=/var/log/sec.log -pid=/var/run/sec.pid");

};

filter f_mail { facility(mail); };

filter f_messages { level(info..emerg) and not facility(mail); };

filter f_emergency { level(emerg); };

filter f_auth { facility(auth); };

filter f_authpriv { facility(auth, authpriv); };

log { source(src); filter(f_mail); destination(mail); };

log { source(src); filter(f_messages); destination(messages); };

log { source(src); filter(f_emergency); destination(console); };

log { source(src); filter(f_authpriv); destination(authlog); };

destination loghost {tcp("127.0.0.1" port(514));};

log { source(src); filter(f_messages); destination(loghost); };

log { source(src); destination(d_sec); };

log { source(src); destination(console_all); };

```

I tried to do some additional debugging via strace. Strace just shows the stall, but no errors, etc. I configured syslog-ng to drop a core, and during the hang, no core dump is created. The process is active (not zombie), but it appears 'stuck' or hung. I have 3.6 installed on other services with the same kernel and config without incident. So this is very puzzling. I increased verbosity, and again, nothing weird before the hang.. it just stops.

Disk is fine for the logging partition (1% Use in /var)

Any ideas?

Thanks!

hanji

----------

## czanik

Hi,

Normally the same configuration should work among version upgrades (except changing the version number and sometimes other minor problems reported by "syslog-ng -s" after upgrade). Looking at your configuration, I see two possible problems:

- use_dns(yes); -- if DNS is unavailable for any reason, name query is a blocking operation

- see note at https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.6-guides/en/syslog-ng-ose-v3.6-guide-admin/html-single/index.html#configuring-sources-unixstream : use unix-dgram() instead of unix-stream()

Let me know, if these resolve your problems in 3.6.2!

--

Peter Czanik (CzP) <peter.czanik@balabit.com>

BalaBit IT Security / syslog-ng upstream

http://czanik.blogs.balabit.com/

https://twitter.com/PCzanik

----------

## hanj

Hello

I made the changes you suggested:

```
@version: 3.6

# $Header: /var/cvsroot/gentoo-x86/app-admin/syslog-ng/files/syslog-ng.conf.gentoo.3.2,v 1.1 2011/01/18 17:44:14 mr_bones_ Exp $

#

# Syslog-ng default configuration file for Gentoo Linux

options {

        chain_hostnames(no);

        stats_freq(43200);

        use_fqdn(yes);

        keep_hostname(no);

        use_dns(no);

        log_fifo_size(10000);

};

source src {

        unix-dgram("/dev/log");

        internal();

        file("/proc/kmsg");

};

source kernsrc {

        file("/proc/kmsg");

};

destination messages { file("/var/log/messages" owner(root) group(adm) perm(0640)); };

destination mail { file("/var/log/mail.log" owner(root) group(adm) perm(0640)); };

destination authlog { file("/var/log/auth.log" owner(root) group(adm) perm(0640)); };

destination console { usertty("root"); };

destination console_all { file("/dev/tty12"); };

destination d_sec {

        program("/usr/bin/sec -input=\"-\" -conf=/etc/sec/sec.conf -log=/var/log/sec.log -pid=/var/run/sec.pid");

};

filter f_mail { facility(mail); };

filter f_messages { level(info..emerg) and not facility(mail); };

filter f_emergency { level(emerg); };

filter f_auth { facility(auth); };

filter f_authpriv { facility(auth, authpriv); };

log { source(src); filter(f_mail); destination(mail); };

log { source(src); filter(f_messages); destination(messages); };

log { source(src); filter(f_emergency); destination(console); };

log { source(src); filter(f_authpriv); destination(authlog); };

destination loghost {tcp("127.0.0.1" port(514));};

log { source(src); filter(f_messages); destination(loghost); };

log { source(src); destination(d_sec); };

log { source(src); destination(console_all); };
```

Also, here are my USE flags in case that sheds some light:

```
[ebuild   R    ] app-admin/syslog-ng-3.6.2::gentoo  USE="ssl tcpd -amqp -caps -dbi -geoip -ipv6 -json -mongodb -pacct -redis -smtp -spoof-source -systemd" 3133 KiB
```

Also, I issued after upgrade:

```
syslog-ng -s
```

And that returns nothing.

And updated to syslog-ng-3.6. Last night (3AM - logrotate) hung everything up. I had to kill stunnel and syslog-ng processes this morning to get the machine to be responsive. I didn't have this problem yesterday with syslog-ng-3.4. My guess that logrotate could issue the reload. Load on the machine was through the roof this morning. This is different behavior than before. I'll keep 3.6 running to see if there is a hang moment during the day. If so, I'll need to roll it back to 3.4

Thanks!

hanji

----------

## hanj

Ok.. just hung again. Now, with the config changes above, the behavior is different. Syslog-hangs hard. `logger yo` hangs. I have to kill the process to completely restart it. That explains why logrotate had problems last night. This didn't happen prior to the config change.

Rolling back to 3.4

Thanks!

hanji

----------

## hanj

 *czanik wrote:*   

> Hi,
> 
> Normally the same configuration should work among version upgrades (except changing the version number and sometimes other minor problems reported by "syslog-ng -s" after upgrade). Looking at your configuration, I see two possible problems:
> 
> - use_dns(yes); -- if DNS is unavailable for any reason, name query is a blocking operation
> ...

 

Hello Peter

Did you have a chance to review this? It's definitely related to 3.6. Since I rolled back to 3.4, everything has been fine. 3.6 will choke rather quickly. I also noticed that it hangs dhcp and postfix when things go weird.

Thanks!

hanji

----------

## bhakimi

got the same issue, syslog-ng starts to kill my cpu, i start getting these errors and after a while my box dies,

```

syslog-ng[1271]: Destination queue full, dropping messages; queue_len='10000', log_fifo_size='10000', count='2', persist_name='affile_dw_queue(/dev/tty12)'

May 27 11:03:29 dnsnode1 syslog-ng[24696]: Suspending write operation because of an I/O error; fd='17', time_reopen='60'

May 27 11:04:29 dnsnode1 syslog-ng[24696]: Error suspend timeout has elapsed, attempting to write again; fd='17'

May 27 11:19:01 dnsnode1 syslog-ng[24696]: I/O error occurred while writing; fd='17', error='Resource temporarily unavailable (11)'

May 27 11:19:01 dnsnode1 syslog-ng[24696]: Suspending write operation because of an I/O error; fd='17', time_reopen='60'

May 27 11:20:01 dnsnode1 syslog-ng[24696]: Error suspend timeout has elapsed, attempting to write again; fd='17'

May 27 11:23:33 dnsnode1 syslog-ng[24696]: I/O error occurred while writing; fd='17', error='Resource temporarily unavailable (11)'

May 27 11:23:33 dnsnode1 syslog-ng[24696]: Suspending write operation because of an I/O error; fd='17', time_reopen='60'

May 27 11:24:33 dnsnode1 syslog-ng[24696]: Error suspend timeout has elapsed, attempting to write again; fd='17'

May 27 12:18:24 dnsnode1 syslog-ng[24696]: I/O error occurred while writing; fd='17', error='Resource temporarily unavailable (11)'

May 27 12:18:24 dnsnode1 syslog-ng[24696]: Suspending write operation because of an I/O error; fd='17', time_reopen='60'

May 27 12:19:24 dnsnode1 syslog-ng[24696]: Error suspend timeout has elapsed, attempting to write again; fd='17'

```

----------

## hanj

Not sure how similar. My syslog completely stops logging.. and logging dependent services begin to die.. ie: Postfix and DHCP. But the box is up, and my CPU is fine. I'm still sticking with 3.4 for now. I'm wondering if there are issues with kernel options/versions with 3.6? I just built a new hardened sources without grsec.. thinking that might be a possibility, but testing is a pain. It's a fairly important box.

Hopefully, we'll have an answer soon.

h

----------

## czanik

syslog-ng 3.6.3 is expected to be released in the coming days with many bugfixes.

----------

