# Syslog-ng central logserver is dropping logs

## humbletech99

I've got a central logserver running syslog-ng and I find that it is definitely dropping logs but I don't know why yet. It's logging ~30-40 network machines to a mysql database as well as to plain text logs on the filesystem at the same time. The network machine stats say they dropped no logs but the logs for the central logserver says it did drop some, is the logserver struggling to keep pace?

Logs for the logserver host:

```
syslog-notice   2006-06-20 03:10:06   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-19 15:10:06   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-19 03:10:06   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-18 15:10:06   syslog-ng[5848]: STATS: dropped 248

syslog-notice   2006-06-18 03:10:06   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-17 15:10:06   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-16 15:10:21   syslog-ng[5848]: STATS: dropped 85

syslog-notice   2006-06-16 03:10:21   syslog-ng[5848]: STATS: dropped 94

syslog-notice   2006-06-15 15:10:21   syslog-ng[5848]: STATS: dropped 679

syslog-notice   2006-06-15 03:10:21   syslog-ng[5848]: STATS: dropped 2433

syslog-notice   2006-06-14 15:10:21   syslog-ng[5848]: STATS: dropped 21250

syslog-notice   2006-06-14 03:10:21   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-13 15:10:21   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-13 03:10:20   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-12 15:10:20   syslog-ng[5848]: STATS: dropped 3013

syslog-notice   2006-06-12 03:10:19   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-11 15:10:18   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-11 03:10:18   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-10 15:10:18   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-09 23:06:34   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-09 11:06:34   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-08 23:06:33   syslog-ng[5848]: STATS: dropped 0

syslog-notice   2006-06-08 03:44:24   syslog-ng[5846]: STATS: dropped 4984

syslog-notice   2006-06-07 15:44:24   syslog-ng[5846]: STATS: dropped 0
```

----------

## think4urs11

about what amount of log entries do we talk here per $time_unit?

I've one syslog-ng/mysql-db-backend server logging ~440 machines at the moment (WinTel, network equipment, Unix/Linux, etc.) and never faced a STATS dropped with value >0.

Can you post your syslog-ng.conf and some specs for the machine maybe?

The fact that your syslog 'client' so to say doesn't complain about dropped logs is perfectly ok - syslog is (default) udp so the clients don't care at all whether or not someone grabs them 'from the wire'.

----------

## humbletech99

actually I've put the clients I can on syslog-ng and tcp transport and those clients say 0 dropped in the logs. The syslog clients don't say anything about stats, this is a syslog-ng only feature I think...

The machine is an Opteron 2 GHz, 1GB Ram, Sata 160GB disk 7200rpm, running Gentoo of course, 2.6.15 kernel. 

I get 50 logs a minute from 40 hosts. The syslog-ng config file is:

```
options { 

        chain_hostnames(off); 

        sync(0); 

        stats(43200); 

        keep_hostname(no);

        use_dns(yes);

        dns_cache(yes);

        create_dirs(yes);

};

source s_log { unix-stream("/dev/log"); };

source s_int { internal(); };

source s_kern { file("/proc/kmsg" log_prefix("kernel: ") ); };

source s_tcp { tcp( port(50514) ); };

source s_udp { udp(); };

destination messages { file("/var/log/messages"); };

destination d_net { file("/var/log/NETWORK/$HOST/$FACILITY-$YEAR-$MONTH-$DAY"); };

destination d_mysql { pipe("/var/log/mysql.pipe" template("INSERT INTO logs (host, facility, priority, level, tag, datetime, program, msg) VALUES ( '$HOST', '$FACILITY', '$PRIORITY', '$LEVEL', '$TAG', '$YEAR-$MONTH-$DAY $HOUR:$MIN:$SEC', '$PROGRAM', '$MSG' );\n") template-escape(yes)); };

log { source(s_log); source(s_int); source(s_kern); destination(messages); };

log { source(s_log); source(s_int); source(s_kern); source(s_tcp); source(s_udp); destination(d_net); }; 

log { source(s_log); source(s_int); source(s_kern); source(s_tcp); source(s_udp); destination(d_mysql); }; 

```

----------

## think4urs11

first what comes to mind would be to disable DNS usage.

If you *must* use it you should

- have a caching only dns (dnsmasq) on the syslog machine to prevent lengthy lookups as good as possible

and/or

- tune the dns related parameters in syslog-ng.conf

(dns_cache_expire + dns_cache_expire_failed + dns_cache_size)

next candidate would be sync - try to user a higher value so the log files get written per 'x' lines instead of beeing directly written to disc

lastely (never seen that to be needed) gc_busy_threshold+gc_idle_threshold which control the garbage collector which needs to run regularly

*edit*

for the record: i've seen my machine (P4 1.8, 1.5GB, 74GB SCSI) to be able to write up to ~800-1,000 msg/second without getting into trouble

(until i noticed that mysql was compiled without big-tables support   :Rolling Eyes:  )

normally i've (one week period) approx. 3-5 million log entries in total

----------

## humbletech99

syslog-ng has a dns cache so a caching dns server isn't needed.

the default cache time is 1 hour and the default cache size is 1007 which is many times the number of every host in the network! So I consider these to be sufficient.

I may consider changing the sync option, but I have no idea about the garbage collection, should go look at the docs again....

----------

## think4urs11

log_fifo_size is also a very good candidate for tuning

----------

## humbletech99

thanks for the help, you seem to be s syslog-ng veteran, I remember you helped me out before, I appreciate it. 

Having looked up the garbage collection thingy, a quote from the 2.0 Reference:

 *Quote:*   

> Syslog-ng 2.0 is a complete reimplementation of syslog-ng 1.6, and does not use the mark and sweep garbage collector at all. So the garbage collector parameters (gc_idle_threshold, gc_busy_threshold) are still accepted but are ignored.

 

So me not knowing about it is ok! 

I've given log_fifo_size(1000), I'll see how it goes with this....

----------

## think4urs11

 *humbletech99 wrote:*   

> thanks for the help, you seem to be s syslog-ng veteran, I remember you helped me out before, I appreciate it. 

 

veteran? no no

just one of the poor souls which needs to maintain two big servers actually covering an EMEA-wide infrastructure and facing issues with french character sets, wrongly formatted logs, WAN-wise needed QoSing, logs with broken formatting, binary logs, etc.

oh btw. i'm still on v1.6.9

----------

## humbletech99

sound pretty experienced with it to me!

Do you log over subnet vpns or something? This is what I do, the vpn link to France takes care of the networking and routing, the hosts just need an ip and the gateways deal. Just curious if you did the same. Haven't had any problems with French machines yet, but then I don't have many of them logging here.

Do you have 2 logservers to cover this whole third of the world? They must be busy, much busier than my one...

How do binary logs affect you, does syslog-ng even deal with them? I would think that syslog would send only text logs via the syslog protocol and so this wouldn't be a central logging problem...

Also, how do you fix wrongly formatted logs, you'd need to fix the sender wouldn't you? Is there much you can do with them after the fact?

Perhaps version 2.0 would be a nice upgrade when you get time, it's only an emerge away...

----------

## think4urs11

 *humbletech99 wrote:*   

> Do you log over subnet vpns or something?

 

Nope - atm just some sort of bandwidth restricted form of 'pipe' between two of our bigger data centers over our corporate WAN for syslog/ftp traffic towards the log servers.

Big enough to handle 'normal' logtraffic amounts but dropping if one/many machines go really crazy. (4GB syslog, one machine, ~16 hours - quite impressive even within the same LAN segment)

If that happens loosing some log entries is one of our smaller problems actually  :Wink: 

Other option would be to have about  >10 instead of two boxes - money versus risk, money wins 1:0  :Wink: 

 *humbletech99 wrote:*   

> Do you have 2 logservers to cover this whole third of the world? They must be busy, much busier than my one...

 

As we speak a big one here covering 'middle europe'. - thats the one with 440 'clients'. The second one is beeing built up atm and already covers southern/western partially with nordic region to come. (unlimited -as my SAN admins told me- disk space behind ... we'll see   :Twisted Evil:  )

Luckily we've afaik nothing to cover in africa. I'm already getting headaches when looking to greece with their somewhat strange charsets, same for israel - afaik nothing to cover there ... puuuuuh

 *humbletech99 wrote:*   

> How do binary logs affect you, does syslog-ng even deal with them? I would think that syslog would send only text logs via the syslog protocol and so this wouldn't be a central logging problem...

 

We have created some im/exporters, converters and stuff to be able to log AS400s to syslog for example. The logs are transfered via FTP to the log server and get imported there into the DB. Mgmt. wanted to have one single place to have everything logged...

Had some issues with the AS400s set to non-ascii/english as system language - which breaks nearly everything syslog-wise (the logs include e.g 0x00 bytes). Worked around that one e.g. with recoding the incoming logs to utf8 before importing them, sed'ing out the 0x00 etc. etc. etc.

 *humbletech99 wrote:*   

> Also, how do you fix wrongly formatted logs, you'd need to fix the sender wouldn't you? Is there much you can do with them after the fact?

 

Building syslog messages incl. pri/fac and so on manually isn't to complicated, the format is somewhat easy to implement.

If it's native/direct syslog you can't actually do very much - luckily it seems as if even my french clients are not able to break anything here. Snare Agent seems to do its job good enough for Wintel boxes. Unix and other boxes do work perfectly (of course)

 *humbletech99 wrote:*   

> Perhaps version 2.0 would be a nice upgrade when you get time, it's only an emerge away...

 

Time? How do you spell that?

----------

## humbletech99

snare, I'll have to check that out, looks good, I've been using NTsyslog which looks a little dated, but is nice called all config is in one reg key so I've got a reg, a directoy with the binary and bits which I've batched to copy it all to program files, use the reg to get all the settings, install the service and start it, which works nicely.

I'll see it snare is better, which it may well be, but then I just hope it's as easily configurable and automatedly deployable.

----------

## think4urs11

 *humbletech99 wrote:*   

> I'll see it snare is better, which it may well be, but then I just hope it's as easily configurable and automatedly deployable.

 

It is. One of my Wintel colleagues build an MSI package with all our defaults set so installing it is actually the windows form of emerge snare.

Plus it can be configured completely with a builtin little webserver from remote - very handy at times.

----------

## humbletech99

I take it he used Wininstall le to build the MSI package? very nice, I'll have to switch over...

----------

## think4urs11

no idea - thats windows. not longer my business since ~5 years.

All i can tell is that installation is easy as easy can be with his package - he's actually our MSI guru.

If it's software - he can create an MSI, no matter what   :Wink: 

----------

## humbletech99

I don't think it's difficult at all, I did it before and am intending to do it a lot more in the future. If you can install something on windows then you can MSI package it for installation. If you find out what he uses, do post back. There are various tools I believe and some are better than others.

----------

## think4urs11

 *humbletech99 wrote:*   

> Perhaps version 2.0 would be a nice upgrade when you get time, it's only an emerge away...

 

sure? I've only 1.6.9 + 1.6.11 in portage, the later one beeing ~.

Need to check what improvements 1.9/2.0 would give me actually, even given the fact i won't switch till 2.x is stable (upstreams, not neccessarily gentoo portage though).

----------

## humbletech99

oh, I stand corrected, I am running 1.6.9 too, but I've been working off the 2.0 reference since it's newest, I just assumed that it would have the newest version and didn't actually pay close attention to it. doh.

ok, let's both wait til it's in stable and then upgrade, still can't be bothered with garbage collection then if it's gotta be obsolete after an emerge.... it'll fix itself and I have a hundred million other things to do...

----------

## deanpence

 *humbletech99 wrote:*   

> I've got a central logserver running syslog-ng and I find that it is definitely dropping logs but I don't know why yet. It's logging ~30-40 network machines to a mysql database as well as to plain text logs on the filesystem at the same time. The network machine stats say they dropped no logs but the logs for the central logserver says it did drop some, is the logserver struggling to keep pace?

 

To give some idea of how well syslog-ng performs (on a dual-core, dual Opteron machine with 4GB of RAM at least), in my largest datacenter, I log with a single syslog-ng instance to files only for about 325 hosts on a high-availability network, and it's a major abberation for me to see any machine reporting dropped syslog lines, although my DNS server logs way too much information:

```
2006-06-22 01:01:49 [hostname omitted]: syslog-ng[27166]: STATS: dropped 1386871
```

Is it possible that logging to mysql is part of what's causing syslog-ng to drop some logs? A good test might be to disable mysql logging for a short time and see how it behaves then.

----------

## humbletech99

well it's in production and I'm relying on it having a full history, if I stop mysqll, it'll screw up the audit trail of tens of thousands of logs for the sake of a few dozen/hundred.

It's possible this happened because syslog came up before mysql, and the output pipe filled up in syslog, which I've now changed to be larger. I'll see how it goes...

----------

