# [SOLVED] Migration to systemd: Flood from systemd-journald

## lcj

I'm using ~x86, latest updates form this week, latest kernel:

Using the Gentoo Wiki, I've migrated to systemd. OpenRC is still installed, I managed to recompile all with -consolekit systemd. 

Configured kernel - prete much line-by-line what wiki says

Using lilo as boot manager I've added the init line, just as the wiki says. 

Now after reboot, system boots but is on console immediately flooded with following message (the first might be slightly different I had to record hi-speed video of the console to get this):

```

[   OK    ] Started udev Kernel Device Manager

systemd-journald[1460]: /dev/kmsg buffer overrun, some messages lost

systemd-journald[1460]: /dev/kmsg buffer overrun, some messages lost

systemd-journald[1460]: /dev/kmsg buffer overrun, some messages lost

systemd-journald[1460]: /dev/kmsg buffer overrun, some messages lost

systemd-journald[1460]: /dev/kmsg buffer overrun, some messages lost

```

And this goes on with the flood...

After that there's not much happening - usually OpenRC started kde, but I don't get there after 5 minutes, I do have black screen just like right before nvidia module kicks in and activates the X.

I'm able to shut down safely with Power button or CTRL+ALT+DEL. Few times system stopped at the flood and I never got to console.

I did ask /etc/jorunald.conf NOT TO LOG anything to console nor syslog, uninstalled sysklod.

If I do not pass the systemd to kernel while booting - I'm getting running system with the exception of not being able to controll NetworkManager nor mount as user USB sticks... Of course systemd is not talking to dbus.

----------

## ulenrich

You could add to your kernel cmdline (grub or lilo)

```
systemd.unit=multi-user.target blacklist=nvidia 
```

or emergency.target and later on, when you solved some issues you add

systemd.unit=graphical.target 

Don't forget to do before reboot enable kdm or gdm or lightdm

```
systemctl enable kdm.service
```

----------

## TomWij

Try increasing the kernel message log buffer length with the kernel parameter log_buf_len, for example: log_buf_len=262144

----------

## lcj

Wow, the obvious thing of enabling kde helped, thanks ulenrich, my bad for not noticing that! 

I see that NetworkManager is now talking to dbus, and I can controll it!

The VPN started to work too, bu it seems that passwords were not remembered in WiFi, but that's complety different thing.

TomWij - I still saw the messages... 

I did recompile the kernel and changed the default 16=64k to 18... 

system-journald is taking up to 95% of the CPU core, load is @ 1,53.

What I can see is that the cat /proc/kmsg shows repetetive power_supply BAT0: uevent

I'll investigate tomorrow and ask it to log something via syslog-ng and via the suggested lilo line, and remove the acpi debugs...

Big thanks - it's first time since 2004 that my problem was not already solved (using search) by the users of this forum.

----------

## TomWij

 *lcj wrote:*   

> What I can see is that the cat /proc/kmsg shows repetetive power_supply BAT0: uevent

 

Sounds like either some ACPI bug or a real problem with the power supply or battery. Best to investigate this. If you find no other way to fix the flooding, you can want to disable support for the ACPI Power Supply and/or the ACPI Battery (not ACPI itself) in the kernel.

----------

## lcj

The battery just last month started to complain because it's around 3 years old, so Dell utility says after every boot: Battery may load normaly, but it's performance might be decreased over time.

So after removing the support from kernel: battery & AC power supply my kernel log is no longer overflooded. Thanks.

----------

## Barvinok

Having the same problem.

What if I need kernel support for the power supply and battery? How do I just stop journald flooding or redirect it elsewhere?

----------

## ulenrich

 *Barvinok wrote:*   

> Having the same problem.
> 
> What if I need kernel support for the power supply and battery? How do I just stop journald flooding or redirect it elsewhere?

 

If you know your hardware is rotten like @lcj it is hillarious to expect support from any software package. This thread solved.

----------

## lcj

I can suggest:

#1 Try removing the battery and check the status

#2 Expand the kmsg log buffer in kernel beyond 256K

----------

## ulenrich

 *lcj wrote:*   

> #2 Expand the kmsg log buffer in kernel beyond 256K

 

@lcj, do you refer to:

```
$ zcat /proc/config.gz |grep -i log_buf

CONFIG_LOG_BUF_SHIFT=19
```

?

----------

## lcj

Yes, I'm @ 

```
zcat /proc/config.gz |grep -i log_buf 

CONFIG_LOG_BUF_SHIFT=18
```

so, there's still room to try.

Not trying to confuse, but with kernel 3.12.0-gentoo the flood seems to be *huge* while with previous  (umh... 3.9.12 I guess) it was message every 5-6 minutes one from BAT followed by one from AC.

----------

## ulenrich

 *lcj wrote:*   

> with kernel 3.12.0-gentoo the flood seems to be *huge* while with previous  (umh... 3.9.12 I guess) it was message every 5-6 minutes one from BAT followed by one from AC.

 

Such reduced time lapse of this service is surely meant as an improvement: Think of a use case when being mobile and having a heavy duty task run, which results in a battery with only one minute of power left !

----------

## lcj

Well - not that I complain, but flood is probably not what we would like to get. Windows reports battery getting old once, maybe the interpretation of the ACPI event from this particular vendor (DELL) should be supressed after all.

I will investigate further but I think we should not continue this thread - the actual problem IS SOLVED.

----------

## Barvinok

 *ulenrich wrote:*   

>  *Barvinok wrote:*   Having the same problem.
> 
> What if I need kernel support for the power supply and battery? How do I just stop journald flooding or redirect it elsewhere? 
> 
> If you know your hardware is rotten like @lcj it is hillarious to expect support from any software package. This thread solved.

 

I don't think so.

First off, my hardware is NOT rotten. It is the Dell Inspiron 1720 with replacement battery that works pretty well under either Windows and under my previous Gentoo setup (OpenRC + GNOME2 + same kernel 3.10.17).

That switching to systemd caused massive battery event flood in journal while OpenRC and Windows had no problems with it whatsoever means there's something wrong with systemd (misconfigure perhaps) and NOT with hardware.

 *lcj wrote:*   

> Well - not that I complain, but flood is probably not what we would like to get. Windows reports battery getting old once, maybe the interpretation of the ACPI event from this particular vendor (DELL) should be supressed after all. I will investigate further but I think we should not continue this thread - the actual problem IS SOLVED.

 

I'm not having "old battery" messages from Windows so I presume it is not old, yet systemd still floods my journal with battery events on a large scale. Increasing kernel buffers did help in the sense that I now see that the flood is associated with battery.

Turning off kernel support for battery and power supply is not the way to go, because I'd like ALL of my hardware working as expected and supported by the OS.

So, the question lingers -- where do I find settings pertaining the verbosity of journald or suppression of specific kind of messages?

----------

## ulenrich

```
$ zcat /proc/config.gz |grep LOGLEVEL

CONFIG_DEFAULT_MESSAGE_LOGLEVEL=1
```

And specific journalctl options in conf files in /etc/systemd

But your more verbose output could be interesting for @lcj also. If there is a flood of messages there is a high chance you encountered a failure of your system!

----------

## Barvinok

Okay, now is the time for some ACPI mystery.

I was playing with journald config on this machine while ac and battery modules were turned off from the kernel. After that I re-enabled ac and battery modules only to see there's no more uevent flood. Worse still, the flood didn't reappear when I reverted journald settings back to previous. So now the issue on Dell Inspiron seem fixed but I still don't know how to fix it.

For next experiments, I took much newer laptop with stock battery, this time ASUS K52N, with exactly the same previous Gentoo setup and also in transition to systemd and GNOME3. It also flooded in the exactly same way as Inspiron did before, only this time I managed to take a peek at it:

1. modprobe ac results in the following flood which continues infinitely:

```

[  694.674997] systemd-journald[77]: /dev/kmsg buffer overrun, some messages lost.

[  694.675137] power_supply AC0: uevent

[  694.675139] power_supply AC0: POWER_SUPPLY_NAME=AC0

[  694.675157] power_supply AC0: prop ONLINE=1

[  694.675181] systemd-journald[77]: /dev/kmsg buffer overrun, some messages lost.

[  694.675323] power_supply AC0: uevent

[  694.675325] power_supply AC0: POWER_SUPPLY_NAME=AC0

[  694.675342] power_supply AC0: prop ONLINE=1

[  694.675366] systemd-journald[77]: /dev/kmsg buffer overrun, some messages lost.

[  694.675507] power_supply AC0: uevent

[  694.675509] power_supply AC0: POWER_SUPPLY_NAME=AC0

[  694.675526] power_supply AC0: prop ONLINE=1

```

2. modprobe battery results in the following flood, which also continues infinitely:

```

[ 1520.753770] systemd-journald[77]: /dev/kmsg buffer overrun, some messages lost.

[ 1520.753925] power_supply BAT0: uevent

[ 1520.753928] power_supply BAT0: POWER_SUPPLY_NAME=BAT0

[ 1520.753931] power_supply BAT0: prop STATUS=Unknown

[ 1520.753933] power_supply BAT0: prop PRESENT=1

[ 1520.753935] power_supply BAT0: prop TECHNOLOGY=Li-ion

[ 1520.753938] power_supply BAT0: prop CYCLE_COUNT=0

[ 1520.753940] power_supply BAT0: prop VOLTAGE_MIN_DESIGN=14400000

[ 1520.753942] power_supply BAT0: prop VOLTAGE_NOW=16616000

[ 1520.753945] power_supply BAT0: prop POWER_NOW=0

[ 1520.753947] power_supply BAT0: prop ENERGY_FULL_DESIGN=30100000

[ 1520.753949] power_supply BAT0: prop ENERGY_FULL=31332000

[ 1520.753952] power_supply BAT0: prop ENERGY_NOW=30800000

[ 1520.753954] power_supply BAT0: prop CAPACITY=98

[ 1520.753956] power_supply BAT0: prop MODEL_NAME=K52F-22

[ 1520.753959] power_supply BAT0: prop MANUFACTURER=ASUSTek

[ 1520.753961] power_supply BAT0: prop SERIAL_NUMBER=

[ 1520.753994] systemd-journald[77]: /dev/kmsg buffer overrun, some messages lost.

[ 1520.754135] power_supply BAT0: uevent

[ 1520.754137] power_supply BAT0: POWER_SUPPLY_NAME=BAT0

[ 1520.754140] power_supply BAT0: prop STATUS=Unknown

[ 1520.754142] power_supply BAT0: prop PRESENT=1

[ 1520.754145] power_supply BAT0: prop TECHNOLOGY=Li-ion

[ 1520.754147] power_supply BAT0: prop CYCLE_COUNT=0

[ 1520.754149] power_supply BAT0: prop VOLTAGE_MIN_DESIGN=14400000

[ 1520.754152] power_supply BAT0: prop VOLTAGE_NOW=16616000

[ 1520.754154] power_supply BAT0: prop POWER_NOW=0

[ 1520.754157] power_supply BAT0: prop ENERGY_FULL_DESIGN=30100000

[ 1520.754159] power_supply BAT0: prop ENERGY_FULL=31332000

[ 1520.754161] power_supply BAT0: prop ENERGY_NOW=30800000

[ 1520.754164] power_supply BAT0: prop CAPACITY=98

[ 1520.754166] power_supply BAT0: prop MODEL_NAME=K52F-22

[ 1520.754169] power_supply BAT0: prop MANUFACTURER=ASUSTek

[ 1520.754171] power_supply BAT0: prop SERIAL_NUMBER=

```

I don't see any hints of failures, the messages seem to be generic "i'm here and ok" kind.

And I can't turn off ac+battery modules for production because these laptops are often used off-AC and the users need to know the battery status.

Interesting is that if I reboot the system into OpenRC, these same modules do NOT produce flood. So the cause must be in the systemd.

Any ideas?

----------

## Barvinok

Here's my latest discoveries on the issue.

I'm no longer blaming systemd, this seems to be some specific hardware issue unaccounted for in the kernel.

Stuffing dump_stack() in various places of kernel source, namely at power_supply and battery modules, I came to see that the uevent flood is reported by power_supply_uevent() function called by this:

```

гру 08 07:45:25 dolphin kernel:  [<ffffffff813ff5ab>] dev_attr_show+0x1b/0x60

гру 08 07:45:25 dolphin kernel:  [<ffffffff810a5fe2>] ? __get_free_pages+0x12/0x50

гру 08 07:45:25 dolphin kernel:  [<ffffffff81140094>] sysfs_read_file+0xa4/0x180

гру 08 07:45:25 dolphin kernel:  [<ffffffff810da173>] vfs_read+0xa3/0x170

гру 08 07:45:25 dolphin kernel:  [<ffffffff810da3dd>] SyS_read+0x4d/0x90

гру 08 07:45:25 dolphin kernel:  [<ffffffff81803a90>] system_call_fastpath+0x16/0x1b

```

Not being skilled kernel developer I don't know what the implications for this so as a temporary solution I wrote this dirty hack

```
*** power_supply_sysfs.c.bak    2013-12-08 08:00:33.505634432 +0200

--- power_supply_sysfs.c        2013-12-08 08:01:26.942007870 +0200

***************

*** 264,269 ****

--- 264,270 ----

  int power_supply_uevent(struct device *dev, struct kobj_uevent_env *env)

  {

+       return 0; // dirty hack to prevent ac/battery uevent flood on ASUS K52N

        struct power_supply *psy = dev_get_drvdata(dev);

        int ret = 0, j;

        char *prop_buf;

```

For now, ac and battery modules are working as expected. I hope nothing else was broken by this patch   :Laughing: 

----------

## lcj

@Barvinok:

So it's not related to actual hardware problems ? I do know that I had connected PSU for this Dell that wasn't enough (65W should be 90W) and the battery is not BAD yet.

But it seems this ACPI flood is unrelated to actual hardware issues.

----------

## Barvinok

 *lcj wrote:*   

> So it's not related to actual hardware problems ?

 

I don't know for sure but this seems unlikely. The Asus laptop is quite new.

 *lcj wrote:*   

> I do know that I had connected PSU for this Dell that wasn't enough (65W should be 90W) and the battery is not BAD yet. But it seems this ACPI flood is unrelated to actual hardware issues.

 

I'm thinking of elevating the issue to where kernel devs hang out.

----------

## lcj

I agree, elevate this!

----------

## Navar

 *ulenrich wrote:*   

> If there is a flood of messages there is a high chance you encountered a failure of your system!

 

Proof?  So far, the vector presented was with systemd-journald.  Other vectors didn't exhibit, but you automatically leap to the assumption systemd is somehow errata free.

 *ulenrich wrote:*   

> If you know your hardware is rotten like @lcj it is hillarious to expect support from any software package. This thread solved.

 

Proof?  Give your definition of 'rotten hardware'.  Exactly what do you think all the various code handling hardware 'quirks' in the kernel are for?  So you think handling hardware exceptions via software is 'hillarious' and should not be supported?  Have you removed all the various quirks, particularly ACPI oriented, within your kernel code?  Who are these particular non-'rotten hardware' system vendors where no ACPI issues exist in linux?

These are community driven forums.  The OP should decide if their thread is solved by merit of their own determination, not yours.

 *ulenrich wrote:*   

> Such reduced time lapse of this service is surely meant as an improvement: Think of a use case when being mobile and having a heavy duty task run, which results in a battery with only one minute of power left !

 

... like systemd-journald?  :Rolling Eyes: 

 *lcj wrote:*   

> system-journald is taking up to 95% of the CPU core, load is @ 1,53.

 

----------

## ulenrich

@Navar,

my "solved" statement refered to this message just two above it:  *lcj wrote:*   

> The battery just last month started to complain because it's around 3 years old, so Dell utility says after every boot: Battery may load normaly, but it's performance might be decreased over time.
> 
> So after removing the support from kernel: battery & AC power supply my kernel log is no longer overflooded. Thanks.

 

Regarding a system log: Do you prefer hardware failure kept hidden from the logs? It is a fine balance yet to achieve between hiding a flood and - please not - hiding an issue. To not been hinted by syslog I would call an error. But with journald having such a high system load isn't favorable either. But it is less an error. Your argument: "The secure way having a car is without fuel."

----------

## Navar

@ulenrich,

Let's not presume what I may prefer.  My argument was not against the importance of system logging, on the contrary.  Instead, seemingly non-existent rate limiting on messages with a syslog function process acting, frankly, out of control on reporting duplicate events is the issue.  It's hardly the first time a freedesktop.org 'invention' has shown race conditions and run away event handling (evolution, pulseaudio, nautilus, etc.).  Hey, it's software development, stuff happens.  But why it always has to be yet another not-ready-for-prime-time affair with multi-year userbase 'testing' is beyond me.

And if we're going there (systemd), which we're not--I'm clarifying my stance on this particular issue.  I personally loathe the Redmond Way binary closed format of system and event logging rather than the simple de facto text output we're used to that many simple, non-proprietary, tools can easily process.  When have obscure closed binary formats ever been a good idea for system logs?  When we care less about immediate performance and simplicity and more about large stores of metadata (now the metaphor on function is less a log and more a database with associated problems).  When there is money to be made off forcing your users to learn your particular proprietary tools for any access and recovery.And apparently due to freedesktop documentation, when there is some burning need to store large chunks of binary data... what are we storing core file data now?  Have I utilized the systemd way of logging?  Thankfully, no, but I've read enough frustrations of others it has been foist upon and screen shots displaying the situation to know I'm wary.

As convoluted and massive in scope as XML related technologies are, they're still far and away better in helping data exchange than dealing with binary formats.  Efficiency is their main weak point.  The only arguments for binary data have been efficiency and obscurity.  If systemd were producing XML output (as an option) along with plain text like existing loggers do, I could see both sides.  Hell, even if the option to opt out of binary journal use with their proprietary database scheme for simple text writes would be ok.  But those options don't exist.  In fact, the freekdesktop documentation strongly 'recommends' using their tools only rather than bothering to absorb their C API to produce your own parsing tools.

----------

## TomWij

 *Navar wrote:*   

>  *ulenrich wrote:*   If there is a flood of messages there is a high chance you encountered a failure of your system! 
> 
> Proof?  So far, the vector presented was with systemd-journald.  Other vectors didn't exhibit, but you automatically leap to the assumption systemd is somehow errata free.

 

Don't shoot the messenger; I'm going to be that third dog that steals a bone when two dogs try to take it and mention that it could very well be a kernel bug too, which isn't errata free either.

 *Navar wrote:*   

>  *ulenrich wrote:*   If you know your hardware is rotten like @lcj it is hillarious to expect support from any software package. This thread solved. 
> 
> Proof?  Give your definition of 'rotten hardware'.

 

Yes, a definition would be nice; because until given, the following questions may or may not apply:

 *Navar wrote:*   

> Exactly what do you think all the various code handling hardware 'quirks' in the kernel are for?  So you think handling hardware exceptions via software is 'hillarious' and should not be supported?  Have you removed all the various quirks, particularly ACPI oriented, within your kernel code?  Who are these particular non-'rotten hardware' system vendors where no ACPI issues exist in linux?

 

A system failure may very well be the result of broken hardware; but there's a lot between that hardware and the screen before drawing that conclusion, 'rotten hardware' is a bit of a far-fetched expression given that the moment that was stated the hardware was not even known.

 *Navar wrote:*   

> These are community driven forums.  The OP should decide if their thread is solved by merit of their own determination, not yours.

 

+1

 *Navar wrote:*   

>  *ulenrich wrote:*   Such reduced time lapse of this service is surely meant as an improvement: Think of a use case when being mobile and having a heavy duty task run, which results in a battery with only one minute of power left ! 
> 
> ... like systemd-journald? 
> 
>  *lcj wrote:*   system-journald is taking up to 95% of the CPU core, load is @ 1,53. 

 

In the Harry Potter movie similar things happened when tons of letters arrived at their house, which made the house quite busy and loaded; ...

----------

## TomWij

 *Navar wrote:*   

> Instead, seemingly non-existent rate limiting on messages with a syslog function process acting, frankly, out of control on reporting duplicate events is the issue.

 

It brought attention to the issue and the events are readable; so, what's the issue?

 *Navar wrote:*   

> And if we're going there (systemd), which we're not--I'm clarifying my stance on this particular issue.  I personally loathe the Redmond Way binary closed format of system and event logging rather than the simple de facto text output we're used to that many simple, non-proprietary, tools can easily process.  When have obscure closed binary formats ever been a good idea for system logs?  When we care less about immediate performance and simplicity and more about large stores of metadata (now the metaphor on function is less a log and more a database with associated problems).  When there is money to be made off forcing your users to learn your particular proprietary tools for any access and recovery.And apparently due to freedesktop documentation, when there is some burning need to store large chunks of binary data... what are we storing core file data now?  Have I utilized the systemd way of logging?  Thankfully, no, but I've read enough frustrations of others it has been foist upon and screen shots displaying the situation to know I'm wary.
> 
> As convoluted and massive in scope as XML related technologies are, they're still far and away better in helping data exchange than dealing with binary formats.  Efficiency is their main weak point.  The only arguments for binary data have been efficiency and obscurity.  If systemd were producing XML output (as an option) along with plain text like existing loggers do, I could see both sides.  Hell, even if the option to opt out of binary journal use with their proprietary database scheme for simple text writes would be ok.  But those options don't exist.  In fact, the freekdesktop documentation strongly 'recommends' using their tools only rather than bothering to absorb their C API to produce your own parsing tools.

 

Such options do exist:

```
  -o --output=STRING       Change journal output mode (short, short-iso,

                           short-precise, short-monotonic, verbose,

                           export, json, json-pretty, json-sse, cat)
```

From there on JSON to XML is only one step away.

----------

## nialv7

I think the flood of power_supply messages indicates some process is constantly reading the power_supply uevent file.

And it's a little bit weird that if i stop systemd-journald, the flood goes away.

And I think this problem is far from "[SOLVED]"

----------

## nialv7

A better workaround is to disable CONFIG_POWER_SUPPLY_DEBUG in your kernel. Though this doesn't solve the real problem.

----------

## Bartlomiej_G

Either delete those lines:

drivers/power/power_supply_sysfs.c:272:	dev_dbg(dev, "uevent\n");

drivers/power/power_supply_sysfs.c-275-		dev_dbg(dev, "No power supply yet\n");

drivers/power/power_supply_sysfs.c-279-	dev_dbg(dev, "POWER_SUPPLY_NAME=%s\n", psy->name);

or change them to e.g. pr_debug("uevent\n"); If You wish to keep some output,

or disable CONFIG_POWER_SUPPLY_DEBUG in your config file as suggested by nialv7.

I have the same behaviour with CONFIG_I2C_DEBUG_CORE flag when using I2C eeprom on embedded system, I`ve spent 2 weeks figuring out what is wrong and how to omit the situation.

dev_dbg triggers journald to read uevent file, which invokes dev_dbg -> generating an cyclic graph (loop)

I wonder whether this should be reported as a bug in kernel or in systemd-journal

----------

