# ripmime eating CPU

## selig

I have been having a problem with ripmime for several months. I am using the latest stable version on amd64, which is 1.4.0.6. This happens about twice an hour - that is, ripmime starts consuming 100% CPU and continues to do so unless it is killed either manually or by something else (I do not know what kills it) after a considerable amount of time - I think about 30-60 minutes.

The problem is, I cannot find anything useful in the logs and when I run strace on the process, it does not output anything.

As a workaround, I have put a script into cron which checks for ripmime processes which have eaten more than 1 minute of CPU time and kills them. However, I would really like to solve this more elegantly. Could you give me any hints where to look and what to check? Thanks!

Oh, I should also add that I am using it with qmail/vpopmail/qmailscanner

----------

## Hu

Rebuild ripmime and its dependencies with debug information.  Next time it goes into a spin loop, kill it with SIGABRT and post a backtrace from the resulting core file.

----------

## cach0rr0

with this sort of thing you will normally find it's a specific email message that has caused something somewhere to deadlock, go into an infinite loop, etc 

so for the less code-like/scientific method, I'd start by checking what messages you have in the queue being processed, copy them all out into a temporary directory, and then run ripmime over them one by one to see which if any cause the behaviour to appear. 

The problem I see, is that even if you identify the issue, and identify a bug, it does not look as though there has been a ripmime release since November 2008. I do not know how useful it will be to report bugs, or if it is still actively developed - http://pldaniels.com/ripmime/

(amavisd-new does my recursive unpacking nowadays - I made that change plus the move from qmail to Postfix years ago, and haven't looked back!)

----------

## Hu

For serious bugs like crashes or infinite loops, Gentoo might carry a patch even if the upstream project is abandoned.  This depends on the maintainer's level of interest and whether you can find an interested party with the expertise to write a patch for the problem.

----------

## selig

I have found what kind of messages is causing this... Well some people are insane enough to send huge e-mails with vmw or pps and similar... when the delivery fails, the message is returned as a "text" message and it looks like 12 MB of text - ripmime separates it into appropriate files and it looks like it is trying to run itself on the 12 MB "text" file which is actually the original message with binary content. That probably causes it to loop indefinitely.

I do not need such messages to be delivered at all but I do not know how to stop them... maybe it would be easiest to fix this loop?

----------

## cach0rr0

 *selig wrote:*   

> I have found what kind of messages is causing this... Well some people are insane enough to send huge e-mails with vmw or pps and similar... when the delivery fails, the message is returned as a "text" message and it looks like 12 MB of text - ripmime separates it into appropriate files and it looks like it is trying to run itself on the 12 MB "text" file which is actually the original message with binary content. That probably causes it to loop indefinitely.
> 
> I do not need such messages to be delivered at all but I do not know how to stop them... maybe it would be easiest to fix this loop?

 

the "correct" thing is not the easiest (e.g. fixing the bug)

If by "loop" you mean, keeping the NDR from being generated, if possible yes, I agree; you should configure whatever is generating the NDR to not include the entire original message, but instead just a summary. Or, better still, if you aren't doing so already, you should configure qmail to reject e-mail to invalid recipients at SMTP time, before it's queued, instead of accepting the message and later deciding to bounce it. 

NDR's should, generally speaking, not contain attachments - there is no need. Worse still, because of qmail's propensity for doing this, spammers often exploit mail systems in this way, by using your server's propensity for generating backscatter to spam an unwitting third party (e.g. I forge a return address, qmail stupidly accepts the spoofed message I send, later decides it can't deliver it, and generates an NDR to be sent to the spoofed return address; however as the address is spoofed, you're spamming someone who never sent you an email)

(ultimately, qmail just doesn't handle bounces particularly well, unless you apply a fair chunk of third party patches - http://www.disciplina.net/musings/qmail_rant )

If by "loop" you mean the ripmime loop that causes the CPU utilization, it is going to be a bit more tricky, but still worth doing - not an immediate fix, however. 

To an extent, heavy CPU utilization is to be expected when unpacking large emails/files. But at the absolute most it should not take more than a few seconds in the most extreme cases to decode base64. If the process is at high CPU utilization, and stays there forever, this is a problem. If the process only periodically spikes to high CPU utilization, but then returns to normal shortly after, I would say this is normal and expected. 

As far as unpacking the message, the attachment is going to be a huge string of base64 encoded text, both when incoming and outgoing (well, virtually everyone does base64 nowadays). 

If you copy this large PPS out to a temporary directory, and run ripmime over it by hand, from the command line, does the behavior happen? If you can reproduce this easily, the bug is easier to handle for developers. I would be curious to see the message myself, just to see if there is anything strange about its composition and formatting. 

Anyway, I forgot the rest of what I was going to say. If you can't stop the bounce message, then we will need to look into this further, and see if it is a bug (e.g. the CPU utilization peaks, and never returns to normal) or if it's simply expected behaviour with large attachments (e.g. the CPU spikes for a few seconds, but then returns to normal). Running ripmime over the mesage from the command-line should tell you one way or another. And as has been mentioned, if you can reproduce it, and it is a bug, we should try and snag a core or even analyze in gdb (emerge ripmime with FEATURES="splitdebug"). 

(sorry for the long response!)

----------

