# DSPAM as SMTP relay (aka 'standalone' mode)

## DNAspark99

Well, it states on the DSPAM homepage that it can 'be deployed as a stand-alone SMTP appliance' - but it turns out, it's not so easy, as it's not so well documented...

I've read and followed both the doc/relay.txt and the following howto:

http://dspamwiki.expass.de/Installation/Postfix/RelayStepByStep

neither seem to get me a working setup, just wondering if anyone with some DSPAM experience out there has it working as a SMTP relay (sitting infront of an allready working mailserver, filtering email for users) 

Infact the only way I can get it forwarding to my mailserver is by setting the following - something not mentioned ANYWHERE in the docs I've read???

Without this the mail goes in through postfix->dspam, but doesn't reach my mailserver.

```
DeliveryHost        10.100.100.100

DeliveryPort        25

DeliveryIdent       dspam

DeliveryProto       SMTP
```

It filters like this, but for the most part, seems to completely disregard the SQL database I've set up for it...

Like I say, I've followed both the doc/relay.txt, which got me no where near a working setup, and the URL above, which got me a bit closer to a working setup, once I set this SMTP DeliveryHost... (which isn't mentioned!)

----------

## steveb

The documentation for DSPAM is horrible. Anyway... can you quickly write down what you exactly want to do? Or at least write down how your infrastructure is. If I understand your post correctly then you want to have DSPAM accepting mails from external and then process it and forward it to your real MTA. What MTA would that be? Where do you maintain a list of valid users? Do you have some thing like a directory services?

// SteveB

----------

## DNAspark99

k, basically yea it's on it's own machine (with postfix) to filter spam mail from hitting the real mail server (which runs 'atmail' mail software, but shouldn't matter according to the docs)

 *Quote:*   

> 
> 
>                      box1                           box2
> 
> net--> [ dspam + postfix ] ----> [ real mail server ]
> ...

 

as for the list of valid users, i'm still working out how best to handle that, be it in a list or db

----------

## steveb

 *DNAspark99 wrote:*   

> k, basically yea it's on it's own machine (with postfix) to filter spam mail from hitting the real mail server (which runs 'atmail' mail software, but shouldn't matter according to the docs)
> 
>  *Quote:*   
> 
>                      box1                           box2
> ...

 

Okay. I see. So this here:

```
DeliveryHost        10.100.100.100

DeliveryPort        25

DeliveryIdent       dspam

DeliveryProto       SMTP
```

Will create a infinitive loop because you will get mail from the internet (on port 25) and then pass it somehow to DSPAM and then DSPAM passes it again to Postfix on port 25? Well... this is not going to work right.

What I do is:Postfix listening on port 25, 465, 587Many checks inside Postifx (SPF, DKIM, Greylisting, Anti-Virus, DCC, various Postifx policies doing RBL/DNSBL/RHBL/WL/etc)Sending mail over LMTP to DSPAMDSPAM sending back over SMTP (but not port 25) to PostfixPostfix delivers the mail

Are you going to do Anti-Virus filtering as well? With what?

Could you post how you have integrated DSPAM into Postfix?

// SteveB

----------

## DNAspark99

ok I should clarify this a bit, note the example IPs :

```

net--> [ dspam + postfix (10.100.100.50) ] ----> [ real mail server (10.100.100.100) ] 

```

So this bit in dspam.conf:

```

DeliveryHost        10.100.100.100

DeliveryPort        25

DeliveryIdent       dspam

DeliveryProto       SMTP

```

...with this dspam seems to be doing the actual 'handing off' of mail (which dspam gets from postfix on port 25 of 10.100.100.50) over to the real mail server (10.100.100.100). The real mail server has SMTP on ports 25,465,587 (as well as imap + pop3 services, webmail, SpamAssassin, ClamAv, etc...basically it's a fully functional mailserver on it's own right now for many clients + domains) 

 The source of this issue is; some client domains are HAMMERED with spam (1130188 caught by SA so far... THIS WEEK!), and at this level, there always seems to be some slipping through, which hopefully dspam can help stop... so, I'm setting it up on an alternate machine (actually another VM on the same host machine that hosts the mailserver VM, but I digress) so that we can try dspam out for certain domains by changing the MX entry on a domain-by-domain basis. The actual handling of outgoing SMTP for clients will remain with the current mailserver, as it works fine... I just want to set dspam up in 'appliance' mode to try out some additional filtering of incoming mail 

ok enough background...back to the setup

I've got postfix running, and using dspam as a filter via the following in /etc/postfix/master.cf:

```

smtp      inet  n       -       n       -       -       smtpd

    -o content_filter=dspam:

dspam     unix  -       n       n       -       10      pipe

    flags=Rhqu user=dspam argv=/usr/bin/dspam --deliver=innocent --user ${recipient} -i -f ${sender} -- ${recipient}

```

and the following in /etc/postfix/main.cf: - basically using textfiles at this point vs any of the DB stuff, just trying to get it working first

```

virtual_transport = lmtp:unix:/var/run/dspam/dspam.sock

virtual_mailbox_domains = dspam.gravit-e.ca 

virtual_mailbox_maps = mysql:/etc/postfix/vmailbox.cf

dspam_destination_recipient_limit = 1

relay_recipient_maps = hash:/etc/postfix/relay_recipients

transport_maps = hash:/etc/postfix/transport

alias_maps = hash:/etc/mail/aliases

relay_domains = $transport_maps

```

It actually does seem to work on a limited basis right now, it's accepting mail in, marking it with dspam headers, and handing it off to the real account on the real mailserver -  but it's doing a few odd things in regards to user preferences and whatnot. Is there a 'gauranteed to be marked as spam' message I can test with, because it doesn't seem to be quarantining any of the tests I'm sending through.

also it says to train it, I'll need to send missed spam to spam-username@domain.com, but I'd rather have ONE general spam@domain.com address all users can forward the mail too... also, ideally, without the !DSPAM:3123812498648216126a! token/signatures in the message body.... 

hopefully this isn't asking too much  :Razz: 

----------

## steveb

Okay. I see now what you are doing.

For DSPAM to do the quarantine you need to enable this inside dspam.conf. Post your dspam.conf if possible.

If you want to filter out spam or test the filtering then you need somehow to train DSPAM in order to get high accuracy. Have you done anything in that direction? Are you using DSPAM groups? Do you want DSPAM to tag the message and allow the user to interact with DSPAM or do you just want DSPAM to tag the message and not allow the user to do anything with DSPAM?

What classifier are you using with DSPAM?

What tokenizer are you using with DSPAM?

What storage engine are you using with DSPAM?

// SteveB

----------

## DNAspark99

Ok, here's the current dspam.conf... 

I see now I may need to change the StorageDriver option, that may explain why it seems to ignore the MySQL db I set up...

And no, I've done no training yet, only been sending test messages to make sure it a: goes through dspam (it does) and b: delivers to real mailserver with dspam headers (it does) and c: stores user settings (sort of does, still playing with how best to handle user aliases)

End result I'm after is an additional layer of spam filtering for the users, that hopefully doesn't give too many false positives or false negatives, and that if/when it does, they can either un-quarantine or re-train the filters themselves

```

Home /var/spool/dspam

StorageDriver /usr/lib64/dspam/libhash_drv.so

TrustedDeliveryAgent "/usr/bin/procmail"

DeliveryHost        10.200.100.73

DeliveryPort        25

DeliveryIdent       dspam.mydomain.com

DeliveryProto       SMTP

OnFail error

Trust root

Trust dspam

Trust apache

Trust mail

Trust mailnull 

Trust smmsp

Trust daemon

Trust postfix

Debug *

TrainingMode teft

TestConditionalTraining on

Feature whitelist

Algorithm graham burton

Tokenizer chain

PValue bcr

WebStats on

Preference "spamAction=quarantine"

Preference "signatureLocation=headers"  # 'message' or 'headers'

Preference "showFactors=on"

AllowOverride trainingMode

AllowOverride spamAction spamSubject

AllowOverride statisticalSedation

AllowOverride enableBNR

AllowOverride enableWhitelist

AllowOverride signatureLocation

AllowOverride showFactors

AllowOverride optIn optOut

AllowOverride whitelistThreshold

AllowOverride localStore

MySQLServer     /var/run/mysqld/mysqld.sock

MySQLPort               3306

MySQLUser               dspam

MySQLPass               bX10flfV

MySQLDb                 dspam

MySQLCompress           true

MySQLCompress           true

HashRecMax              98317

HashAutoExtend          on  

HashMaxExtents          0

HashExtentSize          49157

HashPctIncrease 10

HashMaxSeek             10

HashConnectionCache     10

Notifications   off

PurgeSignature  off # Specified in purge.sql

PurgeNeutral   90

PurgeUnused    off # Specified in purge.sql

PurgeHapaxes   off # Specified in purge.sql

PurgeHits1S    off # Specified in purge.sql

PurgeHits1I    off # Specified in purge.sql

LocalMX 127.0.0.1

SystemLog on

UserLog   on

Opt out

ParseToHeaders on

ChangeModeOnParse on

ChangeUserOnParse off

ServerQueueSize 32

ServerPID              /var/run/dspam/dspam.pid

ServerMode standard

ServerParameters        "--deliver=innocent"

ServerIdent             "dspam.mydomain.com"

ServerDomainSocketPath  "/var/run/dspam/dspam.sock"

ClientHost      "/var/run/dspam/dspam.sock"

ProcessorURLContext on

ProcessorBias on

```

----------

## steveb

Try using this dspam.conf (quickly made one with your data):

```
## $Id: dspam.conf.in,v 1.82 2006/06/23 03:11:31 jonz Exp $

## dspam.conf -- DSPAM configuration file

##

#

# DSPAM Home: Specifies the base directory to be used for DSPAM storage

#

Home /var/spool/dspam

#

# StorageDriver: Specifies the storage driver backend (library) to use.

# You'll only need to set this if you are using dynamic storage driver plugins

# from a binary distribution. The default build statically links the storage

# driver (when only one is specified at configure time), overriding this

# setting, which only comes into play if multiple storage drivers are specified

# at configure time. When using dynamic linking, be sure to include the path

# to the library if necessary, and some systems may use an extension other

# than .so (e.g. OSX uses .dylib).

#

# Options include:

#

#   libmysql_drv.so     libpgsql_drv.so   libsqlite_drv.so

#   libsqlite3_drv.so   libhash_drv.so

#

# IMPORTANT: Switching storage drivers requires more than merely changing

# this option. If you do not wish to lose all of your data, you will need to

# migrate it to the new backend before making this change.

#

StorageDriver /usr/lib64/dspam/libmysql_drv.so

#

# Trusted Delivery Agent: Specifies the local delivery agent DSPAM should call

# when delivering mail as a trusted user. Use %u to specify the user DSPAM is

# processing mail for. It is generally a good idea to allow the MTA to specify

# the pass-through arguments at run-time, but they may also be specified here.

#

# Most operating system defaults:

#TrustedDeliveryAgent "/usr/bin/procmail"       # Linux

#TrustedDeliveryAgent "/usr/bin/mail"           # Solaris

#TrustedDeliveryAgent "/usr/libexec/mail.local" # FreeBSD

#TrustedDeliveryAgent "/usr/bin/procmail"       # Cygwin

#

# Other popular configurations:

#TrustedDeliveryAgent "/usr/cyrus/bin/deliver"  # Cyrus

#TrustedDeliveryAgent "/bin/maildrop"           # Maildrop

#TrustedDeliveryAgent "/usr/local/sbin/exim -oMr spam-scanned" # Exim

#

TrustedDeliveryAgent "/usr/bin/procmail"

#

# Untrusted Delivery Agent: Specifies the local delivery agent and arguments

# DSPAM should use when delivering mail and running in untrusted user mode.

# Because DSPAM will not allow pass-through arguments to be specified to

# untrusted users, all arguments should be specified here. Use %u to specify

# the user DSPAM is processing mail for. This configuration parameter is only

# necessary if you plan on allowing untrusted processing.

#

UntrustedDeliveryAgent "/usr/bin/procmail -d %u"

#

# SMTP or LMTP Delivery: Alternatively, you may wish to use SMTP or LMTP

# delivery to deliver your message to the mail server instead of using a

# delivery agent. You will need to configure with --enable-daemon to use host

# delivery, however you do not need to operate in daemon mode. Specify an IP

# address or UNIX path to a domain socket below as a host.

#

# If you would like to set up DeliveryHost's on a per-domain basis, use

# the syntax: DeliveryHost.domain.com 1.2.3.4

#

#DeliveryHost        127.0.0.1

#DeliveryPort        24

#DeliveryIdent       localhost

#DeliveryProto       LMTP

##

DeliveryHost        10.200.100.73

DeliveryPort        25

DeliveryIdent       dspam.mydomain.com

DeliveryProto       SMTP

#

# FallbackDomains: If you want to specify certain domains as fallback domains,

# enable this option. For example, you could create a user @domain.com, and

# if bob@domain.com does not resolve to a known user on the system, the user

# could default to your @domain.com user. NOTE: This also requires designating

# fallbackDomain for the domain name;

# e.g. dspam_admin ch pref domain.com fallbackDomain on

#

#FallbackDomains on

#

# Quarantine Agent: DSPAM's default behavior is to quarantine all mail it

# thinks is spam. If you wish to override this behavior, you may specify

# a quarantine agent which will be called with all messages DSPAM thinks is

# spam. Use %u to specify the user DSPAM is processing mail for.

#

#QuarantineAgent        "/usr/bin/procmail -d spam"

#

# DSPAM can optionally process "plused users" (addresses in the user+detail

# form) by truncating the username just before the "+", so all internal

# processing occurs for "user", but delivery will be performed for

# "user+detail". This is only useful if the LDA can handle "plused users"

# (for example Cyrus IMAP) and when configured for LMTP delivery above

#

#EnablePlusedDetail     on

#

# Quarantine Mailbox: DSPAM's LMTP code can send spam mail using LMTP to a

# "plused" mailbox (such as user+quarantine) leaving quarantine processing

# for retraining or deletion to be performed by the LDA and the mail client.

# "plused" mailboxes are supported by Cyrus IMAP and possibly other LDAs.

# The mailbox name must have the +

#

#QuarantineMailbox      +quarantine

#

# OnFail: What to do if local delivery or quarantine should fail. If set

# to "unlearn", DSPAM will unlearn the message prior to exiting with an

# un successful return code. The default option, "error" will not unlearn

# the message but return the appropriate error code. The unlearn option

# is use-ful on some systems where local delivery failures will cause the

# message to be requeued for delivery, and could result in the message

# being processed multiple times. During a very large failure, however,

# this could cause a significant load increase.

#

OnFail error

#

# Trusted Users: Only the users specified below will be allowed to perform

# administrative functions in DSPAM such as setting the active user and

# accessing tools. All other users attempting to run DSPAM will be restricted;

# their uids will be forced to match the active username and they will not be

# able to specify delivery agent privileges or use tools.

#

Trust root

Trust dspam

Trust apache

Trust mail

Trust mailnull

Trust smmsp

Trust daemon

Trust mailman

Trust postfix

#Trust nobody

#Trust majordomo

#

# Debugging: Enables debugging for some or all users. IMPORTANT: DSPAM must

# be compiled with debug support in order to use this option. DSPAM should

# never be running in production with debug active unless you are

# troubleshooting problems.

#

# DebugOpt: One or more of: process, classify, spam, fp, inoculation, corpus

#   process     standard message processing

#   classify    message classification using --classify

#   spam        error correction of missed spam

#   fp          error correction of false positives

#   inoculation message inoculations (source=inoculation)

#   corpus      corpusfed messages (source=corpus)

#

#Debug *

#Debug bob bill

#

#DebugOpt process spam fp

#DebugOpt process classify spam fp inoculation corpus

#

# ClassAlias: Alias a particular class to spam/nonspam. This is useful if

# classifying things other than spam.

#

#ClassAliasSpam badstuff

#ClassAliasNonspam goodstuff

#

# Training Mode: The default training mode to use for all operations, when

# one has not been specified on the commandline or in the user's preferences.

# Acceptable values are:

#     toe     Train on Error (Only)

#     teft    Train Everything (Trains on every message)

#     tum     Train Until Mature (Train only tokens without enough data)

#     notrain Do not train or store signatures (large ISP systems, post-train)

#

TrainingMode toe

#

# TestConditionalTraining: By default, dspam will retrain certain errors

# until the condition is no longer met. This usually accelerates learning.

# Some people argue that this can increase the risk of errors, however.

#

TestConditionalTraining on

#

# Features: Specify features to activate by default; can also be specified

# on the commandline. See the documentation for a list of available features.

# If _any_ features are specified on the commandline, these are ignored.

#

#Feature sbph

#Feature chained

Feature noise

Feature whitelist

# Training Buffer: The training buffer waters down statistics during training.

# It is designed to prevent false positives, but can also dramatically reduce

# dspam's catch rate during initial training. This can be a number from 0

# (no buffering) to 10 (maximum buffering). If you are paranoid about false

# positives, you should probably enable this option.

#

Feature tb=5

#

# Algorithms: Specify the statistical algorithms to use, overriding any

# defaults configured in the build. The options are:

#    naive       Naive-Bayesian (All Tokens)

#    graham      Graham-Bayesian ("A Plan for Spam")

#    burton      Burton-Bayesian (SpamProbe)

#    robinson    Robinson's Geometric Mean Test (Obsolete)

#    chi-square  Fisher-Robinson's Chi-Square Algorithm

#

# You may have multiple algorithms active simultaneously, but it is strongly

# recommended that you group Bayesian algorithms with other Bayesian

# algorithms, and any use of Chi-Square remain exclusive.

#

# NOTE: For standard "CRM114" Markovian weighting, use 'naive', or consider

#       using 'burton' for slightly better accuracy

#

# Don't mess with this unless you know what you're doing

#

#Algorithm chi-square

#Algorithm naive

Algorithm burton graham naive

#Algorithm burton

#

# Tokenizer: Specify the tokenizer to use. The tokenizer is the piece

# responsible for parsing the message into individual tokens. Depending on

# how many resources you are willing to trade off vs. accuracy, you may

# choose to use a less or more detailed tokenizer:

#   word    uniGram (single word) tokenizer

#           Tokenizes message into single individual words/tokens

#           example: "free" and "viagra"

#   chain   biGram (chained tokens) tokenizer (default)

#           Single words + chains adjacent tokens together

#           example: "free" and "viagra" and "free viagra"

#   sbph    Sparse Binary Polynomial Hashing tokenizer

#           Creates sparse token patterns across sliding window of 5-tokens

#           example: "the quick * fox jumped" and "the * * fox jumped"

#   osb     Orthogonal Sparse biGram

#           Similar to SBPH, but only uses the biGrams

#           example: "the * * fox" and "the * * * jumped"

#

#Tokenizer chain

Tokenizer osb

#

# PValue: Specify the technique used for calculating Probability Values,

# overriding any defaults configured in the build. These options are:

#    bcr         Bayesian Chain Rule (Graham's Technique - "A Plan for Spam")

#    robinson    Robinson's Technique (used in Chi-Square)

#    markov      Markovian Weighted Technique (for Markovian discrimination)

#

# Unlike the "Algorithms" property, you may only have one of these defined.

# Use of the chi-square algorithm automatically changes this to robinson.

#

# Don't mess with this unless you know what you're doing.

#

#PValue robinson

#PValue markov

PValue bcr

#

# WebStats: Enable this if you are using the CGI, which writes .stats files

WebStats on

#

# ImprobabilityDrive: Calculate odds-ratios for ham/spam, and add to

# X-DSPAM-Improbability headers

#

ImprobabilityDrive on

#

# Preferences: Specify any preferences to set by default, unless otherwise

# overridden by the user (see next section) or a default.prefs file.

# If user or default.prefs are found, the user's preferences will override any

# defaults.

#

Preference "trainingMode=TOE"      # TEFT, TUM, TOE

Preference "spamAction=quarantine"      # tag, quarantine, deliver

Preference "signatureLocation=headers"   # 'message' or 'headers'

Preference "spamSubject="

Preference "statisticalSedation=5"   # 0 to 9

Preference "enableBNR=on"      # on, off

Preference "showFactors=off"      # on, off

Preference "enableWhitelist=on"      # on, off

Preference "whitelistThreshold=5"

#

# Overrides: Specifies the user preferences which may override configuration

# and commandline defaults. Any other preferences supplied by an untrusted user

# will be ignored.

#

AllowOverride enableBNR

AllowOverride enableWhitelist

AllowOverride fallbackDomain

AllowOverride ignoreGroups

AllowOverride localStore

AllowOverride makeCorpus

AllowOverride optIn

AllowOverride optOut

AllowOverride optOutClamAV

AllowOverride processorBias

AllowOverride showFactors

AllowOverride signatureLocation

AllowOverride spamAction

AllowOverride spamSubject

AllowOverride statisticalSedation

AllowOverride storeFragments

AllowOverride tagNonspam

AllowOverride tagSpam

AllowOverride trainPristine

AllowOverride trainingMode

AllowOverride whitelistThreshold

# --- MySQL ---

#

# Storage driver settings: Specific to a particular storage driver. Uncomment

# the configuration specific to your installation, if applicable.

#

MySQLServer             /var/run/mysqld/mysqld.sock

MySQLPort

MySQLUser               dspam

MySQLPass               bX10flfV

MySQLDb                 dspam

MySQLCompress           true

# If you are using replication for clustering, you can also specify a separate

# server to perform all writes to.

#

#MySQLWriteServer       /var/run/mysqld/mysqld.sock

#MySQLWritePort

#MySQLWriteUser         dspam

#MySQLWritePass         changeme

#MySQLWriteDb           dspam_write

#MySQLCompress          true

# If your replication isn't close to real-time, your retraining might fail if

# the  signature isn't found. One workaround for this is to use the write

# database for all signature reads:

#

#MySQLReadSignaturesFromWriteDb on

# Use this if you have the 4.1 quote bug (see doc/mysql.txt)

#MySQLSupressQuote      on

# If you're running DSPAM in client/server (daemon) mode, uncomment the

# setting below to override the default connection cache size (the number

# of connections the server pools between all clients). The connection cache

# represents the maximum number of database connections *available* and should

# be set based on the maximum number of concurrent connections you're likely

# to have. Each connection may be used by only one thread at a time, so all

# other threads _will block_ until another connection becomes available.

#

#MySQLConnectionCache   10

MySQLConnectionCache    10

# If you're using vpopmail or some other type of virtual setup and wish to

# change the table dspam uses to perform username/uid lookups, you can over-

# ride it below

#MySQLVirtualTable          dspam_virtual_uids

#MySQLVirtualUIDField       uid

#MySQLVirtualUsernameField  username

# UIDInSignature: MySQL supports the insertion of the user id into the DSPAM

# signature. This allows you to create one single spam or fp alias

# (pointing to some arbitrary user), and the uid in the signature will

# switch to the correct user. Result: you need only one spam alias

MySQLUIDInSignature     on

# --- PostgreSQL ---

#PgSQLServer            127.0.0.1

#PgSQLPort              5432

#PgSQLUser              dspam

#PgSQLPass              changeme

#PgSQLDb                dspam

# If you're running DSPAM in client/server (daemon) mode, uncomment the

# setting below to override the default connection cache size (the number

# of connections the server pools between all clients).

#

#PgSQLConnectionCache   3

# UIDInSignature: PgSQL supports the insertion of the user id into the DSPAM

# signature. This allows you to create one single spam or fp alias

# (pointing to some arbitrary user), and the uid in the signature will

# switch to the correct user. Result: you need only one spam alias

#PgSQLUIDInSignature    on

# If you're using vpopmail or some other type of virtual setup and wish to

# change the table dspam uses to perform username/uid lookups, you can over-

# ride it below

#PgSQLVirtualTable          dspam_virtual_uids

#PgSQLVirtualUIDField       uid

#PgSQLVirtualUsernameField  username

# --- SQLite ---

#SQLitePragma   "synchronous = OFF"

# --- Hash ---

#

# HashRecMax: Default number of records to create in the initial segment when

# building hash files. 100,000 yields files 1.6MB in size, but can fill up

# fast, so be sure to increase this (to a million or more) if you're not using

# autoextend.

#

# NOTE: If you're using a heavy-weight tokenizer, such as SBPH, you should be

#       looking for settings in the 'millions' of records.

#

# Primes List:

#  53, 97, 193, 389, 769, 1543, 3079, 6151, 12289, 24593, 49157, 98317, 196613,

#  393241, 786433, 1572869, 3145739, 6291469, 12582917, 25165843, 50331653,

#  100663319, 201326611, 402653189, 805306457, 1610612741, 3221225473,

#  4294967291

#

HashRecMax              98317

#

# HashAutoExtend: Autoextend hash databases when they fill up. This allows

# them to continue to train by adding extents (extensions) to the file. There

# will be a small delay during the growth process, as everything needs to be

# closed and remapped.

#

HashAutoExtend          on

#

# HashMaxExtents: The maximum number of extents that may be created in a single

# hash file. Set this to zero for unlimited

#

HashMaxExtents          0

#

# HashExtentSize: The initial record size for newly created extents. Creating

# this too small could result in many extents being created. Creating this too

# large could result in excessive disk space usage. Typically, a value close

# to half of the HashRecMax size is good.

#

HashExtentSize          49157

#

# HashPctIncrease: Increase the next extent size by n% from the size of the

# last extent. This is useful in accommodating systems where the default

# HashExtentSize can be too small for certain high-volume users, and can also

# help keep seeks nice and speedy and/or prevent too many unnecessary extents

# from being created when using a low HashMaxSeek. The default behavior, when

# HashPctIncrease is not used, is to always use # HashExtentSize with no

# increase.

#

HashPctIncrease 10

#

# HashMaxSeek: The maximum number of record seeks when inserting a new record

# before failing or adding a new extent. This ultimately translates into the

# max # of acceptable seeks per segment. Setting this too high will exhaustively

# scan each segment and hurt performance. Typically, a low value is acceptable

# as even older extents will continue to fill as training progresses.

#

HashMaxSeek             10

#

# HashConcurrentUser: If you are using a single, stateful hash database in

# daemon mode, specifying a concurrent user below will cause the user to be

# permanently mapped into memory and shared via rwlocks. This is very fast and

# very cool if you are running a "userless" relay appliance.

#

#HashConcurrentUser     user

#

# HashConnectionCache: If running in daemon mode, this is the max # of

# concurrent connections that will be supported. NOTE: If you are using

# HashConcurrentUser, this option is ignored, as all connections are read-

# write locked instead of mutex locked.

#

HashConnectionCache     10

# -- LDAP --

#

# LDAP: Perform various LDAP functions depending on LDAPMode variable.

# Presently, the only mode supported is 'verify', which will verify the

# existence of an unknown user in LDAP prior to creating them as a new user in

# the system.  This is useful on some systems acting as gateway machines.

#

#LDAPMode       verify

#LDAPHost       ldaphost.mydomain.com

#LDAPFilter     "(mail=%u)"

#LDAPBase       ou=people,dc=domain,dc=com

# -- Profiles --

#

# You can specify multiple storage profiles, and specify the server to

# use on the commandline with --profile. For example:

#

#Profile DECAlpha

#MySQLServer.DECAlpha   10.0.0.1

#MySQLPort.DECAlpha     3306

#MySQLUser.DECAlpha     dspam

#MySQLPass.DECAlpha     changeme

#MySQLDb.DECAlpha       dspam

#MySQLCompress.DECAlpha true

#

#Profile Sun420R

#MySQLServer.Sun420R    10.0.0.2

#MySQLPort.Sun420R      3306

#MySQLUser.Sun420R      dspam

#MySQLPass.Sun420R      changeme

#MySQLDb.Sun420R        dspam

#MySQLCompress.Sun420R  false

#

#DefaultProfile DECAlpha

#

# If you're using storage profiles, you can set failovers for each profile.

# Of course, if you'll be failing over to another database, that database

# must have the same information as the first. If you're using a global

# database with no training, this should be relatively simple. If you're

# configuring per-user data, however, you'll need to set up some type of

# replication between databases.

#

#Failover.DECAlpha      SUN420R

#Failover.Sun420R       DECAlpha

# If the storage fails, the agent will follow each profile's failover up to

# a maximum number of failover attempts. This should be set to a maximum of

# the number of profiles you have, otherwise the agent could loop and try

# the same profile multiple times (unless this is your desired behavior).

#

#FailoverAttempts       1

#

# Ignored headers: If DSPAM is behind other tools which may add a header to

# incoming emails, it may be beneficial to ignore these headers - especially

# if they are coming from another spam filter. If you are _not_ using one of

# these tools, however, leaving the appropriate headers commented out will

# allow DSPAM to use them as telltale signs of forged email.

#

IgnoreHeader X--MailScanner-SpamCheck

IgnoreHeader X-Admission-MailScanner-SpamCheck

IgnoreHeader X-Admission-MailScanner-SpamScore

IgnoreHeader X-Amavis-Alert

IgnoreHeader X-Antispam

IgnoreHeader X-AntiVirus

IgnoreHeader X-Antivirus-Scanner

IgnoreHeader X-Antivirus-Status

IgnoreHeader X-Assp-Spam-Prob

IgnoreHeader X-AV-Scanned

IgnoreHeader X-AVAS-Spam-Level

IgnoreHeader X-AVAS-Spam-Score

IgnoreHeader X-AVAS-Spam-Status

IgnoreHeader X-AVAS-Spam-Symbols

IgnoreHeader X-AVAS-Virus-Status

IgnoreHeader X-Barracuda-Bayes

IgnoreHeader X-AVK-Virus-Check

IgnoreHeader X-Barracuda

IgnoreHeader X-Barracuda-Spam-Flag

IgnoreHeader X-Barracuda-Spam-Report

IgnoreHeader X-Barracuda-Spam-Score

IgnoreHeader X-Barracuda-Spam-Status

IgnoreHeader X-Barracuda-Virus-Scanned

IgnoreHeader X-BTI-AntiSpam

IgnoreHeader X-Bogosity

IgnoreHeader X-ClamAntiVirus-Scanner

IgnoreHeader X-CRM114-CacheID

IgnoreHeader X-CRM114-Status

IgnoreHeader X-CRM114-Version

IgnoreHeader X-Despammed-Tracer

IgnoreHeader X-ELTE-SpamCheck

IgnoreHeader X-ELTE-SpamCheck-Details

IgnoreHeader X-ELTE-SpamScore

IgnoreHeader X-ELTE-SpamVersion

IgnoreHeader X-ELTE-VirusStatus

IgnoreHeader X-GMX-Antispam

IgnoreHeader X-GMX-Antivirus

IgnoreHeader X-Greylist

IgnoreHeader X-GWSPAM

IgnoreHeader X-HTMLM

IgnoreHeader X-HTMLM-Info

IgnoreHeader X-HTMLM-Score

IgnoreHeader X-iHateSpam-Checked

IgnoreHeader X-iHateSpam-Quarantined

IgnoreHeader X-IMAIL-SPAM-STATISTICS

IgnoreHeader X-IMAIL-SPAM-URL-DBL

IgnoreHeader X-IMAIL-SPAM-VALFROM

IgnoreHeader X-IMAIL-SPAM-VALHELO

IgnoreHeader X-IMAIL-SPAM-VALREVDNS

IgnoreHeader X-IronPort-Anti-Spam-Filtered

IgnoreHeader X-IronPort-Anti-Spam-Result

IgnoreHeader X-Kaspersky-Antivirus

IgnoreHeader X-KSV-Antispam

IgnoreHeader X-Mailer

IgnoreHeader X-MailScanner

IgnoreHeader X-MailScanner-Information

IgnoreHeader X-MailScanner-SpamCheck

IgnoreHeader X-MDaemon-Deliver-To

IgnoreHeader X-MDAV-Processed

IgnoreHeader X-MDRemoteIP

IgnoreHeader X-MIE-MailScanner-SpamCheck

IgnoreHeader X-MIMEOLE

IgnoreHeader X-Mlf-Spam-Status

IgnoreHeader X-MSMail-Priority

IgnoreHeader X-NAI-Spam-Checker-Version

IgnoreHeader X-NAI-Spam-Flag

IgnoreHeader X-NAI-Spam-Level

IgnoreHeader X-NAI-Spam-Route

IgnoreHeader X-NAI-Spam-Rules

IgnoreHeader X-NAI-Spam-Score

IgnoreHeader X-NAI-Spam-Threshold

IgnoreHeader X-NetcoreISpam1-ECMScanner

IgnoreHeader X-NetcoreISpam1-ECMScanner-From

IgnoreHeader X-NetcoreISpam1-ECMScanner-Information

IgnoreHeader X-NetcoreISpam1-ECMScanner-SpamCheck

IgnoreHeader X-NetcoreISpam1-ECMScanner-SpamScore

IgnoreHeader X-NEWT-spamscore

IgnoreHeader X-No-Spam

IgnoreHeader X-Olypen-Virus

IgnoreHeader X-OWM-SpamCheck

IgnoreHeader X-OWM-VirusCheck

IgnoreHeader X-PAA-AntiVirus

IgnoreHeader X-PAA-AntiVirus-Message

IgnoreHeader X-PIRONET-NDH-MailScanner-SpamCheck

IgnoreHeader X-PIRONET-NDH-MailScanner-SpamScore

IgnoreHeader X-PN-SPAMFiltered

IgnoreHeader X-Priority

IgnoreHeader X-Proofpoint-Spam-Details

IgnoreHeader X-purgate

IgnoreHeader X-purgate-Ad

IgnoreHeader X-purgate-ID

IgnoreHeader X-PMX

IgnoreHeader X-PMX-Version

IgnoreHeader X-RAV-AntiVirus

IgnoreHeader X-Rc-Spam

IgnoreHeader X-Rc-Virus

IgnoreHeader X-RedHat-Spam-Score

IgnoreHeader X-RedHat-Spam-Warning

IgnoreHeader X-RegEx

IgnoreHeader X-RegEx-Score

IgnoreHeader X-RITmySpam

IgnoreHeader X-RITmySpam-IP

IgnoreHeader X-RITmySpam-Spam

IgnoreHeader X-Rocket-Spam

IgnoreHeader X-SA-GROUP

IgnoreHeader X-SA-RECEIPTSTATUS

IgnoreHeader X-Sohu-Antivirus

IgnoreHeader X-Spam

IgnoreHeader X-Spam-ASN

IgnoreHeader X-Spam-Check

IgnoreHeader X-Spam-Checked-By

IgnoreHeader X-Spam-Checker

IgnoreHeader X-Spam-Checker-Version

IgnoreHeader X-Spam-DCC

IgnoreHeader X-Spam-Details

IgnoreHeader X-Spam-detection-level

IgnoreHeader X-Spam-Filter

IgnoreHeader X-Spam-Filtered

IgnoreHeader X-Spam-Flag

IgnoreHeader X-Spam-Level

IgnoreHeader X-Spam-OrigSender

IgnoreHeader X-Spam-Pct

IgnoreHeader X-Spam-Prev-Subject

IgnoreHeader X-Spam-Processed

IgnoreHeader X-Spam-Pyzor

IgnoreHeader X-Spam-Rating

IgnoreHeader X-Spam-Report

IgnoreHeader X-Spam-Scanned

IgnoreHeader X-Spam-Score

IgnoreHeader X-Spam-Status

IgnoreHeader X-Spam-Tagged

IgnoreHeader X-Spam-Tests

IgnoreHeader X-Spam-Tests-Failed

IgnoreHeader X-Spam-Virus

IgnoreHeader X-Spamadvice

IgnoreHeader X-Spamarrest-noauth

IgnoreHeader X-Spamarrest-speedcode

IgnoreHeader X-SpamBouncer

IgnoreHeader X-Spambayes-Classification

IgnoreHeader X-SpamCatcher-Score

IgnoreHeader X-SpamCop-Checked

IgnoreHeader X-SpamCop-Disposition

IgnoreHeader X-SpamCop-Whitelisted

IgnoreHeader X-Spamcount

IgnoreHeader X-SpamDetected

IgnoreHeader X-SpamInfo

IgnoreHeader X-SpamPal

IgnoreHeader X-SpamPal-Timeout

IgnoreHeader X-SpamReason

IgnoreHeader X-SpamScore

IgnoreHeader X-Spamsensitivity

IgnoreHeader X-SpamTest-Categories

IgnoreHeader X-SpamTest-Info

IgnoreHeader X-SpamTest-Method

IgnoreHeader X-SpamTest-Status

IgnoreHeader X-SpamTest-Version

IgnoreHeader X-STA-NotSpam

IgnoreHeader X-STA-Spam

IgnoreHeader X-TERRACE-SPAMMARK

IgnoreHeader X-TERRACE-SPAMRATE

IgnoreHeader X-to-viruscore

IgnoreHeader X-Text-Classification

IgnoreHeader X-Text-Classification-Data

IgnoreHeader X-UCD-Spam-Score

IgnoreHeader x-uscspam

IgnoreHeader X-Virus-Check

IgnoreHeader X-Virus-Checked

IgnoreHeader X-Virus-Checker-Version

IgnoreHeader X-Virus-Scan

IgnoreHeader X-Virus-Scanned

IgnoreHeader X-Virus-Scanner

IgnoreHeader X-Virus-Scanner-Result

IgnoreHeader X-Virus-Status

IgnoreHeader X-VirusChecked

IgnoreHeader X-Virusscan

IgnoreHeader X-WinProxy-AntiVirus

IgnoreHeader X-WinProxy-AntiVirus-Message

#

# Lookup: Perform lookups on streamlined blackhole list servers (see

# http://www.nuclearelephant.com/projects/sbl/). The streamlined blacklist

# server is machine-automated, unsupervised blacklisting system designed to

# provide real-time and highly accurate blacklisting based on network spread.

# When performing a lookup, DSPAM will automatically learn the inbound message

# as spam if the source IP is listed. Until an official public RABL server is

# available, this feature is only useful if you are running your own

# streamlined blackhole list server for internal reporting among multiple mail

# servers. Provide the name of the lookup zone below to use.

#

# This function performs standard reverse-octet.domain lookups, and while it

# will function with many RBLs, it's strongly discouraged to use those

# maintained by humans as they're often inaccurate and could hurt filter

# learning and accuracy.

#

#Lookup "sbl.yourdomain.com"

#

# RBLInoculate: If you want to inoculate the user from RBL'd messages it would

# have otherwise missed, set this to on.

#

#RBLInoculate on

#

# Notifications: Enable the sending of notification emails to users (first

# message, quarantine full, etc.)

#

Notifications   on

#

# Purge configuration: Set dspam_clean purge default options, if not otherwise

# specified on the commandline

#

#PurgeSignatures 14          # Stale signatures

#PurgeNeutral    90          # Tokens with neutralish probabilities

#PurgeUnused     90          # Unused tokens

#PurgeHapaxes    30          # Tokens with less than 5 hits (hapaxes)

#PurgeHits1S    15          # Tokens with only 1 spam hit

#PurgeHits1I    15          # Tokens with only 1 innocent hit

#

# Purge configuration for SQL-based installations using purge.sql

#

PurgeSignature off # Specified in purge.sql

PurgeNeutral   90

PurgeUnused    off # Specified in purge.sql

PurgeHapaxes   off # Specified in purge.sql

PurgeHits1S    off # Specified in purge.sql

PurgeHits1I    off # Specified in purge.sql

#

# Local Mail Exchangers: Used for source address tracking, tells DSPAM which

# mail exchangers are local and therefore should be ignored in the Received:

# header when tracking the source of an email. Note: you should use the address

# of the host as appears between brackets [ ] in the Received header.

#

LocalMX 127.0.0.1 10.200.100.73 

#

# Logging: Disabling logging for users will make usage graphs unavailable to

# them. Disabling system logging will make admin graphs unavailable.

#

SystemLog on

UserLog   on

#

# TrainPristine: for systems where the original message remains server side

# and can therefore be presented in pristine format for retraining. This option

# will cause DSPAM to cease all writing of signatures and DSPAM headers to the

# message, and deliver the message in as pristine format as possible. This mode

# REQUIRES that the original message in its pristine format (as of delivery)

# be presented for retraining, as in the case of webmail, imap, or other

# applications where the message is actually kept server-side during reading,

# and is preserved. DO NOT use this switch unless the original message can be

# presented for retraining with the ORIGINAL HEADERS and NO MODIFICATIONS.

#

# NOTE: You can't use this setting with dspam_trian; if you're going to use it,

#       wait until after you train any corpora.

#

#TrainPristine on

#

# Opt: in or out; determines DSPAM's default filtering behavior. If this value

# is set to in, users must opt-in to filtering by dropping a .dspam file in

# /var/dspam/opt-in/user.dspam (or if you have homedirs configured, a .dspam

# folder in their home directory).  The default is opt-out, which means all

# users will be filtered unless a .nodspam file is dropped in

# /var/dspam/opt-out/user.nodspam

#

Opt out

#

# TrackSources: specify which (if any) source addresses to track and report

# them to syslog (mail.info). This is useful if you're running a firewall or

# blacklist and would like to use this information. Spam reporting also drops

# RABL blacklist files (see http://www.nuclearelephant.com/projects/rabl/).

#

TrackSources spam nonspam virus

#

# ParseToHeaders: In lieu of setting up individual aliases for each user,

# DSPAM can be configured to automatically parse the To: address for spam and

# false positive forwards. From there, it can be configured to either set the

# DSPAM user based on the username specified in the header and/or change the

# training class and source accordingly. The options below can be used to

# customize most common types of header parsing behavior to avoid the need for

# multiple aliases, or if using LMTP, aliases entirely..

#

# ParseToHeader: Parse the To: headers of an incoming message. This must be

#                set to 'on' to use either of the following features.

#

# ChangeModeOnParse: Automatically change the class (to spam or innocent)

#   depending on whether spam- or notspam- was specified, and change the source

#   to 'error'. This is convenient if you're not using aliases at all, but

#   are delivering via LMTP.

#

# ChangeUserOnParse: Automatically change the username to match that specified

#   in the To: header. For example, spam-bob@domain.tld will set the username

#   to bob, ignoring any --user passed in. This may not always be desirable if

#   you are using virtual email addresses as usernames. Options:

#     on or user        take the portion before the @ sign only

#     full              take everything after the initial {spam,notspam}-.

#

ParseToHeaders on

ChangeModeOnParse on

ChangeUserOnParse full

#

# Broken MTA Options: Some MTAs don't support the proper functionality

# necessary. In these cases you can activate certain features in DSPAM to

# compensate. 'returnCodes' causes DSPAM to return an exit code of 99 if

# the message is spam, 0 if not, or a negative code if an error has occured.

# Specifying 'case' causes DSPAM to force the input usernames to lowercase.

# Spceifying 'lineStripping' causes DSPAM to strip ^M's from messages passed

# in.

#

#Broken returnCodes

Broken case

Broken lineStripping

#

# MaxMessageSize: You may specify a maximum message size for DSPAM to process.

# If the message is larger than the maximum size, it will be delivered

# without processing. Value is in bytes.

#

MaxMessageSize 20971520

#

# Virus Checking: If you are running clamd, DSPAM can perform stream-based

# virus checking using TCP. Uncomment the values below to enable virus

# checking.

#

# ClamAVResponse: reject (reject or drop the message with a permanent failure)

#                 accept (accept the message and quietly drop the message)

#                 spam   (treat as spam and quarantine/tag/whatever)

#

#ClamAVPort     3310

#ClamAVHost     127.0.0.1

#ClamAVResponse accept

# -- CLIENT / SERVER --

#

# Daemonized Server: If you are running DSPAM as a daemonized server using

# --daemon, the following parameters will override the default. Use the

# ServerPass option to set up accounts for each client machine. The DSPAM

# server will process and deliver the message based on the parameters

# specified. If you want the client machine to perform delivery, use

# the --stdout option in conjunction with a local setup.

#

#ServerPort             24

ServerQueueSize         32

ServerPID               /var/run/dspam/dspam.pid

#

# ServerMode specifies the type of LMTP server to start. This can be one of:

#     dspam: DSPAM-proprietary DLMTP server, for communicating with dspamc

#  standard: Standard LMTP server, for communicating with Postfix or other MTA

#      auto: Speak both DLMTP and LMTP; auto-detect by ServerPass.IDENT

#

ServerMode auto

# If supporting DLMTP (dspam) mode, dspam clients will require authentication

# as they will be passing in parameters. The idents below will be used to

# determine which clients will be speaking DLMTP, so if you will be using

# both LMTP and DLMTP from the same host, be sure to use something other

# than the server's hostname below (which will be sent by the MTA during a

# standard LMTP LHLO).

#

#ServerPass.Relay1      "secret"

#ServerPass.Relay2      "password"

# If supporting standard LMTP mode, server parameters will need to be specified

# here, as they will not be passed in by the mail server. The ServerIdent

# specifies the 250 response code ident sent back to connecting clients and

# should be set to the hostname of your server, or an alias.

#

# NOTE: If you specify --user in ServerParameters, the RCPT TO will be

#       used only for delivery, and not set as the active user for processing.

#

#ServerParameters       "--deliver=innocent,spam -d %u"

ServerParameters        "--deliver=innocent"

ServerIdent             "dspam.mydomain.com"

# If you wish to use a local domain socket instead of a TCP socket, uncomment

# the following. It is strongly recommended you use local domain sockets if

# you are running the client and server on the same machine, as it eliminates

# much of the bandwidth overhead.

#

ServerDomainSocketPath  "/var/run/dspam/dspam.sock"

#

# Client Mode: If you are running DSPAM in client/server mode, uncomment and

# set these variables. A ClientHost beginning with a / will be treated as

# a domain socket.

#

#ClientHost     /tmp/dspam.sock

#ClientIdent    "secret@Relay1"

#

#ClientHost     127.0.0.1

#ClientPort     24

#ClientIdent    "secret@Relay1"

#

ClientHost      /var/run/dspam/dspam.sock

# RABLQueue: Touch files in the RABL queue

# If you are a reporting streamlined blackhole list participant, you can

# touch ip addresses within the directory the rabl_client process is watching.

#

#RABLQueue       /var/spool/rabl

# DataSource: If you are using any type of data source that does not include

# email-like headers (such as documents), uncomment the line below. This

# will cause the entire input to be treated like a message "body"

#

#DataSource      document

# ProcessorWordFrequency: By default, words are only counted once per message.

# If you are classifying large documents, however, you may wish to count once

# per occurrence instead.

#

#ProcessorWordFrequency  occurrence

# ProcessorURLContext: By default, a URL context is generated for URLs, which

# records their tokens as separate from words found in documents. To use

# URL tokens in the same context as words, turn this feature off.

#

ProcessorURLContext on

# ProcessorBias: Bias causes the filter to lean more toward 'innocent', and

# usually greatly reduces false positives. It is the default behavior of

# most Bayesian filters (including dspam).

#

# NOTE: You probably DONT want this if you're using Markovian Weighting, unless

# you are paranoid about false positives.

#

ProcessorBias on

## EOF
```

Please change in your master.cf the way you call DSPAM to something like this:

```
smtp      inet  n       -       n       -       -       smtpd

   -o content_filter=lmtp:unix:/var/run/dspam/dspam.sock
```

cheers

SteveB

----------

## steveb

Some info:TOE is much more gentile to the database then TEFTThe OSB Tokenizer is much more advanced then chainAdding naive to the Algorithm (beside burton and graham) will help you to get better accuracyUsing LMTP from Postfix to DSPAM is faster then piping from Postfix to the DSPAM binary

// SteveB

----------

## DNAspark99

Wow, thanks alot, there's a lot of 'subtle' information there that isn't really mentioned elsewhere

I'll give it a try and post up any questions I may encounter :p

thanks again!

----------

## steveb

 *DNAspark99 wrote:*   

> Wow, thanks alot, there's a lot of 'subtle' information there that isn't really mentioned elsewhere

 It pays off to use DSPAM for years and hang out on the mailing list  :Smile: 

// Steve

----------

## steveb

Are you open for new stuff? I could post some more info how to get DSPAM more accurate. Just let me know.

Steve

----------

## DNAspark99

well, now that the forums are back online - yea, always open for more suggestions!...

It's finally actually classifying some spam right now - the thing is, it's still in 'testing' mode, so it's not being hit by a lot of 'real world' spam right now as no more than one 'testing' domain is currently pointed at it. However, I've been testing with my own spam, and after maybe 4-5 re-classifications in the web-ui it started to improve in it's identification (and subsequent quarantine) of email.. this wasn't working at all before, now it does... thanks again for that - it's a step towards actual usability now.

Now, I'm left with two immediate issues, well ok, three. 

1: spam reporting for clients.  I'm still not clear on how to set up a universal 'fwd your spam that gets through here' address, aka, spam@mydomain.com, so any client user, say user2813@clientdomain813.com as well as user17@clientdomain632.com can use to report spam... and infact, this may be futile anyways, due to issue #2...

2: as stated, I'm using this as an additional layer of filtering for an existing mail server. (aka 'standalone appliance mode'). So a domains MX entry will be updated to point to this 'dspam.mydomain.com' as required... however, since clients are using the real mailserver, aka 'mail.mydomain.com' for SMTP, the current mailserver seems to be ignoring dns/MX entries for domains it knows are local anyways - so even if I have spam@mydomain.com set up to train dspam, mail forwarded from the client accounts just sees 'mydomain.com' as local, and looks for the 'spam@' user, and if it exits, the mail gets thrown into the mailbox without ever traversing out through the net and back in through the dspam filter (proper MX)... make sense? 

For this, I may try getting tricky with some aliases and forwards, basically using a local forward (spam@mydomain.com) to forward out to an externally hosted mail domain, (spam@externalhostedmx.com), then have that be a forward BACK to 'real_spam@mydomain.com'... which should do a proper MX lookup and run it through the dspam box... I'm just hoping the dspam headers,tokens and whatnot arn't mangled in the forwarding and can still be processed as necessary ... assuming I can't somehow force the mailserver to do proper MX lookups and route accordingly, I'll have to ask the vendor about that one...it should be doable, it's an exim-based SMTP server....I'll be looking into that soon... but then there's the next issue:

3: Notifications - 'first run' notifications go out twice the first time an address recieves mail.. it fails to create it the first time. Any ideas?

Even if you don't have answers or suggestions to any of my questions here, I'm all ears for any additional DSPAM info you may feel like sharing  :Very Happy: 

----------

## steveb

 *DNAspark99 wrote:*   

> well, now that the forums are back online - yea, always open for more suggestions!...
> 
> It's finally actually classifying some spam right now - the thing is, it's still in 'testing' mode, so it's not being hit by a lot of 'real world' spam right now as no more than one 'testing' domain is currently pointed at it. However, I've been testing with my own spam, and after maybe 4-5 re-classifications in the web-ui it started to improve in it's identification (and subsequent quarantine) of email.. this wasn't working at all before, now it does... thanks again for that - it's a step towards actual usability now.

 Cool

 *DNAspark99 wrote:*   

> Now, I'm left with two immediate issues, well ok, three. 
> 
> 1: spam reporting for clients.  I'm still not clear on how to set up a universal 'fwd your spam that gets through here' address, aka, spam@mydomain.com, so any client user, say user2813@clientdomain813.com as well as user17@clientdomain632.com can use to report spam... and infact, this may be futile anyways, due to issue #2...

 This is easy. The problem you will have is that you need to secure this global ham/spam aliases. Do you force your users to use SMTP AUTH? If not, would it be a problem to at least force SMTP AUTH on those two global ham/spam aliases?

 *DNAspark99 wrote:*   

> 2: as stated, I'm using this as an additional layer of filtering for an existing mail server. (aka 'standalone appliance mode'). So a domains MX entry will be updated to point to this 'dspam.mydomain.com' as required... however, since clients are using the real mailserver, aka 'mail.mydomain.com' for SMTP, the current mailserver seems to be ignoring dns/MX entries for domains it knows are local anyways - so even if I have spam@mydomain.com set up to train dspam, mail forwarded from the client accounts just sees 'mydomain.com' as local, and looks for the 'spam@' user, and if it exits, the mail gets thrown into the mailbox without ever traversing out through the net and back in through the dspam filter (proper MX)... make sense? 
> 
> For this, I may try getting tricky with some aliases and forwards, basically using a local forward (spam@mydomain.com) to forward out to an externally hosted mail domain, (spam@externalhostedmx.com), then have that be a forward BACK to 'real_spam@mydomain.com'... which should do a proper MX lookup and run it through the dspam box... I'm just hoping the dspam headers,tokens and whatnot arn't mangled in the forwarding and can still be processed as necessary ... assuming I can't somehow force the mailserver to do proper MX lookups and route accordingly, I'll have to ask the vendor about that one...it should be doable, it's an exim-based SMTP server....I'll be looking into that soon... but then there's the next issue:

 Ouh! Exim! I am more a Postfix person but Exim is good as well. I just don't have enough know-how for Exim.

 *DNAspark99 wrote:*   

> 3: Notifications - 'first run' notifications go out twice the first time an address recieves mail.. it fails to create it the first time. Any ideas?

 What? I don't understand the question. I understand that it goes out twice but I don't understand exactly what you mean with "it fails to create it the first time". Create what? The notification message?

 *DNAspark99 wrote:*   

> Even if you don't have answers or suggestions to any of my questions here, I'm all ears for any additional DSPAM info you may feel like sharing 

 Okay. I would suggest you to use DSPAM's unique feature called groups. With groups you could pretrain a group with ham/spam and then use that group to increase the accuracy for your users. Should I explain you what to do to get that working?

// Steve

----------

## DNAspark99

 *steveb wrote:*   

> 
> 
>  *DNAspark99 wrote:*   Now, I'm left with two immediate issues, well ok, three. 
> 
> 1: spam reporting for clients.  I'm still not clear on how to set up a universal 'fwd your spam that gets through here' address, aka, spam@mydomain.com, so any client user, say user2813@clientdomain813.com as well as user17@clientdomain632.com can use to report spam... and infact, this may be futile anyways, due to issue #2... 
> ...

 

SMTP AUTH is implemented on the mailserver, but it's not mandatory right now, because relay access is also granted by way of successful POP/IMAP auth. For the moment, I'm just making users re-train through the web-ui, now that i've got mod_auth_imap in place... not really 'easy' for the typical 'user', but oh well, it works! How would I go about setting up these global aliases - and would it train per-user or be considered training the 'global' settings of dspam?

 *steveb wrote:*   

> 
> 
>  *DNAspark99 wrote:*   3: Notifications - 'first run' notifications go out twice the first time an address recieves mail.. it fails to create it the first time. Any ideas? What? I don't understand the question. I understand that it goes out twice but I don't understand exactly what you mean with "it fails to create it the first time". Create what? The notification message?

 

Well, so far I've got the dspam_virtual_uids filled with our user accounts... the first time a user's address is mailed, the domain (and/or the) user dir gets created under /var/spool/dspam/data/domainname/username - and I've figured out that since dspam recognizes this is the first time the account is being 'dealt with', it tries to send out the 'first run' notification, somehow before the user dir is created - and since the domain and/or user dir didn't exist, it can't write the /var/spool/dspam/data/domainname/username/username.firstrun file... 

However, the user dir is created sometime immediately after, as the mail does go through, the user dir now exists, so it gets a username.log and username.stats file successfully written. On the 2nd email that goes to the account, the username.firstrun file is written without issue, and the user will not get another 'first run' notification unless this file is removed. So somewhere in there I need to ensure the user dir is created first before it attempts to write the user.firstrun file.... any ideas?

 *steveb wrote:*   

> 
> 
>  *DNAspark99 wrote:*   Even if you don't have answers or suggestions to any of my questions here, I'm all ears for any additional DSPAM info you may feel like sharing  Okay. I would suggest you to use DSPAM's unique feature called groups. With groups you could pretrain a group with ham/spam and then use that group to increase the accuracy for your users. Should I explain you what to do to get that working?
> 
> // Steve

 

Yes please, I've begun to read up on this 'global group' concept, but havn't gotten around to playing with any settings just yet...

----------

## steveb

 *DNAspark99 wrote:*   

> SMTP AUTH is implemented on the mailserver, but it's not mandatory right now, because relay access is also granted by way of successful POP/IMAP auth. For the moment, I'm just making users re-train through the web-ui, now that i've got mod_auth_imap in place... not really 'easy' for the typical 'user', but oh well, it works! How would I go about setting up these global aliases - and would it train per-user or be considered training the 'global' settings of dspam?

 One way would be to add a restriction class into Postfix main.cf:

```
smtpd_restriction_classes =

  ...

  enforce_auth

  ...

smtpd_sender_restrictions =

  ...

  check_recipient_access hash:$config_directory/dspam_retrain_rcpt

  ...

enforce_auth =

  reject_sender_login_mismatch

  permit_sasl_authenticated

  reject
```

The map file (dspam_retrain_rcpt) would be then something like this:

```
spam@example.com      enforce_auth

ham@example.com   enforce_auth
```

This would enforce authentication on the training alias. Now what you need to do is reroute the mail sent to those addresses to DSPAM. For example with a simple transport. In main.cf:

```
transport_maps =

  ...

  hash:$config_directory/dspam_retrain_transport

  ...
```

The map file (dspam_retrain_transport) would be then something like this:

```
spam@example.com      dspam-spam

ham@example.com   dspam-ham
```

In master.cf you add the two new transports:

```
dspam-ham     unix   -      n       n       -        -      pipe

   flags=Rhq user=dspam:mail argv=/usr/bin/dspam

   --user ${sender}

   --class=innocent

   --source=error

   --deliver=spam,innocent

   --stdout

dspam-spam     unix   -      n       n       -        -      pipe

   flags=Rhq user=dspam:mail argv=/usr/bin/dspam

   --user ${sender}

   --class=spam

   --source=error

   --deliver=spam,innocent

   --stdout
```

Now bouncing a spam/ham mail (one with a signature) to one of the above aliases will relearn the mail for the user who sent the mail. The reason for SMTP AUTH is because the alias needs to be protected from the outside world. Else other would be able to mess up with the training. And you don't want that.

 *DNAspark99 wrote:*   

> Well, so far I've got the dspam_virtual_uids filled with our user accounts... the first time a user's address is mailed, the domain (and/or the) user dir gets created under /var/spool/dspam/data/domainname/username - and I've figured out that since dspam recognizes this is the first time the account is being 'dealt with', it tries to send out the 'first run' notification, somehow before the user dir is created - and since the domain and/or user dir didn't exist, it can't write the /var/spool/dspam/data/domainname/username/username.firstrun file... 
> 
> However, the user dir is created sometime immediately after, as the mail does go through, the user dir now exists, so it gets a username.log and username.stats file successfully written. On the 2nd email that goes to the account, the username.firstrun file is written without issue, and the user will not get another 'first run' notification unless this file is removed. So somewhere in there I need to ensure the user dir is created first before it attempts to write the user.firstrun file.... any ideas?

 Aha. I see. I don't know why the directory does not get created. Is it a problem when the notification would come after the second message?

 *DNAspark99 wrote:*   

> Yes please, I've begun to read up on this 'global group' concept, but havn't gotten around to playing with any settings just yet...

 Are you using the preference extension DSPAM offers with MySQL?

----------

## DNAspark99

 *steveb wrote:*   

> 
> 
> Now bouncing a spam/ham mail (one with a signature) to one of the above aliases will relearn the mail for the user who sent the mail. The reason for SMTP AUTH is because the alias needs to be protected from the outside world. Else other would be able to mess up with the training. And you don't want that.
> 
> 

 

I've been playing with this a little, but it seems there's one problem with this. With this being setup in 'appliance' mode, rather than 'full mailserver mode', no clients will be using this machine as their SMTP server - it's just a domains MX filter for incoming - so, they're not authenticating against it anyways and the mail they send to the spam/ham addresses will get refused, no?

```
Sep 13 09:54:33 dspambox postfix/smtpd[9029]: NOQUEUE: reject: RCPT from py-out-1112.google.com[64.233.166.180]: 554 5.7.1 <spam@mydomain.com>: Recipient address rejected: Access denied; from=<mygmailaddress@gmail.com> to=<spam@mydomain.com> proto=ESMTP helo=<py-out-1112.google.com>

```

Supposing for a second that SMTP auth is not required, wouldn't the system take the --user "${sender}" bit and retrain based on that anyways, aka, a spammer would have to specifically target the spam/ham@ addresses and forge headers for each of our users? I can see the value of the auth, but in this setup it may not be practical...  

 *steveb wrote:*   

> 
> 
> Are you using the preference extension DSPAM offers with MySQL?

 

Probably not... is this as simple as the 'Preference' settings in dspam.conf? If so, this is what I've currently got enabled...

```
Preference "trainingMode=TOE"      # TEFT, TUM, TOE

Preference "spamAction=quarantine"      # tag, quarantine, deliver

Preference "signatureLocation=headers"   # 'message' or 'headers'

Preference "spamSubject="

Preference "statisticalSedation=5"   # 0 to 9

Preference "enableBNR=on"      # on, off

Preference "showFactors=off"      # on, off

Preference "enableWhitelist=on"      # on, off

Preference "whitelistThreshold=5"
```

----------

## steveb

 *DNAspark99 wrote:*   

> I've been playing with this a little, but it seems there's one problem with this. With this being setup in 'appliance' mode, rather than 'full mailserver mode', no clients will be using this machine as their SMTP server - it's just a domains MX filter for incoming - so, they're not authenticating against it anyways and the mail they send to the spam/ham addresses will get refused, no?

 But some where the user must send the mail to. So it will be his real original mail server. And there you can catch unauthenticated users. It does not have to be on the system where DSPAM is running.

 *DNAspark99 wrote:*   

> 
> 
> ```
> Sep 13 09:54:33 dspambox postfix/smtpd[9029]: NOQUEUE: reject: RCPT from py-out-1112.google.com[64.233.166.180]: 554 5.7.1 <spam@mydomain.com>: Recipient address rejected: Access denied; from=<mygmailaddress@gmail.com> to=<spam@mydomain.com> proto=ESMTP helo=<py-out-1112.google.com>
> 
> ...

 Yes! Don't allow externals to send to your aliases.

 *DNAspark99 wrote:*   

> Supposing for a second that SMTP auth is not required, wouldn't the system take the --user "${sender}" bit and retrain based on that anyways, aka, a spammer would have to specifically target the spam/ham@ addresses and forge headers for each of our users? I can see the value of the auth, but in this setup it may not be practical...

 An attacker would need to forge the sender address and he would need to find a valid DSPAM signature. Not just any signature. He needs to find a real signature for the user he is trying to forge. The reason for this is, that DSPAM does not relearn the whole message. It just flips the tokens for that signature. So only the signature is needed. The content of the original mail is not needed at all.

 *DNAspark99 wrote:*   

> Probably not... is this as simple as the 'Preference' settings in dspam.conf? If so, this is what I've currently got enabled...
> 
> ```
> Preference "trainingMode=TOE"      # TEFT, TUM, TOE
> 
> ...

 No! This has nothing to do with my question. Please look in your MySQL table if you have the preference table in DSPAM. Execute this and tell me if you got a match:

```
dspam --version 2>&1|grep "\-\-enable\-preferences\-extension"
```

// SteveB

----------

## DNAspark99

 *steveb wrote:*   

> ...But some where the user must send the mail to. So it will be his real original mail server. And there you can catch unauthenticated users. It does not have to be on the system where DSPAM is running.

 

Ok, I'll look into bridging the two machines for smtp auth, as they're currently linked for many other things (dspam db now runs on the mailserver allready, as dspam_virtual_uids is a sql view of the mailservers' db of equivalent values... smtp auth must be in there somewhere

 *steveb wrote:*   

> 
> 
> No! This has nothing to do with my question. Please look in your MySQL table if you have the preference table in DSPAM. Execute this and tell me if you got a match:
> 
> ```
> ...

 

aaah, ok. Yes, it's enabled, and yes, I've got a dspam_preferences table; currently it's holding:

```
+-----+---------------------+------------+

| uid | preference          | value      |

+-----+---------------------+------------+

| 197 | trainingMode        | TOE        | 

| 197 | spamAction          | quarantine | 

| 197 | signatureLocation   | headers    | 

| 197 | spamSubject         |            | 

| 197 | statisticalSedation | 5          | 

| 197 | enableBNR           | on         | 

| 197 | optOut              | off        | 

| 197 | optIn               | off        | 

| 197 | showFactors         | on         | 

| 197 | enableWhitelist     | on         | 

...(next uid, same stuff)

```

----------

## steveb

Okay. Let's create a group called global_merged_user and use that group as merged parent for all users on your server:

```
dspam_admin add preferences global_merged_user "enableBNR" = "on"

dspam_admin add preferences global_merged_user "enableWhitelist" = "off"

dspam_admin add preferences global_merged_user "optIn" = "on"

dspam_admin add preferences global_merged_user "optOut" = "off"

dspam_admin add preferences global_merged_user "showFactors" = "off"

dspam_admin add preferences global_merged_user "signatureLocation" = "header"

dspam_admin add preferences global_merged_user "spamAction" = "tag"

dspam_admin add preferences global_merged_user "spamSubject" = ""

dspam_admin add preferences global_merged_user "statisticalSedation" = "5"

dspam_admin add preferences global_merged_user "trainingMode" = "TOE"

echo "global_merged_user:merged:*">>/var/spool/dspam/group
```

Important is, that we DISABLE the white listing (white listing is okay but not for the merged group user. Enabling it will only influence our training and we don't want that!). Then we don't need much stuff like a changed subject and that other things. We want just to use it for getting better accuracy. That's all. So we disable all not needed flags.

Okay. Now lets train that group/user with the TREC 2006 training data:

```
mkdir -p /tmp/spam_training/

cd /tmp/spam_training/

wget --referer=http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/foo06 http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/trec06p.tgz

tar xzf trec06p.tgz

rm -f trec06p.tgz

cd ./trec06p/full/

dspam_train global_merged_user -i index

cd /tmp/spam_training/

rm -rf ./trec06p

wget --referer=http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/foo06 http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/trec06c.tgz

tar xzf trec06c.tgz

rm -f trec06c.tgz

cd ./trec06c/full/

dspam_train global_merged_user -i index

cd /tmp/spam_training/

rm -rf ./trec06c
```

Okay... Now let's clean the data from unimportant tokens:

```
dspam_clean -s0 -p0 global_merged_user
```

And now let's train it with the SA training corpus:

```
mkdir -p /tmp/spam_training/

cd /tmp/spam_training/

for foo in 20030228_easy_ham 20030228_easy_ham_2 20030228_hard_ham 20030228_spam.tar 20050311_spam_2;do wget http://spamassassin.apache.org/publiccorpus/${foo}.tar.bz2;tar xjf ./${foo}.tar.bz2;rm -f ./${foo}.tar.bz2;done

dspam_train global_merged_user ./spam_2 ./easy_ham_2

dspam_train global_merged_user ./spam_2 ./easy_ham

dspam_train global_merged_user ./spam_2 ./hard_ham

rm -rf ./spam_2 ./easy_ham_2 ./hard_ham ./easy_ham

cd /tmp/spam_training/
```

Clean again the data from unimportant tokens:

```
dspam_clean -s0 -p0 global_merged_user
```

I think you get the point. If you need more stuff (like more links to spam/ham corpi) then let me know.

// SteveB

----------

## DNAspark99

wow, thanks a lot, it took a while to process all that (it still is running through it actually)

my next question is regarding aliases, and how best to handle them. Plugging postfix into the existing db may or may not work, I'm not certain what format postfix requires if it's looking to mysql for it's aliases... any suggestions?

----------

## steveb

I don't understand your question with the aliases. What exactly do you want to do? Or can you rephrase the question? (English is not my native language).

Where do you today maintain your users for Postfix? In MySQL? LDAP? Other?

// SteveB

----------

## DNAspark99

Ok I figured out why my aliases weren't working... alias_maps vs virtual_alias_maps, doh! :p

I guess my next immediate challenge is straightening out 'real users' vs the aliases, so that user preferences and quarantines are kept sane...currently user dirs are created for the aliases as well, and I know I can change this with dspam_admin or one of the other tools, but I've gotta find a way to automate that...

----------

## steveb

Do I understand that right?

Assuming your email is DNAspark99@example.com and you haver aliases belonging to DNAspark99@example.com. For example you have DNA.spark99@example.com, DNAspark.99@example.com,DNAspark@example.com:

DNAspark99@example.com -> DNAspark99@example.com

DNA.spark99@example.com -> DNAspark99@example.com

DNAspark.99@example.com -> DNAspark99@example.com

DNAspark@example.com -> DNAspark99@example.com

In your current setup DSPAM would make for each of the above mentioned addresses a new user directory. Is that right? And you want just to have one and only for DNAspark99@example.com. Is that right?

Where are your mail users currently maintained (not the DSPAM users. I mean the real email users on the other system where you have SA running)? In a SQL database? In LDAP? In normal flat files? Somewhere else?

// SteveB

----------

## DNAspark99

I solved that paritcular issue - basically, dspam+postfix were unaware of aliases vs real accounts, so if an email was addressed to an alias, dspam would do a lookup of it in dspam_virtual_uids - (which is basically a mysql view of the User table in the mail database on the real mailserver) - and if the user was not found, it would be created in dspam_virtual_uids, a userdir would be created in /var/spool/dspam/data/$domain/$address, and the mail would then be handed off to the real mailserver, which would interpet the alias and deliver accordingly.

The creation of these entries in the db, and the resulting userdir creation for aliases, was mostly fixed by using virtual_alias_maps in postfix, set up to query the mail server's db for the aliases... so now, any mail addressed to an alias, postfix+dspam 'know' it's an alias, break it apart into it's real recipients, do a lookup in dspam_virtual_uids for each user and their individual settings, and deliver accordingly. I'm very happy with this behavior, as now aliases don't get their own user dir and settings, instead, each user can maintain their own personal preferences,,,fantastic.... BUT.....

This behavior of dspam creating users in dspam_virtual_uids is... undesirable. If a mail comes in to 'nonexistantuser@domain.com', and that address is NOT a real user OR an alias... dspam will go ahead and create the user+uid in dspam_virtual_uids, and set up a user dir for them, before passing the email through to the real mailserver, which then realizes the address doesn't exist, and creates a bounce announcing that fact.

Is there a way to limit dspam from creating a user if that user doesn't exist in dspam_virtual_uids? Not only is the creation of garbage user directories in /var/spool/dspam/data/$domain undesirable, but as the dspam_virtual_uids is actually a view to the real user table on the mailserver, the creation of 'fake' users there is polluting that real database!

I'm going to talk to one of our sql guys and see if he can remove write access for dspam to that _one_ table, and maybe that will stop the creation of the directories as well - but if there's an internal config setting that can limit dspam's ability to create non-existant users, that would be ideal...

----------

## steveb

I still don't get it. Where do you maintain your real users? In what storage? Is it MySQL? You avoid answering me that part.

How do you create those users? With a tool/frontend/scripts/whatever?

Could you hook into this process with DSPAM? If you can do that, then turn DSPAM into Opt-In mode and when creating a new user add him into the preferences as Opt-In. This will force DSPAM only to consider/filter those users/emails which have opted into DSPAM.

// SteveB

----------

## DNAspark99

Yea, the mailserver is actually an @mail (atmail) packaged solution with a web interface that stores everything in MySQL.

I'll look into plugging dspam into this to create opt-in users if removing write access to that single table doesn't solve the issue...

----------

## DNAspark99

Now, I'm no postfix expert, but I've been playing around with 'check_recipient_access' and the like... but as I suspected, postfix can be used to verify the recipient address, and if it doesn't exist, squash the email before it reaches dspam:

with (at minimum) these settings in main.cf:

```
smtpd_recipient_restrictions = permit_mynetworks,

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_users.cf

        reject

```

...and the sql statement from valid_users.cf being_:

```
SELECT status FROM dspam_valid_users WHERE address = '%s'
```

...and valid_users table looking something like:

```

+-------------------+----------+

| address           | status   |

+-------------------+----------+

| validuser1@mydomain.com | OK | 

+-------------------+----------+

```

..this does indeed reject mail for invalid/non-existant users at the smtp level - BEFORE it gets passed to dspam - so no user dirs or uids are being created!! 

So now I'm just going to fine-tune this a bit in the morning, get this table to use another sql view of the mailservers active users table, duplicate a similar thing for the aliases, and I should finally be able to finally start deploying this for initial client consumption and feedback , yay!

Thanks again for all the help, ... I'm sure I'll have more questions pretty soon here anyways...lol  :Very Happy:  cheers

----------

## DNAspark99

Hrm, ok, ALMOST there... it seems that with smtpd_recipient_restrictions + check_recipient_access enabled, sure it stops non-existant users at the smtp level... but I'm clearly missing something else, because domain aliases are not being interpreted now, where they were before... 

ok... let me try to expalin. this may get confusing tho. I'm sort of thinking out loud here, and if any postfix gurus happen to read this and have a suggesting, great! :p

Using the following example context of required operation (which is working on the real mailserver just fine), there is:

```

domain1.com : primary domain

domain2.com : alias domain2 to domain1

aliasuser  --> user1

user1@domain1.com <- the real address

```

so, under this setup, the user can receive email at 4 different address combinations:

```

user1@domain1.com

user1@domain2.com

aliasuser@domain1.com

aliasuser@domain2.com 

```

Now, with the following settings defined in postfix's main.cf

```

virtual_mailbox_domains = mysql:/etc/postfix/mysql_configs/domains.cf

virtual_mailbox_maps = mysql:/etc/postfix/mysql_configs/atmail_users.cf

virtual_alias_maps = mysql:/etc/postfix/mysql_configs/atmail_aliases.cf

virtual_transport = lmtp:unix:/var/run/dspam/dspam.sock

```

This works as expected; it accepts mail for all 4 of those address+alias combinations, and infact it breaks the aliases down into the 'real' address before handing it off to dspam, which means it's not creating an entry in dspam_virtual_uids or creating a userdir for the aliases, which is great; aliases are treated as such, and there is only one set of setttings for the 'real user'...

however, as mentioned, email to 'nonexistantuser@domain1.com' (or domain2.com) can not be verified or looked up by dspam, so it goes ahead and creates a uid and userdir for that 'nonexistantuser', hands it off to the real mailserver, which of course then bounces the mail for 'unroutable address', as it should. 

With me so far?

To combat this behavior of dspam treating 'nonexistantuser' as a valid-not-yet-setup user, I realized postfix should be able to connect to the MySQL db on the real mailserver and verify the address on it's own. And it can...sort of...

```

smtpd_recipient_restrictions = permit_mynetworks

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_users.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_alias.cf

        reject

```

So with this setup, which basically plugs into mysql, looks up the recipient in either the user table or the alias table, and if valid, it returns an 'OK' status and the mail can be delivered.. this is ideal... but there's one snag:

domain aliases are different, because they're not so direct as just pointing one address to another...instead they use domains, so the alias table does not have an entry for every possible address under an aliased domain... so, in effect, using my example above, since domain2.com is an alias to domain1.com, 'user1@domain2.com' simply does not exist in this table, nor would 'aliasuser@domain2.com', 

The mailserver itself has no issue with these, because it's obviously resolving the domain alias first, ...so this is what I need to figure out for postfix and the smtpd_recipient_restrictions check_recipient_access settings. 

I need it to check domain2.com and resolve it to domain1.com somehow. The wierd thing is that without the smtpd_recipient_restrictions, it DOES this.. I've been reading docs on smtpd_recipient_restrictions and other directives, but havn't come across anything that's worked.

I suspect I may need to do something differenct with virtual_alias_maps or virtual_alias_domains, but so far, no luck, and the documentation is a bit obfuscated on this matter...

----------

## steveb

 *DNAspark99 wrote:*   

> Hrm, ok, ALMOST there... it seems that with smtpd_recipient_restrictions + check_recipient_access enabled, sure it stops non-existant users at the smtp level... but I'm clearly missing something else, because domain aliases are not being interpreted now, where they were before... 
> 
> ok... let me try to expalin. this may get confusing tho. I'm sort of thinking out loud here, and if any postfix gurus happen to read this and have a suggesting, great! :p
> 
> Using the following example context of required operation (which is working on the real mailserver just fine), there is:
> ...

 That is not the way check_recipient_access works. Postfix does multiple lookups when checking a access table. It does this by first checking the full address then the domain part and then the local part.

Could you be so nice and post the table structure and the content with the two example domains/usernames/aliases?

 *DNAspark99 wrote:*   

> The mailserver itself has no issue with these, because it's obviously resolving the domain alias first,

 Wrong! It does fist full address, domain then local part. But the domain alias gets resolved by the cleanup job.

 *DNAspark99 wrote:*   

> ...so this is what I need to figure out for postfix and the smtpd_recipient_restrictions check_recipient_access settings. 
> 
> I need it to check domain2.com and resolve it to domain1.com somehow. The wierd thing is that without the smtpd_recipient_restrictions, it DOES this.. I've been reading docs on smtpd_recipient_restrictions and other directives, but havn't come across anything that's worked.

 Be nice and post the output of postconf -n.

 *DNAspark99 wrote:*   

> I suspect I may need to do something differenct with virtual_alias_maps or virtual_alias_domains, but so far, no luck, and the documentation is a bit obfuscated on this matter...

 So are you.  :Smile: 

Post more info. Post the structure of the MySQL tables you are using. Post the maps you use to access MySQL. etc... The more you post, the better we can help.

// SteveB

----------

## DNAspark99

LOL, pardon my obfuscation, it comes from my confusion :p

postconf -n :

```

command_directory = /usr/sbin

config_directory = /etc/postfix

daemon_directory = /usr/lib/postfix

debug_peer_level = 2

disable_vrfy_command = yes

home_mailbox = .maildir/

html_directory = /usr/share/doc/postfix-2.3.6/html

mail_owner = postfix

mailq_path = /usr/bin/mailq

manpage_directory = /usr/share/man

mydomain = gravit-e.ca

myhostname = spamstop.mydomain.com

mynetworks = 10.100.100.0/24

myorigin = spamstop.mydomain.com

newaliases_path = /usr/bin/newaliases

queue_directory = /var/spool/postfix

readme_directory = /usr/share/doc/postfix-2.3.6/readme

sample_directory = /etc/postfix

sendmail_path = /usr/sbin/sendmail

setgid_group = postdrop

smtpd_helo_required = yes

smtpd_recipient_restrictions = permit_mynetworks        

   check_recipient_access mysql:/etc/postfix/mysql_configs/valid_users.cf  

   check_recipient_access mysql:/etc/postfix/mysql_configs/valid_alias.cf  

   reject

transport_maps = hash:/etc/postfix/hashed_configs/transport     mysql:/etc/postfix/mysql_configs/domains.cf

unknown_local_recipient_reject_code = 550

virtual_alias_maps = mysql:/etc/postfix/mysql_configs/atmail_aliases.cf

virtual_mailbox_domains = mysql:/etc/postfix/mysql_configs/domains.cf

virtual_mailbox_maps = mysql:/etc/postfix/mysql_configs/atmail_users.cf

virtual_transport = lmtp:unix:/var/run/dspam/dspam.sock

```

important bits of the cf files:

domains.cf:

```
query = SELECT Hostname FROM Domains WHERE Hostname = '%s'
```

atmail_users.cf:

```
query = SELECT Account FROM Users WHERE Account='%s'
```

atmail_aliases.cf: 

```
query = SELECT AliasTo FROM MailAliases WHERE AliasName = '%s'
```

valid_users.cf:

```
query = SELECT "OK" AS status FROM dspam_valid_users WHERE address = '%s'
```

valid_aliases.cf:

```
query = SELECT distinct "OK" AS status FROM dspam_valid_alias WHERE address = '%s'
```

The Tables:

1: atmail.Hostname is just a list of ALL domains, including aliased ones, in the 'Hostname' field

```

+----------+--------------+------+-----+---------+-------+

| Field    | Type         | Null | Key | Default | Extra |

+----------+--------------+------+-----+---------+-------+

| Hostname | varchar(255) | NO   | PRI |         |       | 

+----------+--------------+------+-----+---------+-------+
```

2: atmail.Users : this table is basically all the info pertaining to the user's personal info; The ID and the Account=email address are the pertinant ones...

```
+-------------------------------------------+-----+

| Account                                   | id  |

+-------------------------------------------+-----+

| user1@domain1.com                    | 101 |
```

3: atmail.MailAliases : this table has the actual alias, 'AliasName', and the recipients it points to ('AliasTo'). An alias with multiple recipients simply gets multiple entries.

```
+------------+---------------------+------+-----+---------+----------------+

| Field      | Type                | Null | Key | Default | Extra          |

+------------+---------------------+------+-----+---------+----------------+

| AliasName  | varchar(200)        | YES  |     | NULL    |                | 

| AliasTo    | varchar(200)        | YES  |     | NULL    |                | 

+------------+---------------------+------+-----+---------+----------------+

+-------------------------------+-----------------------------+

| AliasName                     | AliasTo                     |

+-------------------------------+-----------------------------+

| @domain2.com                   | @domain1.com                |

| aliasuser@domain1.com              | user1@domain1.com         | 

| aliasuser@domain1.com              | user2@domain1.com         | 

```

the dspam_valid_users and dspam_valid_alias tables, which are just used for 'check_recipient_access', are just views to the respective atmail tables - Users and MailAliases, so it's a really simple list of addresses, nothing more. If the address is found, the select statement returns 'OK', and the email goes through. 

The problem is - those statements won't find anything when they check user1@domain2.com, because that simply isn't in any of the tables...so the OK is not returned, and postifx rejects the mail

----------

## DNAspark99

ding ding ding, and the winner is... more SQL statements! 

(I'm no sql pro, but our programmers sure are handy to talk to about it)

The updated config now includes two more query files:

```

smtpd_recipient_restrictions = permit_mynetworks

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_users.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_alias.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_aliasdomain_users.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_aliasdomain_alias.cf

        reject

```

valid_aliasdomain_users.cf:

```
query = SELECT "OK" AS status FROM Users WHERE Account = CONCAT("%u", (SELECT AliasTo FROM MailAliases WHERE AliasName = '@%d') ) 
```

valid_aliasdomain_alias.cf:

```
query = SELECT distinct "OK" AS status FROM MailAliases WHERE AliasName = CONCAT("%u", (SELECT AliasTo FROM MailAliases WHERE AliasName = '@%d') ) 
```

basically, this builds a query that will do a lookup on the domain portion, and if it's got an domain alias, resolve that from domain2.com to domain1.com, then do a lookup on the translated/real address. 

A similar thing is required for 'aliasuser@domain2.com'... it's a little complicated, but the sql queries are fairly quick, it's efficient, and best of all....it works, for all 4 possible combinations, and refuses non-existant users. Looks like it's properly plugged in to our existing mailserver!

ok NOW I can move on to the next round of testing... 

thanks again for all your help... so far  :Very Happy: 

----------

## DNAspark99

Ok, now back to dspam specific questions...

the 'analysis' page in the web-ui is displaying pics with no history in them, although the 'current' ratio is displayed - this basically appears as a single section of the 'ribbon' floating in the middle of the graph. Ideas?

----------

## DNAspark99

Ok, examining the image URL for the graph, it reveals the following data being passed to it:

```
/cgi-bin/graph.cgi?data=10_10_10&x_label=Hour+of+the+day
```

Looking at the src of graph.cgi ...

```

  my($spam, $nonspam, $period) = split(/\_/, $FORM{'data'});

  @spam_day = split(/\,/, $spam);

  @nonspam_day = split(/\,/, $nonspam);

  @period = split(/\,/, $period);

```

So it expects 'more' data seperated by commas... ok, so as an example, I tried playing with random data plugged in to the data variable:

```
/cgi-bin/graph.cgi?data=77,42,63,17,87_22,15,04,72,16_09,10,11,12,01
```

and surprise, it starts to look proper. So tommorow morning I'll begin looking at why dspam.cgi isn't passing enough data to graph.cgi .. I'm just wondering if anyone else has noticed this behavior - not like I changed any of the cgi scripts ... yet...

using www-apps/dspam-web-3.8.0

----------

## magic919

It's a fault in the webgui.  The file dspam.cgi

```

    $DATA{$hk}=join("_",

                join(",",@{$lst{spam}}    || [0]),

                join(",",@{$lst{nonspam}} || [0]),

                join(",",@{$lst{title}}   || [0]),

        );

```

Should be

```

    $DATA{$hk}=join("_",

                join(",",@{$lst{spam}}),

                join(",",@{$lst{nonspam}}),

                join(",",@{$lst{title}}),

        );

```

Then the graphs will work.

Make sure you can preview messages ok.  That's another fault I've seen.

----------

## DNAspark99

Hey, that did the trick, sweet....saved me some hackin that it did...thanks!

and previewing seems to work fine

----------

## magic919

No probs.  Happened a few versions back and drove me mad as I run 5-6 servers with it.

Then they moved the mysql libs to /var/lib/dspam/ and messed me about again.

I'm watching the thread for any nuggets from Stevee.

----------

## steveb

This is fixed in the www-apps/dspam-web-3.8.0 ebuild. The patch is in the file patches/14_all_cgi-fixes.patch and has this part in it:

```
@@ -546,9 +550,9 @@

       }

     }

     $DATA{$hk}=join("_",

-               join(",",@{$lst{spam}}),

-               join(",",@{$lst{nonspam}}),

-               join(",",@{$lst{title}}),

+               join(",",@{$lst{spam}}    || [0]),

+               join(",",@{$lst{nonspam}} || [0]),

+               join(",",@{$lst{title}}   || [0]),

        );

   }
```

// SteveBLast edited by steveb on Tue Sep 18, 2007 10:22 am; edited 1 time in total

----------

## steveb

 *magic919 wrote:*   

> I'm watching the thread for any nuggets from Stevee.

  :Very Happy:  Magic919! How can I help? Anything burning over there in UK?

I just had this weekend a massive attack on my infrastructure here in Switzerland. Damn spam bots. Lucky me that I had added enough restrictions into Postfix and that I had fail2ban knocking them off the SMTP port for 240 to 300 seconds (depending where they hit first). Without that I would have been probably off line. DSPAM on the other hand just works. But I am looking for alternatives. Not that I don't like DSPAM but I think it's future is not assured. Jonathan sold DSPAM to a company and I am afraid that we will again not see a update for a long time (like the last one from 3.6.8 to 3.8.0 which took one year). The DSPAM mailing list is +/- death. Not much going on there. No patches, no news, nothing.

// SteveB

----------

## DNAspark99

 *steveb wrote:*   

> 
> 
> I just had this weekend a massive attack on my infrastructure here in Switzerland. Damn spam bots. Lucky me that I had added enough restrictions into Postfix and that I had fail2ban knocking them off the SMTP port for 240 to 300 seconds (depending where they hit first). Without that I would have been probably off line. DSPAM on the other hand just works. But I am looking for alternatives. Not that I don't like DSPAM but I think it's future is not assured. Jonathan sold DSPAM to a company and I am afraid that we will again not see a update for a long time (like the last one from 3.6.8 to 3.8.0 which took one year). The DSPAM mailing list is +/- death. Not much going on there. No patches, no news, nothing.
> 
> // SteveB

 

Any recommended settings to help nail down postfix from spammers? I've got the following so far:

```

smtpd_helo_required = yes

disable_vrfy_command = yes

smtpd_recipient_restrictions = permit_mynetworks

        reject_invalid_hostname 

        reject_non_fqdn_hostname 

        reject_non_fqdn_sender 

        reject_non_fqdn_recipient 

        reject_unknown_sender_domain 

        reject_unknown_recipient_domain

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_users.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_alias.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_aliasdomain_users.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_aliasdomain_alias.cf

        reject

```

also, I notice dspam doesn't log much in the way of it's results (atleast without debug turned on)

How best to log + track attempted sources of spam?

----------

## magic919

 *DNAspark99 wrote:*   

> 
> 
> also, I notice dspam doesn't log much in the way of its results (atleast without debug turned on)
> 
> How best to log + track attempted sources of spam?

 

I'll leave Steve to reply on Postfix, except to say make sure they can't hello with your hostname or IP.  And I'll say have a look at the RABL in DSPAM.  I use it the log IP of detected spam.  Then you can stick them in your firewall if you choose.

----------

## steveb

 *DNAspark99 wrote:*   

> also, I notice dspam doesn't log much in the way of it's results (atleast without debug turned on)
> 
> How best to log + track attempted sources of spam?

 Debug has not to be turned on to see results from DSPAM. Turn on the TrackSources option in dspam.conf to see some results:

```
TrackSources spam nonspam virus
```

// SteveB

----------

## steveb

Phuuu... My Postfix configuration has a lot more options turned on. It would be too much to explain everything in detail. But basically I would suggest you to enable SPF checks, DKIM checks and DKIM signing, Greylisting (I would suggest SQLgrey), etc...

I would as well disable delayed rejects and set a sleep (I use 5 seconds) on the EHLO/HELO stage. I would as well block backscatter on non valid addresses (I would exclude the catch all addresses from the valid addresses for this check). Then I would reject senders on your domain not authenticated (see reject_sender_login_mismatch option in Postfix).

I would NOT use the RBL/RHBL feature from Postfix. Not that it is bad or not working but the problem is that it is a all or nothing situation. And I can not life with that. So I am using policyd-weight for a weighted check. It offers me much bigger control over blocking blacklisted IPs/domains.

I use as well postfwd for complex rules/situations.

If you want I could post the relevant part in this thread but I beg you to not follow it blindly. It is not that the configuration is bad but it is much more better if you look at the configuration and then learn something from it instead of blindly copying it. This does not mean that you can not copy it 1 to 1. But I think you own yourself to know what you are using on your infrastructure. And I will offer as much time as needed to explain everything. I would like you to understand, because when you understand then you can evolve and this has more value then any copied configuration.

// SteveB

----------

## DNAspark99

Thanks for the info. And yes, I'd much rather know what I'm implementing, it makes troubleshooting down the line a lot easier  :Smile: 

I'll look into it and let you know when I've got (the inevitable) questions...

----------

## steveb

Okay. Just let me know when and what I should post.

// SteveB

----------

## DNAspark99

well I thought I had the bases covered, but apparently not. There's actually a 5th address possibility I forgot about!

aliasuser@domain1.com --> externalmailuser@gmail.com

This sort of alias gets validated by my postfix/sql setup... mail gets passed to dspam... and since 'externalmailuser@gmail.com' isn't in dspam_virtual_uids, it gets thrown in, and a userdir is created .. gah!

so close, and yet, so far!...

dspam's opt-in / opt-out options... I don't suppose they can be configured to look to a SQL table, can it? This is probably going to require some black magic :p

edit: ok discoverd the 'Opt in' policy restricts creation of the users in dspam_virtual_uids, and you need to opt-in per-user in /var/spool/dspam/data/domain/user.dspam

still some issues I need to sort out here... like how to opt-in automatically for users

----------

## steveb

I have Opt-In on my setup. Every time I register a new user I Opt-In the user by either executing (look at the optIn on and optOut off statements):

```
/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} enableBNR on

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} enableWhitelist on

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} optIn on

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} optOut off

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} showFactors off

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} signatureLocation message

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} spamAction tag

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} spamSubject [SPAM]

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} statisticalSedation 5

/usr/bin/dspam_admin ch pref ${MY_MAIL_USER_NAME}@${MY_DOMAIN} trainingMode TOE
```

Or I add him/her with SQL statements directly into the MySQL DSPAM table.

// SteveB

----------

## DNAspark99

Oddly, the optIn preference seem to be ignored from the database... but not the file in /var/spool/dspam/opt-in/domain/user

and with an alias that forwards from alias@domain1.com to user@externaldomain.com, postfix is resolving the alias and disallowing relay of the mail, since externaldomain.com is not in the vhosted domains list - isn't there a way to allow relaying once the recipient address (orig_to) is verified?

```

Sep 18 23:47:02 dspam postfix/lmtp[16659]: E328012804B: to=<user@externaldomain.com>, orig_to=<alias@domain1.com>, relay=spamstop.mydomain.com[/var/run/dspam/dspam.sock], delay=0.15, delays=0.02/0/0/0.13, dsn=5.3.0, status=bounced (host spamstop.mydomain.com[/var/run/dspam/dspam.sock] said: 530 5.3.0 <user@externaldomain.com> Fatal: 550 relay not permitted (in reply to end of DATA command))

```

EDIT: well, not sure if this is the ideal way, but I solved the relay issue by using:

```
relayhost=[IP of real mailserver]
```

and adjusting the mailserver to allow relaying from the dspam machine. It works, and because dspam's postfix is making strict checks on the validity of incoming addresses, it seems to be safe, only allowing mail 'through' if it's to a valid vhost address or alias...

----------

## magic919

 *steveb wrote:*   

> This is fixed in the www-apps/dspam-web-3.8.0 ebuild. The patch is in the file patches/14_all_cgi-fixes.patch and has this part in it:
> 
> ```
> @@ -546,9 +550,9 @@
> 
> ...

 

Wouldn't this patch be breaking it?  Looks like it adds the bits that I edit out to make it work.

----------

## DNAspark99

 *magic919 wrote:*   

> 
> 
> Wouldn't this patch be breaking it?  Looks like it adds the bits that I edit out to make it work.

 

Yea that made me look twice too - I examined the patch, and sure enough, that's what it's doing. Reversing that bit (as you suggested) got my graphs working again!

EDIT: Ok, more wierd behavior! gah...this time it's really flaky behavior at that:

I got dspam to pay attention to database optIn by removing /var/spool/dspam/default.prefs file - don't ask me why it was ignoring the database settings with this file here. But if optIn by database works, GREAT, I can use MySQL triggers to create the setting to automatically opt-in for all valid users - which means I'm ooo so close to having a working smtp appliance with automated filtering for all valid users and rejecting connections for any others... BUT, with dspam, there's always _something_ , so it seems!...

Ok, I shoot user1@domain1.com an email, and this user's uid (648) has 'optIn on' in the dspam_preferences table, set via:

```
dspam_admin ch pref user1@domain1.com optIn on
```

I've got debug enabled right now, so I can see the loading of preferences:

```

29049: [09/19/2007 11:36:37] DSPAM Instance Startup

29049: [09/19/2007 11:36:37] input args: dspam --deliver=innocent 

29049: [09/19/2007 11:36:37] pass-thru args: /usr/bin/procmail 

29049: [09/19/2007 11:36:37] processing user user1@domain1.com

29049: [09/19/2007 11:36:37] uid = 0, euid = 0, gid = 0, egid = 1001

29049: [09/19/2007 11:36:37] loading preferences for user user1@domain1.com

29049: [09/19/2007 11:36:37] Loading preferences for uid 648

29049: [09/19/2007 11:36:37] Loading preferences for uid 0

29049: [09/19/2007 11:36:37] default preferences empty. reverting to dspam.conf preferences.

29049: [09/19/2007 11:36:37] Loading preferences from dspam.conf

29049: [09/19/2007 11:36:37] using /var/spool/dspam/opt-in/domain1.com/user1.dspam as path

29049: [09/19/2007 11:36:37] using /var/spool/dspam/opt-out/domain1.com/user1.nodspam as path

29049: [09/19/2007 11:36:37] adding user to merged group global_merged_user

...

```

Mail reaches the recipient and I can see dspam has done it's work. woo!

However, after a while, a few minutes of idleness, something 'goes away'... and without any changes on my part, another email to the same user results in this: 

```

29049: [09/19/2007 11:41:41] DSPAM Instance Startup

29049: [09/19/2007 11:41:41] input args: dspam --deliver=innocent 

29049: [09/19/2007 11:41:41] pass-thru args: /usr/bin/procmail 

29049: [09/19/2007 11:41:41] processing user user1@domain1.com

29049: [09/19/2007 11:41:41] uid = 0, euid = 0, gid = 0, egid = 1001

29049: [09/19/2007 11:41:41] loading preferences for user user1@domain1.com

29049: [09/19/2007 11:41:41] default preferences empty. reverting to dspam.conf preferences.

29049: [09/19/2007 11:41:41] Loading preferences from dspam.conf

29049: [09/19/2007 11:41:41] using /var/spool/dspam/opt-in/domain1.com/user1.dspam as path

29049: [09/19/2007 11:41:41] using /var/spool/dspam/opt-out/domain1.com/user1.nodspam as path

...

```

And the mail goes through, UNTOUCHED by dspam - because suddenly dspam doesn't want to retrieve this users UID, so it can't be bothered to look up if the user has optIn = on in his preferences... 

so now I'm suspecting dspam's connection to mysql (which is on a remote machine) is flaky...and there IS this in mail.log:

```

Sep 19 11:54:15 dspam dspam[29189]: unable to initialize tools context

Sep 19 11:54:15 dspam dspam[29189]: unable to initialize tools context

Sep 19 11:54:15 dspam dspam[29189]: unable to initialize tools context

```

Oddly, when OptIn was the default system-wide setting for dspam, I'd see similar messages, with the addition of the final line, but never much of it at this state, because I guess it would re-connect to mysql and retrieve user prefs, and deliver the message...

```

Sep 19 11:58:57 dspam dspam[29311]: unable to initialize tools context

Sep 19 11:58:57 dspam dspam[29311]: unable to initialize tools context

Sep 19 11:58:57 dspam dspam[29311]: unable to initialize tools context

Sep 19 11:58:57 dspam dspam[29311]: Unable to attach DSPAM context. Retrying.

```

EDIT2:

foreach connection in dspam.conf's 'MySQLConnectionCache' setting (mine is set to 10), there's a connection to mysql when dspam starts up:

```

netstat -anp | grep 3306

tcp        0      0 10.100.100.72:43294     10.100.100.73:3306      ESTABLISHED 30396/dspam

...

tcp        1      0 10.100.100.72:43294    10.100.100.73:3306      CLOSE_WAIT  30596/dspam         

```

once these connections time out according to mysql's 'wait_timeout', dspam is unable to re-initiate the connection... hence why dspam in opt-in mode suddenly 'stops' working when important prefs like opt-in are stored in that db ... hrm...dspam bug? :p

looks like I'm not the only one to come across this:

http://mailing-list.nuclearelephant.com/1958.html

----------

## DNAspark99

only sane workaround at this time is to not rely on the dspam daemon. I've made a bug report to the dspam-dev list, hopefully someone's listening... 

In the meantime, /etc/postfix/master.cf is reverted to:

```

smtp      inet  n       -       n       -       -       smtpd

#   -o content_filter=lmtp:unix:/var/run/dspam/dspam.sock

    -o content_filter=dspam:

dspam     unix  -       n       n       -       10      pipe

    flags=Rhqu user=dspam argv=/usr/bin/dspam --deliver=innocent --user ${recipient} -i -f ${sender} -- ${recipient}

```

----------

## steveb

 *DNAspark99 wrote:*   

> 
> 
> ```
> 29049: [09/19/2007 11:36:37] DSPAM Instance Startup
> 
> ...

 Why procmail? Are you using procmail for processing mail?

 *DNAspark99 wrote:*   

> 
> 
> ```
> 29049: [09/19/2007 11:36:37] processing user user1@domain1.com
> 
> ...

 This surprizes me. It is loading the preferences for uid 0. Who is uid 0?

 *DNAspark99 wrote:*   

> 
> 
> ```
> 29049: [09/19/2007 11:36:37] default preferences empty. reverting to dspam.conf preferences.
> ```
> ...

 What? Why is that empty? Ahh... I see. You removed /var/spool/dspam/default.prefs. Sorry. You should put it back.

 *DNAspark99 wrote:*   

> 
> 
> ```
> 29049: [09/19/2007 11:36:37] Loading preferences from dspam.conf
> 
> ...

 All okay here.

// SteveB

----------

## steveb

I don't think it is the MySQL connection. Could you post the output of:

```
dspam --version
```

// SteveB

----------

## steveb

 *magic919 wrote:*   

> Wouldn't this patch be breaking it?  Looks like it adds the bits that I edit out to make it work.

 

No. Including the || [0] inside the join would join the array (@{$lst{xxx}}) or the empty array ([0]) if the first array is not existing.

// SteveB

----------

## DNAspark99

 *steveb wrote:*   

> Why procmail? Are you using procmail for processing mail?

 

No, I'll comment out 'TrustedDeliveryAgent' from dspam.conf

 *steveb wrote:*   

> 
> 
>  *DNAspark99 wrote:*   
> 
> ```
> ...

 

There is no uid 0 - global_merged_user has a uid of 1 - I'm wondering if that's the 'default prefs' in the database it's looking to fall back on?

 *steveb wrote:*   

> 
> 
>  *DNAspark99 wrote:*   
> 
> ```
> ...

 

The existance of that file seems to override the users' optIn = on in the database (and yes I've tried altering the file itself). I'll play with that some more thoughLast edited by DNAspark99 on Thu Sep 20, 2007 6:40 pm; edited 1 time in total

----------

## DNAspark99

 *steveb wrote:*   

> I don't think it is the MySQL connection. Could you post the output of:
> 
> ```
> dspam --version
> ```
> ...

 

```
dspam --version

DSPAM Anti-Spam Suite 3.8.0 (agent/library)

Copyright (c) 2002-2006 Jonathan A. Zdziarski

http://dspam.nuclearelephant.com

DSPAM may be copied only under the terms of the GNU General Public License,

a copy of which can be found with the DSPAM distribution kit.

Configuration parameters:  '--prefix=/usr' '--host=x86_64-pc-linux-gnu' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib' '--enable-long-usernames' '--enable-syslog' '--enable-domain-scale' '--enable-debug' '--enable-bnr-debug' '--enable-virtual-users' '--enable-preferences-extension' '--with-mysql-includes=/usr/include/mysql' '--with-mysql-libraries=/usr/lib64/mysql' '--with-storage-driver=hash_drv,mysql_drv' '--with-dspam-home=/var/spool/dspam' '--sysconfdir=/etc/mail/dspam' '--enable-daemon' '--disable-ldap' '--disable-clamav' '--with-dspam-group=dspam' '--with-dspam-home-group=dspam' '--with-dspam-mode=2511' '--with-logdir=/var/log/dspam' '--libdir=/usr/lib64' '--build=x86_64-pc-linux-gnu' 'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu' 'CFLAGS=-march=opteron -O2 -pipe -Wl,-z,now' 'CXXFLAGS=-march=opteron -O2 -pipe -Wl,-z,now'

```

It really seems like the mysql connection is the issue - it works great, initially! Keep in mind, i'm using a remote MySQL connection (on the same LAN), and as mentioned, I can monitor the status of that connection in both mysql and netstat, and after exactly 2 minutes of inactivity (120 seconds is specified as mysql timeout on mysql server for good reason due to other apps that require it), the connection closes, and it is NOT re-opened by dspam automatically as needed (the way it should behave!), only on dspam restart. Non-daemon mode for dspam opens the connections on startup of every instance, and there is no problem with loading prefs. 

Oddly, I just tested with opt-out mode, and am finding even more wierd behavior... it seems that after the timeout, lookup for user prefs fail, it's not able to find them (connect to the db)... but then it DOES re-connect to the db sometime after... but it's still basically ignoring the user-prefs either way... this is bizzare!

edit: the same behavior as the opt-out mode can be achieved with opt-in mode, by creating one of the /var/spool/dspam/opt-in/domain/user.dspam files - ie; it can now re-connect to the db, but only at some point after the user-prefs lookup...

----------

## DNAspark99

OK, FIXED! and proof that it is mysql connection issues not being re-established!

see http://dev.mysql.com/doc/refman/5.0/en/auto-reconnect.html:

 *Quote:*   

> 
> 
> in MySQL 5.0, auto-reconnect was enabled by default until MySQL 5.0.3, and disabled by default thereafter. The MYSQL_OPT_RECONNECT option is available as of MySQL 5.0.13.
> 
> 

 

Consequently, I'm running mysql-5.0.26

It took a LOT of trial and error, compiling, debuging, etc...but now.. the fix - untar the src of dspam:

around line 2491 of dspam-3.8.0/src/mysql_drv.c - added the following two lines:

```

  my_bool reconnect = 1;

  mysql_options(dbh, MYSQL_OPT_RECONNECT, &reconnect);

```

patch:

```

--- dspam-3.8.0.orig/src/mysql_drv.c    2006-09-21 11:25:19.000000000 -0700

+++ dspam-3.8.0/src/mysql_drv.c 2007-09-21 09:31:58.000000000 -0700

@@ -2488,7 +2488,10 @@

       ("_ds_init_storage: mysql_init: unable to initialize handle to database");

     goto FAILURE;

   }

-

+  /*fix for losing idle mysql connection after timeout*/

+  my_bool reconnect = 1;

+  mysql_options(dbh, MYSQL_OPT_RECONNECT, &reconnect); 

+  /*endfix*/

   if (hostname[0] == '/')

   {

     if (!mysql_real_connect (dbh, NULL, user, password, db, 0, hostname, 

```

configure with same opts as 'gentoo dspam' (from 'dspam --version' -but minus CFLAGS and CXXFLAGS reported, as that wouldn't let it compile by hand for me):

```

./configure --prefix=/usr --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --enable-long-usernames --enable-syslog --enable-domain-scale --enable-debug --enable-bnr-debug --enable-virtual-users --enable-preferences-extension --with-mysql-includes=/usr/include/mysql --with-mysql-libraries=/usr/lib64/mysql --with-storage-driver=hash_drv,mysql_drv --with-dspam-home=/var/spool/dspam --sysconfdir=/etc/mail/dspam --enable-daemon --disable-ldap --disable-clamav --with-dspam-group=dspam --with-dspam-home-group=dspam --with-dspam-mode=2511 --with-logdir=/var/log/dspam --libdir=/usr/lib64 --build=x86_64-pc-linux-gnu build_alias=x86_64-pc-linux-gnu host_alias=x86_64-pc-linux-gnu
```

*note* src MUST be built with same opts as 'gentoo' dspam, I encountered segfaults when using mysql_drv built with 'debug' in a dspam install that was built without it... as long as they match there doesn't seem to be an issue*

Then make the src... (but you don't need to install it, it actually won't put the sql libs in the proper place anyways)

just cp the sql driver relevant libs to the proper location by hand:

```

cp src/.libs/libmysql_drv.a /usr/lib/dspam/

cp src/.libs/libmysql_drv.la /usr/lib/dspam/

cp src/.libs/libmysql_drv.so.7.0.0 /usr/lib/dspam/

```

then restart dpsam... 

and bingo! dspam can re-establish a lost connection to the remote database. FINALLY.

User prefs are looked up no matter what time it is or how long the connection has been idle. woo! 

Now I can start worrying about the other stuff... like automatically setting the db 'opt-in' preference for every user, including newly created ones...

but first... how to go about submitting this as a patch?  :Razz: 

I'm going to work on this as an overlay and try to get portage to auto-patch this into dspam...

----------

## steveb

The patch you posted will not work on every MySQL version. How about this?

dspam-3.8.0-r7.ebuild

```
--- /usr/portage/mail-filter/dspam/dspam-3.8.0-r6.ebuild        2007-09-10 06:06:01.000000000 +0200

+++ ./dspam-3.8.0-r7.ebuild     2007-09-21 21:46:21.126370750 +0200

@@ -1,6 +1,6 @@

 # Copyright 1999-2007 Gentoo Foundation

 # Distributed under the terms of the GNU General Public License v2

-# $Header: /var/cvsroot/gentoo-x86/mail-filter/dspam/dspam-3.8.0-r6.ebuild,v 1.1 2007/09/10 04:06:01 mrness Exp $

+# $Header: Exp $

 WANT_AUTOCONF="latest"

 WANT_AUTOMAKE="latest"

@@ -68,6 +68,12 @@

        EPATCH_SUFFIX="patch"

        epatch "${WORKDIR}"/patches

+       # Add MySQLReconnect option

+       epatch "${FILESDIR}"/${PN}-${PV}-mysql_reconnect.patch

+

+       # Fix domain blocklisting

+       epatch "${FILESDIR}"/${PN}-${PV}-blocklist.patch

+

        # Fix Lazy bindings

        append-flags $(bindnow-flags)
```

files/dspam-3.8.0-mysql_reconnect.patch

```
--- dspam-3.8.0/src/mysql_drv.c 2006-09-21 20:25:19.000000000 +0200

+++ dspam-3.8.0-new/src/mysql_drv.c     2007-09-21 22:12:17.092607928 +0200

@@ -2489,6 +2489,16 @@

     goto FAILURE;

   }

+#if MYSQL_VERSION_ID >= 50013

+  /* enable automatic reconnect for MySQL >= 5.0.13 */

+  snprintf(attrib, sizeof(attrib), "%sReconnect", prefix);

+  if (_ds_match_attribute(CTX->config->attributes, attrib, "true"))

+  {

+      my_bool reconnect = 1;

+      mysql_options(dbh, MYSQL_OPT_RECONNECT, &reconnect);

+  }

+#endif

+

   if (hostname[0] == '/')

   {

     if (!mysql_real_connect (dbh, NULL, user, password, db, 0, hostname,
```

files/dspam-3.8.0-blocklist.patch

```
--- src/dspam.c 2006-12-12 16:33:45.000000000 +0100

+++ src/dspam.c.new     2007-08-16 19:14:06.155370137 +0200

@@ -3774,7 +3774,7 @@

     char buf[256];

     if (heading) {

       char *dup = strdup(heading);

-      char *domain = strchr(dup, '@');

+      char *domain = strrchr(dup, '@');

       if (domain) {

         int i;

         for(i=0;domain[i] && domain[i]!='\r' && domain[i]!='\n'
```

Then in dspam.conf just add this to enable automatic reconnection to MySQL:

```
MySQLReconnect          true
```

What do you think?

// SteveB

----------

## DNAspark99

Yea I figured my 'fix' was a bit too 'dirty', (I don't even know C! so you can imagine how tedious it is to troubleshoot this!) 

but I've been working on improving it and fixing up an ebuild to keep the fix 'sane'... but I'll use yours instead - it works, and is even a configurable option now...perfect, thanks!!!

Actually, I'm still trying to troubleshoot one more issue, maybe you can help:

the '2 first run notifications' bug... 

I believe somewhere around line 4008 - 4025 of src/dspam.c there needs to be some sort of check/creation performed on the /var/spool/dspam/data/domain/user dir before sending + creation the firstrun notification.

The current logic for a new user seems to run something like:

1st mail:

check for firstrun file: does not exist, send notification, create file (fails - no user directory!)

*then* creates userdir

2nd mail:

check for firstrun file: does not exist, send notification, create file (success! - no more firstrun notifications)

This is reproducible for every new user or if I remove their /var/spool/dspam/data/domain/user dir

Any insight you can offer?

----------

## steveb

dspam-3.8.0-r7.ebuild

```
--- /usr/portage/mail-filter/dspam/dspam-3.8.0-r6.ebuild        2007-09-10 06:06:01.000000000 +0200

+++ ./dspam-3.8.0-r7.ebuild     2007-09-21 23:45:04.199535000 +0200

@@ -1,6 +1,6 @@

 # Copyright 1999-2007 Gentoo Foundation

 # Distributed under the terms of the GNU General Public License v2

-# $Header: /var/cvsroot/gentoo-x86/mail-filter/dspam/dspam-3.8.0-r6.ebuild,v 1.1 2007/09/10 04:06:01 mrness Exp $

+# $Header: Exp $

 WANT_AUTOCONF="latest"

 WANT_AUTOMAKE="latest"

@@ -68,6 +68,15 @@

        EPATCH_SUFFIX="patch"

        epatch "${WORKDIR}"/patches

+       # Add MySQLReconnect option

+       epatch "${FILESDIR}"/${PN}-${PV}-mysql_reconnect.patch

+

+       # Fix domain blocklisting

+       epatch "${FILESDIR}"/${PN}-${PV}-blocklist.patch

+

+       # Fix non existent path for notification

+       epatch "${FILESDIR}"/${PN}-${PV}-notification_path.patch

+

        # Fix Lazy bindings

        append-flags $(bindnow-flags)

@@ -256,6 +265,12 @@

                        -i "${D}"/${CONFDIR}/dspam.conf

        fi

+       # Add MySQLReconnect option after MySQLCompress

+       if ! ( grep -iqe "^#*MySQLReconnect[[:space:]]" "${D}"/${CONFDIR}/dspam.conf ) ; then

+               sed -e "s:^\(\(#*\)MySQLCompress[\t ].*\):\1\n\2MySQLReconnect\t\ttrue:" \

+                       -i "${D}"/${CONFDIR}/dspam.conf

+       fi

+

        # installs the notification messages

        # -> The documentation is wrong! The files need to be in ./txt

        echo "Scanned and tagged as SPAM with DSPAM ${PV} by Your ISP.com">"${T}"/msgtag.spam
```

files/dspam-3.8.0-notification_path.patch

```
--- ./dspam-3.8.0/src/dspam.c   2006-12-12 16:33:45.000000000 +0100

+++ ./dspam-3.8.0-new/src/dspam.c       2007-09-22 00:29:55.025347433 +0200

@@ -4014,6 +4014,7 @@

       LOGDEBUG("sending firstrun.txt to %s (%s): %s",

                CTX->username, filename, strerror(errno));

       send_notice(ATX, "firstrun.txt", ATX->mailer_args, CTX->username);

+      _ds_prepare_path_for(filename);

       file = fopen(filename, "w");

       if (file) {

         fprintf(file, "%ld\n", (long) time(NULL));

@@ -4038,6 +4039,7 @@

       LOGDEBUG("sending firstspam.txt to %s (%s): %s",

                CTX->username, filename, strerror(errno));

       send_notice(ATX, "firstspam.txt", ATX->mailer_args, CTX->username);

+      _ds_prepare_path_for(filename);

       file = fopen(filename, "w");

       if (file) {

         fprintf(file, "%ld\n", (long) time(NULL));

@@ -4063,6 +4065,7 @@

       if (stat(qfile, &s)) {

         FILE *f;

+        _ds_prepare_path_for(qfile);

         f = fopen(qfile, "w");

         if (f != NULL) {

           fprintf(f, "%ld", (long) time(NULL));
```

// SteveB

Edit:Changed to prepare the path just before opening the file in write modeChanged ebuild to include SED command to modify dspam.conf for including MySQLReconnect optionLast edited by steveb on Fri Sep 21, 2007 10:40 pm; edited 2 times in total

----------

## DNAspark99

LOL, doh, I'm a hair too slow! I figured it out on my own tho  :Razz: 

This is the patch I came up with:

```

diff -Naur dspam-3.8.0.orig/src/dspam.c dspam-3.8.0/src/dspam.c

--- dspam-3.8.0.orig/src/dspam.c        2006-12-12 07:33:45.000000000 -0800

+++ dspam-3.8.0/src/dspam.c     2007-09-21 15:13:58.000000000 -0700

@@ -4014,6 +4014,9 @@

       LOGDEBUG("sending firstrun.txt to %s (%s): %s",

                CTX->username, filename, strerror(errno));

       send_notice(ATX, "firstrun.txt", ATX->mailer_args, CTX->username);

+      /*write firstrun file for new users after sending once*/

+      _ds_prepare_path_for(filename);

+      /*endfix*/

       file = fopen(filename, "w");

       if (file) {

         fprintf(file, "%ld\n", (long) time(NULL));

@@ -4038,6 +4041,9 @@

       LOGDEBUG("sending firstspam.txt to %s (%s): %s",

                CTX->username, filename, strerror(errno));

       send_notice(ATX, "firstspam.txt", ATX->mailer_args, CTX->username);

+      /*write firstspam file for new users after sending once*/

+      _ds_prepare_path_for(filename);

+      /*endfix*/

       file = fopen(filename, "w");

       if (file) {

         fprintf(file, "%ld\n", (long) time(NULL));

```

Pretty similar, eh?  :Razz:  (not bad for not knowing any C!)

I actually tried something similar to your placement first, but found it wasn't sending notifications at all, because it was creating the file before the test, ..so I moved it back to just after the firstrun notification is sent out... now it sends the msg, writes the file, and doesn't have any issues... works as desired..

thanks for all the help to get me this far!! 

ok, NOW I can start to use this... almost... just a few more tweaks to configs first  :Smile: 

----------

## steveb

Sorry. I did not tested it. Fixed the above patch to take care of your last input.

// SteveB

----------

## DNAspark99

 *steveb wrote:*   

> Sorry. I did not tested it. Fixed the above patch to take care of your last input.
> 
> // SteveB

 

cool, thanks - will these be included in a future update? I've never submitted a bugfix before, not really sure how it's done?

----------

## steveb

 *DNAspark99 wrote:*   

> cool, thanks - will these be included in a future update?

 I will post it on b.g.o. I don't know if it will be included in the next or future releases. The creator of DSPAM has given away the product to another company. They don't do much. We will see... but I think on our side (Gentoo) Alin Năstac is very responsive and ultra fast in taking care of the DSPAM ebuild and any bugs/changes/whatever.

 *DNAspark99 wrote:*   

> I've never submitted a bugfix before, not really sure how it's done?

 A bug report for Gentoo? Or for DSPAM? For Gentoo you would need to go here and post the bug. It is not difficult and it is a great way of contributing back to all Gentoo users.

// SteveB

----------

## steveb

 *DNAspark99 wrote:*   

> not bad for not knowing any C!

 No! No! Don't lower your achievement! You are doing and did a great job. Bravo!

// SteveB

----------

## DNAspark99

 *steveb wrote:*   

>  *DNAspark99 wrote:*   not bad for not knowing any C! No! No! Don't lower your achievement! You are doing and did a great job. Bravo!
> 
> // SteveB

 

thanks.. for that... and for everything so far!

OK now that dspam finally seems to be running in the desired manner....on with the show!

Figure I'll keep documenting my various changes here since this thread is now the #1 google result for "dspam standalone mode", and who knows when I'll have to refer to it in the future  :Razz: 

Since my system is set up as 'opt in', (because with opt-out mode dspam was creating undesirable addresses in the db), I'm 'forcing' new and existing users to be filtered... here's how I went about that, if I havn't documented this allready:

1: Create a 'view' to link dspam_virtual_uids to your existing users table: in my case, atmail.Users:

```
drop table dspam_virtual_uids;

CREATE VIEW dspam_virtual_uids AS SELECT id AS uid, Account AS username FROM atmail.Users;

```

now you can create that global_merged_user if needed...

2: Turn opt-in 'on' for all existing users (in my case, except uid 1, which is the global_merged_user, but you can neglect or alter the WHERE statement as needed):

```
INSERT INTO dspam_preferences SELECT uid, "optIn", "on" FROM dspam_virtual_uids WHERE uid != 1;
```

3: Create triggers to add 'optIn = on' for all new users and delete prefs for deleted users : 

```

use atmail; (use existing mail database -not dspam's... and alter tables statements as needed)

CREATE TRIGGER DSpamUserInsert AFTER INSERT ON Users

FOR EACH ROW

INSERT INTO dspam.dspam_preferences SET uid = NEW.id, preference = 'optIn', value='on';

CREATE TRIGGER DSpamUserDelete AFTER DELETE ON Users

FOR EACH ROW

DELETE FROM dspam.dspam_preferences WHERE uid = OLD.id;

```

Now since I'm sure there's bound to be some user who one day decides they don't like/want the filtering, I'll give them a way out that doesn't require my intervention - the web-ui:

/var/www/localhost/cgi-bin/configure.pl: add/edit the following line:

```

$CONFIG{'OPTMODE'}      = "OUT";

```

/var/www/localhost/cgi-bin/dspam.cgi : (yes this removes the supposed graph 'fix' - it never worked 'out of the box' but backing out the change got it working for me:

```

--- dspam.cgi.orig      2007-09-18 00:28:59.000000000 -0700

+++ dspam.cgi   2007-09-21 17:21:36.000000000 -0700

@@ -549,10 +549,10 @@

       }

     }

     $DATA{$hk}=join("_",

-               join(",",@{$lst{spam}}    || [0]),

-               join(",",@{$lst{nonspam}} || [0]),

-               join(",",@{$lst{title}}   || [0]),

-       );

+                join(",",@{$lst{spam}}),

+                join(",",@{$lst{nonspam}}),

+                join(",",@{$lst{title}}),

+       ); 

   }

 

   &output(%DATA);

@@ -580,6 +580,7 @@

 

     if ($FORM{'optOut'} ne "on") {

       $FORM{'optOut'} = "off";

+      $FORM{'optIn'} = "on";

     }

 

     if ($FORM{'showFactors'} ne "on") {

@@ -670,7 +671,7 @@

   }

 

   if ($CONFIG{'OPTMODE'} eq "OUT") {

-    $DATA{"OPTION"} = "<INPUT TYPE=CHECKBOX NAME=optOut " . $DATA{'C_OPTOUT'} . ">Disable DSPAM filtering<br>";

+    $DATA{"OPTION"} = "<INPUT TYPE=CHECKBOX NAME=optOut " . $DATA{'C_OPTOUT'} . "> <B>Disable</B> DSPAM filtering for <B>$username</B><br>";

   } elsif ($CONFIG{'OPTMODE'} eq "IN") {

     $DATA{"OPTION"} = "<INPUT TYPE=CHECKBOX NAME=optIn " . $DATA{'C_OPTIN'} . ">Enable DSPAM filtering<br>";

   } else {

```

There's a fix there to 'toggle' the opt in/out back and forth, and the rest is just formatting preferences. (I also changed templates/nav_preferences.html to move the 'Opt Out' checkbox to the very bottom, but you get the idea...Last edited by DNAspark99 on Sat Sep 22, 2007 12:56 am; edited 4 times in total

----------

## DNAspark99

altered the following script to suit my needs; basically to remind the user, not the admin that their quarantine box may be holding email.

I'm going to cron this up every month or so... with 'mbox.stamp' as the notify extension, they won't be notified if there's been no new email quarantined since they last checked through the webui

report_spam_quarantine.sh:

```

#!/bin/bash

# report_spam_quarantine - Send email to each user with unexamined quarantined email.

#    Steve Pellegrin (spellegrin at convoglio dot com)

#

# History:

#    1.0   2005-January-1    Original code

#    1.1   2005-February-3   Works with large, domain and standard

#    1.2   2007-September-21 Remind users, not the admin

#

# Usage:

#    report_spam_quarantine

DATA=/var/spool/dspam/data                         # DSpam data directory

#ADMIN="root@localhost"                            # Who should get the summary report email (empty for none)

WEBUSER="dspam"                         # Should match the user that runs the DSpam CGI

TEMPFILE="/tmp/qTemp"                    # Used to construct message to send to ADMIN

NOTIFYEXT=".mbox.stamp"                      # File name extension for the notification file

MESSAGESUBJ="Reminder: You Have Quarantined Email" # Subject line for user email messages

MESSAGETXT=/etc/dspam/txt/quarantine_reminder.txt         # File that contains the email message text

MAILFROM="support@gravit-e.ca"                            # Apparent sender addr

ess

# Remove old temp file, if any

if [ -f ${TEMPFILE} ]; then

    rm ${TEMPFILE}

fi

# For each user mailbox...

for mboxFile in `find ${DATA} -follow -name *mbox`

do

    if [ -s $mboxFile ]; then

        # Extract the user name from the path

        # and generate the notification file name

        userPath="${mboxFile%/*}"

        user="${userPath##*/}"

        domainPath="${userPath%/*}"

        domain="${domainPath##*/}"

        notificationFile="$userPath/$user${NOTIFYEXT}"

        userEmail="$user@$domain"

        # Send notification if the mailbox has changed since the last notification.

        if [ $mboxFile -nt $notificationFile ]; then

            cat "${MESSAGETXT}" | /bin/mail "$userEmail" -s "${MESSAGESUBJ}" -a "From: ${MAILFROM}"

            echo "$userEmail has quarantined mail" >>${TEMPFILE}

            sudo -u ${WEBUSER} touch $notificationFile

        fi;

    fi;

done

# Send the report to the admin, if required.

#if [ -f ${TEMPFILE} ]; then

#    if [ -n ${ADMIN} ]; then

#        mail -s "Quarantine Report" ${ADMIN} <${TEMPFILE}

#    fi;

#    rm ${TEMPFILE}

#fi

```

----------

## steveb

Since you are posting documentation things here... allow me to post a small automated training script:

```
#!/bin/sh

##

## Author: SteveB <steeeeeveee@gmx.net>

##

echo "DSPAM mass training script"

if [ "${1}" == "" -o -z "${1}" ]

then

      echo "You need to specify the target DSPAM user for the training."

      echo "  Syntax: $(basename ${0}) <training_user>"

      exit 1

fi

_lockfile="/var/run/$(basename ${0} .sh).pid"

##

## Function to check if we have all needed tools

##

check_for_tools() {

      local myrc=0

      for foo in wget awk sed md5sum find tar dspam_train dspam_clean 7z

      do

          if ! which ${foo} >/dev/null 2>&1

          then

                echo "Command ${foo} not found!"

                myrc=1

          fi

      done

      return ${myrc}

}

cleanup() {

      rm -f "${_lockfile}"

      cd /tmp

      rm -rf /tmp/spam_training/

      trap - INT TERM EXIT

      exit ${?}

}

if ! check_for_tools ; then exit 1; fi

##

## Acquire lock file and start processing

##

if ( set -o noclobber; echo "$$" > "${_lockfile}") 2> /dev/null; then

      trap 'cleanup' INT TERM EXIT

      ##

      ## TREC 2006 corpus

      ##   --> http://plg.uwaterloo.ca/~gvcormac/treccorpus06/

      ##

      echo -n "Train TREC 2006 data [y/N] "

      while true

      do

          read -n 1 -s -p "" _answer

          [ "${_answer}" == "Y" -o "${_answer}" == "y" -o "${_answer}" == "N" -o "${_answer}" == "n" -o "${_answer}" == "" ] && echo && break

      done

      if [ "${_answer}" == "Y" -o "${_answer}" == "y" ]

      then

          mkdir -p /tmp/spam_training/

          cd /tmp/spam_training/

          wget -c --referer=http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/foo06 http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/trec06p.tgz

          tar xvzf trec06p.tgz

          rm -vf trec06p.tgz

          cd ./trec06p/full/

          dspam_train ${1} -i index

          cd /tmp/spam_training/

          rm -rf ./trec06p

          wget -c --referer=http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/foo06 http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/trec06c.tgz

          tar xvzf trec06c.tgz

          rm -vf trec06c.tgz

          cd ./trec06c/full/

          dspam_train ${1} -i index

          cd /tmp/spam_training/

          rm -rf ./trec06c

          dspam_clean -s0 -p0 ${1}

      fi

      ##

      ## SpamAssassin training corpus

      ##   --> http://spamassassin.apache.org/publiccorpus/

      ##

      echo -n "Train SpamAssassin training corpus data [y/N] "

      while true

      do

          read -n 1 -s -p "" _answer

          [ "${_answer}" == "Y" -o "${_answer}" == "y" -o "${_answer}" == "N" -o "${_answer}" == "n" -o "${_answer}" == "" ] && echo && break

      done

      if [ "${_answer}" == "Y" -o "${_answer}" == "y" ]

      then

          mkdir -p /tmp/spam_training/

          cd /tmp/spam_training/

          for foo in 20030228_easy_ham 20030228_easy_ham_2 20030228_hard_ham 20030228_spam.tar 20050311_spam_2

          do

                wget -c http://spamassassin.apache.org/publiccorpus/${foo}.tar.bz2

                tar xvjf ./${foo}.tar.bz2

                rm -vf ./${foo}.tar.bz2

          done

          mkdir -p ./empty

          dspam_train ${1} ./spam_2 ./easy_ham_2

          dspam_train ${1} ./empty ./easy_ham

          dspam_train ${1} ./empty ./hard_ham

          rm -rf ./spam_2 ./easy_ham_2 ./hard_ham ./easy_ham ./empty

          cd /tmp/spam_training/

          dspam_clean -s0 -p0 ${1}

      fi

      ##

      ## LingSpam corpus

      ##   --> http://www.iit.demokritos.gr/skel/i-config/downloads/

      ##

      echo -n "Train LingSpam corpus data [y/N] "

      while true

      do

          read -n 1 -s -p "" _answer

          [ "${_answer}" == "Y" -o "${_answer}" == "y" -o "${_answer}" == "N" -o "${_answer}" == "n" -o "${_answer}" == "" ] && echo && break

      done

      if [ "${_answer}" == "Y" -o "${_answer}" == "y" ]

      then

          mkdir -p /tmp/spam_training/

          cd /tmp/spam_training/

          wget -c http://www.iit.demokritos.gr/skel/i-config/downloads/lingspam_public.tar.gz

          tar xvzf ./lingspam_public.tar.gz

          rm -vf ./lingspam_public.tar.gz

          cd ./lingspam_public

          find . -name "*.txt.gz" -exec gunzip -v "{}" ";"

          mkdir -p ./{spam,ham}

          for foo in $(find . -regex "^.*/spmsg.*\.txt")

          do

                mv -fv ${foo} ./spam/$(md5sum ${foo}|awk '{print $1}')

          done

          for foo in $(find . -name "*.txt" -not -name "readme.txt")

          do

                mv -fv ${foo} ./ham/$(md5sum ${foo}|awk '{print $1}')

          done

          dspam_train ${1} ./spam ./ham

          cd /tmp/spam_training/

          rm -rf ./lingspam_public

          cd /tmp/spam_training/

          dspam_clean -s0 -p0 ${1}

      fi

      ##

      ## Preprocessed Enron corpus

      ##   --> http://www.iit.demokritos.gr/skel/i-config/downloads/

      ##

      echo -n "Train Enron corpus data [y/N] "

      while true

      do

          read -n 1 -s -p "" _answer

          [ "${_answer}" == "Y" -o "${_answer}" == "y" -o "${_answer}" == "N" -o "${_answer}" == "n" -o "${_answer}" == "" ] && echo && break

      done

      if [ "${_answer}" == "Y" -o "${_answer}" == "y" ]

      then

          mkdir -p /tmp/spam_training/

          cd /tmp/spam_training/

          for foo in 1 2 3 4 5 6

          do

                wget -c http://www.iit.demokritos.gr/skel/i-config/downloads/enron-spam/preprocessed/enron${foo}.tar.gz

                tar xvzf ./enron${foo}.tar.gz

                rm -vf ./enron${foo}.tar.gz

                dspam_train ${1} ./enron${foo}/spam ./enron${foo}/ham

                rm -rf ./enron${foo}

                dspam_clean -s0 -p0 ${1}

          done

          cd /tmp/spam_training/

      fi

      ##

      ## ASSP corpus

      ##   --> http://assp.sourceforge.net/

      ##

      echo -n "Train ASSP corpus data [y/N] "

      while true

      do

          read -n 1 -s -p "" _answer

          [ "${_answer}" == "Y" -o "${_answer}" == "y" -o "${_answer}" == "N" -o "${_answer}" == "n" -o "${_answer}" == "" ] && echo && break

      done

      if [ "${_answer}" == "Y" -o "${_answer}" == "y" ]

      then

          mkdir -p /tmp/spam_training/

          cd /tmp/spam_training/

          wget -c http://easynews.dl.sourceforge.net/sourceforge/assp/asspsmpl-0.1.tgz

          tar xvzf ./asspsmpl-0.1.tgz

          rm -vf ./asspsmpl-0.1.tgz

          dspam_train ${1} ./asspsmpl/spam ./asspsmpl/notspam

          rm -rf ./asspsmpl

          cd /tmp/spam_training/

          dspam_clean -s0 -p0 ${1}

      fi

      ##

      ## Untroubled SPAM corpus

      ##   --> http://untroubled.org/spam/

      ##

      echo -n "Train Untroubled SPAM corpus data [y/N] "

      while true

      do

          read -n 1 -s -p "" _answer

          [ "${_answer}" == "Y" -o "${_answer}" == "y" -o "${_answer}" == "N" -o "${_answer}" == "n" -o "${_answer}" == "" ] && echo && break

      done

      if [ "${_answer}" == "Y" -o "${_answer}" == "y" ]

      then

          mkdir -p /tmp/spam_training/

          cd /tmp/spam_training/

          wget -c --level=1 --cut-dirs=1 -r http://untroubled.org/spam/

          cd ./untroubled.org

          find . -not -name "*.7z" -exec rm -vf "{}" ";"

          mkdir -p ./{ham,spam}

          for foo in *.7z

          do

                7z e ${foo} -o./spam

                rm -vf ${foo}

          done

          dspam_train ${1} ./spam ./ham

          cd /tmp/spam_training/

          rm -rf ./untroubled.org

          dspam_clean -s0 -p0 ${1}

      fi

      ##

      ## Cleanup

      ##

      cleanup

fi
```

This are just some of the available spam/ham corpus archives. If some one needs more then let me know. I have some others but most of them are pure spam corpi. Finding ham corpi is not so easy. Most of the ham corpi are either mailing lists or stuff like that. Pure email ham corpi are very rare and most of them are in English (I have some in other languages like German, Swiss German, Italian, Russian and other Slavic languages. I could make the spam part available but not the ham part. Contact me directly if you need more info).

// SteveB

----------

## DNAspark99

last night I turned it on for one of our own personal sub domains that gets a lot of spam, this is the first 'real' test of it's performance as a mail-gateway for us. It seems that it still requires some initial training per-user, as several spams gets through - is that normal, or shouldn't the amount of training put in for the global_user help prevent this? (or is this just 'new' spam that dspam has to learn? 

Also- with the training it failed on some messages, but only a very small percentage - I had to edit the index and remove the offending messages to get it to move on.

----------

## steveb

 *DNAspark99 wrote:*   

> last night I turned it on for one of our own personal sub domains that gets a lot of spam, this is the first 'real' test of it's performance as a mail-gateway for us. It seems that it still requires some initial training per-user, as several spams gets through - is that normal, or shouldn't the amount of training put in for the global_user help prevent this? (or is this just 'new' spam that dspam has to learn?

 DSPAM is learning fast but no advanced training is so good as real world training. The training done in advance was with old spam and new spam can and is different. If you want fresh spam then have a look here (the new mass training script I posted includes this corpus as well). Just download the last 2 or 3 months and train DSPAM with it. This should ease the situation a bit but it will not solve the situation. Your users will need to do some training. Small but still they need to do it. After a while most outside users will soon or later end up in the DSPAM white list and this will help as well. Anyway... I don't think your users will need to do much training. Maybe a hand full of trainings.

 *DNAspark99 wrote:*   

> Also- with the training it failed on some messages, but only a very small percentage - I had to edit the index and remove the offending messages to get it to move on.

 I know. The training script included with DSPAM is not the best and not rock solid. I have a patch for the situation you describe:

```
--- ./dspam-3.8.0/src/tools/dspam_train.in      2006-05-23 21:52:40.000000000 +0200

+++ ./dspam-3.8.0-new/src/tools/dspam_train.in  2007-09-24 21:53:29.386167177 +0200

@@ -123,7 +123,7 @@

     my($code, $cmd, $response);

     my($dir, $msg) = @_;

     print "[test: nonspam] " . substr($msg . " " x 32, 0, 32) .  " result: ";

-    $cmd = "$CONFIG{'DSPAM_BINARY'} --user $USER --deliver=summary < '$dir/$msg'";

+    $cmd = "$CONFIG{'DSPAM_BINARY'} --user $USER --deliver=summary --stdout < '$dir/$msg'";

     $response = `$cmd`;

     $code = "UNKNOWN";

@@ -131,10 +131,12 @@

         $code = $1;

     }

     if ($code eq "UNKNOWN") {

-        print "\n===== WOAH THERE =====\n";

-        print "I was unable to parse the result. Test Broken.\n";

-        print "======================\n";

-        exit(0);

+        # print "\n===== WOAH THERE =====\n";

+        # print "I was unable to parse the result. Test Broken.\n";

+        # print "======================\n";

+        # exit(0);

+        print "BROKEN result!!\n";

+        return;

     }

     if ($code eq "Innocent" || $code eq "Whitelisted") {

@@ -144,16 +146,21 @@

         my($signature) = "UNKNOWN";

         if ($response =~ /class="(\S+)"/i) {

             $class = $1;

+        } else {

+            print "BROKEN class!!\n";

+            return;

         }

         if ($response =~ /signature=(\S+)/i) {

             $signature = $1;

         } else {

-            print "\n===== WOAH THERE =====\n";

-            print "I was unable to find the DSPAM signature. Test Broken.\n";

-            print "======================\n";

-            print "\n$response\n";

-            exit(0);

+            # print "\n===== WOAH THERE =====\n";

+            # print "I was unable to find the DSPAM signature. Test Broken.\n";

+            # print "======================\n";

+            # print "\n$response\n";

+            # exit(0);

+            print "BROKEN signature!!\n";

+            return;

         }

         print "FAIL ($class)";

@@ -182,7 +189,7 @@

     my($dir, $msg) = @_;

     print "[test: spam   ] " . substr($msg . " " x 32, 0, 32) . " result: ";

-    $cmd = "$CONFIG{'DSPAM_BINARY'} --user $USER --deliver=summary < '$dir/$msg'";

+    $cmd = "$CONFIG{'DSPAM_BINARY'} --user $USER --deliver=summary --stdout < '$dir/$msg'";

     $response = `$cmd`;

     $code = "UNKNOWN";

@@ -190,29 +197,36 @@

         $code = $1;

     }

     if ($code eq "UNKNOWN") {

-        print "\n===== WOAH THERE =====\n";

-        print "I was unable to parse the result. Test Broken.\n";

-        print "======================\n";

-        exit(0);

+        # print "\n===== WOAH THERE =====\n";

+        # print "I was unable to parse the result. Test Broken.\n";

+        # print "======================\n";

+        # exit(0);

+        print "BROKEN result!!\n";

+        return;

     }

-    if ($code eq "Spam") {

+    if ($code eq "Spam" || $code eq "Blacklisted") {

         print "PASS";

     } else {

         my($class) = "UNKNOWN";

         my($signature) = "UNKNOWN";

         if ($response =~ /class="(\S+)"/i) {

             $class = $1;

+        } else {

+            print "BROKEN class!!\n";

+            return;

         }

         if ($response =~ /signature=(\S+)/i) {

             $signature = $1;

         } else {

-            print "\n===== WOAH THERE =====\n";

-            print "I was unable to find the DSPAM signature. Test Broken.\n";

-            print "======================\n";

-            print "\n$response\n";

-            exit(0);

+            # print "\n===== WOAH THERE =====\n";

+            # print "I was unable to find the DSPAM signature. Test Broken.\n";

+            # print "======================\n";

+            # print "\n$response\n";

+            # exit(0);

+            print "BROKEN signature!!\n";

+            return;

         }

         print "FAIL ($class)";
```

I will include the patches this week in a new bug report at bugs.gentoo.org. I am waiting here to see if you have more problems before posting the bug report.

// SteveB

----------

## steveb

For the fun of it... I tested http://untroubled.org/spam/2007-09.7z with my merged group user and so far on the whole corpus I only had 4 errors. I have not before trained my merged group with that corpus. But I have to confess that I do inject daily around 100 to 200 fresh spam/ham messages into my merged group. Probably that is the reason why the accuracy is so high.

I have multiple levels where I fight spam. DSPAM is just the last level. Having 4 errors on 22'915 unconfirmed spam mails on that last level is nothing.

My average spam level is below 5%. On normal days I am around 1% to 2% spam compared to the total inbound mail.

My reject level for today is at around 90% (or if I would add rejected + allowed then the rejected mails would be around 47% today). Basically today for every 19 messages reaching my system I allowed 10 and rejected 9 of them. And for today the stats tell me that I am around 1% spam. So this means every 100 allowed messages there is 1 spam mail tagged by DSPAM. And according to the stats there is no singe FP/FN today  :Smile:  I find this amazing.

I try to reject as much as possible but only as much as needed. My users can easy handle the up to 5% spam mails. This is not much for them. At least DSPAM tags the spam mails correctly. My FP/FN rate is ultra low (below 1% in the 0.x% range).

But it took me some time to get there. The problem I have is that I have a lot of different domains. Each of them has their own requirements and jet I have to fulfill them. So I added to each of my levels the possibility for either a domain user or the domain owner (domain owner can enforce policies on his domain which are valid for all his users in his domain) to change settings. Each of them can turn on/off the various checks or influence the checks.

And I think this is the key for successful filtering. Roll out good defaults (for the lazy users) but allow power users to change their settings. And let DSPAM mature. DSPAM is like a good wine. Take care that it runs and it will by itself get better and better the more time passes.

// SteveB

----------

## DNAspark99

Encountering the odd issue of internal emails being blocked - I've read up and from what I understand, there's no way to 'automatically' whitelist a domain or address, is there? Some of our clients are getting the dspam notifications themselves quarantined!

----------

## steveb

 *DNAspark99 wrote:*   

> Encountering the odd issue of internal emails being blocked

 Could you reroute the internal mails to not pass over DSPAM? Can you programmatically identify what mails are internal?

 *DNAspark99 wrote:*   

> I've read up and from what I understand, there's no way to 'automatically' whitelist a domain or address, is there?

 That is not true. You can whitelist them. The problem is that DSPAM takes the whole FROM header line for whitelisting. You could whitelist addresses but not domains.

Allow me quickly to explain how DSPAM does the whitelisting. The whole FROM line is used to assemble the token for whitelisting. Let's take this line as an example:

```
From: John Doe <john.doe@example.com>
```

DSPAM would produce for that the following token:

```
mail ~ # dspam_crc "From*John Doe <john.doe@example.com>"

TOKEN: 'From*John Doe <john.doe@example.com>' CRC: 12034071734753829144

mail ~ #
```

As you see DSPAM takes the From line by replacing the ": " with a "*" and then calculates the token for it. The token would be in this case "12034071734753829144". Then DSPAM does query the DSPAM database for that token and the DSPAM user (let's assume the DSPAM user has UID 4):

```
SELECT * FROM dspam_token_data WHERE uid=4 AND token='12034071734753829144';

+-----+----------------------+-----------+---------------+------------+

| uid | token                | spam_hits | innocent_hits | last_hit   |

+-----+----------------------+-----------+---------------+------------+

|   4 | 12034071734753829144 |         0 |          3286 | 2007-09-30 |

+-----+----------------------+-----------+---------------+------------+
```

Then DSPAM reads the preferences of the DSPAM user and looks for the parameter "whitelistThreshold". If it does not find the preferences for the DSPAM user then it uses the global preferences for that value (which is normally 5). If the DSPAM user (uid 4 in our example) has more then whitelistThreshold in innocent hits for that token, then the token (the FROM line) is considered as whitelisted.

The problem with the way DSPAM does whitelisting is that a complete valid FROM line could produce different tokens. For example:

```
mail ~ # dspam_crc "From*John Doe <john.doe@example.com>"

TOKEN: 'From*John Doe <john.doe@example.com>' CRC: 12034071734753829144

mail ~ # dspam_crc "From*<john.doe@example.com>"

TOKEN: 'From*<john.doe@example.com>' CRC: 7585973361526758504

mail ~ # dspam_crc "From*JOHN DOE <john.doe@example.com>"

TOKEN: 'From*JOHN DOE <john.doe@example.com>' CRC: 15502441436662596355

mail ~ # dspam_crc "From*<JOHN.DOE@EXAMPLE.COM>"

TOKEN: 'From*<JOHN.DOE@EXAMPLE.COM>' CRC: 12665003408387760854

mail ~ #
```

Each of them is valid and probably from the same sender but DSPAM is handling them differently since for DSPAM they are not the same address/token.

 *DNAspark99 wrote:*   

> Some of our clients are getting the dspam notifications themselves quarantined!

 This is strange. What output do you get from this command:

```
grep "^From\:" /etc/mail/dspam/txt/*.txt
```

// SteveB

----------

## DNAspark99

 *steveb wrote:*   

> Could you reroute the internal mails to not pass over DSPAM? Can you programmatically identify what mails are internal?
> 
> 

 

Actually I just tried this; adding/replacing the relevant sections as per the basic dspam howto

dspam.conf :

```
DeliveryHost        127.0.0.1

DeliveryPort        10025

DeliveryIdent       localhost

DeliveryProto       SMTP
```

/etc/postfix/master.cf:

```

 127.0.0.1:10025 inet    n       -       n       -       -       smtpd

  -o smtpd_authorized_xforward_hosts=127.0.0.0/8

  -o smtpd_client_restrictions=

[/url]  -o smtpd_helo_restrictions=

  -o smtpd_sender_restrictions=

  -o smtpd_recipient_restrictions=permit_mynetworks,reject

  -o mynetworks=127.0.0.0/8

  -o receive_override_options=no_unknown_recipient_checks

```

..which resulted in an infinite loop of undeliverable mail, not too sure why just yet, I paniced and reset the configs, I'll play with this more after-hours, I think this may be a good way to go, since mail sent from the command line of 'localhost' (dspam machine) is also being run through dspam, probably not required, and may (hopefully!) solve the issue where dspam-notifications are being quarantined for some users...   

 *steveb wrote:*   

> 
> 
>  *DNAspark99 wrote:*   Some of our clients are getting the dspam notifications themselves quarantined! This is strange. What output do you get from this command:
> 
> ```
> ...

 

I've got the 'From:' set to 'support@mydomain.com'

----------

## steveb

hmmm... Please post your whole main.cf and master.cf. This would help.

As for the support@mydomain.com: Is there really mydomain.com or did you use mydomain.com for this post and in reality this is in the configuration your real domain name there?

// SteveB

----------

## DNAspark99

heh, yea, when I say 'mydomain.com' I mean 'my real work domain name'... in other words, it's legit. I'm actually wondering if I should generate a few of these firstrun messages and feed them to dspam_train for the global_merged_user until it's 'certain' they're ham?

well, here's the current running/working config:

$ cat /etc/postfix/master.cf | grep -v ^#

```
smtp      inet  n       -       n       -       -       smtpd

   -o content_filter=lmtp:unix:/var/run/dspam/dspam.sock

dspam     unix  -       n       n       -       10      pipe

    flags=Rhqu user=dspam argv=/usr/bin/dspam --deliver=innocent --user ${recipient} -i -f ${sender} -- ${recipient}

pickup    fifo  n       -       n       60      1       pickup

cleanup   unix  n       -       n       -       0       cleanup

qmgr      fifo  n       -       n       300     1       qmgr

tlsmgr    unix  -       -       n       1000?   1       tlsmgr

rewrite   unix  -       -       n       -       -       trivial-rewrite

bounce    unix  -       -       n       -       0       bounce

defer     unix  -       -       n       -       0       bounce

trace     unix  -       -       n       -       0       bounce

verify    unix  -       -       n       -       1       verify

flush     unix  n       -       n       1000?   0       flush

proxymap  unix  -       -       n       -       -       proxymap

smtp      unix  -       -       n       -       -       smtp

relay     unix  -       -       n       -       -       smtp

        -o fallback_relay=

showq     unix  n       -       n       -       -       showq

error     unix  -       -       n       -       -       error

discard   unix  -       -       n       -       -       discard

local     unix  -       n       n       -       -       local

virtual   unix  -       n       n       -       -       virtual

lmtp      unix  -       -       n       -       -       lmtp

anvil     unix  -       -       n       -       1       anvil

scache    unix  -       -       n       -       1       scache

maildrop  unix  -       n       n       -       -       pipe

  flags=DRhu user=vmail argv=/usr/local/bin/maildrop -d ${recipient}

old-cyrus unix  -       n       n       -       -       pipe

  flags=R user=cyrus argv=/usr/lib/cyrus/deliver -e -m ${extension} ${user}

cyrus     unix  -       n       n       -       -       pipe

  flags=hu user=cyrus argv=/usr/lib/cyrus/deliver -e -r ${sender} -m ${extension} ${user}

virt-cyrus     unix  -       n       n       -       -       pipe

  flags=hu user=cyrus argv=/usr/lib/cyrus/deliver -e -r ${sender} -m ${recipient} ${user}

uucp      unix  -       n       n       -       -       pipe

  flags=Fqhu user=uucp argv=uux -r -n -z -a$sender - $nexthop!rmail ($recipient)

ifmail    unix  -       n       n       -       -       pipe

  flags=F user=ftn argv=/usr/lib/ifmail/ifmail -r $nexthop ($recipient)

bsmtp     unix  -       n       n       -       -       pipe

  flags=Fq. user=foo argv=/usr/local/sbin/bsmtp -f $sender $nexthop $recipient

```

$ cat /etc/postfix/main.cf | grep -v ^# | grep -v ^$

```
queue_directory = /var/spool/postfix

command_directory = /usr/sbin

daemon_directory = /usr/lib/postfix

mail_owner = postfix

myhostname = spamstop.mydomain.com

mydomain = mydomain.com

myorigin = spamstop.mydomain.com

unknown_local_recipient_reject_code = 550

mynetworks = 10.200.100.0/24

relayhost = [10.200.100.73]

debug_peer_level = 2

debugger_command =

         PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin

         xxgdb $daemon_directory/$process_name $process_id & sleep 5

sendmail_path = /usr/sbin/sendmail

newaliases_path = /usr/bin/newaliases

mailq_path = /usr/bin/mailq

setgid_group = postdrop

html_directory = /usr/share/doc/postfix-2.3.6/html

manpage_directory = /usr/share/man

sample_directory = /etc/postfix

readme_directory = /usr/share/doc/postfix-2.3.6/readme

home_mailbox = .maildir/

dspam_destination_recipient_limit = 1

virtual_mailbox_domains = mysql:/etc/postfix/mysql_configs/domains.cf

virtual_mailbox_maps = mysql:/etc/postfix/mysql_configs/atmail_users.cf

virtual_alias_maps = mysql:/etc/postfix/mysql_configs/atmail_aliases.cf

virtual_transport = lmtp:unix:/var/run/dspam/dspam.sock

alias_maps = hash:/etc/mail/aliases

smtpd_helo_required = yes

disable_vrfy_command = yes

smtpd_recipient_restrictions = permit_mynetworks

        reject_invalid_hostname 

        reject_non_fqdn_hostname 

        reject_non_fqdn_sender 

        reject_non_fqdn_recipient 

        reject_unknown_sender_domain 

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_users.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_alias.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_aliasdomain_users.cf

        check_recipient_access mysql:/etc/postfix/mysql_configs/valid_aliasdomain_alias.cf

        reject

```

Actually in posting this just now (and getting a clear look at my config without all the spacing and comments) I suspect the line virtual_transport = lmtp:unix:/var/run/dspam/dspam.sock may have had something to do with the infinite mail loop when using the other config... but that's just my current suspicion... 

(those difference being:)

```
--- master.cf   2007-10-01 09:46:17.000000000 -0700

+++ master.cf.test      2007-10-01 09:46:11.000000000 -0700

@@ -8,10 +8,22 @@

 # ==========================================================================

 #smtp      inet  n       -       n       -       -       smtpd

 smtp      inet  n       -       n       -       -       smtpd

-   -o content_filter=lmtp:unix:/var/run/dspam/dspam.sock

+#   -o content_filter=lmtp:unix:/var/run/dspam/dspam.sock

 #    -o content_filter=dspam:

-dspam     unix  -       n       n       -       10      pipe

-    flags=Rhqu user=dspam argv=/usr/bin/dspam --deliver=innocent --user ${recipient} -i -f ${sender} -- ${recipient}

+#dspam     unix  -       n       n       -       10      pipe

+#    flags=Rhqu user=dspam argv=/usr/bin/dspam --deliver=innocent --user ${recipient} -i -f ${sender} -- ${recipient}

+#

+#

+dspam     unix  -       -       n       -       10      lmtp

+#

+127.0.0.1:10025 inet    n       -       n       -       -       smtpd

+  -o smtpd_authorized_xforward_hosts=127.0.0.0/8

+  -o smtpd_client_restrictions=

+  -o smtpd_helo_restrictions=

+  -o smtpd_sender_restrictions=

+  -o smtpd_recipient_restrictions=permit_mynetworks,reject

+  -o mynetworks=127.0.0.0/8

+  -o receive_override_options=no_unknown_recipient_checks

 #

 #dspam-ham     unix   -      n       n       -        -      pipe

 #   flags=Rhq user=dspam:mail argv=/usr/bin/dspam
```

dspam.conf

```
-DeliveryHost        10.200.100.73

-DeliveryPort        25

-DeliveryIdent       spamstop.mydomain.com

-DeliveryProto     SMTP

+DeliveryHost        127.0.0.1

+DeliveryPort        10025

+DeliveryIdent       localhost

+DeliveryProto       SMTP

```

----------

## DNAspark99

one more thing on a slightly unrelated note, regarding the global database for global_merged_user - I'm not entirely sure if mail coming in is processed against all the training that's been done - I do have the following:

$ cat /var/spool/dspam/group 

```
global_merged_user:merged:*

```

However, it's a little difficult to tell it this is working, new accounts seem to have to train their own spam initially.

Reading the relay.txt it does have this blurb:

```

GLOBAL DATABASES

If you're thinking about going with a global database, I strongly recommend

using merged groups + toe instead of a single global group. To do this, just

follow the README directions for setting one up and leave everything the way

it is. If, however, you insist on a single global group, you'll need to make

one change to dspam.conf to accomodate this configuration. Add

--user [globaluser] to your ServerParameters property. This will cause all

mail to be processed using this user, but will still deliver using the

recipient information.

```

Currently my dspam.conf has this:

```
ServerParameters        "--deliver=innocent"

```

I'm wondering, should I add the "--user global_merged_user", or is this overruled with the group file?

----------

## steveb

 *DNAspark99 wrote:*   

> I'm wondering, should I add the "--user global_merged_user", or is this overruled with the group file?

 No! Don't add that if you don't want the global_merged_user to be the ONLY user in DSPAM. If you want each user to have his own tokens, then leave the setup the way you have it right now.

// SteveB

----------

## steveb

 *DNAspark99 wrote:*   

> /etc/postfix/master.cf:
> 
> ```
> smtp      inet  n       -       n       -       -       smtpd
> 
> ...

 

Yes. This is a problem. You filter twice and then later in dspam.conf you have the loop. First you have a content filter (DSPAM) on SMTP (port 25) and then you deliver virtually with DSPAM and DSPAM then again injects the mail back on the SMTP port. This is the loop.

So basically you do this:

--> message --> port 25 --> content filter (DSPAM) --> processing of message --> delivery of message to 10.200.100.73 on port 25 --> loop

I think best would be if you can write down who you want to deliver the mail. Do you want DSPAM to deliver the mail? Do you want Postfix to deliver the mail?

Is this box (I assume it has IP 10.200.100.73) having more the one IP? Does it have an external and an internal IP?

If I am not mistaken, then you receive the mail on another system and the DSPAM box is just responsible to tag the message and then send it back to another system. Is that right?

--> box 1 receives internet mail --> box 2 (IP 10.200.100.73) receives mail and processes it with DSPAM --> box 3? box 1? who is then getting the mail in the next hop? Or is this hop the final destination?

// SteveB

----------

## DNAspark99

[firewall] -> [box1: dspam] -> [box2: atmail]

the is doing it's job, restricting port access based on funtion (there's several web servers, db servers, file servers, subnets and whatnot behind there)

box1 and box2 are on the same subnet, 10.200.100.0

box1 is my dspam box, box2 is the actual mailserver.  10.200.100.72 and 10.200.100.73 respectfully

my current config works rather well, just the odd case where 'quarantine reminders' generated by a script are getting caught up and marked by dspam.. It doesn't happen often, and even cron messages from the system itself were occasionally tagged as well - although they are frequent enough that they now appear to be whitelisted, so perhaps I just need to wait for more of these messages to be generated so they're whitelisted as well... I don't think there's much benefit to be gained in this area by altering the way dspam+postfix deliver messages, is there?

A more interesting issue is the occasional occurrence in the CGI where the 'Type' and 'Additional Info' of a message don't match : 'SPAM + delivered' or 'Good + Quarantined' -  the 'additional info' on these is usually the opposite of whatever dspam actually did with these messages, although this is rare - *maybe* 1 out of 100 seem to have this occur

----------

## steveb

 *DNAspark99 wrote:*   

> [firewall] -> [box1: dspam] -> [box2: atmail]

 Okay. I see now. How about something like this?

main.cf:

```
virtual_transport = lmtp:unix:/var/run/dspam/dspam.sock
```

master.cf:

```
smtp      inet  n       -       n       -       -       smtpd
```

dspam.cf:

```
DeliveryHost      10.200.100.72

DeliveryPort      25

DeliveryIdent      FQDN of your system

DeliveryProto      SMTP

ServerMode      auto

ServerDomainSocketPath   "/var/run/dspam/dspam.sock"
```

This will receive mail by Postfix and then deliver virtual mail to DSPAM over LMTP and then DSPAM will deliver mail over SMTP to 10.200.100.72.

btw: Where are you doing anti virus scanning?

btw2: For the system messages you could directly send them to "smtp:[10.200.100.72]" with a simple transport map. Or you could make Postfix to listen on 127.0.0.1 (assuming you send your local generated system messages over localhost) and then directly deliver anything to 10.200.100.72:

master.cf

```
127.0.0.1:smtp      inet  n       -       n       -       -       smtpd

  -o local_transport=smtp:[10.200.100.72]

  -o virtual_transport=smtp:[10.200.100.72]

  -o default_transport=smtp:[10.200.100.72]
```

// SteveB

----------

## DNAspark99

Actually that's not too far off from what I'm doing now... it works, so I won't fiddle with the setup too much if necessary, and it's just the odd 'reminder' email quarantined that's the issue here, so I'll play with setting up a 'clean' smtp on localhost for delivery of these messages, that should avoid the issue altogether, if I understand it correctly...

AV scanning (clamav) is actually done on the real mailserver  - I've debated integrating clamav+dspam together, but it's a bit redundant, no? Currently, not all clients are using dspam, and they do have the option to opt-out, so leaving AV on the mailserver itself makes sense - it's 'guaranteed' to process ALL users mail this way.

Here's another question for ya: say we've got alias1 pointing to user1 + user2 @somedomain.com - it seems that some obvious spam sent to that alias1, gets caught for user1 but let through for user2 - is this just a matter or more training required for user2??

Thanks again for everything so far!

----------

## steveb

 *DNAspark99 wrote:*   

> is this just a matter or more training required for user2??

 Yes. User2 needs more training (or your global merged group/user needs more training).

// SteveB

----------

## DNAspark99

so for any sort of 'pattern' of spam that gets through for user2, it would be a good idea to not just re-train for that user, but capture and store this email to feed into dspam_train? aaaaaaaah, ok

----------

## steveb

Have you trained DSPAM with your OWN ham/spam messages? If not: Why not?

Have you thought about setting up honey traps for spam and feed that captured stuff into DSPAM (into the merged group? Maybe using inoculation)?

Have you thought about inoculating outbound mails as ham into DSPAM (into the merged group)?

// SteveB

----------

## DNAspark99

Well today, out of the blue, when the dspam.cron was run around 3am in the morning, something, somewhere, went horribly wrong, and mail delivery was interrupted for a few hours as dspam went sideways, mainly taking mysql with it. 

Still sorting through the pieces trying to figure out what went wrong with it, it appears to have corrupted the dspam_tokens table, completely halting normal mail delivery

dspam has been disabled in the mean time

----------

## steveb

What MySQL storage engine do you use for the DSPAM tables? MyISAM? InnoDB?

// SteveB

----------

## DNAspark99

I believe it's InnoDB. 

The issue was related to the token table optimization spilling out of memory on to disk - and /tmp being a small 100MB tmpfs filesystem. Once that was filled up by mysql's reworking of the table, it became corrupted, dspam broke, and mail stopped being delivered. This post highlights the issue we had with /tmp: 

http://www.sage.org/lists/sage-members-archive/2007/msg01016.html

After knowing what caused it, it was easy enough to sort out, and steps have been taken to ensure this particular issue doesn't happen again  :Smile: 

----------

## DNAspark99

for documentation purposes:

I just encountered a user with an 8MB attachment having their mail bounced back from the dspam machine - despite the setting in dspam.conf:

```
MaxMessageSize 20971520
```

turns out it was a postfix limitation, setting the following in /etc/postfix/main.cf fixed it:

```
message_size_limit = 20971520
```

----------

## DNAspark99

Ok, and now another surprise: 

All of a sudden, trying to re-train messages as spam in the history page of the webui doesn't work?!

After some digging around, it appears the dspam_signature_data table is 1.1G in size, and has an mtime of 3:30ish am, so it hasn't been updated in nearly 9 hours?

As it turns out, I can successfully retrain anything prior to this 3:30am 'cut off' - the two messages closest to this are one at 3:11am and one at 3:52am - I can retrain the one at 3:11 but not the one at 3:52 - so for some reason, it appears dspam stopped writing to this table (the table takes manual inserts without issue)...so, wtf...odd

----------

## DNAspark99

 *DNAspark99 wrote:*   

> Ok, and now another surprise: 
> 
> All of a sudden, trying to re-train messages as spam in the history page of the webui doesn't work?!
> 
> After some digging around, it appears the dspam_signature_data table is 1.1G in size, and has an mtime of 3:30ish am, so it hasn't been updated in nearly 9 hours?
> ...

 

interestingly, running 'dspam_clean -s7 myuser@mydomain.com' seems to have (temporarily?) 'kick-started' dspam into working again - any new message coming through, the table mtime now updates, and I can re-train on the new message. Anything previous to this though, is untrainable, since it's signature apparently never got written to the table. Now the question is... what happened there, and why? And how long until it happens again?

----------

## steveb

Normally /etc/cron.daily/dspam.cron should do the cleaning. Could it be that your crond is not processing /etc/cron.daily/dspam.cron?

// SteveB

----------

## DNAspark99

Yea, dspam.cron IS running, I just modified it to put out an email notification after it runs the dspam_clean, something odd is going on... it's fixed now, but I'll have to keep an eye on it

----------

## DNAspark99

Well, all of a sudden, without warning or change, DSPAM has stopped working reliably. Messages are incredibly delayed, if they come through at all, and if they do come through, they're often repeated, as if they're 'stuck' in the queue. I have no idea. It's almost 3am and I've been dealing with this for several hours now, trying to figure it out. Still no luck, so I've had to disable dspam as the MX for now. 

Problem seems similar to this: 

https://forums.gentoo.org/viewtopic-t-619112-postdays-0-postorder-asc-highlight-dspam-start-50.html

Updated to latest (3.8.0-r11) dspam, didn't seem to fix anything. 

I think dspam is crapping out on certain messages, but will have to troubleshoot this in the morning.

----------

## DNAspark99

well so far, it looks like there was a single (large: 4.5MB of actual msg body!) message 'clogging' the queue, by strangling dspam. 

I suspect having:

```
MaxMessageSize 20971520
```

 (20MB!)

may have been a bad idea. But it's not like dspam was taking up a bunch of CPU to process this, it seemed to just quietly choke & die on it, without actually segfaulting or anything. But at the time, dspam process would not stop cleanly with '/etc/init.d/dspam restart', the pid would have to be killed by hand. (kill -s 9)

Still mitigating the issue, but one thing I changed last night in the confusion was the config from using dspam.sock, to 127.0.0.1:10023 (and postfix to use lmtp:127.0.0.1:10023), although I suspect I could change this back now that the suspect message has been DELETED from the queue. The queue has been placed on hold and all traffic re-routed to the main MX for the time being, as I'm working on re-queueing the 1600 some-odd messages that have been held captive since sometime yesterday. They are now being delivered, and without that offending message in the queue, the issue seems to have subsided.

note: once on hold, I've scripted up the following quick-n-dirty script to more easily manage processing of the queue - this helped in isolation of the 'offending' message:

```
#!/usr/bin/perl

#

# Script to re-queue X amount of msgs placed on hold, default is 10

# In this way, if a msg is 'stuck' or breaking dspam, it can be determined

# which one is the culprit, as it will be within X of the first few msgs

# as listed in 'ls /var/spool/postfix/hold | head -X', where X is the number

#

# Requeued msgs may sit in the 'new' queue until postfix decides to retry, 

# but a 'postfix -f' should flush these newly requeued msgs instantly

$default="10";

$arg=$ARGV[0];

chomp $arg;

unless ($arg) {

        $arg = $default;

}

else {

        if ($arg=~/[a-z]|[A-Z]+/) {

                print qq|$arg is NOT a valid number\n|;

                exit;

        }

        elsif ($arg=~/^[0-9]+/) {

                #print qq|$arg is a number\n|;

                #exit;

        }

        else {

                print qq|$arg is NOT a valid number\n|;

                exit;

        }

}

$count=$arg;

$start_q=`postqueue -p | grep Requests`;

chomp $start_q;

@msg_list=`ls /var/spool/postfix/hold/ | head -$count`; 

foreach $msg_id (@msg_list) {

        chomp $msg_id;

        print qq|Requeing $msg_id\n|;

        system("postsuper -r $msg_id");

}

print qq|\n$count items requeued for delivery\n|;

print qq|$start_q\n|;

if ($count == $default) {

        print qq|\nUsage TIP: '$0 X' will requeue 'X' messages at once\n|;

}
```

----------

## opimon

Hi,

Scuse me for me english... it's not very good.

I have installed Dspam as Relay on my server.

Why you dont use FallbackDomain options ?

you insert @example.com on your dspam_preferences table and you don't have any problems with mails alias....

I don't have understant what's Opt In and Opt out....

----------

