[c5c522c] | 1 | .\" $Id: dspam.1,v 1.20 2011/06/28 00:13:48 sbajic Exp $ |
---|
| 2 | .\" -*- nroff -*- |
---|
| 3 | .\" |
---|
| 4 | .\" dspam3.9 |
---|
| 5 | .\" |
---|
| 6 | .\" Authors: Jonathan A. Zdziarski <jonathan@nuclearelephant.com> |
---|
| 7 | .\" Stevan Bajic <stevan@bajic.ch> |
---|
| 8 | .\" |
---|
| 9 | .\" Copyright (C) 2002-2012 DSPAM Project |
---|
| 10 | .\" All rights reserved |
---|
| 11 | .\" |
---|
| 12 | .TH DSPAM 1 "Aug 14, 2010" "DSPAM" "DSPAM" |
---|
| 13 | |
---|
| 14 | .SH NAME |
---|
| 15 | dspam \- DSPAM Anti-Spam Agent |
---|
| 16 | |
---|
| 17 | .SH SYNOPSIS |
---|
| 18 | .na |
---|
| 19 | .B dspam |
---|
| 20 | [\c |
---|
| 21 | .BI \--mode= teft|toe|tum|notrain|unlearn\c |
---|
| 22 | ] |
---|
| 23 | [\c |
---|
| 24 | .BI \--user \ user1 |
---|
| 25 | user2\ ...\ userN\c |
---|
| 26 | ] |
---|
| 27 | [\c |
---|
| 28 | .BI \--feature= noise|no,tb=N,whitelist|wh\c |
---|
| 29 | ] |
---|
| 30 | [\c |
---|
| 31 | .BI \--class= spam|innocent\c |
---|
| 32 | ] |
---|
| 33 | [\c |
---|
| 34 | .BI \--source= error|corpus|inoculation\c |
---|
| 35 | ] |
---|
| 36 | [\c |
---|
| 37 | .BI \--profile= PROFILE\c |
---|
| 38 | ] |
---|
| 39 | [\c |
---|
| 40 | .BI \--deliver= spam,innocent|nonspam,summary,stdout\c |
---|
| 41 | ] |
---|
| 42 | [\c |
---|
| 43 | .BI \--help\c |
---|
| 44 | ] |
---|
| 45 | [\c |
---|
| 46 | .BI \--version\c |
---|
| 47 | ] |
---|
| 48 | [\c |
---|
| 49 | .BI \--process\c |
---|
| 50 | ] |
---|
| 51 | [\c |
---|
| 52 | .BI \--classify\c |
---|
| 53 | ] |
---|
| 54 | [\c |
---|
| 55 | .BI \--signature= signature\c |
---|
| 56 | ] |
---|
| 57 | [\c |
---|
| 58 | .BI \--stdout\c |
---|
| 59 | ] |
---|
| 60 | [\c |
---|
| 61 | .BI \--debug\c |
---|
| 62 | ] |
---|
| 63 | [\c |
---|
| 64 | .BI \--daemon\c |
---|
| 65 | ] |
---|
| 66 | [\c |
---|
| 67 | .BI \--nofork\c |
---|
| 68 | ]] |
---|
| 69 | [\c |
---|
| 70 | .BI \--client\c |
---|
| 71 | ] |
---|
| 72 | [\c |
---|
| 73 | .BI \--rcpt\-to \ recipient\-address(es)\c |
---|
| 74 | ] |
---|
| 75 | [\c |
---|
| 76 | .BI \--mail\-from= sender\-address\c |
---|
| 77 | ] |
---|
| 78 | [\c |
---|
| 79 | .BI passthru\-delivery\-arguments\fR\c |
---|
| 80 | ] |
---|
| 81 | |
---|
| 82 | .ad |
---|
| 83 | .SH DESCRIPTION |
---|
| 84 | .LP |
---|
| 85 | .B The DSPAM agent |
---|
| 86 | provides a direct interface to mail servers for command\-line |
---|
| 87 | spam filtering. The agent can masquerade as the mail server's local delivery |
---|
| 88 | agent and will process any email passed to it. The agent will then call whatever |
---|
| 89 | delivery agent was specified at compile time or quarantine/tag/drop messages |
---|
| 90 | identified as spam. The DSPAM agent can function locally or as a proxy. It |
---|
| 91 | is also responsible for processing classification errors so that DSPAM can |
---|
| 92 | learn from its mistakes. |
---|
| 93 | |
---|
| 94 | .SH OPTIONS |
---|
| 95 | .LP |
---|
| 96 | .ne 3 |
---|
| 97 | .TP |
---|
| 98 | .BI \--user \ user1\fR\ user2\ ...\ userN\c |
---|
| 99 | Specifies the destination users of the incoming message. In most cases this is |
---|
| 100 | the local user on the system, however some implementations may call for virtual |
---|
| 101 | usernames, specific to DSPAM, to be assigned. The agent processes an |
---|
| 102 | incoming message once for each user specified. If the message is to be |
---|
| 103 | delivered, the $u (or %u) parameters of the argument string will be interpolated |
---|
| 104 | for the current user being processed. |
---|
| 105 | |
---|
| 106 | .ne 3 |
---|
| 107 | .TP |
---|
| 108 | .BI \--mode= toe|tum|teft|notrain\c |
---|
| 109 | Configures the training mode to be used for this process, overriding any defaults in |
---|
| 110 | dspam.conf or the preference extension: |
---|
| 111 | |
---|
| 112 | .B teft |
---|
| 113 | : Train\-Everything. Trains on all messages processed. This is a very thorough training |
---|
| 114 | approach and should be considered the standard training approach for most users. TEFT |
---|
| 115 | may, however, prove too volatile on installations with extremely high per\-user traffic, |
---|
| 116 | or prove not very scalable on systems with extremely large user\-bases. In the event |
---|
| 117 | that TEFT is proving ineffective, one of the other modes is recommended. |
---|
| 118 | |
---|
| 119 | .B toe |
---|
| 120 | : Train\-on\-Error. Trains only on a classification error, once the user's metadata has |
---|
| 121 | matured to 2500 innocent messages. This training mode is much less resource intensive, |
---|
| 122 | as only occasional metadata writes are necessary. It is also far less volatile than |
---|
| 123 | the TEFT mode of training. One drawback, however, is that TOE only learns when DSPAM |
---|
| 124 | has made a mistake \- which means the data is sometimes too static, and unable to "ease |
---|
| 125 | into" a different type of behavior. |
---|
| 126 | |
---|
| 127 | .B tum |
---|
| 128 | : Train\-until\-Mature. This training mode is a hybrid between the other two training modes |
---|
| 129 | and provides a great balance between volatility and static metadata. TuM will train on a |
---|
| 130 | per\-token basis only tokens which have had fewer than 25 "hits" on them, unless an error |
---|
| 131 | is being retrained in which case all tokens are trained. This training mode provides a |
---|
| 132 | solid core of stable tokens to keep accuracy consistent, but also allows for dynamic |
---|
| 133 | adaptation to any new types of email behavior a user might be experiencing. |
---|
| 134 | |
---|
| 135 | .B notrain |
---|
| 136 | : No training. Do not train the user's data, and do not keep totals. This should only be |
---|
| 137 | used in cases where you want to process mail for a particular user (based on a group, for |
---|
| 138 | example), but don't want the user to accumulate any learning data. |
---|
| 139 | |
---|
| 140 | .B unlearn |
---|
| 141 | : Unlearn original training. Use this if you wish to unlearn a previously learned message. |
---|
| 142 | Be sure to specify |
---|
| 143 | .B \--source=error |
---|
| 144 | and |
---|
| 145 | .B \--class |
---|
| 146 | to whatever the original classification the |
---|
| 147 | message was learned under. If not using TrainPristine, this will require the original |
---|
| 148 | signature from training. |
---|
| 149 | |
---|
| 150 | .ne 3 |
---|
| 151 | .TP |
---|
| 152 | .BI \--feature= noise|no,whitelist|wh,tb=N\c |
---|
| 153 | Specifies the features that should be activated for this filter instance. The following |
---|
| 154 | features may be used individually or combined using a comma as a delimiter: |
---|
| 155 | |
---|
| 156 | .B (no)ise |
---|
| 157 | : Bayesian Noise Reduction (BNR). Bayesian Noise Reduction kicks in at 2500 innocent |
---|
| 158 | messages and provides an advanced progressive noise logic to reduce Bayesian Noise |
---|
| 159 | (wordlist attacks) in spams. See http://www.zdziarski.com/papers/bnr.html for more |
---|
| 160 | information. |
---|
| 161 | |
---|
| 162 | .B (tb)\=N |
---|
| 163 | : Sets the training loop buffering level. Training loop buffering is the amount of |
---|
| 164 | statistical sedation performed to water down statistics and avoid false positives |
---|
| 165 | during the user's training loop. The training buffer sets the buffer sensitivity, |
---|
| 166 | and should be a number between 0 (no buffering whatsoever) to 10 (heavy buffering). |
---|
| 167 | The default is 5, half of what previous versions of DSPAM used. To avoid dulling |
---|
| 168 | down statistics at all during the training loop, set this to 0. |
---|
| 169 | |
---|
| 170 | .B (wh)itelist |
---|
| 171 | : Automatic whitelisting. DSPAM will keep track of the entire "From:" line for each |
---|
| 172 | message received per user, and automatically whitelist messages from senders with more |
---|
| 173 | than 20 innocent messages and zero spams. Once the user reports a spam from the sender, |
---|
| 174 | automatic whitelisting will automatically be deactivated for that sender. Since DSPAM |
---|
| 175 | uses the entire "From:" line, and not just the sender's email address, automatic |
---|
| 176 | whitelisting is a very safe approach to improving accuracy especially during initial |
---|
| 177 | training. |
---|
| 178 | |
---|
| 179 | .B NOTE: |
---|
| 180 | : None of the present features are necessary when the source is "error", because the |
---|
| 181 | original training data is used from the signature to retrain, instantiating whatever |
---|
| 182 | features (such as whitelisting) were active at the time of the initial classification. |
---|
| 183 | Since BNR is only necessary when a message is being classified, the |
---|
| 184 | .B \--feature |
---|
| 185 | flag can be safely omitted from error source calls. |
---|
| 186 | |
---|
| 187 | .ne 3 |
---|
| 188 | .TP |
---|
| 189 | .BI \--class= spam|innocent\c |
---|
| 190 | Identifies the disposition (if any) of the message being presented. This flag |
---|
| 191 | should be used when a misclassification has occured, when the user is |
---|
| 192 | corpus\-feeding a message, or when an inoculation is being presented. This |
---|
| 193 | flag should not be used for standard processing. This flag must be used in |
---|
| 194 | conjunction with the |
---|
| 195 | .B \--source |
---|
| 196 | flag. Omitting this flag causes DSPAM to determine the disposition of the message on |
---|
| 197 | its own (the standard operating mode). |
---|
| 198 | |
---|
| 199 | .ne 3 |
---|
| 200 | .TP |
---|
| 201 | .BI \--source= error|corpus|inoculation\c |
---|
| 202 | Where |
---|
| 203 | .B \--class |
---|
| 204 | is used, the source of the classification must also be provided. The source |
---|
| 205 | tells dspam how to learn the message being presented: |
---|
| 206 | |
---|
| 207 | .B error |
---|
| 208 | : The message being presented was a message previously misclassified by DSPAM. When |
---|
| 209 | \'error\' is provided as a source, DSPAM requires that the DSPAM signature be present |
---|
| 210 | in the message, and will use the signature to recall the original training metadata. |
---|
| 211 | If the signature is not present, the message will be rejected. In this source mode, |
---|
| 212 | DSPAM will also decrement each token's previous classification's count as well as |
---|
| 213 | the user totals. |
---|
| 214 | |
---|
| 215 | You should use error only when DSPAM has made an error in classifying the message, |
---|
| 216 | and should present the modified version of the message with the DSPAM signature when |
---|
| 217 | doing so. |
---|
| 218 | |
---|
| 219 | .B corpus |
---|
| 220 | : The message being presented is from a mail corpus, and should be trained as a new |
---|
| 221 | message, rather than re\-trained based on a signature. The message's full headers and |
---|
| 222 | body will be analyzed and the correct classification will be incremented, without |
---|
| 223 | its opposite being decremented. |
---|
| 224 | |
---|
| 225 | You should use corpus only when feeding messages in from corpus. |
---|
| 226 | |
---|
| 227 | .B inoculation |
---|
| 228 | : The message being presented is in pristine form, and should be trained as an |
---|
| 229 | inoculation. Inoculations are a more intense mode of training designed to cause DSPAM |
---|
| 230 | to train the user's metadata repeatedly on previoulsy unknown tokens, in an attempt to |
---|
| 231 | vaccinate the user from future messages similar to the one being presented. You should |
---|
| 232 | use inoculation only on honeypots and the like. |
---|
| 233 | |
---|
| 234 | .ne 3 |
---|
| 235 | .TP |
---|
| 236 | .BI \--profile= PROFILE\c |
---|
| 237 | Specify a storage profile from dspam.conf. The storage profile selected will be used |
---|
| 238 | for all database connectivity. See dspam.conf for more information. |
---|
| 239 | |
---|
| 240 | .ne 3 |
---|
| 241 | .TP |
---|
| 242 | .BI \--deliver= spam,innocent|nonspam,summary,stdout\c |
---|
| 243 | Tells |
---|
| 244 | .B DSPAM |
---|
| 245 | to deliver the message if its result falls within the criteria specified. For example, |
---|
| 246 | .B \--deliver=innocent |
---|
| 247 | or |
---|
| 248 | .B \--deliver=nonspam |
---|
| 249 | will cause DSPAM to only deliver the message if its classification has been determined |
---|
| 250 | as innocent. Providing |
---|
| 251 | .B \--deliver=innocent,spam |
---|
| 252 | or |
---|
| 253 | .B \--deliver=nonspam,spam |
---|
| 254 | will cause DSPAM to deliver the message regardless of its classification. This flag |
---|
| 255 | provides a significant amount of flexibility for nonstandard implementations, where |
---|
| 256 | false positives may not be delivered but spam is, and etcetera. |
---|
| 257 | |
---|
| 258 | .B summary |
---|
| 259 | : Deliver (to stdout) a summary indentical to the output of message classification: |
---|
| 260 | |
---|
| 261 | X\-DSPAM\-Result: User; result="Innocent"; class="Innocent"; probability=0.0000; confidence=1.00; signature=4b11c532158749980119923 |
---|
| 262 | |
---|
| 263 | .B stdout |
---|
| 264 | : Is a shortcut for for |
---|
| 265 | .B \--deliver=innocent,spam --stdout |
---|
| 266 | |
---|
| 267 | .ne 3 |
---|
| 268 | .TP |
---|
| 269 | .B \--stdout \c |
---|
| 270 | If the message is indeed deemed "deliverable" by the |
---|
| 271 | .B \--deliver |
---|
| 272 | flag, this flag will cause DSPAM to deliver the message to stdout, rather than the |
---|
| 273 | configured delivery agent. |
---|
| 274 | |
---|
| 275 | .ne 3 |
---|
| 276 | .TP |
---|
| 277 | .B \--process\c |
---|
| 278 | Tells |
---|
| 279 | .B DSPAM |
---|
| 280 | to process the message. This is the default behavior, and the flag is implied unless |
---|
| 281 | .B \--classify |
---|
| 282 | is used. |
---|
| 283 | |
---|
| 284 | .ne 3 |
---|
| 285 | .TP |
---|
| 286 | .BI \--classify\c |
---|
| 287 | Tells |
---|
| 288 | .B DSPAM |
---|
| 289 | to only classify the message, and not perform any writes to the user's |
---|
| 290 | data or attempt to deliver/quarantine the message. The results of a |
---|
| 291 | classification are printed to stdout in the following format: |
---|
| 292 | |
---|
| 293 | X\-DSPAM\-Result: User; result="Spam"; probability=1.0000; confidence=0.80 |
---|
| 294 | |
---|
| 295 | .B NOTE |
---|
| 296 | : The output of the classification is specific to a user's own data, and |
---|
| 297 | does not include the output of any groups they might be affiliated with, |
---|
| 298 | so it is entirely possible that the message would be caught as spam by a |
---|
| 299 | group the user belongs to, and appear as innocent in the output of a |
---|
| 300 | classification. To get the classification for the |
---|
| 301 | .B group |
---|
| 302 | , use the group name as the user instead of an individual. |
---|
| 303 | |
---|
| 304 | .ne 3 |
---|
| 305 | .TP |
---|
| 306 | .BI \--signature= signature\c |
---|
| 307 | If only the signature is available for training, and not the entire message, the |
---|
| 308 | .B \--signature |
---|
| 309 | flag may be used to feed the signature into DSPAM and forego |
---|
| 310 | the reading of stdin. DSPAM will process the signature with whatever |
---|
| 311 | commandline classification was specified. |
---|
| 312 | |
---|
| 313 | .B NOTE |
---|
| 314 | : This should only be used with |
---|
| 315 | .B \--source=error |
---|
| 316 | |
---|
| 317 | .ne 3 |
---|
| 318 | .TP |
---|
| 319 | .BI \--debug\c |
---|
| 320 | If |
---|
| 321 | .B DSPAM |
---|
| 322 | was compiled with |
---|
| 323 | .B \--enable\-debug |
---|
| 324 | then using |
---|
| 325 | .B \--debug |
---|
| 326 | will turn on debugging messages. |
---|
| 327 | |
---|
| 328 | .ne 3 |
---|
| 329 | .TP |
---|
| 330 | .BI \--daemon\c |
---|
| 331 | If |
---|
| 332 | .B DSPAM |
---|
| 333 | was compiled with |
---|
| 334 | .B \--enable\-daemon |
---|
| 335 | then using |
---|
| 336 | .B \--daemon |
---|
| 337 | will cause DSPAM to enter daemon mode, where it will listen for DSPAM clients to |
---|
| 338 | connect and actively service requests. |
---|
| 339 | |
---|
| 340 | .ne 3 |
---|
| 341 | .TP |
---|
| 342 | .BI \--nofork\c |
---|
| 343 | If |
---|
| 344 | .B DSPAM |
---|
| 345 | was compiled with |
---|
| 346 | .B \--enable\-daemon |
---|
| 347 | then using |
---|
| 348 | .B \--nofork |
---|
| 349 | will cause DSPAM to not fork the daemon into backgound when using |
---|
| 350 | .B \--daemon |
---|
| 351 | switch. |
---|
| 352 | |
---|
| 353 | .ne 3 |
---|
| 354 | .TP |
---|
| 355 | .BI \--client\c |
---|
| 356 | If |
---|
| 357 | .B DSPAM |
---|
| 358 | was compiled with |
---|
| 359 | .B \--enable\-daemon |
---|
| 360 | then using |
---|
| 361 | .B \--client |
---|
| 362 | will cause DSPAM to act as a client and attempt to connect to the DSPAM server specified in |
---|
| 363 | the client's configuration within dspam.conf. If client behavior is desired, this option |
---|
| 364 | .B must |
---|
| 365 | be specified, otherwise the agent simply operate as self\-contained and processes |
---|
| 366 | the message on its own, eliminating any benefit of using the daemon. |
---|
| 367 | |
---|
| 368 | .ne 3 |
---|
| 369 | .TP |
---|
| 370 | .BI \--rcpt\-to \ recipient\-address(es)\c |
---|
| 371 | If |
---|
| 372 | .B DSPAM |
---|
| 373 | will be configured to deliver via LMTP or SMTP, this flag may be used to define the |
---|
| 374 | RCPT TOs which will be used for the delivery of each user specified with |
---|
| 375 | .B \--user |
---|
| 376 | If no recipients are provided, the RCPT TOs will match the username. |
---|
| 377 | |
---|
| 378 | .B NOTE |
---|
| 379 | : The recipient list should always be balanced with the user list, or empty. |
---|
| 380 | Specifying an unbalanced number of recipients to users will result in undefined |
---|
| 381 | behavior. |
---|
| 382 | |
---|
| 383 | .ne 3 |
---|
| 384 | .TP |
---|
| 385 | .BI \--mail\-from= sender\-address\c |
---|
| 386 | If |
---|
| 387 | .B DSPAM |
---|
| 388 | will be cofigured to deliver via LMTP or SMTP, this flag will set the MAIL FROM sent on |
---|
| 389 | delivery of the message. The default MAIL FROM depends on how the message was originally |
---|
| 390 | relayed to DSPAM. If it was relayed via the commandline, an empty MAIL FROM will be |
---|
| 391 | used. If it was relayed via LMTP, the original MAIL FROM will be used. |
---|
| 392 | |
---|
| 393 | .SH EXIT VALUE |
---|
| 394 | .LP |
---|
| 395 | .ne 3 |
---|
| 396 | .PD 0 |
---|
| 397 | .TP |
---|
| 398 | .B 0 |
---|
| 399 | Operation was successful. |
---|
| 400 | .ne 3 |
---|
| 401 | .TP |
---|
| 402 | .B other |
---|
| 403 | Operation resulted in an error. If the error involved an error in calling the |
---|
| 404 | delivery agent, the exit value of the delivery agent will be returned. |
---|
| 405 | .PD |
---|
| 406 | |
---|
| 407 | .SH COPYRIGHT |
---|
| 408 | Copyright \(co 2002\-2012 DSPAM Project |
---|
| 409 | .br |
---|
| 410 | All rights reserved. |
---|
| 411 | .br |
---|
| 412 | |
---|
| 413 | For more information, see http://dspam.sourceforge.net. |
---|
| 414 | |
---|
| 415 | .SH SEE ALSO |
---|
| 416 | .BR dspam_admin (1), |
---|
| 417 | .BR dspam_clean (1), |
---|
| 418 | .BR dspam_crc (1), |
---|
| 419 | .BR dspam_dump (1), |
---|
| 420 | .BR dspam_logrotate (1), |
---|
| 421 | .BR dspam_merge (1), |
---|
| 422 | .BR dspam_stats (1), |
---|
| 423 | .BR dspam_train (1) |
---|