1 | .\" $Id: dspam.1,v 1.20 2011/06/28 00:13:48 sbajic Exp $ |
---|
2 | .\" -*- nroff -*- |
---|
3 | .\" |
---|
4 | .\" dspam3.9 |
---|
5 | .\" |
---|
6 | .\" Authors: Jonathan A. Zdziarski <jonathan@nuclearelephant.com> |
---|
7 | .\" Stevan Bajic <stevan@bajic.ch> |
---|
8 | .\" |
---|
9 | .\" Copyright (C) 2002-2012 DSPAM Project |
---|
10 | .\" All rights reserved |
---|
11 | .\" |
---|
12 | .TH DSPAM 1 "Aug 14, 2010" "DSPAM" "DSPAM" |
---|
13 | |
---|
14 | .SH NAME |
---|
15 | dspam \- DSPAM Anti-Spam Agent |
---|
16 | |
---|
17 | .SH SYNOPSIS |
---|
18 | .na |
---|
19 | .B dspam |
---|
20 | [\c |
---|
21 | .BI \--mode= teft|toe|tum|notrain|unlearn\c |
---|
22 | ] |
---|
23 | [\c |
---|
24 | .BI \--user \ user1 |
---|
25 | user2\ ...\ userN\c |
---|
26 | ] |
---|
27 | [\c |
---|
28 | .BI \--feature= noise|no,tb=N,whitelist|wh\c |
---|
29 | ] |
---|
30 | [\c |
---|
31 | .BI \--class= spam|innocent\c |
---|
32 | ] |
---|
33 | [\c |
---|
34 | .BI \--source= error|corpus|inoculation\c |
---|
35 | ] |
---|
36 | [\c |
---|
37 | .BI \--profile= PROFILE\c |
---|
38 | ] |
---|
39 | [\c |
---|
40 | .BI \--deliver= spam,innocent|nonspam,summary,stdout\c |
---|
41 | ] |
---|
42 | [\c |
---|
43 | .BI \--help\c |
---|
44 | ] |
---|
45 | [\c |
---|
46 | .BI \--version\c |
---|
47 | ] |
---|
48 | [\c |
---|
49 | .BI \--process\c |
---|
50 | ] |
---|
51 | [\c |
---|
52 | .BI \--classify\c |
---|
53 | ] |
---|
54 | [\c |
---|
55 | .BI \--signature= signature\c |
---|
56 | ] |
---|
57 | [\c |
---|
58 | .BI \--stdout\c |
---|
59 | ] |
---|
60 | [\c |
---|
61 | .BI \--debug\c |
---|
62 | ] |
---|
63 | [\c |
---|
64 | .BI \--daemon\c |
---|
65 | ] |
---|
66 | [\c |
---|
67 | .BI \--nofork\c |
---|
68 | ]] |
---|
69 | [\c |
---|
70 | .BI \--client\c |
---|
71 | ] |
---|
72 | [\c |
---|
73 | .BI \--rcpt\-to \ recipient\-address(es)\c |
---|
74 | ] |
---|
75 | [\c |
---|
76 | .BI \--mail\-from= sender\-address\c |
---|
77 | ] |
---|
78 | [\c |
---|
79 | .BI passthru\-delivery\-arguments\fR\c |
---|
80 | ] |
---|
81 | |
---|
82 | .ad |
---|
83 | .SH DESCRIPTION |
---|
84 | .LP |
---|
85 | .B The DSPAM agent |
---|
86 | provides a direct interface to mail servers for command\-line |
---|
87 | spam filtering. The agent can masquerade as the mail server's local delivery |
---|
88 | agent and will process any email passed to it. The agent will then call whatever |
---|
89 | delivery agent was specified at compile time or quarantine/tag/drop messages |
---|
90 | identified as spam. The DSPAM agent can function locally or as a proxy. It |
---|
91 | is also responsible for processing classification errors so that DSPAM can |
---|
92 | learn from its mistakes. |
---|
93 | |
---|
94 | .SH OPTIONS |
---|
95 | .LP |
---|
96 | .ne 3 |
---|
97 | .TP |
---|
98 | .BI \--user \ user1\fR\ user2\ ...\ userN\c |
---|
99 | Specifies the destination users of the incoming message. In most cases this is |
---|
100 | the local user on the system, however some implementations may call for virtual |
---|
101 | usernames, specific to DSPAM, to be assigned. The agent processes an |
---|
102 | incoming message once for each user specified. If the message is to be |
---|
103 | delivered, the $u (or %u) parameters of the argument string will be interpolated |
---|
104 | for the current user being processed. |
---|
105 | |
---|
106 | .ne 3 |
---|
107 | .TP |
---|
108 | .BI \--mode= toe|tum|teft|notrain\c |
---|
109 | Configures the training mode to be used for this process, overriding any defaults in |
---|
110 | dspam.conf or the preference extension: |
---|
111 | |
---|
112 | .B teft |
---|
113 | : Train\-Everything. Trains on all messages processed. This is a very thorough training |
---|
114 | approach and should be considered the standard training approach for most users. TEFT |
---|
115 | may, however, prove too volatile on installations with extremely high per\-user traffic, |
---|
116 | or prove not very scalable on systems with extremely large user\-bases. In the event |
---|
117 | that TEFT is proving ineffective, one of the other modes is recommended. |
---|
118 | |
---|
119 | .B toe |
---|
120 | : Train\-on\-Error. Trains only on a classification error, once the user's metadata has |
---|
121 | matured to 2500 innocent messages. This training mode is much less resource intensive, |
---|
122 | as only occasional metadata writes are necessary. It is also far less volatile than |
---|
123 | the TEFT mode of training. One drawback, however, is that TOE only learns when DSPAM |
---|
124 | has made a mistake \- which means the data is sometimes too static, and unable to "ease |
---|
125 | into" a different type of behavior. |
---|
126 | |
---|
127 | .B tum |
---|
128 | : Train\-until\-Mature. This training mode is a hybrid between the other two training modes |
---|
129 | and provides a great balance between volatility and static metadata. TuM will train on a |
---|
130 | per\-token basis only tokens which have had fewer than 25 "hits" on them, unless an error |
---|
131 | is being retrained in which case all tokens are trained. This training mode provides a |
---|
132 | solid core of stable tokens to keep accuracy consistent, but also allows for dynamic |
---|
133 | adaptation to any new types of email behavior a user might be experiencing. |
---|
134 | |
---|
135 | .B notrain |
---|
136 | : No training. Do not train the user's data, and do not keep totals. This should only be |
---|
137 | used in cases where you want to process mail for a particular user (based on a group, for |
---|
138 | example), but don't want the user to accumulate any learning data. |
---|
139 | |
---|
140 | .B unlearn |
---|
141 | : Unlearn original training. Use this if you wish to unlearn a previously learned message. |
---|
142 | Be sure to specify |
---|
143 | .B \--source=error |
---|
144 | and |
---|
145 | .B \--class |
---|
146 | to whatever the original classification the |
---|
147 | message was learned under. If not using TrainPristine, this will require the original |
---|
148 | signature from training. |
---|
149 | |
---|
150 | .ne 3 |
---|
151 | .TP |
---|
152 | .BI \--feature= noise|no,whitelist|wh,tb=N\c |
---|
153 | Specifies the features that should be activated for this filter instance. The following |
---|
154 | features may be used individually or combined using a comma as a delimiter: |
---|
155 | |
---|
156 | .B (no)ise |
---|
157 | : Bayesian Noise Reduction (BNR). Bayesian Noise Reduction kicks in at 2500 innocent |
---|
158 | messages and provides an advanced progressive noise logic to reduce Bayesian Noise |
---|
159 | (wordlist attacks) in spams. See http://www.zdziarski.com/papers/bnr.html for more |
---|
160 | information. |
---|
161 | |
---|
162 | .B (tb)\=N |
---|
163 | : Sets the training loop buffering level. Training loop buffering is the amount of |
---|
164 | statistical sedation performed to water down statistics and avoid false positives |
---|
165 | during the user's training loop. The training buffer sets the buffer sensitivity, |
---|
166 | and should be a number between 0 (no buffering whatsoever) to 10 (heavy buffering). |
---|
167 | The default is 5, half of what previous versions of DSPAM used. To avoid dulling |
---|
168 | down statistics at all during the training loop, set this to 0. |
---|
169 | |
---|
170 | .B (wh)itelist |
---|
171 | : Automatic whitelisting. DSPAM will keep track of the entire "From:" line for each |
---|
172 | message received per user, and automatically whitelist messages from senders with more |
---|
173 | than 20 innocent messages and zero spams. Once the user reports a spam from the sender, |
---|
174 | automatic whitelisting will automatically be deactivated for that sender. Since DSPAM |
---|
175 | uses the entire "From:" line, and not just the sender's email address, automatic |
---|
176 | whitelisting is a very safe approach to improving accuracy especially during initial |
---|
177 | training. |
---|
178 | |
---|
179 | .B NOTE: |
---|
180 | : None of the present features are necessary when the source is "error", because the |
---|
181 | original training data is used from the signature to retrain, instantiating whatever |
---|
182 | features (such as whitelisting) were active at the time of the initial classification. |
---|
183 | Since BNR is only necessary when a message is being classified, the |
---|
184 | .B \--feature |
---|
185 | flag can be safely omitted from error source calls. |
---|
186 | |
---|
187 | .ne 3 |
---|
188 | .TP |
---|
189 | .BI \--class= spam|innocent\c |
---|
190 | Identifies the disposition (if any) of the message being presented. This flag |
---|
191 | should be used when a misclassification has occured, when the user is |
---|
192 | corpus\-feeding a message, or when an inoculation is being presented. This |
---|
193 | flag should not be used for standard processing. This flag must be used in |
---|
194 | conjunction with the |
---|
195 | .B \--source |
---|
196 | flag. Omitting this flag causes DSPAM to determine the disposition of the message on |
---|
197 | its own (the standard operating mode). |
---|
198 | |
---|
199 | .ne 3 |
---|
200 | .TP |
---|
201 | .BI \--source= error|corpus|inoculation\c |
---|
202 | Where |
---|
203 | .B \--class |
---|
204 | is used, the source of the classification must also be provided. The source |
---|
205 | tells dspam how to learn the message being presented: |
---|
206 | |
---|
207 | .B error |
---|
208 | : The message being presented was a message previously misclassified by DSPAM. When |
---|
209 | \'error\' is provided as a source, DSPAM requires that the DSPAM signature be present |
---|
210 | in the message, and will use the signature to recall the original training metadata. |
---|
211 | If the signature is not present, the message will be rejected. In this source mode, |
---|
212 | DSPAM will also decrement each token's previous classification's count as well as |
---|
213 | the user totals. |
---|
214 | |
---|
215 | You should use error only when DSPAM has made an error in classifying the message, |
---|
216 | and should present the modified version of the message with the DSPAM signature when |
---|
217 | doing so. |
---|
218 | |
---|
219 | .B corpus |
---|
220 | : The message being presented is from a mail corpus, and should be trained as a new |
---|
221 | message, rather than re\-trained based on a signature. The message's full headers and |
---|
222 | body will be analyzed and the correct classification will be incremented, without |
---|
223 | its opposite being decremented. |
---|
224 | |
---|
225 | You should use corpus only when feeding messages in from corpus. |
---|
226 | |
---|
227 | .B inoculation |
---|
228 | : The message being presented is in pristine form, and should be trained as an |
---|
229 | inoculation. Inoculations are a more intense mode of training designed to cause DSPAM |
---|
230 | to train the user's metadata repeatedly on previoulsy unknown tokens, in an attempt to |
---|
231 | vaccinate the user from future messages similar to the one being presented. You should |
---|
232 | use inoculation only on honeypots and the like. |
---|
233 | |
---|
234 | .ne 3 |
---|
235 | .TP |
---|
236 | .BI \--profile= PROFILE\c |
---|
237 | Specify a storage profile from dspam.conf. The storage profile selected will be used |
---|
238 | for all database connectivity. See dspam.conf for more information. |
---|
239 | |
---|
240 | .ne 3 |
---|
241 | .TP |
---|
242 | .BI \--deliver= spam,innocent|nonspam,summary,stdout\c |
---|
243 | Tells |
---|
244 | .B DSPAM |
---|
245 | to deliver the message if its result falls within the criteria specified. For example, |
---|
246 | .B \--deliver=innocent |
---|
247 | or |
---|
248 | .B \--deliver=nonspam |
---|
249 | will cause DSPAM to only deliver the message if its classification has been determined |
---|
250 | as innocent. Providing |
---|
251 | .B \--deliver=innocent,spam |
---|
252 | or |
---|
253 | .B \--deliver=nonspam,spam |
---|
254 | will cause DSPAM to deliver the message regardless of its classification. This flag |
---|
255 | provides a significant amount of flexibility for nonstandard implementations, where |
---|
256 | false positives may not be delivered but spam is, and etcetera. |
---|
257 | |
---|
258 | .B summary |
---|
259 | : Deliver (to stdout) a summary indentical to the output of message classification: |
---|
260 | |
---|
261 | X\-DSPAM\-Result: User; result="Innocent"; class="Innocent"; probability=0.0000; confidence=1.00; signature=4b11c532158749980119923 |
---|
262 | |
---|
263 | .B stdout |
---|
264 | : Is a shortcut for for |
---|
265 | .B \--deliver=innocent,spam --stdout |
---|
266 | |
---|
267 | .ne 3 |
---|
268 | .TP |
---|
269 | .B \--stdout \c |
---|
270 | If the message is indeed deemed "deliverable" by the |
---|
271 | .B \--deliver |
---|
272 | flag, this flag will cause DSPAM to deliver the message to stdout, rather than the |
---|
273 | configured delivery agent. |
---|
274 | |
---|
275 | .ne 3 |
---|
276 | .TP |
---|
277 | .B \--process\c |
---|
278 | Tells |
---|
279 | .B DSPAM |
---|
280 | to process the message. This is the default behavior, and the flag is implied unless |
---|
281 | .B \--classify |
---|
282 | is used. |
---|
283 | |
---|
284 | .ne 3 |
---|
285 | .TP |
---|
286 | .BI \--classify\c |
---|
287 | Tells |
---|
288 | .B DSPAM |
---|
289 | to only classify the message, and not perform any writes to the user's |
---|
290 | data or attempt to deliver/quarantine the message. The results of a |
---|
291 | classification are printed to stdout in the following format: |
---|
292 | |
---|
293 | X\-DSPAM\-Result: User; result="Spam"; probability=1.0000; confidence=0.80 |
---|
294 | |
---|
295 | .B NOTE |
---|
296 | : The output of the classification is specific to a user's own data, and |
---|
297 | does not include the output of any groups they might be affiliated with, |
---|
298 | so it is entirely possible that the message would be caught as spam by a |
---|
299 | group the user belongs to, and appear as innocent in the output of a |
---|
300 | classification. To get the classification for the |
---|
301 | .B group |
---|
302 | , use the group name as the user instead of an individual. |
---|
303 | |
---|
304 | .ne 3 |
---|
305 | .TP |
---|
306 | .BI \--signature= signature\c |
---|
307 | If only the signature is available for training, and not the entire message, the |
---|
308 | .B \--signature |
---|
309 | flag may be used to feed the signature into DSPAM and forego |
---|
310 | the reading of stdin. DSPAM will process the signature with whatever |
---|
311 | commandline classification was specified. |
---|
312 | |
---|
313 | .B NOTE |
---|
314 | : This should only be used with |
---|
315 | .B \--source=error |
---|
316 | |
---|
317 | .ne 3 |
---|
318 | .TP |
---|
319 | .BI \--debug\c |
---|
320 | If |
---|
321 | .B DSPAM |
---|
322 | was compiled with |
---|
323 | .B \--enable\-debug |
---|
324 | then using |
---|
325 | .B \--debug |
---|
326 | will turn on debugging messages. |
---|
327 | |
---|
328 | .ne 3 |
---|
329 | .TP |
---|
330 | .BI \--daemon\c |
---|
331 | If |
---|
332 | .B DSPAM |
---|
333 | was compiled with |
---|
334 | .B \--enable\-daemon |
---|
335 | then using |
---|
336 | .B \--daemon |
---|
337 | will cause DSPAM to enter daemon mode, where it will listen for DSPAM clients to |
---|
338 | connect and actively service requests. |
---|
339 | |
---|
340 | .ne 3 |
---|
341 | .TP |
---|
342 | .BI \--nofork\c |
---|
343 | If |
---|
344 | .B DSPAM |
---|
345 | was compiled with |
---|
346 | .B \--enable\-daemon |
---|
347 | then using |
---|
348 | .B \--nofork |
---|
349 | will cause DSPAM to not fork the daemon into backgound when using |
---|
350 | .B \--daemon |
---|
351 | switch. |
---|
352 | |
---|
353 | .ne 3 |
---|
354 | .TP |
---|
355 | .BI \--client\c |
---|
356 | If |
---|
357 | .B DSPAM |
---|
358 | was compiled with |
---|
359 | .B \--enable\-daemon |
---|
360 | then using |
---|
361 | .B \--client |
---|
362 | will cause DSPAM to act as a client and attempt to connect to the DSPAM server specified in |
---|
363 | the client's configuration within dspam.conf. If client behavior is desired, this option |
---|
364 | .B must |
---|
365 | be specified, otherwise the agent simply operate as self\-contained and processes |
---|
366 | the message on its own, eliminating any benefit of using the daemon. |
---|
367 | |
---|
368 | .ne 3 |
---|
369 | .TP |
---|
370 | .BI \--rcpt\-to \ recipient\-address(es)\c |
---|
371 | If |
---|
372 | .B DSPAM |
---|
373 | will be configured to deliver via LMTP or SMTP, this flag may be used to define the |
---|
374 | RCPT TOs which will be used for the delivery of each user specified with |
---|
375 | .B \--user |
---|
376 | If no recipients are provided, the RCPT TOs will match the username. |
---|
377 | |
---|
378 | .B NOTE |
---|
379 | : The recipient list should always be balanced with the user list, or empty. |
---|
380 | Specifying an unbalanced number of recipients to users will result in undefined |
---|
381 | behavior. |
---|
382 | |
---|
383 | .ne 3 |
---|
384 | .TP |
---|
385 | .BI \--mail\-from= sender\-address\c |
---|
386 | If |
---|
387 | .B DSPAM |
---|
388 | will be cofigured to deliver via LMTP or SMTP, this flag will set the MAIL FROM sent on |
---|
389 | delivery of the message. The default MAIL FROM depends on how the message was originally |
---|
390 | relayed to DSPAM. If it was relayed via the commandline, an empty MAIL FROM will be |
---|
391 | used. If it was relayed via LMTP, the original MAIL FROM will be used. |
---|
392 | |
---|
393 | .SH EXIT VALUE |
---|
394 | .LP |
---|
395 | .ne 3 |
---|
396 | .PD 0 |
---|
397 | .TP |
---|
398 | .B 0 |
---|
399 | Operation was successful. |
---|
400 | .ne 3 |
---|
401 | .TP |
---|
402 | .B other |
---|
403 | Operation resulted in an error. If the error involved an error in calling the |
---|
404 | delivery agent, the exit value of the delivery agent will be returned. |
---|
405 | .PD |
---|
406 | |
---|
407 | .SH COPYRIGHT |
---|
408 | Copyright \(co 2002\-2012 DSPAM Project |
---|
409 | .br |
---|
410 | All rights reserved. |
---|
411 | .br |
---|
412 | |
---|
413 | For more information, see http://dspam.sourceforge.net. |
---|
414 | |
---|
415 | .SH SEE ALSO |
---|
416 | .BR dspam_admin (1), |
---|
417 | .BR dspam_clean (1), |
---|
418 | .BR dspam_crc (1), |
---|
419 | .BR dspam_dump (1), |
---|
420 | .BR dspam_logrotate (1), |
---|
421 | .BR dspam_merge (1), |
---|
422 | .BR dspam_stats (1), |
---|
423 | .BR dspam_train (1) |
---|