1 | .\" $Id: dspam_train.1,v 1.10 2011/06/28 00:13:48 sbajic Exp $ |
---|
2 | .\" -*- nroff -*- |
---|
3 | .\" |
---|
4 | .\" dspam_train3.9 |
---|
5 | .\" |
---|
6 | .\" Authors: Jonathan A. Zdziarski <jonathan@nuclearelephant.com> |
---|
7 | .\" Stevan Bajic <stevan@bajic.ch> |
---|
8 | .\" |
---|
9 | .\" Copyright (C) 2002-2012 DSPAM Project |
---|
10 | .\" All rights reserved |
---|
11 | .\" |
---|
12 | .TH dspam_train 1 "Apr 17, 2010" "DSPAM" "DSPAM" |
---|
13 | |
---|
14 | .SH NAME |
---|
15 | dspam_train \- train a corpus of mail |
---|
16 | |
---|
17 | .SH SYNOPSIS |
---|
18 | .na |
---|
19 | .B dspam_train |
---|
20 | [\c |
---|
21 | .BI username\fR\c |
---|
22 | ] |
---|
23 | [\c |
---|
24 | .BI \--client\fR\c |
---|
25 | ] |
---|
26 | [\c |
---|
27 | .BI \-i\ \fR\c |
---|
28 | index|\c |
---|
29 | .BI spam_corpus\fR\c |
---|
30 | \ \c |
---|
31 | .BI nonspam_corpus\fR\c |
---|
32 | ] |
---|
33 | |
---|
34 | .ad |
---|
35 | .SH DESCRIPTION |
---|
36 | .LP |
---|
37 | .B dspam_train |
---|
38 | is used to train and test a corpus of mail (in maildir or MBOX format). This |
---|
39 | tool will present each message to DSPAM for a classification and then |
---|
40 | retrain only if the message was incorrect. This provides close to real\-world |
---|
41 | training and should be used to build pretrained databases. Upon execution, |
---|
42 | the tool will automatically determine the ratio of spam:nonspam and train |
---|
43 | based on that ratio to ensure both corpora are trained consecutively. This |
---|
44 | tool can also be used as a test jig to measure the efficiency and accuracy |
---|
45 | of a particular corpus against DSPAM in a given configuration. |
---|
46 | |
---|
47 | .SH OPTIONS |
---|
48 | .LP |
---|
49 | .ne 3 |
---|
50 | .TP |
---|
51 | |
---|
52 | .ne 3 |
---|
53 | .TP |
---|
54 | .BI \--client\c |
---|
55 | If specified, DSPAM is used in client\-server mode. |
---|
56 | |
---|
57 | .ne 3 |
---|
58 | .TP |
---|
59 | .BI username\c |
---|
60 | Specifies the user to train, if omitted the current user name is used. |
---|
61 | |
---|
62 | .ne 3 |
---|
63 | .TP |
---|
64 | .BI \-i\fR\ index\c |
---|
65 | Use a index file instead of the usual spam_corpus and nonspam_corpus. |
---|
66 | |
---|
67 | .B index |
---|
68 | : Path to the index file having the following format per line: |
---|
69 | .br |
---|
70 | [class] [path to message] |
---|
71 | |
---|
72 | .ne 3 |
---|
73 | .TP |
---|
74 | .BI spam_corpus\c |
---|
75 | Specifies either the pathname to the directory containing the corpus of spam, |
---|
76 | with each in a separate file (e.g. maildir format) or a path to the mailbox in |
---|
77 | the traditional Unix MBOX format. |
---|
78 | |
---|
79 | .ne 3 |
---|
80 | .TP |
---|
81 | .BI nonspam_corpus\c |
---|
82 | Specifies either the pathname to the directory containing the corpus of |
---|
83 | nonspam with each message in a separate file or a path to the mailbox in the |
---|
84 | traditional Unix MBOX format. |
---|
85 | |
---|
86 | .SH EXIT VALUE |
---|
87 | .LP |
---|
88 | .ne 3 |
---|
89 | .PD 0 |
---|
90 | .TP |
---|
91 | .B 0 |
---|
92 | Operation was successful. |
---|
93 | .ne 3 |
---|
94 | .TP |
---|
95 | .B other |
---|
96 | Operation resulted in an error. |
---|
97 | .PD |
---|
98 | |
---|
99 | .SH COPYRIGHT |
---|
100 | Copyright \(co 2002\-2012 DSPAM Project |
---|
101 | .br |
---|
102 | All rights reserved. |
---|
103 | .br |
---|
104 | |
---|
105 | For more information, see http://dspam.sourceforge.net. |
---|
106 | |
---|
107 | .SH SEE ALSO |
---|
108 | .BR dspam (1), |
---|
109 | .BR dspam_admin (1), |
---|
110 | .BR dspam_clean (1), |
---|
111 | .BR dspam_crc (1), |
---|
112 | .BR dspam_dump (1), |
---|
113 | .BR dspam_logrotate (1), |
---|
114 | .BR dspam_merge (1), |
---|
115 | .BR dspam_stats (1) |
---|