1 | $Id: cssclean.txt,v 1.01 2011/07/11 23:46:05 sbajic Exp $ |
---|
2 | |
---|
3 | HASH_DRV NIGHTLY MAINTENANCE |
---|
4 | |
---|
5 | The tool for nightly maintenance - dspam_clean - does not work with hash_drv, |
---|
6 | it does not do any cleaning. |
---|
7 | |
---|
8 | You have to clean it by yourself. There are the two steps to do: |
---|
9 | |
---|
10 | - first, you should remove old signature files. These files are located |
---|
11 | in user.sig directory and have extension .sig. They are needed only |
---|
12 | for dspam retraining, so you can remove them if they are older than two weeks. |
---|
13 | - second, you should purge databases. They are located in user.css |
---|
14 | files and contain a set of tokens with counters: SPAM and NONSPAM, which count |
---|
15 | how many times the token appeared in spams and innocent mails. |
---|
16 | |
---|
17 | There is a special tool for cleaning it - cssclean: |
---|
18 | |
---|
19 | cssclean [file.css] {heavy} |
---|
20 | |
---|
21 | Cssclean implements its own counter for each token. It increments every |
---|
22 | cleaning - so if you run cssclean every night, it works like timestamp. |
---|
23 | If DSPAM uses a token, it resets this counter. So cssclean knows which tokens |
---|
24 | were not used for counted time. |
---|
25 | |
---|
26 | Cssclean removes tokens, which are: |
---|
27 | - not used for 15 cleans and ( NONSPAM + SPAM <= 1 ), or |
---|
28 | - not used for 15 cleans and NONSPAM is equal or almost equal SPAM, or |
---|
29 | - not used for 60 cleans and ( NONSPAM*2 + SPAM < 5 ), or |
---|
30 | - not used for 120 cleans. |
---|
31 | |
---|
32 | With special options - heavy - cssclean is more strict and removes |
---|
33 | tokens for which: |
---|
34 | - NONSPAM + SPAM <= 1 |
---|
35 | - NONSPAM is equal or almost equal SPAM |
---|