[c5c522c] | 1 | $Id: cssclean.txt,v 1.01 2011/07/11 23:46:05 sbajic Exp $ |
---|
| 2 | |
---|
| 3 | HASH_DRV NIGHTLY MAINTENANCE |
---|
| 4 | |
---|
| 5 | The tool for nightly maintenance - dspam_clean - does not work with hash_drv, |
---|
| 6 | it does not do any cleaning. |
---|
| 7 | |
---|
| 8 | You have to clean it by yourself. There are the two steps to do: |
---|
| 9 | |
---|
| 10 | - first, you should remove old signature files. These files are located |
---|
| 11 | in user.sig directory and have extension .sig. They are needed only |
---|
| 12 | for dspam retraining, so you can remove them if they are older than two weeks. |
---|
| 13 | - second, you should purge databases. They are located in user.css |
---|
| 14 | files and contain a set of tokens with counters: SPAM and NONSPAM, which count |
---|
| 15 | how many times the token appeared in spams and innocent mails. |
---|
| 16 | |
---|
| 17 | There is a special tool for cleaning it - cssclean: |
---|
| 18 | |
---|
| 19 | cssclean [file.css] {heavy} |
---|
| 20 | |
---|
| 21 | Cssclean implements its own counter for each token. It increments every |
---|
| 22 | cleaning - so if you run cssclean every night, it works like timestamp. |
---|
| 23 | If DSPAM uses a token, it resets this counter. So cssclean knows which tokens |
---|
| 24 | were not used for counted time. |
---|
| 25 | |
---|
| 26 | Cssclean removes tokens, which are: |
---|
| 27 | - not used for 15 cleans and ( NONSPAM + SPAM <= 1 ), or |
---|
| 28 | - not used for 15 cleans and NONSPAM is equal or almost equal SPAM, or |
---|
| 29 | - not used for 60 cleans and ( NONSPAM*2 + SPAM < 5 ), or |
---|
| 30 | - not used for 120 cleans. |
---|
| 31 | |
---|
| 32 | With special options - heavy - cssclean is more strict and removes |
---|
| 33 | tokens for which: |
---|
| 34 | - NONSPAM + SPAM <= 1 |
---|
| 35 | - NONSPAM is equal or almost equal SPAM |
---|