REVIEW: "Ending Spam", Jonathan A. Zdziarski
am 19.01.2006 17:05:30 von rsladeBKENDSPM.RVW 20051029
"Ending Spam", Jonathan A. Zdziarski, 2005, 1-59327-052-6,
U$39.95/C$53.95
%A Jonathan A. Zdziarski
%C 555 De Haro Street, Suite 250, San Francisco, CA 94107
%D 2005
%G 1-59327-052-6
%I No Starch Press
%O U$39.95/C$53.95 415-863-9900 fax 415-863-9950 info@nostarch.com
%O http://www.amazon.com/exec/obidos/ASIN/1593270526/robsladesi nterne
http://www.amazon.co.uk/exec/obidos/ASIN/1593270526/robslade sinte-21
%O http://www.amazon.ca/exec/obidos/ASIN/1593270526/robsladesin 03-20
%O Audience s+ Tech 3 Writing 2 (see revfaq.htm for explanation)
%P 287 p.
%T "Ending Spam"
The preface states that the book is for those seriously interested in
spam identification technologies, and concentrates on Bayesian and
related statistical filtering.
Part one is an introduction to spam filtering. Chapter one reviews
the history of spam, although many of the early entries are simply
annoyances or chain letters rather than the commercial or fraudulent
items considered under the banner today, and the author does not seem
to realize that 419 scams predated email by a considerable margin. A
look at the development of spam filtering (excluding Bayesian) is
presented in chapter two, along with some non-filtering. Bayesian
analysis is explained in chapter three, and the statistical filtering
basis is outlined in chapter four.
The fundamental actuarial core is expanded in part two. Chapter five
covers message coding. Tokenization, chunking characters into
identifiable items, is examined in chapter six. Tricks spammers use
to evade filters, and the solutions finding spam despite the
deceptions, are outlined in chapter seven. Storage and performance
issues raised by the data rules required by statistical filters are
addressed in chapter eight. Chapter nine looks at aspects of scaling
to systems supporting large numbers of users.
Part three deals with advanced concepts in statistical filtering.
Chapter ten delves into testing which, because of the individual and
adaptive nature of Bayesian filtering, presents unique challenges.
Tokenization is revisited in chapter eleven, in more advanced forms.
Markovian discrimination, with its examination of stateful entities,
is explained in chapter twelve. Having noted many kinds of features
in the book, chapter thirteen explores ways to reduce the items used
(and data required) while maintaining accuracy. Collaborative rule-
building with other users, groups, or systems is reviewed in chapter
fourteen.
As the preface implies, this is *not* a book for users who just want
to install POPFile (although that and other programs are explored in
an appendix). For those who are seriously involved in managing and
developing spam filtering, however, the book does provide very useful
advice, pointers, and research.
copyright Robert M. Slade, 2005 BKENDSPM.RVW 20051029
--
======================
rslade@vcn.bc.ca slade@victoria.tc.ca rslade@sun.soci.niu.edu
============= for back issues:
[Base URL] site http://victoria.tc.ca/techrev/
or mirror http://sun.soci.niu.edu/~rslade/
CISSP refs: [Base URL]mnbksccd.htm
Security Dict.: [Base URL]secgloss.htm
Book reviews: [Base URL]mnbk.htm
Review mailing list: send mail to techbooks-subscribe@egroups.com
or techbooks-subscribe@topica.com