Return-Path: list-managers-owner@GreatCircle.COM
Received: from relay3.UU.NET (relay3.UU.NET [192.48.96.8]) by leibniz.math.psu.edu (8.6.12/8.6.9) with ESMTP id OAA02226 for <barr@math.psu.edu>; Tue, 17 Oct 1995 14:46:57 -0400
Received: from miles.greatcircle.com by relay3.UU.NET with ESMTP 
	id QQzlwl24468; Tue, 17 Oct 1995 14:45:45 -0400
Received: (majordom@localhost) by miles.greatcircle.com (8.6.9/Miles-950430-1) id LAA22532 for list-managers-outgoing; Tue, 17 Oct 1995 11:32:04 -0700
Received: from maytag.graphics.cornell.edu (MAYTAG.GRAPHICS.CORNELL.EDU [128.84.247.157]) by miles.greatcircle.com (8.6.9/Miles-950430-1) with SMTP id LAA22517 for <list-managers@greatcircle.com>; Tue, 17 Oct 1995 11:31:58 -0700
Received: from localhost by maytag.graphics.cornell.edu; (5.65/1.1.8.2/07Nov94-0649PM)
	id AA04243; Tue, 17 Oct 1995 14:30:16 -0400
Message-Id: <9510171830.AA04243@maytag.graphics.cornell.edu>
X-Mailer: exmh version 1.6gamma 3/31/95
To: list-managers@GreatCircle.COM
Subject: news-to-mail gateways and spam
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Tue, 17 Oct 95 14:30:15 -0400
From: Mitch Collinsworth <mkc@graphics.cornell.edu>
X-Mts: smtp
Sender: list-managers-owner@GreatCircle.COM
Precedence: bulk


Hi all,

Those of you who run gateways between mailing lists and newsgroups are
well aware of the problems created by news spammers, who are much more
numerous and prolific than e-mail spammers.  The spam goes through the
gateway to the mailing list the second it arrives.  The cancel message
from one of the spam cancellers comes along some time later and cleans
up your news spool, but can't do anything for the mailing list since
the spam has already gone out.

What to do?  Moderate your mailing list or the news-to-mail gateway?
An acceptable solution for some, but not for many others.  One idea
I've heard raised is to insert a delay in the gateway to allow time
for the cancel message to catch the spam before it crosses the gateway.
That of course slows everything down, but if you can live with that
then maybe it's ok.  It probably slows things less on average than
moderating the gateway or the entire list.  And you really only need
to do it on the news-to-mail side of the gateway.

Below is a perl script I just put together to do this for INN.  The
real question now is how much delay is sufficient.  I suppose that's
partly a tradeoff against how much delay you're willing to insert into
the process.  I'm interested in feedback on this point.

I'm thinking of modifying the delay to be longer on weekends than during
the week.  Partly because the big news spammers seem to hit more often on
weekends and the spam cancellers are probably away from their posts more
then.  And I'm sure my list members are away from the net more on weekends,
so extra delay should be less objectionable then.

-Mitch

-----------

#!/usr/local/bin/perl

# delayfeed: Take an INN file feed and delay for a specified time period.
# After delay invoke a program feed.  Intended purpose is for delaying a
# news --> mail gateway to allow time for spam cancellations to catch up
# before the spam traverses the gateway.
#
# NB: This is NOT a news --> mail gateway program.  This program is a
# feed delaying tool that can be inserted between innd and an existing
# news --> mail gateway program.

# Installation:
#
# 1. Assuming your existing INN news --> mail gateway newsfeed entry looks
#    something like this:
#
#	# News-Mail gateway
#	my-gw:!*,rec.foo.bar,news.bar.foo\
#		:Tp:/var/news/bin/News2Mail
#
#    ... which pipes each incoming article in the specified groups to the
#    named News2Mail, you should change the feed to look something like
#    this:
#
#	# Anti-spam gateway
#	my-gw:!*,rec.foo.bar,news.bar.foo\
#		:Tf,Wtf:
#
#    This will create a file feed in your spool/out.going directory.  The
#    feed file will contain time and pathname for each article received.
#    Like so:
#
#	813939153 /var/spool/news/rec/foo/bar/2327
#	813941888 /var/spool/news/news/bar/foo/8412
#	813943494 /var/spool/news/news/bar/foo/8413
#
# 2. Adjust configuration information below appropriately.
#
# 3. Add a crontab entry something like this:
#
#	8,28,48 * * * * su news -c "( /usr/local/scripts/delayfeed )"
#	(If you have a more modern cron, adjust syntax accordingly.
#	The point is to make it run as the news user.)
#
# 4. You will probably want to add a cron script to rotate the log files
#    to prevent them from growing indefinitely.


# configuration info
$FEED = 'my-gw';			# name of feed from newsfeeds file
$NEWS_SPOOL = '/var/spool/news';	# where the news spool is
$CTLINND = '/var/news/bin/ctlinnd';	# path to ctlinnd
$N2M = '/var/news/bin/News2Mail';	# path to n2m script
$LOG = '/var/log/news/antispamlog';	# where to put our log file
$DELAY_TIME = 7200;			# minimum delay time in seconds


$tmpfile = "$FEED.bch$$";		# temporary file for batching the feed
$savefile = "$FEED.save";		# temporary file for rewriting queue
$workfile = "$FEED.delay";		# queue of delaying messages

# open log file
open(LOG,">>$LOG") || die "Can't open $LOG";
chop($date = `date`);
print LOG "*** $0 run at $date\n";

# move new feed data from inn's feed file to a temporary batch file
chdir "$NEWS_SPOOL/out.going" || &errdie("Can't chdir to 
$NEWS_SPOOL/out.going");
rename($FEED,$tmpfile);

# signal innd to flush output and start new feed file
system "$CTLINND flush $FEED";

# append new feed data to existing article delay queue
open(BATCH,$tmpfile) || &errdie("Can't open $tmpfile");
open(WORK,">>$workfile") || &errdie("Can't open $workfile");
while(<BATCH>) {
	print WORK $_;
}
close BATCH;
close WORK;
unlink $tmpfile;

# Loop through the delay queue, feed files that have delayed sufficiently.
# Stop when either eof or reach an entry with insufficient accumulated delay.
$time_now = time;
open(WORK,$workfile) || &errdie("Can't open $workfile");
while(<WORK>) {
	($time_rcvd,$path) = split(' ',$_);
	$delayed = $time_now - $time_rcvd;
	print LOG "Delayed $delayed:\t$path\t";
	if ($delayed > $DELAY_TIME) {
		if (-e $path) {
			print LOG "still exists - FEEDING NOW\n";
			system "$N2M <$path &";
		} else {
			print LOG "CANCELLED OR EXPIRED\n";
		}
	} else {
		print LOG "delaying rest of queue some more\n";

		# rewrite the delay queue with only current and remaining
		# articles
		open(SAVE,">$savefile") || &errdie("Can't open $savefile");
		print SAVE $_;
		while (<WORK>) {
			print SAVE $_;
		}
		close SAVE;
		close WORK;
		unlink $workfile || &errdie("Can't unlink $workfile");
		rename("$savefile",$workfile);
	}
}

close LOG;
exit;

# errdie: put a death message in the logfile and die.
#	input: $message  # an error message
sub errdie {
	local($message) = @_;
	print LOG "$message\n";
	die $message;
}

