Search Engine Spammers, beware:

Here is the Call for Papers for The Fourth International Workshop on Adversarial Information Retrieval on the Web, to be held in April 22nd, 2008 in Beijing, China:

As in AIRWeb 2007, there will be a Web Spam Challenge. Let’s call it “Ethical Spamming”, a la “Ethical Hacking”. Indeed, to understand the spammer/hacker mentality you need to either act like one under controlled conditions or be one in a previous life, sort of speak. 

Once again, I’ve been appointed member of the Program Committee. To help promote the event, I’m reproducing below their Call for Papers.

Adversarial Information Retrieval addresses tasks such as gathering, indexing, filtering, retrieving and ranking information from collections wherein a subset has been manipulated maliciously. On the Web, the predominant form of such manipulation is “search engine spamming” or spamdexing, i.e., malicious attempts to influence the outcome of ranking algorithms, aimed at getting an undeserved high ranking for some items in the collection.

We solicit both full and short papers on any aspect of adversarial information retrieval on the Web. Particular areas of interest include, but are not limited to:

Link spam
Content spam
Comment spam
Spam-oriented blogging
Click fraud detection
Reverse engineering of ranking algorithms
Web content filtering
Advertisement blocking
Stealth crawling
Malicious tagging

Proceedings of the workshop will be included in the ACM Digital Library. Full papers are limited to 8 pages; work-in progress will be permitted 4 pages.

Web Spam Challenge
Last year we introduced a novel element at the workshop: a Web Spam Challenge for testing web spam detection systems. We will be holding the Web Spam Challenge again this year, using the WEBSPAM-UK2007 collection for Web Spam Detection which we anticipate being released in early January, 2008.

The collection includes large set of web pages, a web graph, and human-provided labels for a set of hosts. We will also provide a set of features extracted from the contents and links in the collection, which may be used by the participant teams in addition to any automatic technique they choose to use.

We ask that participants of the Web Spam Challenge submit predictions (normal/spam) for all unlabeled hosts in the collection. Predictions will be evaluated and results will be announced at the AIRWeb 2008 workshop.

More information will be posted to  

15 February 2008: E-mail intention to submit a workshop paper (optional, but helpful)
22 February 2008: Deadline for workshop paper submissions
14 March 2008: Notification of acceptance of workshop papers
31 March 2008: Camera-ready copy due
31 March 2008: Challenge submissions due
22 April 2008: Date of workshop

Organizers and Program Committee

Carlos Castillo, Yahoo! Research
Kumar Chellapilla, Microsoft Live Labs
Dennis Fetterly, Microsoft Research

Program Committee:

Einat Amitay, IBM
András Benczúr, Hungarian Academy of Sciences
Paul-Alexandru Chiri, Uni Hannover
James Caverlee, Texas A&M University
Gordon Cormack, University of Waterloo
Nick Craswell, Microsoft Research
Matt Cutts, Google
Brian Davison, Lehigh University
Ludovic Denoyer, University Paris 6
Aaron D’Souza, Google
Edel García, Mi
Natalie Glance, Nielsen BuzzMetrics
Antonio Gulli,
Zoltán Gyöngyi, Stanford University
Monika Henzinger, Google
Pranam Kolari, Yahoo! Applied Research
Mark Manasse, Microsoft Research
Marc Najork, Microsoft Research
Alexandros Ntoulas, Microsoft Search Labs
Jan Pedersen, Yahoo! Search
Erik Selberg, Amazon
Torsten Suel, Polytechnic University
Mike Thelwall, University of Wolverhampton
Baoning Wu, Snap
Tao Yang,