As a program committee member of the W3C AIRWEB07 Workshop to be held in Banff, Canada, I was asked by its organizers (Dr. Brian Davison, Lehigh University; Dr. Carlos Castillo, Yahoo! Research – Spain; Dr. Kumar Chellapilla, Microsoft Live Labs) to give updates regarding the conference, which I did.

Since this is the new home of IR Thoughts and some of you might be new readers, I am reposting the program, complete with links to the presentations.

AIRWEB is a special workshop dedicated to investigate and test research dealing with all kind of information retrieval adversarial practices. The way I see it, this includes all sort of spam and trickery like:

  1. search engine spam and optimization (e.g., keyword spam, questionable SEO practices, etc)
  2. crawling the web without detection
  3. link-bombing (a.k.a. Google-bombing)
  4. comment spam, referrer spam
  5. blog spam (a.ka. splogs)
  6. malicious tagging (a.k.a magging)
  7. reverse engineering of ranking algorithms
  8. advertisement blocking
  9. web content filtering
  10. other practices designed to deceive search engines and the public

The irony of publishing papers on these subjects is that these are available to the public, including spammers. Thus, the so-called Search Engines War gets more interesting since we all end playing a cat-and-mouse chase act.

To illustrate, a wannabe spammer might learn few things by checking some of the papers below, while a researcher interested in fighting spam might found new ways for improving an anti-spam technique.

Here is the list of papers on Adversarial IR on the Web (AIRWEB) that survived the peer review from PC members by their own merits.

Spammers, for good or bad…: You have been served with the papers.

Session 1: Temporal and Topological Factors (8:30-10:00)

Splog Detection Using Self-similarity Analysis on Blog Temporal Dynamics
Yu-Ru Lin, Hari Sundaram, Yun Chi, Junichi Tatemura and Belle Tseng
Improving Web Spam Classification using Rank-time Features
Krysta Svore, Qiang Wu, Chris Burges and Aaswath Raman
Improving Web Spam Classifiers Using Link Structure (S)
Qingqing Gan and Torsten Suel
Transductive Link Spam Detection
Dengyong Zhou, Christopher Burges and Tao Tao (presenter: Krysta Svore)

Session 2: Link Farms (10:30-12:00)

Using Spam Farm to Boost PageRank
Ye Du, Yaoyun Shi and Xin Zhao
Extracting Link Spam using Biased Random Walks from Spam Seed Sets
Baoning Wu and Kumar Chellapilla
A Large-Scale Study of Link Spam Detection by Graph Algorithms (S)
Hiroo Saito, Masashi Toyoda, Masaru Kitsuregawa and Kazuyuki Aihara
Measuring Similarity to Detect Qualified Links
Xiaoguang Qi, Lan Nie and Brian Davison

Session 3: Tagging, P2P and Cloaking (13:30-15:00)

Combating Spam in Tagging Systems
Georgia Koutrika, Frans Effendi, Zoltán Gyöngyi, Paul Heymann and Hector García-Molina
New Metrics for Reputation Management in P2P Networks
Debora Donato, Mario Paniccia, Maddalena Selis, Carlos Castillo, Giovanni Cortese and Stefano Leonardi
Computing Trusted Authority Scores in Peer-to-Peer Web Search Networks
Josiane Xavier Parreira, Debora Donato, Carlos Castillo and Gerhard Weikum
A Taxonomy of JavaScript Redirection Spam
Kumar Chellapilla and Alexey Maykov

Session 4: Web Spam Challenge (15:30-17:00)

Web Spam Detection via Commercial Intent Analysis (S)
András A. Benczúr, István Bíró, Károly Csalogány and Tamás Sárlós
Five 10-minutes presentations from the participating teams
Presentation of the Evaluation Results (15 minutes)
Closing Remarks (5 minutes)