Snake Preview of IR Watch
July 31, 2007
The current issue of IRW-2007-08 should be out within the next few days. The Association Rules Part 2 discuses how association rule mining techniques from market basket can be applied to Web Mining.

The current issue of IRW-2007-08 should be out within the next few days. The Association Rules Part 2 discuses how association rule mining techniques from market basket can be applied to Web Mining.
Here is a 2004 paper on Association Thesaurus Construction for Interactive Query Expansion based on Association Rule Mining
The article discusses basic association rule mining concepts like support, confidence, and pruning as we described in Association Rules Part 1 (July issue of IR Watch - The Newsletter). BTW, read Part 2 in the August issue.
Here is a list of IR conferences scheduled for August, 2007:
I came across a relatively old paper authored by researchers at Indiana University and Institute for Human and Machine Cognition: Assessing Conceptual Similarity to Support Concept Mapping
Indeed, it seems like a great concept.
Omaya Sosa Pascual over at El Nuevo Dia published (07/22/07) an Interview with Vint Cerf, while he was visiting Puerto Rico during the ICANN Public Meeting. As the newspaper, the interview is in Spanish. I have the priviledge of attending the talk Dr. Cerf delivered before the Law School of University of Puerto Rico.
I have discussed AND and EXACT searches many times, but did you know the following?
In addition to enclosing search terms with double quotes (”like this”), in some search engines one can invoke a shortcut to an EXACT search by using certain characters that serve as sequence connectors. These work in the same way double quotes work. The most common is the hyphen; e.g.
SEOmoz has a great discussion on why at times search engines don’t return relevant results; that is, why some results perceived by users as being not relevant to their information needs (queries) are ranked high by search engines.
Some bloggers at SEOmoz attribute this in part to precision and recall issues. We have covered these topics in different occasions; so, let revisit some points along those lines.
Here is a list of IR conferences, scheduled for the rest of July in The Netherlands.
I just came across this article
http://seo-and-google.blogspot.com/2007/07/5-tips-to-effective-seo-keyword.html
by Valerie DiCarlo and honestly don’t know from where these marketers learn all these misconceptions regarding LSI. Perhaps she has been misled as well by the usual suspects and is just trying to make some honest comments she truly believes. Unfortunately most of her’s are incorrect. I’m commenting her lines, one by one.
Glasgow Summer School on Multimedia Semantics, organized by the Information Retrieval Group at Glasgow University, is in full swing now (July 15-21, 2007)
Thomas Richard Lynam, has researched extensively a variant of ITF called Redundant ITF (RITF). His 2002 master thesis, “Exploitation of Redundant Inverse Term Frequency”, is a must-read for anyone interested in the topic. His thesis is available as a PDF and Postscript.
The justification for using RITF is as follows.
In my previous post I explained to a reader the difference between inverse term frequency (ITF) and inverse document frequency (IDF), but did not provide practical applications. This post is to explain what ITF is good for.Like IDF, ITF is a global weight measure; i.e., Gi = ITF. Combined with a local weight measure (Lij), it can be used to compute an overall weight.Local weights can be defined in many different ways. Here is one definition:
Good question from a reader:
Hi, I have heard the expression inverse document frequency, but recently I came across a paper mentioning inverse term frequency. Can you clarify these?
Sure.
Consider a set, D, consisting of n documents:
D ={d1, d2…dn}
and a set, T, consisting of m unique terms extracted from D:
T = {t1, t2…tm}
Harry Collier over at Infonortics emailed me yesterday this Call for Papers:
The Call for Papers for next year’s (April 200
Search Engine Meeting in Boston has now been released. Offers of presentations are being invited for consideration. Absolute deadline for submission of offers is October 18, 2007 but the organizers stress that, with presentations limited to only 20 over the two days of the meeting, it is advisable to make contact as early as possible.

Association Rules based on co-occurrence can be used to address relationships like: Customers buying X tend to buy Y. These can be used to support business-related services such as marketing promotions, inventory, and CRM programs. Learn how by reading the July 2007 issue of IR Watch - The Newsletter.
I’m putting the final touches to this month issue of IRW, which is running late –reasons all subscribers know by now. It should be out tomorrow.
Amazing how many are still perpetuating so many misconceptions about “LSI tools”. Here is another example, forwarded to me by Melissa Fach, one of several SEOs that are discovering how many “LSI-based” SEO lies are out there thanks to the usual suspects:
http://courtneytuttle.com/2007/07/05/taking-seo-to-the-next-level-lsi/
Mike Grehan finished his great ClickZ column of June 11, 2007 SEO Is Dead. Long Live, er, the Other SEO, as follows:
“I’ve run out of space again. I’ll come back to the stupidity of the latent semantic indexing issue in my next column.”
As mentioned the July issue of IR Watch is running late due to the backend changes we made last week to our main site (http://www.miislita.com). If you are a subscriber, IRW should arrive to your inbox in few days. This issue is dedicated to Market Basket Analysis and Keyword Research. Some portions are adaptations from Tan, Steinbach, and Kumar book “Introduction to Data Mining”.
Data mining query logs? Then these research papers might interest you.
Dependence or Independence Day? Ask regular citizens.
Meanwhile, how about some IR papers on the dependence/independence of query relevance feedback?
Here is a list of really interesting papers on the subject from Michael Ortega-Binderberger’s group:
Finally we completed some backend changes to http://www.miislita.com. Sorry for the inconvenient. These changes were necessary, but have the effect of delaying the publication of the July issue of IR Watch.