<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>IR Thoughts</title>
	<atom:link href="http://irthoughts.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://irthoughts.wordpress.com</link>
	<description>Thoughts on Information Retrieval &#38; Data Mining</description>
	<lastBuildDate>Wed, 25 Nov 2009 13:04:05 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='irthoughts.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/b50f2f199631fcb269aa9a1b8b9bcda4?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>IR Thoughts</title>
		<link>http://irthoughts.wordpress.com</link>
	</image>
			<item>
		<title>IRW:2009-11: Subnetting and Security</title>
		<link>http://irthoughts.wordpress.com/2009/11/25/irw2009-11-subnetting-and-security/</link>
		<comments>http://irthoughts.wordpress.com/2009/11/25/irw2009-11-subnetting-and-security/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 13:04:05 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Internet Engineering]]></category>
		<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1161</guid>
		<description><![CDATA[
The current issue of IRW arrived last week to subscribers. Look what non-subscribers missed:. 
Featuring article: Subnetting and Security
&#8220;The purpose of subnetting is to break large networks into smaller networks called subnets by borrowing bits from the network portion of an IP address. This is done by using a variable length subnet mask. As the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1161&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img src="http://www.miislita.com/irw/subnetting.gif"/></p>
<p>The current issue of IRW arrived last week to subscribers. Look what non-subscribers missed:. </p>
<p>Featuring article: Subnetting and Security</p>
<p>&#8220;The purpose of subnetting is to break large networks into smaller networks called subnets by borrowing bits from the network portion of an IP address. This is done by using a variable length subnet mask. As the IP addresses of host computers on each subnet are masked by the network address these are invisible to those outside the network. In this sense, subnetting benefits security. Conversely, the cost of misconfigured subnets creates Internet vulnerabilities. For those working at the intersection of Information Security, IP address masking and subnetting might be relevant when it comes to analyzing spoofing, penetration hacking, and subnet advertisements. Accordingly, in this issue of the newsletter we present a straightforward approach on the art of subnetting.&#8221; </p>
<p>Enjoy it.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1161/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1161&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/11/25/irw2009-11-subnetting-and-security/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Remembering Mike Muuss</title>
		<link>http://irthoughts.wordpress.com/2009/11/20/remembering-mike-muuss/</link>
		<comments>http://irthoughts.wordpress.com/2009/11/20/remembering-mike-muuss/#comments</comments>
		<pubDate>Fri, 20 Nov 2009 15:17:08 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1151</guid>
		<description><![CDATA[
Nine years ago the great Mike Muuss, inventor of PING and many other ground  breaking software tools, was killed in a car accident. Almost a decade later we pay respect to his memory in the current issue of IR Watch (to be delivered a bit late in few days). The Who is Who column of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1151&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img class="aligncenter" src="http://ftp.arl.mil/~mike/ivy-mike.gif" alt="Mike Muuss" /></p>
<p>Nine years ago the great Mike Muuss, inventor of PING and many other ground  breaking software tools, was killed in a car accident. Almost a decade later we pay respect to his memory in the current issue of IR Watch (to be delivered a bit late in few days). The Who is Who column of IRW features the following:</p>
<p style="text-align:left;">Michael John Muuss (October 16, 1958 &#8211; November 20, 2000), a multi-talented computer wizard who helped lay the foundations for the modern-day Internet, was killed at the age of 42 in an automobile accident near his home in Havre de Grace, while returning home from a restaurant, when his car was involved in a multivehicle pileup on Interstate 95. At 9 years from his death, we pay respects to his memory. The following are excerpts from his Obiturary. (Obituary from The Baltimore Sun Company: <a href="http://www.ping127001.com/pingpage/muuss.htm">http://www.ping127001.com/pingpage/muuss.htm</a>).</p>
<p>A graduate of the Johns Hopkins University, Muuss spent his entire career at the U.S. Army Research Laboratory (ARL) at Aberdeen Proving Ground, where he established a reputation as an enthusiastic problem-solver who did groundbreaking work in areas ranging from computer networks to graphics.</p>
<p>He is best known for inventing PING, one of the most widely used IP address retrieval and diagnostic tools for computer networks in the world and used in almost all PCs.</p>
<p>Contrary to popular opinion/urban legends, the PING name was not intended to be a ping pong analogy or to stand for Packet Internet Grouper. According to his own words (<a href="http://ftp.arl.mil/~mike/ping.html">http://ftp.arl.mil/~mike/ping.html</a>):</p>
<ul>
<li>“I named it after the sound that a sonar makes, inspired by the whole principle of echo-location.”</li>
<li>“From my point of view PING is not an acronym standing for Packet InterNet Grouper, it&#8217;s a sonar analogy.“</li>
</ul>
<p>In the early 1980s, Muuss helped lay the technological foundation that would transform what was then called the ARPANET, back then an obscure military computer network created in 1969 by the Department of Defense, into the modern-day Internet. (<a href="http://ftp.arl.mil/~mike/">http://ftp.arl.mil/~mike/</a>).  </p>
<p>Muuss also created BRL-CAD, a program that allowed the military to create sophisticated 3-D models. Over the years, BRL-CAD has become one of the Army&#8217;s most-licensed technologies and is used to model everything from tanks to brain tumors.</p>
<p>His work in computer security landed him a cameo appearance in Clifford Stoll&#8217;s 1989 hacker classic &#8220;The Cuckoo&#8217;s Egg,&#8221; a nonfiction thriller about the hunt for an international band of computer criminals. Muuss was also known for tracking down crackers.</p>
<p style="text-align:left;">In 1990, he was one of the government&#8217;s key witnesses in the case against Robert Tappan Morris, whose &#8220;Morris Worm&#8221; in 1988 nearly brought down the Internet.<br />
(Case Sentence: <a href="http://www.nytimes.com/1990/05/05/us/computer-intruder-is-put-on-probation-and-fined-10000.html?scp=2&amp;sq=robert+tappan+morris&amp;st=nyt">http://www.nytimes.com/1990/05/05/us/computer-intruder-is-put-on-probation-and-fined-10000.html?scp=2&amp;sq=robert+tappan+morris&amp;st=nyt</a><br />
Case Appeal: <a href="http://morrisworm.larrymcelhiney.com/morris_appeal.txt">http://morrisworm.larrymcelhiney.com/morris_appeal.txt</a>).</p>
<p>In 1999 and at the age of 41, Muuss was given the Research and Development Achievement Award, the Army&#8217;s highest civilian award for scientific accomplishments. Before his death, he assembled an impressive review on the <em>History of Computing Information</em> (<a href="http://ftp.arl.army.mil/~mike/comphist/">http://ftp.arl.army.mil/~mike/comphist/</a>).</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1151/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1151&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/11/20/remembering-mike-muuss/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://ftp.arl.mil/~mike/ivy-mike.gif" medium="image">
			<media:title type="html">Mike Muuss</media:title>
		</media:content>
	</item>
		<item>
		<title>IP Packet Fragmentation, MTU, and MSS Tutorials</title>
		<link>http://irthoughts.wordpress.com/2009/11/16/ip-packet-fragmentation-mtu-and-mss-tutorials/</link>
		<comments>http://irthoughts.wordpress.com/2009/11/16/ip-packet-fragmentation-mtu-and-mss-tutorials/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 20:09:22 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Internet Engineering]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1147</guid>
		<description><![CDATA[Two new tutorials on Internet Engineering are available now from Mi Islita.com:
IP Packet Fragmentation Tutorial [pdf]
MTU and MSS Tutorial [pdf]
Both are based on lecture material provided in class during the IE Part 1 course. These can be used as reference material for other courses such as Network Security, Internet Engineering, Internet Architecture, etc. 
The first [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1147&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Two new tutorials on Internet Engineering are available now from Mi Islita.com:</p>
<p><a href="http://www.miislita.com/internet-engineering/ip-packet-fragmentation-tutorial.pdf">IP Packet Fragmentation Tutorial</a> [pdf]</p>
<p><a href="http://www.miislita.com/internet-engineering/mtu-mss-tutorial.pdf">MTU and MSS Tutorial</a> [pdf]</p>
<p>Both are based on lecture material provided in class during the IE Part 1 course. These can be used as reference material for other courses such as Network Security, Internet Engineering, Internet Architecture, etc. </p>
<p>The first one is an introduction to packet fragmentation analysis while the second one delves into experimental techniques on maximum transmission unit and maximum segment size calculations.</p>
<p>The tutorials can be used to understand Ping of Death, Fragmentation Offset Attacks, Tiny Fragmentation Attacks, Firewall Fragmentation Attacks and other forms of hacks.</p>
<p>Enjoy it.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1147/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1147/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1147/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1147/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1147/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1147/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1147/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1147/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1147/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1147/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1147&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/11/16/ip-packet-fragmentation-mtu-and-mss-tutorials/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Global Terrorism Database</title>
		<link>http://irthoughts.wordpress.com/2009/11/03/global-terrorism-database/</link>
		<comments>http://irthoughts.wordpress.com/2009/11/03/global-terrorism-database/#comments</comments>
		<pubDate>Tue, 03 Nov 2009 05:00:06 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Homeland Security]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1143</guid>
		<description><![CDATA[If you are into homeland security oriented data mining, this post is for you.
The University of Maryland has a Global Terrorism Database (GTD; http://www.start.umd.edu/gtd/) with information on over 80,000 terrorist attacks that intelligence researchers can tap into.
GTD is an open-source database including information on terrorist events around the world from 1970 through 2007 (with annual [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1143&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>If you are into homeland security oriented data mining, this post is for you.</p>
<p>The University of Maryland has a Global Terrorism Database (GTD; <a href="http://www.start.umd.edu/gtd/">http://www.start.umd.edu/gtd/</a>) with information on over 80,000 terrorist attacks that intelligence researchers can tap into.</p>
<p>GTD is an open-source database including information on terrorist events around the world from 1970 through 2007 (with annual updates planned for the future). Unlike many other event databases, the GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 80,000 cases.</p>
<p>You can search by keywords or browse by region, country, perpetraror, weapon, attack, or target.</p>
<p>It also has advanced search capabilites. To perform an advanced search you need to select all categories you wish to search. If you do not check any options then your search will include all content from that category, for example, selecting Algeria from the &#8220;Country&#8221; list will restrict your search to incidents in Algeria, while leaving it blank searches all countries.</p>
<p>Incident searches can be restricted to specific years using several pull-down menus.</p>
<p>I tested by querying [puerto rico] and indeed was able to obtain incident records related with Los Macheteros. The answer set, however, included results not relevant to the Island of Puerto Rico.</p>
<p>The database is pretty small, but can come handy at times. Definitively, I will use it for one of my next graduate courses on search engine architectures.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1143/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1143&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/11/03/global-terrorism-database/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>2009-10-IRW: Network Connectivity</title>
		<link>http://irthoughts.wordpress.com/2009/10/29/2009-10-irw-network-connectivity/</link>
		<comments>http://irthoughts.wordpress.com/2009/10/29/2009-10-irw-network-connectivity/#comments</comments>
		<pubDate>Thu, 29 Oct 2009 16:49:45 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1140</guid>
		<description><![CDATA[
The october issue of IRW newsletter is out, but late due to my academic duties. It should arrive to subscriber&#8217;s inbox today (or at the latest tomorrow). Sorry for the inconvenience.
In the featuring article, I included material from one of my grad exams. The QA column features IP-to-MAC conversions. The Who is Who section features [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1140&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img src="http://www.miislita.com/irw/network-connectivity.gif" alt="network connectivity" /></p>
<p>The october issue of IRW newsletter is out, but late due to my academic duties. It should arrive to subscriber&#8217;s inbox today (or at the latest tomorrow). Sorry for the inconvenience.</p>
<p>In the featuring article, I included material from one of my grad exams. The QA column features IP-to-MAC conversions. The Who is Who section features Van Jacobson, creator of UDP-based traceroute and of many other tools.</p>
<p>Hopefully, the november issue will not be that delayed. BTW, it will feature the late Mike Muuss, inventor of Ping plus some super-fast track on the art of subnetting.</p>
<p>&nbsp;</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1140/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1140&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/10/29/2009-10-irw-network-connectivity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://www.miislita.com/irw/network-connectivity.gif" medium="image">
			<media:title type="html">network connectivity</media:title>
		</media:content>
	</item>
		<item>
		<title>FIRE: Forum for Information Retrieval Evaluation</title>
		<link>http://irthoughts.wordpress.com/2009/10/14/fire-forum-for-information-retrieval-evaluation/</link>
		<comments>http://irthoughts.wordpress.com/2009/10/14/fire-forum-for-information-retrieval-evaluation/#comments</comments>
		<pubDate>Wed, 14 Oct 2009 17:26:24 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Conferences]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1132</guid>
		<description><![CDATA[Ellen Voorhees, Director of TREC at NIST.Gov sent me this Call for Participation, reproduced below to facilitate its dissemination:
CALL FOR  PARTICIPATION
FIRE
(Forum for Information Retrieval Evaluation)
Workshop
DAIICT, Gandhinagar,  India
19-21 February  2010
http://www.isical.ac.in/~fire
The  success of TREC, CLEF, and NTCIR has clearly established the importance
of  building reusable, large-scale standard test collections in  Information
Access research. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1132&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Ellen Voorhees, Director of TREC at NIST.Gov sent me this Call for Participation, reproduced below to facilitate its dissemination:</p>
<blockquote><p>CALL FOR  PARTICIPATION</p>
<p>FIRE<br />
(Forum for Information Retrieval Evaluation)<br />
Workshop<br />
DAIICT, Gandhinagar,  India<br />
19-21 February  2010</p>
<p><a href="http://www.isical.ac.in/%7Efire">http://www.isical.ac.in/~fire</a></p>
<p>The  success of TREC, CLEF, and NTCIR has clearly established the importance<br />
of  building reusable, large-scale standard test collections in  Information<br />
Access research. The aim of FIRE is to encourage research in  Indian language<br />
Information Access by creating a similar platform for Indian  languages that<br />
provides the data and a common forum for comparing models and  techniques.</p>
<p>The Tasks:<br />
==========<br />
1) Ad-hoc monolingual  document retrieval in Bengali, Hindi and Marathi.</p>
<p>2) Ad-hoc cross-lingual  document retrieval<br />
- documents in Bengali, Hindi, Marathi, and  English,<br />
- queries in Bengali, Hindi, Marathi, Tamil, Telugu and  English.<br />
- Bengali and Hindi topics will also be transliterated and made  available<br />
in Roman script. Adhoc monolingual task participants are  encouraged to<br />
submit runs using these queries as well.</p>
<p>3)  Retrieval and classification from mailing lists and forums.<br />
This is a  pilot task being offered by IBM India Research Lab.</p>
<p>4)  Ad-hoc  Wikipedia-entity retrieval from news documents<br />
- Entities mined from  English Wikipedia<br />
- Query documents from English news website<br />
This  is a pilot task being offered by Yahoo! Labs, Bangalore.</p>
<p>Important  Dates:<br />
================<br />
Ad-hoc monolingual and cross-lingual document  retrieval:<br />
Training data release   Aug 15 &#8216;09<br />
Test data release       Nov  01 &#8216;09<br />
Adhoc run submission    Nov 25 &#8216;09<br />
Results released        Feb 01  &#8216;10</p>
<p>Retrieval and classification from mailing lists and  forums:<br />
Training data release   Oct 16 &#8216;09<br />
Test data release       Nov 01  &#8216;09<br />
Run submission          Nov 25 &#8216;09<br />
Results declared        Feb 01  &#8216;10</p>
<p>Ad-hoc Wikipedia-entity retrieval from news documents:<br />
Training  data release   Oct 15 &#8216;09<br />
Test data release       Nov 01 &#8216;09<br />
Run  submission          Nov 25 &#8216;09<br />
Results declared        Feb 01  &#8216;10</p>
<p>Task Co-ordinators:<br />
===================<br />
Ad-hoc  retrieval:<br />
Pushpak Bhattacharyya (<a href="mailto:pb@cse.iitb.ac.in">pb@cse.iitb.ac.in</a>)<br />
IIT Bombay<br />
Dipasree  Pal (<a href="mailto:dipasree_t@isical.ac.in">dipasree_t@isical.ac.in</a>)<br />
ISI  Kolkata</p>
<p>Retrieval and classification from mailing lists and  forums:<br />
Debapriyo Majumdar (<a href="mailto:debapriyo@in.ibm.com">debapriyo@in.ibm.com</a>)<br />
IBM India  Research Lab<br />
Ayan Bandyopadhyay (<a href="mailto:ayan_t@isical.ac.in">ayan_t@isical.ac.in</a>)<br />
ISI  Kolkata</p>
<p>Ad-hoc Wikipedia-entity retrieval from news documents:<br />
Ashwin  Tengli (<a href="mailto:ashwint@yahoo-inc.com">ashwint@yahoo-inc.com</a>)<br />
Yahoo! Labs,  Bangalore<br />
Pabitra Mitra (<a href="mailto:pabitra@cse.iitkgp.ernet.in">pabitra@cse.iitkgp.ernet.in</a>)<br />
IIT  Kharagpur</p>
<p>Overall co-ordinators:<br />
Prasenjit Majumder (<a href="mailto:p_majumder@daiict.ac.in">p_majumder@daiict.ac.in</a>)<br />
DAIICT,  Gandhinagar<br />
Mandar Mitra (<a href="mailto:mandar@isical.ac.in">mandar@isical.ac.in</a>)<br />
ISI  Kolkata</p>
<p>International Advisory Committee for  FIRE:<br />
==========================================<br />
Amit Singhal, Google  Fellow, USA<br />
Carol Peters, ISTI-CNR, Italy<br />
Christian Fluhr, CEA,  France<br />
Donna Harman, National Institute of Standards and Technology,  USA<br />
Doug Oard, University of Maryland, USA<br />
Ee Peng Lim, Nanyang  Technological University, Singapore<br />
Ellen Voorhees, National Institute of  Standards and Technology, USA<br />
Fabrizio Sebastiani, ISTI-CNR, Italy<br />
Gareth  Jones, Dublin City University, Ireland.<br />
Hsin-Hsi Chen, National Taiwan  University, Taipei, Taiwan<br />
Hwee Tou Ng, National University of Singapore,  Singapore<br />
Iadh Ounis, University of Glasgow, UK<br />
Ian Soboroff, National  Institute of Standards and Technology, USA<br />
Jacques Savoy, University of  Neuchatel, Switzerland<br />
James Allan, University of Massachusetts Amherst,  USA<br />
Krishna Kummamuru, IBM Research Lab, India<br />
Mark Sanderson, University  of Sheffield, UK<br />
Mun Kew Leong, Institute for Infocomm Research,  Singapore<br />
Norbert Fuhr, University of Duisburg, Germany<br />
Noriko Kando,  National Institute of Informatics, Japan<br />
Paul McNamee, Johns Hopkins  University, USA<br />
Prabhakar Raghavan, Yahoo! Research Labs, USA<br />
Ricardo  Baeza-Yates, Yahoo! Research Labs, Spain<br />
Stephen Robertson, Microsoft  Research, Cambridge, UK<br />
Sung Hyon Myaeng, KAIST, South Korea<br />
Tat-Seng  Chua, National University of Singapore, Singapore<br />
Tetsuya Sakai, Microsoft  Research Asia, Beijing</p></blockquote>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1132/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1132&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/10/14/fire-forum-for-information-retrieval-evaluation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>DNS Intelligence</title>
		<link>http://irthoughts.wordpress.com/2009/10/13/dns-intelligence/</link>
		<comments>http://irthoughts.wordpress.com/2009/10/13/dns-intelligence/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 14:03:24 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Graduate Courses]]></category>
		<category><![CDATA[Internet Engineering]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1130</guid>
		<description><![CDATA[Today&#8217;s Internet Engineering Part 1 course lecture will be on DNS Intelligence and how we can use DNS records to understand virus and worm attacks as well as remote network topologies. Quite handy these days.
Please check Lecture 8
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1130&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Today&#8217;s Internet Engineering Part 1 course lecture will be on DNS Intelligence and how we can use DNS records to understand virus and worm attacks as well as remote network topologies. Quite handy these days.</p>
<p>Please check <a href="http://irthoughts.wordpress.com/2009/09/21/internet-engineering-i-course-lectures/">Lecture 8</a></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1130/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1130/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1130/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1130/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1130/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1130/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1130/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1130/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1130/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1130/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1130&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/10/13/dns-intelligence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>The Danger of Microsoft: Data Lost</title>
		<link>http://irthoughts.wordpress.com/2009/10/11/the-danger-of-microsoft-data-lost/</link>
		<comments>http://irthoughts.wordpress.com/2009/10/11/the-danger-of-microsoft-data-lost/#comments</comments>
		<pubDate>Sun, 11 Oct 2009 12:14:16 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Human-Computer Interaction]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1123</guid>
		<description><![CDATA[According to a Techcrunch 10-10-09 news a crash at Microsoft&#8217;s Danger servers resulted in the lost of all user personal data and they don&#8217;t have a backup! 
The news says:
T-Mobile and Danger, the Microsoft-owned subsidiary that makes the Sidekick, has just announced that they’ve likely lost all user data that was being stored on Microsoft’s [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1123&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>According to a Techcrunch 10-10-09 news <a href="http://www.techcrunch.com/2009/10/10/t-mobile-sidekick-disaster-microsofts-servers-crashed-and-they-dont-have-a-backup/">a crash at Microsoft&#8217;s Danger servers resulted in the lost of all user personal data and they don&#8217;t have a backup</a>! </p>
<p>The news says:</p>
<blockquote><p>T-Mobile and Danger, the Microsoft-owned subsidiary that makes the Sidekick, has just announced that they’ve likely lost all user data that was being stored on Microsoft’s servers due to a server failure. That means that any contacts, photos, calendars, or to-do lists that haven’t been locally backed up are gone.</p></blockquote>
<p>And there is no backup for the data. Really smart, Microsoft people. That says a lot!</p>
<p>This is gonna be in an information security textbook near you. How about in a textbook on Human-Computer <strong>No</strong>-Interaction?</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1123/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1123/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1123/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1123/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1123/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1123/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1123/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1123/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1123/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1123/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1123&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/10/11/the-danger-of-microsoft-data-lost/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>All About Email Headers</title>
		<link>http://irthoughts.wordpress.com/2009/10/06/all-about-email-headers/</link>
		<comments>http://irthoughts.wordpress.com/2009/10/06/all-about-email-headers/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 19:22:11 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Graduate Courses]]></category>
		<category><![CDATA[Internet Engineering]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1118</guid>
		<description><![CDATA[If you are enrolled in the IE-Part 1 course, here is some reference material on Email Headers for today&#8217;s lecture:
Exposing email headers
http://www.abs-comptech.com/EmailHeaders.htm 
Tracking the source of email spam
http://www.rahul.net/falk/mailtrack.html 
How to read email headers
http://www.emailaddressmanager.com/tips/header.html 
Reading the email header
http://antivirus.about.com/od/windowsbasics/a/emailheaders.htm 
Reading email headers
http://www.tinhat.com/email/read_email_headers.html 
Spamlinks: Reading email headers
http://spamlinks.net/track-trace-headers.htm 
ACCC: Reading Email Headers
http://www.uic.edu/depts/accc/newsletter/adn29/headers.html 
E-mail Headers and SMTP Commands
http://www.avolio.com/columns/E-mailheaders.html 
All About [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1118&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>If you are enrolled in the IE-Part 1 course, here is some reference material on Email Headers for today&#8217;s lecture:</p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">Exposing email headers</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.abs-comptech.com/EmailHeaders.htm">http://www.abs-comptech.com/EmailHeaders.htm</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">Tracking the source of email spam</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.rahul.net/falk/mailtrack.html">http://www.rahul.net/falk/mailtrack.html</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">How to read email headers</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.emailaddressmanager.com/tips/header.html">http://www.emailaddressmanager.com/tips/header.html</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">Reading the email header</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://antivirus.about.com/od/windowsbasics/a/emailheaders.htm">http://antivirus.about.com/od/windowsbasics/a/emailheaders.htm</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">Reading email headers</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.tinhat.com/email/read_email_headers.html">http://www.tinhat.com/email/read_email_headers.html</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">Spamlinks</span><span style="font-size:10pt;font-family:Arial;color:black;">: Reading email headers</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://spamlinks.net/track-trace-headers.htm">http://spamlinks.net/track-trace-headers.htm</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">ACCC: Reading Email Headers</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.uic.edu/depts/accc/newsletter/adn29/headers.html">http://www.uic.edu/depts/accc/newsletter/adn29/headers.html</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;margin-left:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">E-mail Headers and SMTP Commands</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="text-decoration:underline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.avolio.com/columns/E-mailheaders.html">http://www.avolio.com/columns/E-mailheaders.html</a></span></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">All About Email Headers</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.stopspam.org/index.php?option=com_content&amp;view=article&amp;id=45&amp;Itemid=56">http://www.stopspam.org/index.php?option=com_content&amp;view=article&amp;id=45&amp;Itemid=56</a></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span><span style="font-size:10pt;font-family:Arial;color:black;font-weight:bold;"> </span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;">Security Optimization Strategies in the Workplace</span></p>
<p style="margin-top:0;margin-bottom:0;text-align:left;direction:ltr;unicode-bidi:embed;vertical-align:baseline;"><span style="font-size:10pt;font-family:Arial;color:black;"><a href="http://www.miislita.com/searchito/security-optimization-strategies.html">http://www.miislita.com/searchito/security-optimization-strategies.html</a></span><span style="font-size:10pt;font-family:Arial;color:black;"> </span><span style="font-size:10pt;font-family:Arial;color:black;"> </span></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1118/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1118&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/10/06/all-about-email-headers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Email Protocols</title>
		<link>http://irthoughts.wordpress.com/2009/10/05/email-protocols/</link>
		<comments>http://irthoughts.wordpress.com/2009/10/05/email-protocols/#comments</comments>
		<pubDate>Mon, 05 Oct 2009 11:31:12 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Graduate Courses]]></category>
		<category><![CDATA[Internet Engineering]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1115</guid>
		<description><![CDATA[If you are a student enrolled in the Internet Engineering I graduate course, check the Lecture 7 update.
We will be covering email protocols such as SMTP, POP3, and IMAP. The exercise section covers email headers intelligence and email crawlers. 
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1115&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>If you are a student enrolled in the Internet Engineering I graduate course, check the <a href="http://irthoughts.wordpress.com/2009/09/21/internet-engineering-i-course-lectures/">Lecture 7</a> update.</p>
<p>We will be covering email protocols such as SMTP, POP3, and IMAP. The exercise section covers email headers intelligence and email crawlers. </p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1115/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1115&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/10/05/email-protocols/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>DNS Configuration</title>
		<link>http://irthoughts.wordpress.com/2009/09/28/dns-configuration/</link>
		<comments>http://irthoughts.wordpress.com/2009/09/28/dns-configuration/#comments</comments>
		<pubDate>Mon, 28 Sep 2009 14:15:42 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Graduate Courses]]></category>
		<category><![CDATA[Internet Engineering]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1108</guid>
		<description><![CDATA[If you are a student enrolled in the Internet Engineering I graduate course, check the Lecture 6 update.
I will be covering all about DNS configuration files. For the hands-on exercise section, we will be using nslookup commands to snoop at all relevant records of remote Web domains.
Use nslookup/? to access the options helper
Use nslookup followed [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1108&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>If you are a student enrolled in the <strong>Internet Engineering I</strong> graduate course, check the <a href="http://irthoughts.wordpress.com/2009/09/21/internet-engineering-i-course-lectures/">Lecture 6</a> update.</p>
<p>I will be covering all about DNS configuration files. For the hands-on exercise section, we will be using nslookup commands to snoop at all relevant records of remote Web domains.</p>
<p>Use <strong>nslookup/?</strong> to access the options helper<br />
Use <strong>nslookup</strong> followed by <strong>?</strong> in a different line to access the commands helper<br />
To quit nslookup, press <strong>ctrl C</strong> or either type <strong>quit</strong> or <strong>exit</strong>.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1108/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1108/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1108&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/09/28/dns-configuration/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Migrating from IPv4 to IPv6: The Next Nightmare?</title>
		<link>http://irthoughts.wordpress.com/2009/09/24/migrating-from-ipv4-to-ipv6-the-next-nightmare/</link>
		<comments>http://irthoughts.wordpress.com/2009/09/24/migrating-from-ipv4-to-ipv6-the-next-nightmare/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 20:56:51 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Internet Engineering]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1105</guid>
		<description><![CDATA[Two weeks ago, venerable Vinton Cerf urged the Internet community to migrate from IPv4 to IPv6. According to Cerf, co-designer of the TCP/IP protocols, IPv4 will run out of addresses next year or in early-2011.
However, there is a problem.
Back in March, it was reported of an allegued Fatal Flaw for IPv6: it&#8217;s Not Backwards Compatible
Both [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1105&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Two weeks ago, venerable <a href="http://www.computerworld.com/s/article/9138187/Internet_pioneer_Cerf_urges_IPv6_migrations?taxonomyId=71">Vinton Cerf urged the Internet community to migrate from IPv4 to IPv6</a>. According to Cerf, co-designer of the TCP/IP protocols, IPv4 will run out of addresses next year or in early-2011.</p>
<p>However, there is a problem.</p>
<p>Back in March, it was reported of an allegued <a href="http://www.cio.com/article/486610/Fatal_Flaw_for_IPv6_it_s_Not_Backwards_Compatible?page=1&amp;taxonomyId=1413">Fatal Flaw for IPv6: it&#8217;s Not Backwards Compatible</a></p>
<p>Both news are equally intriguing.</p>
<p>IPv6 migration: Your Next Nightmare?</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1105/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1105&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/09/24/migrating-from-ipv4-to-ipv6-the-next-nightmare/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Internet Engineering I: Course Lectures</title>
		<link>http://irthoughts.wordpress.com/2009/09/21/internet-engineering-i-course-lectures/</link>
		<comments>http://irthoughts.wordpress.com/2009/09/21/internet-engineering-i-course-lectures/#comments</comments>
		<pubDate>Mon, 21 Sep 2009 12:19:05 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Graduate Courses]]></category>
		<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Internet Engineering]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1100</guid>
		<description><![CDATA[The following are the lecture and exercise topics covered in the PUPR.edu core graduate course Internet Engineering, Part I. Students enrolled in the course might want to revisit this post as it will be updated.
Lecture 0
History of the Internet &#38; Search Engines
Internet Basics
Lecture 1
RFCs (Request for Comments)
Network Types
IP (Internet Protocol)
Exercise 1 &#8211; RFCs, Network types, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1100&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>The following are the lecture and exercise topics covered in the PUPR.edu core graduate course <strong>Internet Engineering</strong>, Part I. Students enrolled in the course might want to revisit this post as it will be updated.</p>
<p><strong>Lecture 0</strong></p>
<p>History of the Internet &amp; Search Engines</p>
<p>Internet Basics</p>
<p><strong>Lecture 1</strong></p>
<p>RFCs (Request for Comments)</p>
<p>Network Types</p>
<p>IP (Internet Protocol)</p>
<p>Exercise 1 &#8211; RFCs, Network types, IP calculations</p>
<p><strong>Lecture 2</strong></p>
<p>OSI Reference Model</p>
<p>ARP</p>
<p>ICMP</p>
<p>Exercise 2 &#8211; IP-MAC Mapping, Prompt Commands (arp, ipconfig, nslookup)</p>
<p><strong>Lecture 3</strong></p>
<p>Man-in-the-Middle ARP Attacks</p>
<p>IGMP</p>
<p>IP Packets</p>
<p>Exercise 3 &#8211; Broadcast &amp; Multicast IPs, Prompt Commands (netstat, ping, tracert, ipconfig, arp, nslookup)</p>
<p><strong>Lecture 4</strong></p>
<p>Fragmentation Offset</p>
<p>FO Overlapping Attacks</p>
<p>FO Gap Attacks</p>
<p>Tiny FO Attacks</p>
<p>TCP Protocol &amp; Buffers</p>
<p>Exercise 4 &#8211; TCP buffers, Congestion Windows, Advertised Windows</p>
<p><strong>Lecture 5</strong></p>
<p>PING</p>
<p>PING of Death</p>
<p>Smurfing</p>
<p>TRACEROUTE-based Intelligence</p>
<p>Exercise 5 &#8211; Prompt Commands (arp, ipconfig, nslookup, netstat, ping, tracert)</p>
<p><strong>Lecture 6</strong></p>
<p>BIND &amp; WINDOWS DNS (Domain Name Server)</p>
<p>Internet backbone root servers</p>
<p>Configuration Files</p>
<p>DNS Configuration Errors</p>
<p>Forward Lookup (Zone) Files</p>
<p>Reverse Lookup Files</p>
<p>Exercise 6 &#8211; Prompt Commands (interactive/non-interactive nslookup modes)</p>
<p><strong>Lecture 7</strong></p>
<p>SMTP</p>
<p>POP3</p>
<p>IMAP</p>
<p>Email Headers</p>
<p>Exercise 7 &#8211; Email Intelligence.</p>
<p><strong>Lecture 8</strong></p>
<p>DNS Intelligence</p>
<p>Using DNS records to understand Virus &amp; Worm Attacks</p>
<p>Network Topology Intelligence from DNS records</p>
<p>Exercise 8 &#8211; DNS Intelligence</p>
<p><strong>Lecture 9</strong></p>
<p>General Review</p>
<p>Practice Test</p>
<p><strong>Lecture 10</strong></p>
<p>Final Exam, Oct 27</p>
<p><strong>Course Grading System</strong></p>
<p>8 out of 9 hands-on exercises count (worse exercise grade dropped)<br />
1st partial exam = average of first best 4 exercise grades<br />
2nd partial exam = average of last best 4 exercise grades<br />
The average of these two is the same as adding up best 8 grades and dividing by 8. This result amounts to 75% of total grade (course letter grade score).</p>
<p>Final Exam amounts to 25 % of total grade.</p>
<p>After that, course letter grade is curved as shown below.</p>
<p>A (100-89%)<br />
B (88-77%)<br />
C (76-60%)<br />
D (59-50%)<br />
F (49-0%)</p>
<p>where</p>
<p>course letter grade score = (sum of best 8 exercise grades/8)*(0.75) + (final exam grade)*(0.25)</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1100/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1100/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1100/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1100/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1100/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1100/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1100/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1100/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1100/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1100/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1100&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/09/21/internet-engineering-i-course-lectures/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>2009-9-IRW: TCP/IP Practice Exam</title>
		<link>http://irthoughts.wordpress.com/2009/09/12/tcp-ip-practice-exa/</link>
		<comments>http://irthoughts.wordpress.com/2009/09/12/tcp-ip-practice-exa/#comments</comments>
		<pubDate>Sat, 12 Sep 2009 18:24:05 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1093</guid>
		<description><![CDATA[
The current issue of IR Watch is out.  Sorry it was a bit delayed. The featuring article is a practice exam on TCP/IP that I&#8217;m giving to students enrolled in my Internet Engineering I graduate course.
The test was designed to review what students have learned during the first five lectures. Students need to describe about [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1093&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img class="aligncenter" src="http://www.miislita.com/irw/tcp-ip-exam.gif" alt="TCP/IP Review Test" /></p>
<p>The current issue of IR Watch is out.  Sorry it was a bit delayed. The featuring article is a practice exam on TCP/IP that I&#8217;m giving to students enrolled in my <strong>Internet Engineering I</strong> graduate course.</p>
<p>The test was designed to review what students have learned during the first five lectures. Students need to describe about 10 TCP/IP-related vulnerability/hacking practices. So the test also is a great jump start for those interested in such weaknesses.</p>
<p>I have included an Excel gooddie for making IP conversions (IPv4/hexadecimal/decimal equivalent/binary) as well as some material from Tim Berners-Lee 1989 WWW proposal.</p>
<p>Enjoy it.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1093/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1093/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1093/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1093/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1093/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1093/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1093/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1093/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1093/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1093/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1093&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/09/12/tcp-ip-practice-exa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://www.miislita.com/irw/tcp-ip-exam.gif" medium="image">
			<media:title type="html">TCP/IP Review Test</media:title>
		</media:content>
	</item>
		<item>
		<title>New Graduate Courses</title>
		<link>http://irthoughts.wordpress.com/2009/09/01/new-graduate-courses/</link>
		<comments>http://irthoughts.wordpress.com/2009/09/01/new-graduate-courses/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 13:17:05 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Graduate Courses]]></category>
		<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1087</guid>
		<description><![CDATA[As PUPR students know by now, the AIRWeb and Internet Engineering courses have been consolidated into a single course called Internet Engineering I (IE-I), which is on Tuesday&#8217;s.
This was a decision made strictly by the administration. 12 graduate students are enrolled &#8211;a big number for a grad course. We are now in the fourth week [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1087&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>As PUPR students know by now, the AIRWeb and Internet Engineering courses have been consolidated into a single course called <strong>Internet Engineering I (IE-I), </strong>which is on Tuesday&#8217;s.</p>
<p>This was a decision made strictly by the administration. 12 graduate students are enrolled &#8211;a big number for a grad course. We are now in the fourth week of IE-I and I can tell that is a lot of fun.</p>
<p>This coming Winter semester I&#8217;m scheduled to teach a new grad course called <strong>Advanced Search Engine Architecture (ASEA). </strong>Both, IE-I and ASEA are hands-on. This means students need to get their hands and feet wet, not just learning the theory.</p>
<p>What we are trying to accomplish in IE-I is to understand how hackers and spammers use Internet architectures at the level of TCP/IP and Search Engines to game the system. I&#8217;ll open a special blog category for it during the week.</p>
<p>First lecture (Lecture 1) was briefly summarized in the August 2009 issue of IR Watch. BTW. Tonight&#8217;s lecture (Lecture 4) covers the following:</p>
<p>IP Protocol (MAC and IP Mapping)</p>
<p>ICMP Protocol</p>
<p>ARP Hacking Attacks</p>
<p>ICMP Hacking Attacks</p>
<p>Firewall&#8217;s Fragmentation Offset  Attacks</p>
<p>Meanwhile, ASEA is an expanded version of the previous Search Engine Architecture (SEA) course I&#8217;ve taught before. Students interested in registering, can search this blog for the SEA category and check what we have covered in the past. This will give them an idea of what to expect from the Advanced SEA course. One thing I&#8217;m planning to do different is to build an inverted index from scratch using AJAX. The most recent version of Terrier will also be used for testing/benchmarking experimentals.</p>
<p>Last but not least, September Issue of IRW will be a bit delayed.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1087/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1087/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1087/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1087/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1087/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1087/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1087/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1087/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1087/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1087/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1087&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/09/01/new-graduate-courses/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>TCP/IP: The paper that started it all</title>
		<link>http://irthoughts.wordpress.com/2009/08/25/tcpip-the-paper-that-started-it-all/</link>
		<comments>http://irthoughts.wordpress.com/2009/08/25/tcpip-the-paper-that-started-it-all/#comments</comments>
		<pubDate>Tue, 25 Aug 2009 05:00:44 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Internet Engineering]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1085</guid>
		<description><![CDATA[Here is the paper that started all: A Protocol for Packet Network Intercommunication, by Vince Cerf and Bob Kahn.
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1085&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Here is the paper that started all: <a href="http://www.cs.princeton.edu/courses/archive/fall06/cos561/papers/cerf74.pdf">A Protocol for Packet Network Intercommunication</a>, by Vince Cerf and Bob Kahn.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1085/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1085/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1085/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1085/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1085/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1085/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1085/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1085/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1085/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1085/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1085&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/08/25/tcpip-the-paper-that-started-it-all/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>2009-8-IRW: Internet &amp; Search Engines: Early Days</title>
		<link>http://irthoughts.wordpress.com/2009/08/14/2009-8-irw-internet-search-engines-early-days/</link>
		<comments>http://irthoughts.wordpress.com/2009/08/14/2009-8-irw-internet-search-engines-early-days/#comments</comments>
		<pubDate>Fri, 14 Aug 2009 12:34:10 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Newsletters]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1083</guid>
		<description><![CDATA[
The current issue of IRW is already out &#8211;a bit delayed due to reasons previously mentioned. Enjoy it.
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1083&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img class="aligncenter" src="http://www.miislita.com/irw/internet-search-engines.gif" alt="Internet &amp; Search Engines" /></p>
<p>The current issue of IRW is already out &#8211;a bit delayed due to reasons previously mentioned. Enjoy it.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1083/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1083/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1083/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1083/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1083/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1083/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1083/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1083/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1083/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1083/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1083&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/08/14/2009-8-irw-internet-search-engines-early-days/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://www.miislita.com/irw/internet-search-engines.gif" medium="image">
			<media:title type="html">Internet &#38; Search Engines</media:title>
		</media:content>
	</item>
		<item>
		<title>Vector Notation</title>
		<link>http://irthoughts.wordpress.com/2009/08/10/vector-notation/</link>
		<comments>http://irthoughts.wordpress.com/2009/08/10/vector-notation/#comments</comments>
		<pubDate>Mon, 10 Aug 2009 16:00:06 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Latent Semantic Indexing]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1079</guid>
		<description><![CDATA[I&#8217;ve been asked what is the standard notation for vectors. I normally use loose notation, unless I need to write or review a formal piece, in which case I follow the APS style. See also here.
A vector should be represented by a letter, in boldface or with a right arrow on top.
A caret should be [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1079&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve been asked what is the standard notation for vectors. I normally use loose notation, unless I need to write or review a formal piece, in which case I follow the <a href="http://forms.aps.org/author/vecnot-prae.pdf">APS style</a>. See also <a href="http://forms.aps.org/author/h7vectors.pdf">here</a>.</p>
<p>A vector should be represented by a letter, in boldface or with a right arrow on top.</p>
<p>A caret should be used to indicate a unit vector.</p>
<p>An inner product should be indicated by placing a dot between two letters representing vectors.</p>
<p>Note that Dirac Notation is a different animal.</p>
<p>The <a href="http://forms.aps.org/author/styleguide.pdf">APS Style Guide</a> has additional guidelines.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1079/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1079/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1079/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1079/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1079/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1079/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1079/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1079/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1079/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1079/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1079&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/08/10/vector-notation/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Thesaurus as a Complex Network</title>
		<link>http://irthoughts.wordpress.com/2009/08/06/thesaurus-as-a-complex-network/</link>
		<comments>http://irthoughts.wordpress.com/2009/08/06/thesaurus-as-a-complex-network/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 14:04:35 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[IR Tutorials]]></category>
		<category><![CDATA[Latent Semantic Indexing]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1070</guid>
		<description><![CDATA[I came across Thesaurus as a complex network, a fascinating 2003 paper written by Adriano de Jesus Holanda, Ivan Torres Pisa, Osame Kinouchi, Alexandre Souto Martinez and Evandro Eduardo Seron Ruiz from Universidade Sao Paulo, Brazil in which they model thesauri using graph theory. The abstracts reads:
&#8220;A thesaurus is one, out of many, possible representations [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1070&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I came across <a href="http://arxiv.org/PS_cache/cond-mat/pdf/0312/0312586v1.pdf">Thesaurus as a complex network</a>, a fascinating 2003 paper written by Adriano de Jesus Holanda, Ivan Torres Pisa, Osame Kinouchi, Alexandre Souto Martinez and Evandro Eduardo Seron Ruiz from Universidade Sao Paulo, Brazil in which they model thesauri using graph theory. The abstracts reads:</p>
<p>&#8220;A thesaurus is one, out of many, possible representations of term (or word) connectivity. The terms of a thesaurus are seen as the nodes and their relationship as the links of a directed graph. The directionality of the links retains all the thesaurus information and allows the measurement of several quantities. This has lead to a new term classification according to the characteristics of the nodes, for example, nodes with no links in, no links out, etc. Using an electronic available thesaurus we have obtained the incoming and outgoing link distributions. While the incoming link distribution follows a stretched exponential function, the lower bound for the outgoing link distribution has the same envelope of the scientific paper citation distribution proposed by Albuquerque and Tsallis [1]. However, a better fit is obtained by simpler function which is the solution of Ricatti’s differential equation. We conjecture that this differential equation is the continuous limit of a stochastic growth model of the thesaurus network. We also propose a new manner to arrange a thesaurus using the “inversion method”.&#8221;</p>
<p>The study is important because it provides an interesting look at word relationships. They have identified an underlying power law, which in my opinion might be worth to be investigated as to whether it is at core of semantic relationships.</p>
<p>They briefly mentioned the limitations of LSA.:</p>
<p>&#8220;However, LSA has been criticized as a poor approach for predicting semantic neighborhood&#8221;.</p>
<p>Indeed, LSA (or LSI) not necessarily describes or predicts semantics, as originally thought. In my view, LSA/LSI itself is a misnomer. Research references can be provided to support this view.</p>
<p>I do have one additional comment on the paper. In it, LSA is described as a PCA technique. The authors write:</p>
<p>&#8220;Another interesting way to treat data is the Latent Semantic Analysis (LSA) [5] which deals with word covariance in a corpus. LSA is a principal component analysis (PCA) technique , i.e., the covariance matrix is diagonalized and from the most important eigenvalues (around 300) the eigenvectors are considered to span an Euclidean vector space.&#8221;</p>
<p>This might not be entirely accurate. Let see why:</p>
<p>1. PCA was invented by Karl Pearson in 1901 so is more than half a century  older than Golub and Kahan&#8217;s SVD algorithm which was published in 1965. See G. Golub and W. Kahan, J. SIAM, Numer. Anal. SEr. B, Vol 2, No. 2 (1965). </p>
<p>2. In 1988 Dumais, et al applied Golub&#8217;s SVD to text and called that LSA (LSI). See Proceedings of the Conference on Human Factors in Computing Systems, CHI. 281-286, Dumais, S. T., Furnas, G. W., Landauer, T. K., Deerwester, S. &amp; Harshman, R. (1988). See also, Improving information retrieval using Latent Semantic Indexing. Proceedings of the 1988 annual meeting of the American Society for Information Science. Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W., &amp; Beck, L. (1988).</p>
<p>3. In LSA (LSI) the SVD algorithm can be applied to matrices that not necessarily are populated with covariance values.</p>
<p>4. It was only later realized that SVD can be applied to a covariance matrix to obtain the PCA components.</p>
<p>5. See the <a href="http://www.miislita.com/information-retrieval-tutorial/pca-spca-tutorial.pdf">PCA &amp; SPCA Tutorial</a></p>
<p style="text-align:left;">6. PCA is not LSI. See <a href="http://irthoughts.wordpress.com/2007/05/05/pca-is-not-lsi/">http://irthoughts.wordpress.com/2007/05/05/pca-is-not-lsi/</a></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1070/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1070/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1070/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1070&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/08/06/thesaurus-as-a-complex-network/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Random Notes Before School Starts</title>
		<link>http://irthoughts.wordpress.com/2009/08/03/random-notes-before-school-starts/</link>
		<comments>http://irthoughts.wordpress.com/2009/08/03/random-notes-before-school-starts/#comments</comments>
		<pubDate>Mon, 03 Aug 2009 17:43:41 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Latent Semantic Indexing]]></category>
		<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Vector Space Models]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1066</guid>
		<description><![CDATA[1. The current issue of IR Watch will be out over the weekend&#8211;a bit delayed due to getting ready for school, preparing lessons and research projects. If things go as expected, my academic schedule will be a bit busy between teaching and research at two different universities.
2. I&#8217;m researching for a manuscript that deals with [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1066&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>1. The current issue of IR Watch will be out over the weekend&#8211;a bit delayed due to getting ready for school, preparing lessons and research projects. If things go as expected, my academic schedule will be a bit busy between teaching and research at two different universities.</p>
<p>2. I&#8217;m researching for a manuscript that deals with affine transformations applied to several IR problems. It expands on Vector Space Theory and allows one to think out of the &#8220;term-document&#8221; box. Great stuff.</p>
<p>3. Here is a great grad project in ppt format: <a href="http://www.cse.iitb.ac.in/~rohitashwa/btp/Final%20Presentation/Semantically%20Motivated%20Information%20Retrieval.ppt">Semantically Motivated Information Retrieval</a>. I thank its author for referencing my <a href="http://www.miislita.com/information-retrieval-tutorial/singular-value-decomposition-fast-track-tutorial.pdf"> SVD Fast Track Tutorial</a>.</p>
<p>4. Talking about semantically motivated, sentiment analysis, spam, etc&#8230; Funny how some folks in the SEO world like to damage the reputation of others without presenting any evidence. This time the trolls took on Kim Krause Berge ( <a href="http://cre8pc.com/archives/1489">http://cre8pc.com/archives/1489</a> ). I always admire Kim&#8217;s work, consider her an usability icon, and had the privilege of meeting her back in 2005. I was surprised to see these folks having a field day at her expense at Rand&#8217;s site. Kim, I feel your pain. However, more than one SEO forum/blog had lose credibility by allowing these folks, most of which think they can be socially &#8220;ranked&#8221; by attacking whoever is at the &#8220;top&#8221;. The fact is that most trolls are paper tigers that go hidding at the first Cease &amp; Desist or defamation lawsuit.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1066/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1066/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1066/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1066/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1066/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1066/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1066/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1066/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1066/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1066/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1066&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/08/03/random-notes-before-school-starts/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Data Mining Poetry</title>
		<link>http://irthoughts.wordpress.com/2009/07/28/data-mining-poetry/</link>
		<comments>http://irthoughts.wordpress.com/2009/07/28/data-mining-poetry/#comments</comments>
		<pubDate>Tue, 28 Jul 2009 13:50:25 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1058</guid>
		<description><![CDATA[I am intrigued with the subject of data mining poetry. This is an interesting topic for a grad student thesis since:
EXACT querying search engines for &#8220;data mining poetry&#8221; returns a small answer set.
Unlike other type of content, all words, including those considered stopwords, might matter; i.e., these must be counted as might act as content-bearing [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1058&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I am intrigued with the subject of data mining poetry. This is an interesting topic for a grad student thesis since:</p>
<p>EXACT querying search engines for &#8220;data mining poetry&#8221; returns a small answer set.</p>
<p>Unlike other type of content, all words, including those considered stopwords, might matter; i.e., these must be counted as might act as content-bearing terms &#8211;thus, there is no such thing as stopwords in poetry.</p>
<p>Word statistics (e.g., word counts per lines) and specific tokens matter, unless we talk about the so-called free-style poetry.</p>
<p>Metric makes poetry suitable for building language-specific and writing style-specific parsers.</p>
<p>Any help will be kindly appreciated. Meanwhile, here are some relevant links:</p>
<p><a href="http://students.cec.wustl.edu/~jcd1/parser">poetry parser, anyone?</a></p>
<p><a href="http://www.google.com/search?q=poema materotico">Poema Materotico</a> &#8211; This is a blog-popular poem written in Spanish</p>
<p><a href="http://www.google.com/search?q=poemas matematicos">poemas matematicos</a> &#8211; Spanish resources</p>
<p><a href="http://www.google.com/search?q=mathematical poems">mathematical poems</a></p>
<p><a href="http://www.google.com/search?q=mathematical poetry">mathematical poetry</a></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1058/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1058/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1058/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1058/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1058/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1058/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1058/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1058/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1058/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1058/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1058&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/07/28/data-mining-poetry/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Hacking the Cloud: Getting Google&#8217;s User Data by Hacking Twitter</title>
		<link>http://irthoughts.wordpress.com/2009/07/17/hacking-the-cloud-getting-googles-user-data-by-hacking-twitter/</link>
		<comments>http://irthoughts.wordpress.com/2009/07/17/hacking-the-cloud-getting-googles-user-data-by-hacking-twitter/#comments</comments>
		<pubDate>Fri, 17 Jul 2009 13:12:02 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Homeland Security]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1052</guid>
		<description><![CDATA[A day ago Michael Arrington&#8217;s Techrunch published excerpts from &#8220;leaked&#8221; documents stolen from the Google Apps account of a Twitter Employee which included over 300 confidential files meant for &#8220;internal&#8221; Twitter consumption. &#8220;Hacker Croll&#8221; sent TechCrunch a zip file with 310 private files from inside Twitter.
(http://www.techtree.com/India/News/Leaked_Documents_Twitter_TechCrunch_Faceoff/551-104503-643.html).
It appears HC essentially used a cracker tool of some [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1052&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>A day ago Michael Arrington&#8217;s Techrunch published excerpts from &#8220;leaked&#8221; documents stolen from the Google Apps account of a Twitter Employee which included over 300 confidential files meant for &#8220;internal&#8221; Twitter consumption. &#8220;Hacker Croll&#8221; sent TechCrunch a zip file with 310 private files from inside Twitter.<br />
(<a href="http://www.techtree.com/India/News/Leaked_Documents_Twitter_TechCrunch_Faceoff/551-104503-643.html">http://www.techtree.com/India/News/Leaked_Documents_Twitter_TechCrunch_Faceoff/551-104503-643.html</a>).</p>
<p>It appears HC essentially used a cracker tool of some sort to brute-guess weak passwords. Once inside the first security ring, &#8230;</p>
<p>Cloud Programs: A Web Vulnerability Paradise for Hackers</p>
<p>Twitter relies heavily on cloud-based apps (Web-centric programs such as Google Docs or Web-based e-mail), and these services are becoming increasingly interconnected. Even social Web apps are beginning to share data: Facebook Connect and Google Friend Connect, for example, let you log in to multiple sites with a simple Facebook or Google account, raising the vulnerability of your entire online identity.<br />
(<a href="http://www.switched.com/2009/07/17/twitter-employee-accounts-hacked-business-documents-leaked/">http://www.switched.com/2009/07/17/twitter-employee-accounts-hacked-business-documents-leaked/</a>)</p>
<p>The documents coming out of the hacker seem to be pretty significant. The &#8220;problem&#8221; is that if you have a Google Apps email account compromised, you also have shared calendar, Docs, Contacts, Wikis(Sites), etc.<br />
(<a href="http://www.pcworld.com/article/168572/google_apps_security_questioned_after_twitter_leak.html">http://www.pcworld.com/article/168572/google_apps_security_questioned_after_twitter_leak.html</a>)</p>
<p>This might be a good case study for students planning to take the <a href="http://www.miislita.com/courses/airweb-web-spam-syllabus.pdf">AIR Web: Web Spam and Internet Vulnerability course.</a></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1052/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1052/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1052/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1052/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1052/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1052/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1052/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1052/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1052/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1052/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1052&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/07/17/hacking-the-cloud-getting-googles-user-data-by-hacking-twitter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Google Voice: A Call to Rule all Voice-based Services</title>
		<link>http://irthoughts.wordpress.com/2009/07/16/google-voice-a-call-to-rule-all-voice-based-services/</link>
		<comments>http://irthoughts.wordpress.com/2009/07/16/google-voice-a-call-to-rule-all-voice-based-services/#comments</comments>
		<pubDate>Thu, 16 Jul 2009 05:00:28 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1045</guid>
		<description><![CDATA[Google yesterday announced Google Voice for Android and BlackBerry, a cell phone mobile service which brings voicemail transcriptions, the ability to call and text with your Voice number, and cheap international dialing to yourmobile phone. It is like one number to rule them all!
According to Google Mobile Blog:
&#8220;The Google Voice app integrates seamlessly with your [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1045&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Google yesterday announced Google Voice for Android and BlackBerry, a cell phone mobile service which brings voicemail transcriptions, the ability to call and text with your Voice number, and cheap international dialing to yourmobile phone. It is like one number to rule them all!</p>
<p>According to Google Mobile Blog:</p>
<p>&#8220;The Google Voice app integrates seamlessly with your phone&#8217;s native address book, making it even easier to call or text with your Voice number. Voicemail transcriptions are now available, and the app will highlight individual words during playback just like your favorite karaoke song. It also lets you take advantage of Google Voice&#8217;s low-priced international call rates, starting at only $0.02/minute.&#8221;</p>
<p>It is expected the service to eventually serve as a glue for other Google services like GMail, Web Searches, YouTube, SMS, etc.</p>
<p>These are great news, considering that as mentioned in the current issue of IRW, texting is a new playground for data miners. Imagine then a similar mining playground involving voice!</p>
<p><a href="http://googlemobile.blogspot.com/2009/07/google-voice-for-android-and-blackberry.html">http://googlemobile.blogspot.com/2009/07/google-voice-for-android-and-blackberry.html</a></p>
<p>See also</p>
<p><a href="http://searchengineland.com/google-voice-for-mobile-one-number-to-rule-them-all-22417">http://searchengineland.com/google-voice-for-mobile-one-number-to-rule-them-all-22417</a></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1045/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1045&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/07/16/google-voice-a-call-to-rule-all-voice-based-services/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>The Most Influential Paper that Gerard Salton Never Wrote</title>
		<link>http://irthoughts.wordpress.com/2009/07/15/the-most-influential-paper-that-gerard-salton-never-wrote/</link>
		<comments>http://irthoughts.wordpress.com/2009/07/15/the-most-influential-paper-that-gerard-salton-never-wrote/#comments</comments>
		<pubDate>Wed, 15 Jul 2009 05:00:05 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Vector Space Models]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1026</guid>
		<description><![CDATA[It is surprising how even serious information retrieval researchers and journals quote papers that were never written! 
This is the thesis of David Dubin&#8217;s 2004 great article
The Most Influential Paper Gerard Salton Never Wrote
Dubin wrote:
&#8220;In giving credit to Salton for the vector model, a number of authors cite an overview paper titled &#8220;A Vector Space [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1026&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>It is surprising how even serious information retrieval researchers and journals quote papers that were never written! </p>
<p>This is the thesis of David Dubin&#8217;s 2004 great article<br />
<a href="http://www.ideals.uiuc.edu/bitstream/handle/2142/1697/Dubin748764.pdf?sequence=2">The Most Influential Paper Gerard Salton Never Wrote</a></p>
<p>Dubin wrote:</p>
<p>&#8220;In giving credit to Salton for the vector model, a number of authors cite an overview paper titled &#8220;A Vector Space Model for Information Retrieval,&#8221; which some show as published in the JASIS in 1975 and others as published in the Communications of the Association for Computing Machinery (CACM) in 1975. In fact, no such article was ever published, and citations to it usually represent a confusion of two 1975 articles (Salton, Wong, &amp; Yang, 1975; Salton, Yang, &amp; Yu, 1975), neither of which were overviews of the VSM as it is generally understood (see section 5 below). Some of Salton&#8217;s own colleagues have been guilty of this mistake: both Cardie et al. and Singhal cite the CACM version, for example (Singhal, 2001; Cardie, Ng, Pierce, &amp; Buckley, 2000). The paper is even cited in a few of the very last articles on which Salton is listed as a coauthor (Singhal, Salton, Mitra, &amp; Buckley, 1996; Singhal &amp; Salton, 1995). These papers were published close to or shortly after the time of his death, and so the errors cannot be blamed on Salton (remembered by his colleagues as a very careful and meticulous writer).&#8221;</p>
<p>Somehow far too many IRs misquote Salton&#8217;s 1975 paper titled &#8220;A vector space model for <strong>automatic indexing</strong>&#8220;. This causes digital libraries to create a spurious record attached to many cross-referenced articles.</p>
<p>I searched Google for <a href="http://www.google.com.pr/search?sourceid=navclient&amp;aq=0h&amp;oq=%22&amp;ie=UTF-8&amp;rlz=1T4GGLJ_enPR268PR268&amp;q=%22a+vector+space+model+for+information+retrieval%22+salton">&#8220;a vector space model for information retrieval&#8221; + salton</a> and indeed there are many reputed publications and researchers citing a paper that was never published!  What a shame.</p>
<p>That says a lot about researchers, editors, and reviewers that were lazy enough to never bother about the accuracy of the references. </p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1026/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1026/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1026/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1026&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/07/15/the-most-influential-paper-that-gerard-salton-never-wrote/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Centering Data With Excel</title>
		<link>http://irthoughts.wordpress.com/2009/07/10/centering-data-with-excel/</link>
		<comments>http://irthoughts.wordpress.com/2009/07/10/centering-data-with-excel/#comments</comments>
		<pubDate>Fri, 10 Jul 2009 16:58:47 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1020</guid>
		<description><![CDATA[The QA column of the current issue of IR Watch Newsletter has a great question that might help IR, CS, and stats students.
Q: Centering Data with Excel- In Excel, how do you center a data set?
 A: To center a data set, use the STANDARDIZE function which converts x values into z-scores; i.e. 
z = (x [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1020&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>The QA column of the current issue of IR Watch Newsletter has a great question that might help IR, CS, and stats students.</p>
<p><strong>Q:</strong> <strong>Centering Data with Excel</strong>- In Excel, how do you center a data set?</p>
<p> <strong>A:</strong> To center a data set, use the STANDARDIZE function which converts <strong><em>x</em></strong> values into <strong><em>z-scores</em></strong>; <em>i.e. </em></p>
<p><strong><em>z = (x – a)/s</em></strong></p>
<p>where <strong><em>a</em></strong> and <strong><em>s</em></strong> respectively are the population arithmetic mean and standard deviation. The following table emulates an Excel spreadsheet.</p>
<table style="text-align:center;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top"> </td>
<td valign="top">
<p align="center"><strong>A</strong></p>
</td>
<td valign="top">
<p align="center"><strong>B</strong></p>
</td>
<td valign="top">
<p align="center"><strong>C</strong></p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>1</strong></p>
</td>
<td valign="top">
<p align="center"><strong><em>Age, x(A)</em></strong></p>
</td>
<td valign="top">
<p align="center"><strong><em>Weight, x(W)</em></strong></p>
</td>
<td valign="top">
<p align="center"><strong><em>Height, x(H)</em></strong></p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>2</strong></p>
</td>
<td valign="top">
<p align="center">64</p>
</td>
<td valign="top">
<p align="center">57</p>
</td>
<td valign="top">
<p align="center">8</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>3</strong></p>
</td>
<td valign="top">
<p align="center">71</p>
</td>
<td valign="top">
<p align="center">59</p>
</td>
<td valign="top">
<p align="center">10</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>4</strong></p>
</td>
<td valign="top">
<p align="center">53</p>
</td>
<td valign="top">
<p align="center">49</p>
</td>
<td valign="top">
<p align="center">6</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>5</strong></p>
</td>
<td valign="top">
<p align="center">67</p>
</td>
<td valign="top">
<p align="center">62</p>
</td>
<td valign="top">
<p align="center">11</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>6</strong></p>
</td>
<td valign="top">
<p align="center">55</p>
</td>
<td valign="top">
<p align="center">51</p>
</td>
<td valign="top">
<p align="center">8</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>7</strong></p>
</td>
<td valign="top">
<p align="center">58</p>
</td>
<td valign="top">
<p align="center">50</p>
</td>
<td valign="top">
<p align="center">7</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>8</strong></p>
</td>
<td valign="top">
<p align="center">77</p>
</td>
<td valign="top">
<p align="center">55</p>
</td>
<td valign="top">
<p align="center">10</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>9</strong></p>
</td>
<td valign="top">
<p align="center">57</p>
</td>
<td valign="top">
<p align="center">48</p>
</td>
<td valign="top">
<p align="center">9</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>10</strong></p>
</td>
<td valign="top">
<p align="center">56</p>
</td>
<td valign="top">
<p align="center">42</p>
</td>
<td valign="top">
<p align="center">10</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>11</strong></p>
</td>
<td valign="top">
<p align="center">51</p>
</td>
<td valign="top">
<p align="center">42</p>
</td>
<td valign="top">
<p align="center">6</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>12</strong></p>
</td>
<td valign="top">
<p align="center">76</p>
</td>
<td valign="top">
<p align="center">61</p>
</td>
<td valign="top">
<p align="center">12</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>13</strong></p>
</td>
<td valign="top">
<p align="center">68</p>
</td>
<td valign="top">
<p align="center">57</p>
</td>
<td valign="top">
<p align="center">9</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>14</strong></p>
</td>
<td valign="top"> </td>
<td valign="top"> </td>
<td valign="top"> </td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>15</strong></p>
</td>
<td valign="top">
<p align="center"><strong><em>z(A)</em></strong></p>
</td>
<td valign="top">
<p align="center"><strong><em>z(W)</em></strong></p>
</td>
<td valign="top">
<p align="center"><strong><em>z(H)</em></strong></p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>16</strong></p>
</td>
<td valign="top">
<p align="center">0.14</p>
</td>
<td valign="top">
<p align="center">0.62</p>
</td>
<td valign="top">
<p align="center">-0.44</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>17</strong></p>
</td>
<td valign="top">
<p align="center">0.92</p>
</td>
<td valign="top">
<p align="center">0.92</p>
</td>
<td valign="top">
<p align="center">0.61</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>18</strong></p>
</td>
<td valign="top">
<p align="center">-1.09</p>
</td>
<td valign="top">
<p align="center">-0.55</p>
</td>
<td valign="top">
<p align="center">-1.49</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>19</strong></p>
</td>
<td valign="top">
<p align="center">0.47</p>
</td>
<td valign="top">
<p align="center">1.36</p>
</td>
<td valign="top">
<p align="center">1.14</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>20</strong></p>
</td>
<td valign="top">
<p align="center">-0.86</p>
</td>
<td valign="top">
<p align="center">-0.26</p>
</td>
<td valign="top">
<p align="center">-0.44</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>21</strong></p>
</td>
<td valign="top">
<p align="center">-0.53</p>
</td>
<td valign="top">
<p align="center">-0.40</p>
</td>
<td valign="top">
<p align="center">-0.97</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>22</strong></p>
</td>
<td valign="top">
<p align="center">1.59</p>
</td>
<td valign="top">
<p align="center">0.33</p>
</td>
<td valign="top">
<p align="center">0.61</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>23</strong></p>
</td>
<td valign="top">
<p align="center">-0.64</p>
</td>
<td valign="top">
<p align="center">-0.70</p>
</td>
<td valign="top">
<p align="center">0.09</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>24</strong></p>
</td>
<td valign="top">
<p align="center">-0.75</p>
</td>
<td valign="top">
<p align="center">-1.58</p>
</td>
<td valign="top">
<p align="center">0.61</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>25</strong></p>
</td>
<td valign="top">
<p align="center">-1.31</p>
</td>
<td valign="top">
<p align="center">-1.58</p>
</td>
<td valign="top">
<p align="center">-1.49</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>26</strong></p>
</td>
<td valign="top">
<p align="center">1.47</p>
</td>
<td valign="top">
<p align="center">1.21</p>
</td>
<td valign="top">
<p align="center">1.67</p>
</td>
</tr>
<tr>
<td valign="top">
<p align="center"><strong>27</strong></p>
</td>
<td valign="top">
<p align="center">0.58</p>
</td>
<td valign="top">
<p align="center">0.62</p>
</td>
<td valign="top">
<p align="center">0.09</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align:left;">Rows 2 – 13 contains the data set x(A), x(W), and x(H). In rows 16 – 27 the set was centered by typing in cell A16 the formula</p>
<p style="text-align:left;"> =STANDARDIZE(A2,AVERAGE(A$2:A$13),STDEV(A$2:A$13))</p>
<p style="text-align:left;"> Pasting this formula in cells A16 through C27 centers the data set. That was easy!</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1020/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1020/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1020/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1020/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1020/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1020/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1020/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1020/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1020/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1020/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1020&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/07/10/centering-data-with-excel/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>IRW-7-2009: Data Mining Texting</title>
		<link>http://irthoughts.wordpress.com/2009/07/06/irw-7-2009-data-mining-texting/</link>
		<comments>http://irthoughts.wordpress.com/2009/07/06/irw-7-2009-data-mining-texting/#comments</comments>
		<pubDate>Mon, 06 Jul 2009 16:17:11 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1008</guid>
		<description><![CDATA[
The current issue of IRW the newsletter is out.
Featuring Article:
Data Mining Texting
TTMD OMG MOS CU
&#8220;My parents send email, I text.” This illustrates the obvious: a digital divide between parents and teens. While parents are busy replying to email or blogging at the most, their kids probably are busy developing their own language to alert their [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1008&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img class="aligncenter" src="http://www.miislita.com/irw/data-mining-texting.gif" alt="data mining texting" /></p>
<p>The current issue of IRW the newsletter is out.</p>
<p>Featuring Article:</p>
<p><strong>Data Mining Texting</strong><br />
<em>TTMD OMG MOS CU</em></p>
<blockquote><p>&#8220;My parents send email, I text.” This illustrates the obvious: a digital divide between parents and teens. While parents are busy replying to email or blogging at the most, their kids probably are busy developing their own language to alert their peers when mom or dad is trying to figure out what they are texting about. Did you know that MOS  CU means ‘Mother over shoulder’. ‘See you’. And how about PW CUL? (‘Parents watching. See you later’).</p></blockquote>
<p>Indeed&#8230; Texting is not just for teens:</p>
<blockquote><p>Texting not only is revolutionizing the way businesses are being conducted in 2009, but is an emerging data mining playground. The number of behavioral patterns in connection with texting is on the rise at different diffusion fronts: from sexting and sextcasting (transmission of conversations, videos, photos with sexual content) to dealing (transmission of conversations in connection with illegal drug activities), to encoding conversations about Wall Street transactions, industrial espionage, and so forth.</p></blockquote>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1008/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1008/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1008/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1008&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/07/06/irw-7-2009-data-mining-texting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://www.miislita.com/irw/data-mining-texting.gif" medium="image">
			<media:title type="html">data mining texting</media:title>
		</media:content>
	</item>
		<item>
		<title>Random notes prior to 4th July weekend</title>
		<link>http://irthoughts.wordpress.com/2009/07/03/random-notes-prior-to-4th-july-weekend/</link>
		<comments>http://irthoughts.wordpress.com/2009/07/03/random-notes-prior-to-4th-july-weekend/#comments</comments>
		<pubDate>Fri, 03 Jul 2009 13:21:37 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[Newsletters]]></category>
		<category><![CDATA[SEO Myths]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1006</guid>
		<description><![CDATA[As the 4th of July weekend approaches, here are some notes before hitting to planet oblivious.
1. Yesterday we had an interesting business entrepreneur meeting with the CIO of the Government of Puerto Rico at El Palacio Rojo, Fortaleza.
2. IRW should be out by Monday. Main article: Data Mining Texting.
3. Only monkeys still believe in KD Myths. Ha, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1006&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>As the 4th of July weekend approaches, here are some notes before hitting to planet oblivious.</p>
<p>1. Yesterday we had an interesting business entrepreneur meeting with the CIO of the Government of Puerto Rico at El Palacio Rojo, Fortaleza.</p>
<p>2. IRW should be out by Monday. Main article: Data Mining Texting.</p>
<p>3. Only <a href="http://www.bloggeries.com/forum/seo-search-engine-optimization/13517-free-keyword-counter-articles.html">monkeys</a> still believe in KD Myths. Ha, Ha.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1006/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1006/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1006/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1006/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1006/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1006/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1006/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1006/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1006/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1006/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1006&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/07/03/random-notes-prior-to-4th-july-weekend/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Official: MIC Puerto Rico</title>
		<link>http://irthoughts.wordpress.com/2009/06/23/official-mic-puerto-rico/</link>
		<comments>http://irthoughts.wordpress.com/2009/06/23/official-mic-puerto-rico/#comments</comments>
		<pubDate>Tue, 23 Jun 2009 15:32:13 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=1000</guid>
		<description><![CDATA[Back in April, I mentioned that Microsoft will be co-launching with Interamerican University of Puerto Rico, Metropolitan Campus the Microsoft Innovation Center (MIC) of Puerto Rico.
Well, tomorrow is the official inauguration. the university generously has provided me with lab and office space to start an interesting research project within the MIC building. These are exciting news. I cannot [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1000&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Back in April, I mentioned that Microsoft will be co-launching with Interamerican University of Puerto Rico, Metropolitan Campus the <a href="http://irthoughts.wordpress.com/2009/04/29/microsoft-inter-metro-to-co-launch-a-mic/">Microsoft Innovation Center (MIC)</a> of Puerto Rico.</p>
<p>Well, tomorrow is the official inauguration. the university generously has provided me with lab and office space to start an interesting research project within the MIC building. These are exciting news. I cannot comment much about the project, except to say that it is at the interface of search engines, social networks, and information security.</p>
<p>It looks like I will have my hands full between workig at two universities, blogging, and doing consulting work.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/1000/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/1000/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/1000/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/1000/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/1000/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/1000/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/1000/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/1000/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/1000/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/1000/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=1000&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/06/23/official-mic-puerto-rico/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>IR Videos in Spanish</title>
		<link>http://irthoughts.wordpress.com/2009/06/22/ir-videos-in-spanish/</link>
		<comments>http://irthoughts.wordpress.com/2009/06/22/ir-videos-in-spanish/#comments</comments>
		<pubDate>Mon, 22 Jun 2009 05:00:55 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[IR Tutorials]]></category>
		<category><![CDATA[Latent Semantic Indexing]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=991</guid>
		<description><![CDATA[I normally do not put online my lecture notes (ppt, pdf, videos). However, there are two public conferences that event organizers taped. Both last over 1 hour and are in Spanish, but with slides in English. Here are the links. The quality of the videos is so-so.
Since the videos were made available few months later after the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=991&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I normally do not put online my lecture notes (ppt, pdf, videos). However, there are two public conferences that event organizers taped. Both last over 1 hour and are in Spanish, but with slides in English. Here are the links. The quality of the videos is so-so.</p>
<p>Since the videos were made available few months later after the events, these are not properly dated. I have included below the actual date of the events. If you don&#8217;t know Spanish, you are out of luck.</p>
<p>1. Understanding Search Engines (Entendiendo a los Buscadores), University of Puerto Rico, Bayamon, 4-23-2008</p>
<p><a href="http://video.google.com/videoplay?docid=-653964730907023811">http://video.google.com/videoplay?docid=-653964730907023811</a></p>
<p>This one last for about two hours. The audience consisted of grad students and researchers. Unfortunately, the video has an audio-visual mismatch of about one slide. If you can coupe with this, I hope you like it.</p>
<p>2. Demystifying LSI (Desmitificando LSI)- OJOBuscador Congress, Madrid, Spain, 3-09-2007.</p>
<p><a href="http://www.ojotube.com/videos/congreso-ojobuscador-2007-ponencia-desmitificando-lsi-de-dr-e-garcia/">http://www.ojotube.com/videos/congreso-ojobuscador-2007-ponencia-desmitificando-lsi-de-dr-e-garcia/</a><a href="http://lo-mas-buscado-en-google.guca.es/2008/03/26/lsi-latent-semantic-indexing/"></a></p>
<p>This one last for over one hour. Since it was for a non-scientific audience  (most Spanish SEOs)  I tried to talk very slow.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/991/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/991/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/991/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=991&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/06/22/ir-videos-in-spanish/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>What is a Similarity Matrix?</title>
		<link>http://irthoughts.wordpress.com/2009/06/16/what-is-a-similarity-matrix/</link>
		<comments>http://irthoughts.wordpress.com/2009/06/16/what-is-a-similarity-matrix/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 14:23:57 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[IR Quizzes]]></category>
		<category><![CDATA[Latent Semantic Indexing]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=986</guid>
		<description><![CDATA[Soon or later CS students, in particularly those in IR, will need to deal with similarity matrices.
In simple terms, any matrix M that exhibits the following five characteristics is a similarity matrix.
Squaredness = M must have the same number of rows and columns.
Non-Negativity = all elements of M must be real, non-negative numbers.
Boundedness = all elements [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=986&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Soon or later CS students, in particularly those in IR, will need to deal with similarity matrices.</p>
<p>In simple terms, any matrix <strong>M</strong> that exhibits the following five characteristics is a similarity matrix.</p>
<p><strong>Squaredness</strong> = <strong>M</strong> must have the same number of rows and columns.<br />
<strong>Non-Negativity</strong> = all elements of <strong>M</strong> must be real, non-negative numbers.<br />
<strong>Boundedness</strong> = all elements of <strong>M</strong> must adopt values between 0 and 1.<br />
<strong>Reflexivity</strong> = all diagonal elements of <strong>M</strong> (i.e. from left to bottom) must be filled with 1.<br />
<strong>Symmetry</strong> = all ij elements must be identical to all ji elements.</p>
<p>A matrix that fails to exhibit any of these characteristics is not a similarity matrix.</p>
<p>Accordingly, some matrices found in the literature on LSI and whose elements have been referred to as similarities are not so since the corresponding matrix does not conform to the above definition.</p>
<p>Note. This information will help those that took the <a href="http://irthoughts.wordpress.com/2009/05/13/ir-quiz-matrices/">IR Quiz on Matrices</a> to realize how well they did.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/986/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/986/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/986/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/986/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/986/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/986/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/986/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/986/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/986/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/986/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=986&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/06/16/what-is-a-similarity-matrix/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Computing Co-Occurrence Matrices with Excel</title>
		<link>http://irthoughts.wordpress.com/2009/06/05/computing-co-occurrence-matrices-with-excel/</link>
		<comments>http://irthoughts.wordpress.com/2009/06/05/computing-co-occurrence-matrices-with-excel/#comments</comments>
		<pubDate>Fri, 05 Jun 2009 14:37:46 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=976</guid>
		<description><![CDATA[The QA column of the current issue of IR Watch &#8211; The Newsletter features the following question:
Question: In Excel, how do you convert a term-document occurrence matrix into a term-term or document-document co-occurrence matrix?
Answer:
Let A be a matrix populated with term occurrences (frequencies).
Let AT be its transpose.
Then, T = AAT is a term-term co-occurrence matrix, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=976&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>The QA column of the current issue of IR Watch &#8211; The Newsletter features the following question:</p>
<p>Question: In Excel, how do you convert a term-document occurrence matrix into a term-term or document-document co-occurrence matrix?</p>
<p>Answer:</p>
<p>Let <strong>A</strong> be a matrix populated with term occurrences (frequencies).<br />
Let <strong>A<sup>T</sup> </strong>be its transpose.</p>
<p>Then, <strong>T = AA<sup>T</sup></strong> is a term-term co-occurrence matrix, and <strong>D = A<sup>T</sup>A </strong>is a document-document co-occurrence matrix.</p>
<p>The following table emulates an Excel spreadsheet.</p>
<table style="text-align:center;" border="1" cellspacing="0" cellpadding="0" width="276">
<tbody>
<tr>
<td width="20" valign="bottom">
<p align="center"> </p>
</td>
<td width="64" valign="bottom">
<p align="center">A</p>
</td>
<td width="64" valign="bottom">
<p align="center">B</p>
</td>
<td width="64" valign="bottom">
<p align="center">C</p>
</td>
<td width="64" valign="bottom">
<p align="center">D</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">1</td>
<td width="64" valign="bottom"><strong> A =</strong></td>
<td width="64" valign="bottom">
<p align="center">d1</p>
</td>
<td width="64" valign="bottom">
<p align="center">d2</p>
</td>
<td width="64" valign="bottom">
<p align="center">d3</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">2</td>
<td width="64" valign="bottom">
<p align="right">t1</p>
</td>
<td width="64" valign="bottom">
<p align="center">0</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">0</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">3</td>
<td width="64" valign="bottom">
<p align="right">t2</p>
</td>
<td width="64" valign="bottom">
<p align="center">0</p>
</td>
<td width="64" valign="bottom">
<p align="center">0</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">4</td>
<td width="64" valign="bottom">
<p align="right">t3</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">5</td>
<td width="64" valign="bottom">
<p align="right"> </p>
</td>
<td width="64" valign="bottom">
<p align="center"> </p>
</td>
<td width="64" valign="bottom">
<p align="center"> </p>
</td>
<td width="64" valign="bottom">
<p align="center"> </p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">6</td>
<td width="64" valign="bottom">
<p align="center"><strong>T = AA<sup>T</sup></strong></p>
</td>
<td width="64" valign="bottom">
<p align="center">t1</p>
</td>
<td width="64" valign="bottom">
<p align="center">t2</p>
</td>
<td width="64" valign="bottom">
<p align="center">t3</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">7</td>
<td width="64" valign="bottom">
<p align="right">t1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">0</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">8</td>
<td width="64" valign="bottom">
<p align="right">t2</p>
</td>
<td width="64" valign="bottom">
<p align="center">0</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">9</td>
<td width="64" valign="bottom">
<p align="right">t3</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">3</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">10</td>
<td width="64" valign="bottom">
<p align="right"> </p>
</td>
<td width="64" valign="bottom">
<p align="center"> </p>
</td>
<td width="64" valign="bottom">
<p align="center"> </p>
</td>
<td width="64" valign="bottom">
<p align="center"> </p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">11</td>
<td width="64" valign="bottom">
<p align="center"><strong>D = A<sup>T</sup>A</strong></p>
</td>
<td width="64" valign="bottom">
<p align="center">d1</p>
</td>
<td width="64" valign="bottom">
<p align="center">d2</p>
</td>
<td width="64" valign="bottom">
<p align="center">d3</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">12</td>
<td width="64" valign="bottom">
<p align="right">d1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">13</td>
<td width="64" valign="bottom">
<p align="right">d2</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">2</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
</tr>
<tr>
<td width="20" valign="bottom">14</td>
<td width="64" valign="bottom">
<p align="right">d3</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">1</p>
</td>
<td width="64" valign="bottom">
<p align="center">2</p>
</td>
</tr>
</tbody>
</table>
<p>In the table, <strong>T</strong> was computed by selecting a destination array, entering in its first empty cell (<strong>B7</strong>) the formula <strong>=MMULT(B2:D4,TRANSPOSE(B2:D4))</strong>, pressing the <strong>f2</strong> key and then the <strong>Ctrl+Shift+Enter</strong> keys.</p>
<p>Similarly, <strong>D </strong>was computed by selecting a destination array, entering in its first empty cell (<strong>B12</strong>) the formula <strong>=MMULT(TRANSPOSE(B2:D4),B2:D4)</strong>, pressing the <strong>f2</strong> key and then the <strong>Ctrl+Shift+Enter</strong> keys.</p>
<p>That was easy!</p>
<p>Note that none of these are similarity matrices. Can you tell why?</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/976/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/976/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/976/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/976/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/976/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/976/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/976/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/976/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/976/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/976/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=976&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/06/05/computing-co-occurrence-matrices-with-excel/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>IRW-2009-6:Hackers: Taxonomy &amp; Writing Styles</title>
		<link>http://irthoughts.wordpress.com/2009/06/01/irw-2009-6hackers-taxonomy-writing-styles/</link>
		<comments>http://irthoughts.wordpress.com/2009/06/01/irw-2009-6hackers-taxonomy-writing-styles/#comments</comments>
		<pubDate>Mon, 01 Jun 2009 13:56:18 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Homeland Security]]></category>
		<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=973</guid>
		<description><![CDATA[
The current issue of IRW should reach subscribers inbox during the day or at the latest, tomorrow.
In this issue:

Featuring article: Hackers: Taxonomy and Writing Styles
Due to the increasing interest in developing Information Retrieval and Data Mining courses at the intersection of Information Security, this issue of the newsletter covers a brief taxonomy on hackers and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=973&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img class="aligncenter" src="http://www.miislita.com/irw/hackers.gif" alt="hackers" /></p>
<p>The current issue of IRW should reach subscribers inbox during the day or at the latest, tomorrow.</p>
<p>In this issue:</p>
<ul>
<li>Featuring article: Hackers: Taxonomy and Writing Styles<br />
Due to the increasing interest in developing Information Retrieval and Data Mining courses at the intersection of Information Security, this issue of the newsletter covers a brief taxonomy on hackers and their writing styles.</li>
<li>QA: Excel Matrix Multiplications: How to convert a term-document occurrence matrix into a term-term or document-document co-occurrence matrix?</li>
<li>Vacuum Tubes &amp; Transistors Historical</li>
<li>Who is Who in IR: Thomas K. Landauer</li>
<li>Top CS Departments: Dartmouth College</li>
<li>Outstanding Graduate Theses</li>
<li>Calls and Events</li>
<li>IR Blogs</li>
<li>and more&#8230;</li>
</ul>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/973/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/973/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/973/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/973/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/973/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/973/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/973/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/973/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/973/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/973/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=973&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/06/01/irw-2009-6hackers-taxonomy-writing-styles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://www.miislita.com/irw/hackers.gif" medium="image">
			<media:title type="html">hackers</media:title>
		</media:content>
	</item>
		<item>
		<title>On Term Repetition and Local Models</title>
		<link>http://irthoughts.wordpress.com/2009/05/27/on-term-repetition-and-local-models/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/27/on-term-repetition-and-local-models/#comments</comments>
		<pubDate>Wed, 27 May 2009 17:08:34 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[SEO Myths]]></category>
		<category><![CDATA[Vector Space Models]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=968</guid>
		<description><![CDATA[I&#8217;m putting together a piece on several local term weight models. It should be ready in few weeks.
It is a research paper that can be used as a tutorial. It describes a systematic approach for the derivation of any kind of local term weighting model. Students can use it as a recipe for proposing their [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=968&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;m putting together a piece on several local term weight models. It should be ready in few weeks.</p>
<p>It is a research paper that can be used as a tutorial. It describes a systematic approach for the derivation of any kind of local term weighting model. Students can use it as a recipe for proposing their own candidate models.</p>
<p>The article touches on some aspects of the problem of trusting models that lack of attenuation. Here is one snippet on the subject:</p>
<p>&lt;last nail in KD coffin  style=&#8221;intensity:100%;&#8221;&gt;</p>
<p>&#8220;It should be stressed that term repetition not necessarily satisfies users’ queries nor is evidence of:</p>
<p> <em><strong>Pertinence (P)</strong></em>; e.g., that a term repeated x times is x times more pertinent to the document.</p>
<p><strong><em>Aboutness (A)</em></strong>; e.g., that the document is x times more about the term.</p>
<p><strong><em>Importance (I)</em></strong>; i.e., that there is a term-document relationship of <em>pertinence </em>and<em> aboutness</em>.</p>
<p><strong><em>Relevance (R)</em></strong>;i..e., that a document repeating a term x times is x times more relevant.</p>
<p>Accordingly, fulfilling such <em>‘PAIR criteria’</em> on a regular basis is hard to accomplish with any model that lacks of attenuation.&#8221;</p>
<p>&lt;/last nail in KD coffin&gt;</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/968/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/968/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/968/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/968/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/968/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/968/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/968/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/968/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/968/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/968/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=968&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/27/on-term-repetition-and-local-models/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Defining Data Mining and Database</title>
		<link>http://irthoughts.wordpress.com/2009/05/25/defining-data-mining-and-database/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/25/defining-data-mining-and-database/#comments</comments>
		<pubDate>Mon, 25 May 2009 16:00:58 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=959</guid>
		<description><![CDATA[What is the (^H^H^H) best definition for data mining and database? It depends on who you ask and in which context.
According to Section 126 of the USA Patriot Act,
(1) DATA-MINING- The term `data-mining&#8217; means a query or search or other analysis of one or more electronic databases, where
(A) at least one of the databases was [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=959&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>What is the (^H^H^H) best definition for data mining and database? It depends on who you ask and in which context.</p>
<p>According to <a href="http://thomas.loc.gov/cgi-bin/cpquery/?&amp;dbname=cp109&amp;sid=cp109QWRIU&amp;refer=&amp;r_n=hr333.109&amp;item=&amp;sel=TOC_124051&amp;">Section 126 of the USA Patriot Act</a>,</p>
<blockquote><p>(1) <strong>DATA-MINING</strong>- The term `data-mining&#8217; means a query or search or other analysis of one or more electronic databases, where</p>
<p>(A) at least one of the databases was obtained from or remains under the control of a non-Federal entity, or the information was acquired initially by another department or agency of the Federal Government for purposes other than intelligence or law enforcement;</p>
<p>(B) the search does not use personal identifiers of a specific individual or does not utilize inputs that appear on their face to identify or be associated with a specified individual to acquire information; and</p>
<p>(C) a department or agency of the Federal Government is conducting the query or search or other analysis to find a pattern indicating terrorist or other criminal activity.</p>
<p>(2) <strong>DATABASE-</strong> The term `database&#8217; does not include telephone directories, information publicly available via the Internet or available by any other means to any member of the public, any databases maintained, operated, or controlled by a State, local, or tribal government (such as a State motor vehicle database), or databases of judicial and administrative opinions.</p></blockquote>
<p>Asking the government or a KDDM researcher the question and using LSI to clusters results for the above question can be a futile exercise.</p>
<p>It is like asking President Obama or Vice President Cheney to agree on: &#8220;What is Torture?&#8221;</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/959/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/959/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/959/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/959/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/959/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/959/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/959/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/959/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/959/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/959/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=959&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/25/defining-data-mining-and-database/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>When Noise is a Good Thing.</title>
		<link>http://irthoughts.wordpress.com/2009/05/22/when-noise-is-a-good-thing/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/22/when-noise-is-a-good-thing/#comments</comments>
		<pubDate>Fri, 22 May 2009 14:03:07 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Latent Semantic Indexing]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=951</guid>
		<description><![CDATA[Today, a reader (name removed to protect confidentiality) asked me:
My name is **** ****. I working as a junior research fellow in a project in India. I red the SVD techniques from the web page http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-3-full-svd.html#right-eigenvectors. I found it is quite satisfactory for me. Now I can understand how SVD works. But I have a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=951&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Today, a reader (name removed to protect confidentiality) asked me:</p>
<blockquote><p>My name is **** ****. I working as a junior research fellow in a project in India. I red the SVD techniques from the web page <a href="http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-3-full-svd.html#right-eigenvectors">http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-3-full-svd.html#right-eigenvectors</a>. I found it is quite satisfactory for me. Now I can understand how SVD works. But I have a query as follows.</p>
<p>query:</p>
<p>As mentioned in this tutorial that we have arrange these eigen-values in descending order. Cold you please tell me if I put these values in ascending order or arbitrary what will be wrong with the SVD.</p>
<p>Looking forward your early kind response.</p>
<p>Thanking you.</p>
<p>With best regards.</p>
<p>*******</p></blockquote>
<p>My answer follows.</p>
<p>It depends on what you are trying to address.</p>
<p>SVD is used to identify singular values interpreted as dimensions. When used as a dimensionality reduction technique, the largest N singular values are normally retained and thus retaining the smaller singular values is meaningless.  The largest singular values capture most of the information of the original data set and is therefore a noise minimization approach.</p>
<p>If the retention criterion used is reversed (smaller singular values are retained) this implies retaining the more noisy dimensions such that the reconstructed matrix will be a matrix of the hidden (latent) data noise. This is a noise maximization approach.</p>
<p>If the retention criterion is based on a random selection, the resultant reconstructed matrix might be one representing a data structure with randomized noise.</p>
<p>These scenarios depend on the original data under examination. </p>
<p>In Image Compression, these approaches have been already explored. If the goal is a stability study and not just SVD dimensionality reduction, &#8220;the ratio between the highest singular value and the lowest singular value of the Jacobian matrix quantifies the spread of the Jacobian’s singular values, which in practice, reflects the extent of the solution’s instability with respect to small changes in the observation&#8221;  (<a href="http://www.mathcs.emory.edu/~horesh/publications/thesis/thesis_all_in_one21.pdf">Horesh&#8217;s Thesis</a> )</p>
<p>Having said all that, we should not render noise in a data set as something that must be discarded at all cost.</p>
<p>This is intimate linked with the so-called Inverse Problem. Incorporating noise and <em>a priori </em>SVD information can provide the complete information in a linear sense. Qianqian Fang has a beautiful PPT presentation &#8220;<a href="http://bbs.dartmouth.edu/~fangq/Presentation/RIP2003/LookClosertoInverseProblem.ppt">Look Closer to Inverse Problem</a>&#8221; on the subject. If you want to visualize the MATRIX Problem, this presentation is for you.</p>
<p>I&#8217;m thinking in putting together a tutorial on the Singular Value Expansion algorithm (SVE), if I ever find the time.</p>
<p>I hope this helps.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/951/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=951&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/22/when-noise-is-a-good-thing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Ethical Hacking: An Oxymoron, a Misnomer, or Both?</title>
		<link>http://irthoughts.wordpress.com/2009/05/18/ethical-hacking-an-oxymoron-a-misnomer-or-both/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/18/ethical-hacking-an-oxymoron-a-misnomer-or-both/#comments</comments>
		<pubDate>Mon, 18 May 2009 12:51:28 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=944</guid>
		<description><![CDATA[According to a report from the British Computer Society (BCS) covering a Security Panel Strategic Forum, &#8220;ethical hacking&#8221; is an oxymoron.
The report highligths do&#8217;s and don&#8217;t when it comes to defining terms like &#8220;hacker&#8221;, &#8220;ethical hacking&#8221;, &#8220;penetration tester&#8221;, &#8220;white/black hats&#8221;, and derivatives terms. These labels are frequently used in the IT industry. The report also [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=944&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>According to a report from the <a href="http://www.bcs.org/upload/pdf/ethical-hacking.pdf">British Computer Society</a> (BCS) covering a Security Panel Strategic Forum, &#8220;ethical hacking&#8221; is an oxymoron.</p>
<p>The report highligths do&#8217;s and don&#8217;t when it comes to defining terms like &#8220;hacker&#8221;, &#8220;ethical hacking&#8221;, &#8220;penetration tester&#8221;, &#8220;white/black hats&#8221;, and derivatives terms. These labels are frequently used in the IT industry. The report also underscores which terms should not be used by schools offering IT courses.</p>
<p>The problem with defining and redefining such labels is that there will always be others disagreeing with/circumventing said definitions.</p>
<p>For instance, in the December 1986 issue of MicroTimes, Bob Bickford wrote:</p>
<p>&#8220;A Hacker is any person who derives joy from discovering ways to circumvent limitations.&#8221;</p>
<p>If we accept this definition then a person that <strong>doesn&#8217;t </strong>derive any joy from discovering ways to circumvent limitations <strong>is not</strong> a hacker. Similarly a spouse cheater, an SEO, a spammer, a politician, a mobster, or a kid trying to get some candies from mom is a hacker.</p>
<p>I am taking this extreme, off-topic interpretation to illustrate the problem of semantics when it comes to defining things.</p>
<p>Whether you agree or disagree partial or totally with the report, it is a good read. For sure it will be a good piece for students planning to take my AIRWeb graduate course.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/944/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/944/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/944/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/944/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/944/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/944/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/944/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/944/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/944/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/944/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=944&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/18/ethical-hacking-an-oxymoron-a-misnomer-or-both/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Google Accused of Conversion-Inflation Syndication Fraud</title>
		<link>http://irthoughts.wordpress.com/2009/05/15/google-accused-of-conversion-inflation-syndication-fraud/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/15/google-accused-of-conversion-inflation-syndication-fraud/#comments</comments>
		<pubDate>Fri, 15 May 2009 14:35:24 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Marketing Research]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=939</guid>
		<description><![CDATA[According to Ben Edelman, Google is engaged in a conversion-inflation syndyication fraud.
These tactics are nothing new.
In the featuring article of the November 2008 issue of IR Watch, &#8220;Fraudulent Web Analytics &#8211; Engineering the Fraud&#8220;, we covered how in-the-middle mechanisms are part of Web Analytic Frauds and Business Collusion Schemes.
As in man-in-the-middle attacks found in information [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=939&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>According to Ben Edelman, Google is engaged in a <a href="http://www.benedelman.org/news/051309-1.html">conversion-inflation syndyication fraud</a>.</p>
<p>These tactics are nothing new.</p>
<p>In the featuring article of the November 2008 issue of IR Watch, <strong>&#8220;Fraudulent Web Analytics &#8211; Engineering the Fraud</strong>&#8220;, we covered how <em>in-the-middle mechanisms</em> are part of Web Analytic Frauds and Business Collusion Schemes.</p>
<p>As in man-in-the-middle attacks found in information security settings, the underlying goal is the same: the crafting of deceiving intermediary events.</p>
<p>Expect soon a pr damage control campaign from the useful idiots/moles.</p>
<p>What is next? A class action lawsuit?</p>
<p>Still, I have a little taste of satisfaction in my mouth when crooks disguised as advertisers/search marketers are gamed. Gaming the gamers: Life ironies!</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/939/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/939/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/939/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/939/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/939/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/939/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/939/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/939/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/939/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/939/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=939&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/15/google-accused-of-conversion-inflation-syndication-fraud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>IR Quiz: Matrices</title>
		<link>http://irthoughts.wordpress.com/2009/05/13/ir-quiz-matrices/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/13/ir-quiz-matrices/#comments</comments>
		<pubDate>Wed, 13 May 2009 12:15:59 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[IR Quizzes]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=933</guid>
		<description><![CDATA[Explain and give example for the following matrices used in IR:
1. Term-document occurrence matrix.
2. Term-term cooccurrence matrix.
3. Term-term correlation matrix.
4. Term-term similarity matrix.
5. Term-term coweights matrix.
6. Term-term distance matrix (*).
7. Covariance matrix (*).
 
(*) PS. I forgot to list these other matices.
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=933&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Explain and give example for the following matrices used in IR:</p>
<p>1. Term-document occurrence matrix.</p>
<p>2. Term-term cooccurrence matrix.</p>
<p>3. Term-term correlation matrix.</p>
<p>4. Term-term similarity matrix.</p>
<p>5. Term-term coweights matrix.</p>
<p>6. Term-term distance matrix (*).</p>
<p>7. Covariance matrix (*).</p>
<p> </p>
<p>(*) PS. I forgot to list these other matices.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/933/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/933/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/933/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/933/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/933/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/933/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/933/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/933/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/933/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/933/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=933&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/13/ir-quiz-matrices/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Vector Normalization with Excel &#8211; Part II</title>
		<link>http://irthoughts.wordpress.com/2009/05/07/vector-normalization-with-excel-part-ii/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/07/vector-normalization-with-excel-part-ii/#comments</comments>
		<pubDate>Thu, 07 May 2009 12:06:56 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[IR Tutorials]]></category>
		<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=928</guid>
		<description><![CDATA[Back in March, we explained how to normalize column vectors with Excel. But, what about normalizing row vectors? This question is addressed in the current QA column of IRW. I think it might be useful sharing the answer with readers since many of these are students struggling with similar questions. So, here we go.
The following table [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=928&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Back in March, we explained <a href="http://irthoughts.wordpress.com/2009/03/04/vector-normalization-with-excel/">how to normalize column vectors with Excel</a>. But, what about normalizing row vectors? This question is addressed in the current QA column of IRW. I think it might be useful sharing the answer with readers since many of these are students struggling with similar questions. So, here we go.</p>
<p>The following table emulates an Excel array consisting of three columns (A, B, and C) and six rows (1-6).</p>
<table border="0" cellspacing="0" cellpadding="0" width="178">
<col span="1" width="37"></col>
<col span="1" width="47"></col>
<col span="1" width="48"></col>
<col span="1" width="46"></col>
<tbody>
<tr>
<td width="37" height="20"> </td>
<td width="47">A</td>
<td width="48">B</td>
<td width="46">C</td>
</tr>
<tr>
<td height="20">1</td>
<td>1</td>
<td>2</td>
<td>3</td>
</tr>
<tr>
<td height="20">2</td>
<td>4</td>
<td>5</td>
<td>6</td>
</tr>
<tr>
<td height="20">3</td>
<td>7</td>
<td>8</td>
<td>9</td>
</tr>
<tr>
<td height="20">4</td>
<td>0.27</td>
<td>0.53</td>
<td>0.80</td>
</tr>
<tr>
<td height="20">5</td>
<td>0.46</td>
<td>0.57</td>
<td>0.68</td>
</tr>
<tr>
<td height="20">6</td>
<td>0.50</td>
<td>0.57</td>
<td>0.65</td>
</tr>
</tbody>
</table>
<p style="text-align:left;">Rows 1, 2, and 3 are row vectors. Rows 4, 5, and 6 are the corresponding normalized vectors, also known as unit vectors because their length is 1. To compute these, do as follows:</p>
<p>1. In cell A4, enter the formula =A1/(SQRT(SUMSQ($A1:$C1))). The result should be as given in this cell.</p>
<p>2. Copy this formula, select cells A5 and A6 and paste the formula in these.</p>
<p> 3. Finally, copy at once cells A4 through A6, select the remaining empty cells of the array, i.e., cells B4 through C6 and paste the formulas in these.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/928/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/928/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/928/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/928/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/928/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/928/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/928/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/928/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/928/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/928/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=928&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/07/vector-normalization-with-excel-part-ii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>NSA/DHS Designates PUPR as a CAE</title>
		<link>http://irthoughts.wordpress.com/2009/05/05/nsadhs-designates-pupr-as-a-cae/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/05/nsadhs-designates-pupr-as-a-cae/#comments</comments>
		<pubDate>Tue, 05 May 2009 12:40:22 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Homeland Security]]></category>
		<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=916</guid>
		<description><![CDATA[As blogged yesterday, the current issue of IRW should reach subscribers inbox today. The Top CS Departments column features Polytechnic University of Puerto Rico, where I teach graduate courses. As mentioned few days ago, PUPR has been designated a CAE. This is a great news that is making a splash across academic centers within the U.S., the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=916&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>As blogged yesterday, the current issue of IRW should reach subscribers inbox today. The Top CS Departments column features Polytechnic University of Puerto Rico, where I teach graduate courses. As mentioned few days ago, PUPR has been designated a CAE. This is a great news that is making a splash across academic centers within the U.S., the Caribbean Region and Latin America, and whose mission is research relevant to homeland security.</p>
<p>Associate Director for Computer Science, Dr. Alfredo Cruz, sent me an  official announcement, which I am reproducing.</p>
<blockquote><p>Polytechnic University of Puerto Rico (PUPR) is Designated National Center of Academic Excellence in Information Assurance Education by NSA and DHS. PUPR was recently designated as a National Center of Academic Excellence in Information Assurance Education (CAE/IAE) by the National Security Agency (NSA) and the Department of Homeland Security (DHS) on April 22, 2009. The goal of these centers is to reduce the vulnerability of the national information infrastructure by promoting higher education and research in Information Assurance (IA) and Security through the development of a growing number of professionals with IA expertise in various related disciplines. PUPR will be recognized as the first institution in Puerto Rico to be designated as a CAE/IAE on June 3, 2009 in Seattle, Washington. Dr. Alfredo Cruz from the Department of Electrical &amp; Computer Engineering and Computer Science will be present to receive the designation. He is the Director of the Center of Information Assurance for Research and Education (CIARE) at PUPR. Dr. Cruz is the person responsible for this designation. PUPR is of the very few Hispanic serving institution (HSI) in the Nation to receive this designation, and to become one of the first 100 institutions nationwide; this is a very special recognition. This designation requires that the President of the United States send the Governor of Puerto Rico a certification that should be handed to the president of PUPR designating the Institution as a CAE/IAE at a National level. The Congress and all the respective Congressional Committees are also notified.</p>
<p>Some of the benefits of the CAE/IAE designation are:<br />
• PUPR will receive formal recognition from the U.S. Government as well as opportunities for prestige and publicity for our roll in securing the Nation’s information systems.<br />
• This designation increases collaboration opportunities between designated and aspiring institutions at local and national levels. This includes internships, faculty and student exchange, research, and publications, among other activities.<br />
• With this designation as a CAE/IAE PUPR can obtain scholarships that can help outstanding students to pursue graduate studies in IA, enabling them to work with the Federal Government or other federal institutions and agencies.<br />
• PUPR can compete and benefit from proposal calls (RFP) that are specifically for designated CAE/IAE institutions. These proposals offer millions of dollars from the DoD, NSF, NSA and “Homeland Security”, among others, for research and infrastructure.<br />
• Student scholarships offered under the NSF&#8217;s Scholarship for Service (SFS) program. The SFS scholarship offers the following:<br />
&#8211;2-year scholarship, includes 8K stipend (12K for graduate students), plus tuition and nominal room and board expenses.<br />
&#8211;Paid summer internship in a federal agency.<br />
&#8211;Placement in federal government at the end of the scholarship period.</p></blockquote>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/916/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/916/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/916/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/916/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/916/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/916/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/916/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/916/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/916/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/916/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=916&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/05/nsadhs-designates-pupr-as-a-cae/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>IRW: RIA Vulnerabilities</title>
		<link>http://irthoughts.wordpress.com/2009/05/04/irw-ria-vulnerabilities/</link>
		<comments>http://irthoughts.wordpress.com/2009/05/04/irw-ria-vulnerabilities/#comments</comments>
		<pubDate>Mon, 04 May 2009 14:09:50 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Newsletters]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=910</guid>
		<description><![CDATA[
The current of issue of IRW should reach subscribers inbox tomorrow.
In this issue:
Featuring article: RIA Vulnerabilities
This issue of the newsletter discusses how hackers might be exploiting Web vulnerabilities found in Rich Internet Applications (RIAs). As mentioned in our previous issue, some RIAs are based on Adobe’s technologies like Flash, Flex, or AIR. Some are designed to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=910&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img class="aligncenter" src="http://www.miislita.com/irw/ria-vulnerabilities.gif" alt="" /></p>
<p>The current of issue of IRW should reach subscribers inbox tomorrow.</p>
<p>In this issue:</p>
<p>Featuring article: RIA Vulnerabilities</p>
<blockquote><p>This issue of the newsletter discusses how hackers might be exploiting Web vulnerabilities found in Rich Internet Applications (RIAs). As mentioned in our previous issue, some RIAs are based on Adobe’s technologies like Flash, Flex, or AIR. Some are designed to be run online or offline. Their rising popularity has attracted developers and marketers, and -as expected- hackers and spammers.</p></blockquote>
<p>QA: Excel Vector Normalization: How do I convert a row vector into a unit vector?<br />
Who is Who in IR: C.J. van Rijsbergen<br />
Top CS Departments: Polytechnic University of Puerto Rico<br />
Historical Notes: ENIAC Computer<br />
Outstanding Graduate Theses<br />
Calls and Events<br />
Research Blogs<br />
and more&#8230;</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/910/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/910/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/910/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/910/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/910/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/910/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/910/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/910/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/910/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/910/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=910&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/05/04/irw-ria-vulnerabilities/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://www.miislita.com/irw/ria-vulnerabilities.gif" medium="image" />
	</item>
		<item>
		<title>No-Caching is Spammers Best Friend</title>
		<link>http://irthoughts.wordpress.com/2009/04/30/no-caching-is-spammers-best-friend/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/30/no-caching-is-spammers-best-friend/#comments</comments>
		<pubDate>Thu, 30 Apr 2009 14:10:44 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=905</guid>
		<description><![CDATA[Today I feel like giving a piece of advise to spammers, so this will force raising the bar in the &#8220;we versus them&#8221; in the Spam War. Think of this as a love-hate relationship.
C&#8217;mon spammers, I know you can do better. Don&#8217;t make our IR life easy at neutralizing your tactics. He, He.
At the recent AIRWeb [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=905&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Today I feel like giving a piece of advise to spammers, so this will force raising the bar in the &#8220;we versus them&#8221; in the Spam War. Think of this as a love-hate relationship.</p>
<p>C&#8217;mon spammers, I know you can do better. Don&#8217;t make our IR life easy at neutralizing your tactics. He, He.</p>
<p>At the recent AIRWeb Workshops, Brian Davison presented the paper <a href="http://airweb.cse.lehigh.edu/2009/papers/p1-dai.pdf">Looking into the Past to Better Classify Web Spam</a>, which received high reviews from referees and the audience.</p>
<p>Wannabe spammers, if you are really committed to spamdexing, at least know the how-tos. Don&#8217;t leave a temporal fingerprint of your web presence. Try this:</p>
<p>1. Prevent online resources from caching your web pages, like the Wayback Machine and commercial search engines.</p>
<p>2. Use No-Cache and No-Archive.</p>
<p>3. Switch hosts whenever you can.</p>
<p>4. Constantly mutate your link structure.</p>
<p>5. Don&#8217;t profile yourself with easy to detect/predictable honeypots, link swapping, strongly-connected component structures, etc.</p>
<p>Why giving these advices? Check current AIRWeb &#8220;gems&#8221;.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/905/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/905/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/905/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/905/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/905/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/905/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/905/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/905/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/905/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/905/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=905&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/30/no-caching-is-spammers-best-friend/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Microsoft, Inter-Metro to Co-Launch a MIC</title>
		<link>http://irthoughts.wordpress.com/2009/04/29/microsoft-inter-metro-to-co-launch-a-mic/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/29/microsoft-inter-metro-to-co-launch-a-mic/#comments</comments>
		<pubDate>Wed, 29 Apr 2009 13:33:20 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[IR Tools]]></category>
		<category><![CDATA[Marketing Research]]></category>
		<category><![CDATA[Miscellaneous]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=900</guid>
		<description><![CDATA[This afternoon, Microsoft in partnership with The Interamerican University of Puerto Rico, Metropolitan Campus (Inter-Metro) will announce that they are officially co-launching the Microsoft Innovation Center (MIC) of Puerto Rico.
This will be the first MIC in the region. A two stores building has been abilitated within the Inter-Metro campus for this project. As member of the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=900&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>This afternoon, Microsoft in partnership with The Interamerican University of Puerto Rico, Metropolitan Campus (Inter-Metro) will announce that they are officially co-launching the Microsoft Innovation Center (MIC) of Puerto Rico.</p>
<p>This will be the first MIC in the region. A two stores building has been abilitated within the Inter-Metro campus for this project. As member of the MIC steering committee, I have been invited to the presentation by President, Manuel J. Fernos.</p>
<p>They have also provided me with office and lab space in the MIC building to put together the Internet Business Development Center (IBDC). The objectives of the MIC is the development and commercialization of ecommerce-related software tools. Emphasis will be given to egovernment and ebusiness solutions.</p>
<p>It looks like I will split my schedules between being the IBDC principal investigator, MIC meetings, doing research at Inter-Metro, teaching at PUPR, and writing IRWs. These are exciting news. Let see how things go, especially with the other great news  that PUPR&#8217;s ECE&amp;CS department has been accredited by NSA as a CAE.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/900/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/900/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/900/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/900/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/900/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/900/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/900/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/900/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/900/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/900/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=900&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/29/microsoft-inter-metro-to-co-launch-a-mic/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>AIRWeb 2009 Proceedings</title>
		<link>http://irthoughts.wordpress.com/2009/04/28/airweb-2009-proceedings/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/28/airweb-2009-proceedings/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 15:13:00 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[AIRWeb Course]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=893</guid>
		<description><![CDATA[Here are the proceeding papers of AIRWeb 2009, available at http://airweb.cse.lehigh.edu/2009/proceedings.html
OK, SEOs, Spammers, and Hackers: start your engines and let the fun begin.
If you are a PUPR graduate student and are planning to take my AIR course, it might be a good idea to start browsing through these &#8220;gems&#8221;. Check also previous proceedings of AIRWeb.
Invited [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=893&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Here are the proceeding papers of AIRWeb 2009, available at <a href="http://airweb.cse.lehigh.edu/2009/proceedings.html">http://airweb.cse.lehigh.edu/2009/proceedings.html</a></p>
<p>OK, SEOs, Spammers, and Hackers: start your engines and let the fun begin.</p>
<p>If you are a PUPR graduate student and are planning to take my AIR course, it might be a good idea to start browsing through these &#8220;gems&#8221;. Check also previous proceedings of AIRWeb.</p>
<h4>Invited Talks</h4>
<p class="paper">The Potential for Research and Development in Adversarial Information Retrieval — <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/Davison-AIRWeb2009-Keynote.pdf">slides</a></p>
<p><span class="authors">Brian D. Davison </span></p>
<p class="paper">Web Spam Challenges: Looking Backward and Forward — <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/castillo-challenges.pdf">slides</a></p>
<p><span class="authors">Carlos Castillo</span></p>
<h4>Temporal Analysis</h4>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p1-dai.pdf">Looking into the Past to Better </a>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/Dai- LookingintothePasttoBetterClassifyWeb.pdf">slides</a></p>
<p><span class="authors">Na Dai, Brian D. Davison and Xiaoguang Qi</span></p>
<p>Classify Web Spam</p>
<div class="abstract">Web spamming techniques aim to achieve undeserved rankings in<br />
search results. Research has been widely conducted on identifying<br />
such spam and neutralizing its influence. However, existing spam<br />
detection work only considers current information. We argue that<br />
historical web page information may also be important in spam<br />
classification. In this paper, we use content features from historical<br />
versions of web pages to improve spam classification. We use<br />
supervised learning techniques to combine classifiers based on<br />
current page content with classifiers based on temporal features.<br />
Experiments on the WEBSPAM-UK2007 dataset show that our<br />
approach improves spam classification F-measure performance by<br />
30% compared to a baseline classifier which only considers current<br />
page content.</div>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p9-chung.pdf">A Study of Link Farm Distribution </a>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/airweb2009_chung.pdf">slides</a></p>
<p><span class="authors">Young-joo Chung, Masashi Toyoda and Masaru Kitsuregawa</span></p>
<p>and Evolution Using a Time Series of Web Snapshots</p>
<div class="abstract">In this paper, we study the overall link-based spam structure<br />
and its evolution which would be helpful for the development<br />
of robust analysis tools and research for Web spamming as a<br />
social activity in the cyber space. First, we use strongly connected<br />
component (SCC) decomposition to separate many<br />
link farms from the largest SCC, so called the core. We<br />
show that denser link farms in the core can be extracted by<br />
node filtering and recursive application of SCC decomposition<br />
to the core. Surprisingly, we can find new large link<br />
farms during each iteration and this trend continues until at<br />
least 10 iterations. In addition, we measure the spamicity<br />
of such link farms. Next, the evolution of link farms is examined<br />
over two years. Results show that almost all large<br />
link farms do not grow anymore while some of them shrink,<br />
and many large link farms are created in one year.</div>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p17-erdelyi.pdf">Web Spam Filtering in Internet </a>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/erdelyi-timeline-spam- pres.pdf">slides</a></p>
<p><span class="authors">Miklós Erdélyi, András A. Benczúr, Julien Masanes and </span></p>
<p>Archives</p>
<p>Dávid Siklósi</p>
<div class="abstract">While Web spam is targeted for the high commercial value of topranked<br />
search-engine results, Web archives observe quality deterioration<br />
and resource waste as a side effect. So far Web spam filtering<br />
technologies are rarely used by Web archivists but planned in the<br />
future as indicated in a survey with responses from more than 20<br />
institutions worldwide. These archives typically operate on a modest<br />
level of budget that prohibits the operation of standalone Web<br />
spam filtering but collaborative efforts could lead to a high quality<br />
solution for them.<br />
In this paper we illustrate spam filtering needs, opportunities and<br />
blockers for Internet archives via analyzing several crawl snapshots<br />
and the difficulty of migrating filter models across different<br />
crawls via the example of the 13 .uk snapshots performed<br />
by UbiCrawler that include WEBSPAM-UK2006 and WEBSPAM-UK2007.</div>
<h4>Content Analysis</h4>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p21-martinez-romo.pdf">Web Spam Identification </a>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/juaner09airweb-pres.pdf">slides</a></p>
<p><span class="authors">Juan Martinez-Romo and Lourdes Araujo</span></p>
<p>Through Language Model Analysis</p>
<div class="abstract">This paper applies a language model approach to different<br />
sources of information extracted from a Web page, in order<br />
to provide high quality indicators in the detection of<br />
Web Spam. Two pages linked by a hyperlink should be<br />
topically related, even though this were a weak contextual<br />
relation. For this reason we have analysed different sources<br />
of information of a Web page that belongs to the context of<br />
a link and we have applied Kullback-Leibler divergence on<br />
them for characterising the relationship between two linked<br />
pages. Moreover, we combine some of these sources of information<br />
in order to obtain richer language models. Given<br />
the different nature of internal and external links, in our<br />
study we also distinguished these types of links getting a<br />
significant improvement in classification tasks. The result<br />
is a system that improves the detection of Web Spam on<br />
two large and public datasets such as WEBSPAM-UK2006 and<br />
WEBSPAM-UK2007.</div>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p29-katayama.pdf">An Empirical Study on </a>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/Katayam-active_learning_blog_spam.pdf">slides</a></p>
<p><span class="authors">Taichi Katayama, Takehito Utsuro, Yuuki Sato, Takayuki Yoshinaka, Yasuhide Kawada and </span></p>
<p>Selective Sampling in Active Learning for Splog Detection</p>
<p>Tomohiro Fukuhara</p>
<div class="abstract">This paper studies how to reduce the amount of human supervision<br />
for identifying splogs / authentic blogs in the context<br />
of continuously updating splog data sets year by year.<br />
Following the previous works on active learning, against the<br />
task of splog / authentic blog detection, this paper empirically<br />
examines several strategies for selective sampling in<br />
active learning by Support Vector Machines (SVMs). As a<br />
confidence measure of SVMs learning, we employ the distance<br />
from the separating hyperplane to each test instance,<br />
which have been well studied in active learning for text classification.<br />
Unlike those results of applying active learning<br />
to text classification tasks, in the task of splog / authentic<br />
blog detection of this paper, it is not the case that adding<br />
least confident samples performs best.</div>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p37-biro.pdf">Linked Latent Dirichlet Allocation </a>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/Siklosi- LinkedLDA.pdf">slides</a></p>
<p><span class="authors">István Bíró, Dávid Siklósi, Jácint Szabó </span></p>
<p>in Web Spam Filtering</p>
<p>and András Benczúr</p>
<div class="abstract">Latent Dirichlet allocation (LDA) (Blei, Ng, Jordan 2003)<br />
is a fully generative statistical language model on the content<br />
and topics of a corpus of documents. In this paper<br />
we apply an extension of LDA for web spam classification.<br />
Our linked LDA technique takes also linkage into account:<br />
topics are propagated along links in such a way that the<br />
linked document directly influences the words in the linking<br />
document. The inferred LDA model can be applied for<br />
classification as dimensionality reduction similarly to latent<br />
semantic indexing. We test linked LDA on the WEBSPAM-UK2007<br />
corpus. By using BayesNet classifier, in terms of<br />
the AUC of classification, we achieve 3% improvement over<br />
plain LDA with BayesNet, and 8% over the public link features<br />
with C4.5. The addition of this method to a log-odds<br />
based combination of strong link and content baseline classifiers<br />
results in a 3% improvement in AUC. Our method<br />
even slightly improves over the best Web Spam Challenge<br />
2008 result.</div>
<h4>Social Spam</h4>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p41-markines.pdf">Social Spam Detection</a></p>
<p>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/Markines-social_spam.pdf">slides</a></p>
<p><span class="authors">Benjamin Markines, Ciro Cattuto and Filippo Menczer</span></p>
<div class="abstract">The popularity of social bookmarking sites has made them prime<br />
targets for spammers. Many of these systems require an administrator’s<br />
time and energy to manually filter or remove spam. Here<br />
we discuss the motivations of social spam, and present a study<br />
of automatic detection of spammers in a social tagging system.<br />
We identify and analyze six distinct features that address various<br />
properties of social spam, finding that each of these features provides<br />
for a helpful signal to discriminate spammers from legitimate<br />
users. These features are then used in various machine learning<br />
algorithms for classification, achieving over 98% accuracy in detecting<br />
social spammers with 2% false positives. These promising<br />
results provide a new baseline for future efforts on social spam. We<br />
make our dataset publicly available to the research community.</div>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p49-neubauer.pdf">Tag Spam Creates Large Non-</a> — <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/airweb_neubauer.pdf">slides</a></p>
<p><span class="authors">Nicolas Neubauer, Robert Wetzker and Klaus Obermayer</span></p>
<p>Giant Connected Components</p>
<div class="abstract">Spammers in social bookmarking systems try to mimick<br />
bookmarking behaviour of real users to gain the attention<br />
of other users or search engines. Several methods have been<br />
proposed for the detection of such spam, including domain specific<br />
features (like URL terms) or similarity of users to<br />
previously identified spammers. However, as shown in our<br />
previous work, it is possible to identify a large fraction of<br />
spam users based on purely structural features. The hypergraph<br />
connecting documents, users, and tags can be decomposed<br />
into connected components, and any large, but non-giant<br />
components turned out to be almost entirely inhabited<br />
by spam users in the examined dataset. Here, we test<br />
to what degree the decomposition of the complete hypergraph<br />
is really necessary, examining the component structure<br />
of the induced user/document and user/tag graphs.<br />
While the user/tag graph&#8217;s connectivity does not help in<br />
classifying spammers, the user/document graph&#8217;s connectivity<br />
is already highly informative. It can however be augmented<br />
with connectivity information from the hypergraph.<br />
In our view, spam detection based on structural features, like<br />
the one proposed here, requires complex adaptation strategies<br />
from spammers and may complement other, more traditional<br />
detection approaches.</div>
<h4>Spam Research Collections</h4>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p53-jones.pdf">Nullification Test Collections </a><br />
— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/Jones- Nullification_test_collections_for_web_spam_an.pdf">slides</a></p>
<p><span class="authors">Timothy Jones, David Hawking, Ramesh Sankaranarayana and Nick Craswell</span></p>
<p>for Web Spam and SEO</p>
<div class="abstract">Research in the area of adversarial information retrieval has<br />
been facilitated by the availability of the UK-2006/UK-2007<br />
collections, comprising crawl data, link graph, and spam labels.<br />
However, research into nullifying the negative effect<br />
of spam or excessive search engine optimisation (SEO) on<br />
the ranking of non-spam pages is not well supported by<br />
these resources. Nor is the study of cloaking techniques<br />
or of click spam. Finally, the domain-restricted nature of a<br />
.uk crawl means that only parts of link-farm icebergs may<br />
be visible in these crawls. We introduce the term nullification<br />
which we define as &#8220;preventing problem pages from<br />
negatively affecting search results&#8221;. We show some important<br />
differences between properties of current .uk-restricted<br />
crawls and those previously reported for the Web as a whole.<br />
We identify a need for an adversarial IR collection which is<br />
not domain-restricted and which is supported by a set of<br />
appropriate query sets and (optimistically) user-behaviour<br />
data. The billion-page unrestricted crawl being conducted<br />
by CMU (web09-bst) and which will be used in the 2009<br />
TREC Web Track is assessed as a possible basis for a new<br />
AIR test collection. We discuss the pros and cons of its scale,<br />
and the feasibility of adding resources such as query lists to<br />
enhance the utility of the collection for AIR research.</div>
</p>
<p class="paper"><a class="pdf" href="http://airweb.cse.lehigh.edu/2009/papers/p61-benczur.pdf">Web Spam Challenge Proposal for </a>— <a class="presc" href="http://airweb.cse.lehigh.edu/2009/slides/erdelyi- challenge-position-pres.pdf">slides</a></p>
<p><span class="authors">András A. Benczúr, Miklós Erdélyi, Julien Masanes and </span></p>
<p>Filtering in Archives</p>
<p>Dávid Siklósi</p>
<div class="abstract">In this paper we propose new tasks for a possible future Web Spam<br />
Challenge motivated by the needs of the archival community. The<br />
Web archival community consists of several relatively small institutions<br />
that operate independently and possibly over different top<br />
level domains (TLDs). Each of them may have a large set of historic<br />
crawls. Efficient filtering would hence require (1) enhanced<br />
use of the time series of domain snapshots and (2) collaboration by<br />
transferring models across different TLDs. Corresponding Challenge<br />
tasks could hence include the distribution of crawl snapshot<br />
data for feature generation as well as classification of unlabeled<br />
new crawls of the same or even different TLDs.</div>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/893/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/893/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/893/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/893/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/893/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/893/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/893/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/893/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/893/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/893/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=893&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/28/airweb-2009-proceedings/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Marketing Professor Kills Three, Hurts Two</title>
		<link>http://irthoughts.wordpress.com/2009/04/26/marketing-professor-kills-three-hurts-two/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/26/marketing-professor-kills-three-hurts-two/#comments</comments>
		<pubDate>Sun, 26 Apr 2009 15:23:27 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Marketing Research]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=882</guid>
		<description><![CDATA[George M. Zinkhan III, from Terry College of Business at the University of Georgia allegedly went into a killing rampage, killing his ex-wife and two others, and hurting two.
According to his university page (accessible at the time of writing), Zinkhan is a Coca-Cola Company Professor Department of Marketing and Distribution. Zinkhan is well known in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=882&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><a href="http://www.terry.uga.edu/profiles/?person_id=457">George M. Zinkhan III</a>, from Terry College of Business at the University of Georgia allegedly went into a killing rampage, killing his ex-wife and two others, and hurting two.</p>
<p>According to his university page (accessible at the time of writing), Zinkhan is a Coca-Cola Company Professor Department of Marketing and Distribution. Zinkhan is well known in the academic marketing research circles, having served as editor of the JOURNAL OF THE ACADEMY OF MARKETING SCIENCE.</p>
<p>His <a href="http://www.scribd.com/doc/14643257/zinkhanvitae">40-page CV</a> reveals he conducted extensive research on Marketing and Net Advertising.</p>
<p>In 2008 he was part of an <a href="http://www.newcommreview.com/?p=1104">American Marketing Association</a> committee that redefined marketing. The new definition reads:</p>
<blockquote><p>&#8220;Marketing is the activity, set of institutions, and processes for creating, communicating, delivering, and exchanging offerings that have value for customers, clients, partners, and society at large.&#8221;</p></blockquote>
<p>According to the AMA committee,</p>
<blockquote><p>&#8220;Marketing is no longer a function &#8212; it is an educational process.&#8221;.</p></blockquote>
<p>Zinkhan published extensively with <a href="http://academic.udayton.edu/yuepan/resume.html">Yue Pan</a>, associate professor of marketing, University of Dayton. He published on the concept of Netvertising (&#8220;Netvertising Characteristics, Opportunities and Challenges: A Research Agenda,&#8221; International Journal of Internet Marketing &amp; Advertising, 1(3), 283-299.). According to their <a href="http://inderscience.metapress.com/app/home/contribution.asp?referrer=parent&amp;backto=issue,4,6;journal,13,15;linkingpublicationresults,1:110872,1">abstract</a>:</p>
<blockquote><p>&#8220;Netvertising, or &#8220;advertising on the internet&#8221;, is attracting much attention from advertising and marketing researchers. However, surprisingly little is known about its new features as compared to other forms of advertising and the implications of the new medium for advertisers. Here, we focus on the following issues: the opportunities and challenges associated with internet advertising; the differences of netvertising from other forms of communication; banner ads – the most popular type of netvertising. Applying this framing perspective, we propose a research agenda for the study of netvertising.&#8221;</p></blockquote>
<p>Netvertising is something search marketers do using different out-of-the-thin-air theories/naming conventions.</p>
<p>Read more about the <a href="http://www.allbusiness.com/marketing-advertising/advertising-internet-advertising/330948-1.html">Netvertising Image Communication Model (NICM)</a></p>
<p>That was then. Today Zinkhan&#8217;s name is associated with a Negative Image on the Net. It will be a matter of time before others will dissassociate themselves with such an image. Life ironies!</p>
<p><a href="http://www.ajc.com/news/content/metro/stories/2009/04/25/zinkhan_professor_shoot.html">He didn&#8217;t seem to fit</a> the academic stereotype.</p>
<p>Unfortunately as in any profession, some people cannot coupe with their personal misfortunes and end up doing bad things.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/882/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/882/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/882/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/882/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/882/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/882/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/882/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/882/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/882/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/882/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=882&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/26/marketing-professor-kills-three-hurts-two/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Hackers Hit Pentagon</title>
		<link>http://irthoughts.wordpress.com/2009/04/22/hackers-hit-pentagon/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/22/hackers-hit-pentagon/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 05:00:59 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[AIRWeb Course]]></category>
		<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Newsletters]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=875</guid>
		<description><![CDATA[It happened again: Thanks to Web vulnerabilities, hackers were able to hit the Pentagon. 
According to CCN (http://www.cnn.com/2009/US/04/21/pentagon.hacked/), 
Thousands of confidential files on the U.S. military&#8217;s most technologically advanced fighter aircraft have been compromised by unknown computer hackers over the past two years, according to senior defense officials.
The Internet intruders were able to gain access to data [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=875&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:left;">It happened again: Thanks to Web vulnerabilities, hackers were able to hit the Pentagon. </p>
<p style="text-align:left;">According to CCN <strong>(<a href="http://www.cnn.com/2009/US/04/21/pentagon.hacked/">http://www.cnn.com/2009/US/04/21/pentagon.hacked/</a>), </strong></p>
<blockquote><p>Thousands of confidential files on the U.S. military&#8217;s most technologically advanced fighter aircraft have been compromised by unknown computer hackers over the past two years, according to senior defense officials.</p>
<p>The Internet intruders were able to gain access to data related to the design and electronics systems of the Joint Strike Fighter through computers of Pentagon contractors in charge of designing and building the aircraft, according to the officials, who did not want to be identified because of the sensitivity of the issue.</p>
<p>In addition to files relating to the aircraft, hackers gained entry into the Air Force&#8217;s air traffic control systems, according to the officials. Once they got in, the Internet hackers were able to see such information as the locations of U.S. military aircraft in flight.</p></blockquote>
<p style="text-align:left;">This news is quite relevant to my Fall 2009 Web Vulnerability graduate course (<a href="http://www.miislita.com/courses/airweb-web-spam-syllabus.pdf">http://www.miislita.com/courses/airweb-web-spam-syllabus.pdf</a>)</p>
<p style="text-align:left;">BTW. Associate Director of the CS Department at PUPR.edu, also a colleague and friend, Dr. Alfredo Cruz, called me two days ago with some great news: The department has been accredited for 2009-2014 as a National Center of Academic Excellence in Information Assurance Education. Soon they will be listed with members of this exclusive &#8220;club&#8221; in the National Securing Agency web site (<a href="http://www.nsa.gov/ia/academic_outreach/nat_cae/institutions.shtml">http://www.nsa.gov/ia/academic_outreach/nat_cae/institutions.shtml</a>)</p>
<p style="text-align:left;">An official press release and formal presentation before the pertinent authorities is being coordinated for within the next few weeks or so.</p>
<p style="text-align:left;">The next issue of IR Watch &#8211; The Newsletter provides additional coverage of such an exciting news.</p>
<p style="text-align:left;">I have tied these two news in a single post to underscore the need for IR/data mining courses at the intersection of Information Security, which is precisely the mission statement of IRW, reaching now more than 300 investigators/research centers.<!--startclickprintexclude--></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/875/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/875/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/875/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/875/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/875/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/875/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/875/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/875/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/875/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/875/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=875&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/22/hackers-hit-pentagon/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>McAfee Report: Email Spam and the Environment</title>
		<link>http://irthoughts.wordpress.com/2009/04/16/mcafee-report-email-spam-and-the-environment/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/16/mcafee-report-email-spam-and-the-environment/#comments</comments>
		<pubDate>Thu, 16 Apr 2009 12:57:34 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[AIRWeb Course]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=873</guid>
		<description><![CDATA[According to a McAfee report,
Until now, spam&#8217;s impact has been measured in time, money, and aggravation. It turns out there is a massive environmental impact as well. McAfee recently commissioned climate-change consultant ICF International and spam expert Richi Jennings to calculate the environmental impact of spam. The results that came back were startling: The energy [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=873&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>According to a McAfee report,</p>
<blockquote><p>Until now, spam&#8217;s impact has been measured in time, money, and aggravation. It turns out there is a massive environmental impact as well. McAfee recently commissioned climate-change consultant ICF International and spam expert Richi Jennings to calculate the environmental impact of spam. The results that came back were startling: The energy consumed in transmitting and deleting spam is equivalent to the electricity used in 2.4 million U.S. homes, with greenhouse gas (GHG) emissions equivalent to 3.1 million passenger cars(<a href="http://resources.mcafee.com/content/NACarbonFootprintSpam">http://resources.mcafee.com/content/NACarbonFootprintSpam</a>)</p></blockquote>
<p>I first learned about these findings through ABC. Essentially,</p>
<blockquote><p>Anything powered by electricity also emits greenshouse gases. McAfee researchers say each junk e-mail emits 0.3 grams of the greenhouse gas carbon dioxide (CO2). That may not sound like much, but when you consider the volume of global annual spam, it all adds up. (<a href="http://abcnews.go.com/Technology/GlobalWarming/story?id=7343518&amp;page=1">http://abcnews.go.com/Technology/GlobalWarming/story?id=7343518&amp;page=1</a>).</p></blockquote>
<p>Following that reasoning, spamdexing search engines and any adversarial information retrieval (AIR) practice is also an insult to injury, so as too many things that comes to my mind.</p>
<p>I will tell that to students of my Fall 2009 AIRWeb Course.</p>
<p>Humm, shocking: AIR vs. Environment.</p>
<p>I never thought about such an obvious connection.  <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/873/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/873/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/873/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/873/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/873/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/873/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/873/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/873/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/873/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/873/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=873&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/16/mcafee-report-email-spam-and-the-environment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Why IDF is Expressed Using Logs</title>
		<link>http://irthoughts.wordpress.com/2009/04/15/why-idf-is-expressed-using-logs/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/15/why-idf-is-expressed-using-logs/#comments</comments>
		<pubDate>Wed, 15 Apr 2009 16:04:42 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[IR Tutorials]]></category>
		<category><![CDATA[SEO Myths]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=865</guid>
		<description><![CDATA[Recently a known SEO (name reserved) inquired me about some aspects of IDF (Inverse Document Frequency). Below are three of his questions.
I am partially reproducing/editing my responses, so it might help other SEOs with similar questions.
Questions 1 and 3 are related so I will answer both now. After that, I will answer question 2.
1) Why [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=865&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:left;">Recently a known SEO (name reserved) inquired me about some aspects of IDF (Inverse Document Frequency). Below are three of his questions.</p>
<p style="text-align:left;">I am partially reproducing/editing my responses, so it might help other SEOs with similar questions.</p>
<blockquote><p>Questions 1 and 3 are related so I will answer both now. After that, I will answer question 2.</p>
<p>1) Why is a log function used for calculating IDF?<br />
3) Would it be accurate to describe IDF as &#8220;the ratio of documents in a collection to documents in that collection with a given term&#8221;? I&#8217;m guessing your answer would be, IDF is the [LOG of " the ratio of documents in a collection to documents in that collection with a given term"]? Which brings us back to question, I guess? hehe</p>
<p>These are recurrent questions students asked me before. The reason for using logs is due to two assumptions frequently made in most IR models; i.e.</p>
<p>I. that scoring functions are additive.<br />
II. that terms are independent.</p>
<p>While in some models II might not be present, both (I and II) play well with logs since these also are additive.</p>
<p>These functions and why the use of logs is explained in the recent RSJ-PM Tutorial <a href="http://www.miislita.com/information-retrieval-tutorial/information-retrieval-probabilistic-model-tutorial.pdf">http://www.miislita.com/information-retrieval-tutorial/information-retrieval-probabilistic-model-tutorial.pdf</a></p>
<p>Document Frequency (DF) is defined as d/D, where d is number of documents containing a given term and D is the size of the collection of documents. If we take logs we obtain log(d/D).</p>
<p>But since often D &gt; d the log of d/D, that is log(d/D) gives a negative value. To get rid off the negative sign, we simply invert the ratio inside the log expression. Essentially we are compressing the scale of values so that very large or very small quantities are smoothly compared. Now log(D/d) is conveniently called Inverse Document Frequency.</p>
<p>Now going back to d/D, this is a probability estimate p that a given event has occurred. Let the presence of a term in a document be that event. If terms are independent, it must follows that for any two events, A and B</p>
<p>p(AB) = p(A)p(B).</p>
<p>Taking logs we can write</p>
<p>log[p(AB)] = log[p(A)]+ log[p(B)]</p>
<p>It is easy to show that for two terms</p>
<p>log(d12/D) = log(d1/D) + log(d2/D)</p>
<p>Inverting and using the definition of IDF we end up with</p>
<p>IDF12 = IDF1 + IDF2</p>
<p>validating assumption I; that IDF as a scoring function is additive.</p>
<p>That is the IDF of a two term query is the sum of individual IDF values. However, this is only valid if terms are independent from one another. If terms are not independent we would have two possibilities; i.e.,</p>
<p>p(AB) &gt; p(A) + p(B)</p>
<p>or</p>
<p>p(AB) &lt; p(A) + p(B)</p>
<p>and we cannot say that the IDF of a two term query (e.g, a phrase) is the sum of individual IDF values. Assuming the contrary as many SEOs think in order to promote some dumb keyword research tools is plain snakeoil.</p>
<p>2) What do you mean by &#8216;discriminatory power&#8217; in the phrase &#8220;IDF is a measure of the discriminatory power of a term in a<br />
collection.&#8221;</p>
<p>This is legacy idea from Robertson and Sparck Jones. The discriminatory power of a term (aka term specificity) implies that terms too frequently used are not good discriminators between documents. If a a term is used in too many documents its use to discriminate between documents is poor. By contrast, rare terms are assumed to be good discriminators since they appear in few documents.</p></blockquote>
<p style="text-align:left;">The RSJ-PM Tutorial mentioned above was written to kill for good some misconceptions regarding IDF. In it we explain why IDF is considered by Robertson and Jones a particular RSJ weight in the absence of relevance information.</p>
<p style="text-align:left;">In a nutshell, IDF is a collection wide estimate and as such the information on whether documents containing the terms being queried are relevant to these is unknown. Similarly, the information on whether documents not containing the query terms are relevant or not is unknown and often remains unscrambled when we just look at the d/D and d/(D &#8211; d) collection-wide ratios. All we can say is that relevant documents might have a higher probability of containing query terms in comparison with other documents from the collection as a whole. But we could make such assertion without resourcing to IDF as well.</p>
<p style="text-align:left;">In the case of Web documents, often these are about multiple topics. Many documents aggregate content from dissimilar sources (news headlines, rss, blogs, etc) and said document content might change in time. The mere mention of a term (regardless of repetition) is not a proof of its relevancy or of its importance with respect to the topics discussed in a document.</p>
<p style="text-align:left;">Thus, the idea that we can assess if terms are relevant to a document by simply comparing IDF values is missing the whole point and defeats the purpose for which the RSJ-PM model and many of its variants (e.g., BM25) were developed.</p>
<p style="text-align:left;">I hope this helps to clear up some SEO misconceptions on the topic.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/865/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/865/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/865/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/865/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/865/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/865/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/865/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/865/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/865/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/865/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=865&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/15/why-idf-is-expressed-using-logs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>Finally SEOs are getting the LSI Myth!</title>
		<link>http://irthoughts.wordpress.com/2009/04/09/finally-seos-are-getting-the-lsi-myth/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/09/finally-seos-are-getting-the-lsi-myth/#comments</comments>
		<pubDate>Thu, 09 Apr 2009 17:47:47 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Latent Semantic Indexing]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=854</guid>
		<description><![CDATA[If you search this blog (IRThoughts) for LSI or visit its Latent Semantic Indexing category you will find many posts wherein SEO LSI Myths are debunked. Prior to this wordpress blog I used to maintain a personal blog wherein SEO myths regarding LSI were also debunked.
Over the years, many realized they were taken by the usual [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=854&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:left;">If you search this blog (IRThoughts) for LSI or visit its Latent Semantic Indexing category you will find many posts wherein SEO LSI Myths are debunked. Prior to this wordpress blog I used to maintain a personal blog wherein SEO myths regarding LSI were also debunked.</p>
<p style="text-align:left;">Over the years, many realized they were taken by the usual agents of misinformation, at least when it comes to &#8220;SEO LSI&#8221; and &#8220;LSI-Friendly&#8221; documents.</p>
<p style="text-align:left;">Recently, I found traffic coming from a blog discussion about a video <a href="http://www.stomperblog.com/warning-advanced-seo-technique-does-not-work/">(http://www.stomperblog.com/warning-advanced-seo-technique-does-not-work/</a>) wherein LSI in relation with Google is debunked.</p>
<p style="text-align:left;">The video also discusses one flavor of LSI; i.e. one wherein weights are tf-IDF weights. This flavor does not incorporate relevance information or entropy information, like other LSI variants.</p>
<p style="text-align:left;">The video does a good job at debunking LSI Myths. However, it has at least a factually incorrect argument in relation to how the SVD algorithm works.</p>
<p style="text-align:left;">The video gives an example implying that SVD works by reducing a large set of words to a few words, such that, for example thousand of words are reduced to, let say 300 words.  This is incorrect and certainly is not a trivial flaw.</p>
<p style="text-align:left;">SVD does not work by reducing a vocabulary, but by reducing dimensions, and there are as many dimensions as singular values. This is why is called a dimensionality-reduction and not a vocabulary-reduction algorithm.  I should stress that an LSI Space is not like a Term Space wherein each term is a dimension such that there is a 1:1 correspondence.</p>
<p style="text-align:left;">In LSI, the SVD algorithm is used to reduce the dimensions of a matrix; the number of singular values of the matrix.</p>
<p style="text-align:left;">For instance in our SVD and LSI Tutorial series at</p>
<p style="text-align:left;"><a href="http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-5-lsi-keyword-research-co-occurrence.html">http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-5-lsi-keyword-research-co-occurrence.html</a></p>
<p style="text-align:left;">we present an LSI problem example consisting of many words and few initial dimensions such that for the initial matrix</p>
<p style="text-align:left;">#words &gt;&gt; # initial dimensions</p>
<p style="text-align:left;">more specific, we used 11 words and 3 dimensions</p>
<p style="text-align:left;">After truncation, we ended up with 11 words and 2 dimensions.</p>
<p style="text-align:left;">Other than this, the video is fun to watch, but ended up as an introductory promotion for another SEO proposal.</p>
<p style="text-align:left;"> PS.</p>
<p style="text-align:left;">After reviewing several times the video, unfortunately I found the video has another incorrect argumentation.</p>
<p style="text-align:left;">When objecting to that Google might not use LSI, an argument is made in the sense that LSI has to return same results when word variants are used like plurals and tenses. This might be the case if stemming is heavily used in an LSI implementation, but the use of stemming is not a requirement for implementing LSI at all.</p>
<p style="text-align:left;">When stemming is not implemented, for sure the SVD reduction will return different results since these will be entered in the original term-doc matrix to be undergo decomposition as different tokens.</p>
<p style="text-align:left;">The video also misses what the power of LSI comes from: higher order co-occurrence connectivity path hidden (latent) in the original matrix. Whether terms have to be synonyms, related terms, or even of non-derivative forms is not a requirement for observing these hidden paths in LSI.</p>
<p style="text-align:left;">Terms no need to be related terms either to end up clustered with LSI. It is the hidden co-occurrence patterns what is behind the clustering. For example, in our SVD and LSI tutorial above, we intentionally used stopwords and zero synonyms/related terms and these ended-up in their corresponding clusters, without being necessarily semantically related. This simple example shows that in LSI the SVD algorithm produces an output based on crushing numbers, not on making sense out of meaning or intelligence, and contradicts the generalized opinion that LSI works at the level of meaning. </p>
<p style="text-align:left;">I have to conclude that while the video is intended to debunk LSI SEO myths (a noble effort), it uses incorrect arguments and hearsays lines from around the Web. Debunking hearsay with more hearsay: What a shame.</p>
<p style="text-align:left;"> </p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/854/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=854&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/09/finally-seos-are-getting-the-lsi-myth/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>
	</item>
		<item>
		<title>IRW Newsletter: Web &amp; Data Mining with RIAs</title>
		<link>http://irthoughts.wordpress.com/2009/04/08/irw-newsletter-web-data-mining-with-rias/</link>
		<comments>http://irthoughts.wordpress.com/2009/04/08/irw-newsletter-web-data-mining-with-rias/#comments</comments>
		<pubDate>Wed, 08 Apr 2009 13:39:57 +0000</pubDate>
		<dc:creator>E. Garcia</dc:creator>
				<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Newsletters]]></category>

		<guid isPermaLink="false">http://irthoughts.wordpress.com/?p=847</guid>
		<description><![CDATA[
The current issue of IRW should be in subscribers inbox today or tomorrow, at the latest.
In this issue of the newsletter we cover Rich Internet Applications (RIAs) and how these can be used for Web/Data Mining. A RIA is a browser-independent application that can be compiled and run from the desktop.
In this issue:
Featuring article: Web [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=847&subd=irthoughts&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p style="text-align:center;"><img class="aligncenter" src="http://www.miislita.com/irw/rias.gif" alt="RIAs" /></p>
<p>The current issue of IRW should be in subscribers inbox today or tomorrow, at the latest.</p>
<p>In this issue of the newsletter we cover Rich Internet Applications (RIAs) and how these can be used for Web/Data Mining. A RIA is a browser-independent application that can be compiled and run from the desktop.</p>
<p>In this issue:</p>
<p>Featuring article: Web &amp; Data Mining with RIAs<br />
QA: Recommended RIAs<br />
Who is Who in IR: Bruce Croft<br />
Top CS Departments: UMass, Amherst<br />
Historical Notes: John von Neumann and Bugs<br />
Outstanding Graduate Theses<br />
Calls and Events<br />
Research Blogs<br />
and more&#8230;</p>
<p>IRW currently reaches a fine audience of university and government researchers and their labs. If you are a graduate student or IR practitioner and want to be known within this exclusive circle, submit a short article (2, 3 pages, IRW format, free from marketing and sale pitches) for its consideration</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/irthoughts.wordpress.com/847/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/irthoughts.wordpress.com/847/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/irthoughts.wordpress.com/847/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/irthoughts.wordpress.com/847/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/irthoughts.wordpress.com/847/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/irthoughts.wordpress.com/847/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/irthoughts.wordpress.com/847/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/irthoughts.wordpress.com/847/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/irthoughts.wordpress.com/847/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/irthoughts.wordpress.com/847/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=irthoughts.wordpress.com&blog=1041983&post=847&subd=irthoughts&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://irthoughts.wordpress.com/2009/04/08/irw-newsletter-web-data-mining-with-rias/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/2d26d7051f681fdbb28379876c940a32?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">irthoughts</media:title>
		</media:content>

		<media:content url="http://www.miislita.com/irw/rias.gif" medium="image">
			<media:title type="html">RIAs</media:title>
		</media:content>
	</item>
	</channel>
</rss>