• About IR Thoughts

IR Thoughts

~ Thoughts on Information Retrieval, Data Mining, and Search Engines

IR Thoughts

Category Archives: Graduate Courses

DNS Intelligence

13 Tuesday Oct 2009

Posted by egarcia in Graduate Courses, Internet Engineering

≈ Leave a Comment

Today’s Internet Engineering Part 1 course lecture will be on DNS Intelligence and how we can use DNS records to understand virus and worm attacks as well as remote network topologies. Quite handy these days.

Please check Lecture 8

All About Email Headers

06 Tuesday Oct 2009

Posted by egarcia in Graduate Courses, Internet Engineering, Spam

≈ Leave a Comment

If you are enrolled in the IE-Part 1 course, here is some reference material on Email Headers for today’s lecture:

Exposing email headers

http://www.abs-comptech.com/EmailHeaders.htm

Tracking the source of email spam

http://www.rahul.net/falk/mailtrack.html

How to read email headers

http://www.emailaddressmanager.com/tips/header.html

Reading the email header

http://antivirus.about.com/od/windowsbasics/a/emailheaders.htm

Reading email headers

http://www.tinhat.com/email/read_email_headers.html

Spamlinks: Reading email headers

http://spamlinks.net/track-trace-headers.htm

ACCC: Reading Email Headers

http://www.uic.edu/depts/accc/newsletter/adn29/headers.html

E-mail Headers and SMTP Commands

http://www.avolio.com/columns/E-mailheaders.html

All About Email Headers

http://www.stopspam.org/index.php?option=com_content&view=article&id=45&Itemid=56

Security Optimization Strategies in the Workplace

http://www.miislita.com/searchito/security-optimization-strategies.html

Email Protocols

05 Monday Oct 2009

Posted by egarcia in Graduate Courses, Internet Engineering

≈ Leave a Comment

If you are a student enrolled in the Internet Engineering I graduate course, check the Lecture 7 update.

We will be covering email protocols such as SMTP, POP3, and IMAP. The exercise section covers email headers intelligence and email crawlers.

DNS Configuration

28 Monday Sep 2009

Posted by egarcia in Graduate Courses, Internet Engineering

≈ Leave a Comment

If you are a student enrolled in the Internet Engineering I graduate course, check the Lecture 6 update.

I will be covering all about DNS configuration files. For the hands-on exercise section, we will be using nslookup commands to snoop at all relevant records of remote Web domains.

Use nslookup/? to access the options helper
Use nslookup followed by ? in a different line to access the commands helper
To quit nslookup, press ctrl C or either type quit or exit.

Internet Engineering I: Course Lectures

21 Monday Sep 2009

Posted by egarcia in Graduate Courses, Hacking, Internet Engineering

≈ 2 Comments

The following are the lecture and exercise topics covered in the PUPR.edu core graduate course Internet Engineering, Part I. Students enrolled in the course might want to revisit this post as it will be updated.

Lecture 0

History of the Internet & Search Engines

Internet Basics

Lecture 1

RFCs (Request for Comments)

Network Types

IP (Internet Protocol)

Exercise 1 – RFCs, Network types, IP calculations

Lecture 2

OSI Reference Model

ARP

ICMP

Exercise 2 – IP-MAC Mapping, Prompt Commands (arp, ipconfig, nslookup)

Lecture 3

Man-in-the-Middle ARP Attacks

IGMP

IP Packets

Exercise 3 – Broadcast & Multicast IPs, Prompt Commands (netstat, ping, tracert, ipconfig, arp, nslookup)

Lecture 4

Fragmentation Offset

FO Overlapping Attacks

FO Gap Attacks

Tiny FO Attacks

TCP Protocol & Buffers

Exercise 4 – TCP buffers, Congestion Windows, Advertised Windows

Lecture 5

PING

PING of Death

Smurfing

TRACEROUTE-based Intelligence

Exercise 5 – Prompt Commands (arp, ipconfig, nslookup, netstat, ping, tracert)

Lecture 6

BIND & WINDOWS DNS (Domain Name Server)

Internet backbone root servers

Configuration Files

DNS Configuration Errors

Forward Lookup (Zone) Files

Reverse Lookup Files

Exercise 6 – Prompt Commands (interactive/non-interactive nslookup modes)

Lecture 7

SMTP

POP3

IMAP

Email Headers

Exercise 7 – Email Intelligence.

Lecture 8

DNS Intelligence

Using DNS records to understand Virus & Worm Attacks

Network Topology Intelligence from DNS records

Exercise 8 – DNS Intelligence

Lecture 9

General Review

Practice Test

Lecture 10

Final Exam, Oct 27

Course Grading System

8 out of 9 hands-on exercises count (worse exercise grade dropped)
1st partial exam = average of first best 4 exercise grades
2nd partial exam = average of last best 4 exercise grades
The average of these two is the same as adding up best 8 grades and dividing by 8. This result amounts to 75% of total grade (course letter grade score).

Final Exam amounts to 25 % of total grade.

After that, course letter grade is curved as shown below.

A (100-89%)
B (88-77%)
C (76-60%)
D (59-50%)
F (49-0%)

where

course letter grade score = (sum of best 8 exercise grades/8)*(0.75) + (final exam grade)*(0.25)

New Graduate Courses

01 Tuesday Sep 2009

Posted by egarcia in Graduate Courses, Hacking, Spam

≈ 2 Comments

As PUPR students know by now, the AIRWeb and Internet Engineering courses have been consolidated into a single course called Internet Engineering I (IE-I), which is on Tuesday’s.

This was a decision made strictly by the administration. 12 graduate students are enrolled –a big number for a grad course. We are now in the fourth week of IE-I and I can tell that is a lot of fun.

This coming Winter semester I’m scheduled to teach a new grad course called Advanced Search Engine Architecture (ASEA). Both, IE-I and ASEA are hands-on. This means students need to get their hands and feet wet, not just learning the theory.

What we are trying to accomplish in IE-I is to understand how hackers and spammers use Internet architectures at the level of TCP/IP and Search Engines to game the system. I’ll open a special blog category for it during the week.

First lecture (Lecture 1) was briefly summarized in the August 2009 issue of IR Watch. BTW. Tonight’s lecture (Lecture 4) covers the following:

IP Protocol (MAC and IP Mapping)

ICMP Protocol

ARP Hacking Attacks

ICMP Hacking Attacks

Firewall’s Fragmentation Offset  Attacks

Meanwhile, ASEA is an expanded version of the previous Search Engine Architecture (SEA) course I’ve taught before. Students interested in registering, can search this blog for the SEA category and check what we have covered in the past. This will give them an idea of what to expect from the Advanced SEA course. One thing I’m planning to do different is to build an inverted index from scratch using AJAX. The most recent version of Terrier will also be used for testing/benchmarking experimentals.

Last but not least, September Issue of IRW will be a bit delayed.

AIRWeb Course Announcement

02 Thursday Apr 2009

Posted by egarcia in AIRWeb Course, Graduate Courses

≈ 1 Comment

During the Fall of 2009, I will be teaching 

 Adversarial Information Retrieval on the Web:  A Graduate Course on Web Spam and Internet Vulnerabilities

This a new one-full semester graduate course to be offered at Polytechnic University Puerto Rico. It is based on the material presented at the annual AIRWeb Workshops. KDDM graduate students are encouraged to enroll. An early announcement and preliminary syllabus is available at

http://www.miislita.com/courses/airweb-web-spam-syllabus.pdf

BTW, In November 5 of 2008 PUPR became the First Academic Institution in the Caribbean to be Certified by the Committee on National Security Systems (CNSS). Additional information is available at http://www.pupr.edu/ias.html

Their goal is to become a Center of Academic Excellence in Information Assurance Education (CAE/IAE). These are great news. Nationwide, how many universities you know that are in such an exclusive ”club”?

Sneak Preview of IRW: Graduate Research

01 Friday Aug 2008

Posted by egarcia in Graduate Courses, Machine Learning, Marketing Research, Theses

≈ Leave a Comment

The current issue of IRW, Graduate Students Research, is out. It consists of short abstracts of research conducted by graduate students.

In this issue:

Introduction
Genetic Algorithms, K-Means, and Fuzzy C-Means
Word Association Patterns
U-Site Search Engine Interface
Enhancement of a U-Site Search Engine Interface
News, Research, and Events
Terms of Use and Copyright

The next issue will go back to its how-to mode.

IR Quiz

18 Wednesday Jun 2008

Posted by egarcia in Graduate Courses, IR Tutorials, Machine Learning

≈ Leave a Comment

Here is a question I included during the final examination of the Search Engines Architecture course. I am modifying the question. It might serve as a little quiz for non IR readers:

A collection consists of 500 documents. Some documents mention k1 and/or k2 keywords. If 100 mention k1, 200 mention k2, 70 mention k1 and k2, and 25 mention the k1 k2 terms sequence. Calculate the number of results for the following queries first, assuming terms independence and second assuming terms dependence. If the calculation is not possible from the provided data, write NC, ‘Not Computable’.

1. k1 NOT k2

2. k2 NOT k1

3. k1 OR k2 (unconditional OR)

4. k1 OR k2 (conditional OR)

5. NOT k1

6. NOT k2

7. NOT (k1 AND k2)

8. k1 AND k2 NOT (k1 k2)

9. EF-Ratio of the k1 k2 terms sequence

10. c12-index of the k1 k2 terms sequence

11. c12-index of k1 AND k2

12. IDF of k1

13. IDF of k2

14. IDF of k1 AND k2

15. IDF of k1 k2 terms sequence

Total Possible Scores: 15 points for terms independence and 15 points for terms dependence correct results.

Grading Yourself: A (100 – 90), B (89 – 80), C (79 – 70), D (69 -60), F(59 – 0)

Correct answers will be given during the week.

 

Search Engines Architecture Week 10

16 Friday May 2008

Posted by egarcia in Graduate Courses, Search Engines Architecture Course

≈ Leave a Comment

Week 10 Agenda

Lecture Session

Other Inverted Index Architectures
Divide-and-Conquer Strategies for Fast Indexing and Searching

Lab Session

Lectures and Lab Review

Final Examination Notes

Next week we have the final examination. This is an open book exam, with theory and practice sections.

To answer the test you need:

#2 pencil.
Calculator.
Working version of Terrier.
Tools developed during the course: parser, crawler, url and query normalizers, stemmer, etc.
Laptop (or a PC will be supplied to you).

← Older posts

♣  

May 2012
M T W T F S S
« Apr    
 123456
78910111213
14151617181920
21222324252627
28293031  

♣ Favorite Sites

  • Mi Islita

♣ Pages

  • About IR Thoughts

♣ Categories

  • AIRWeb Course
  • Conferences
  • Data Mining
  • Fractal Geometry
  • Graduate Courses
  • Hacking
  • Homeland Security
  • Human-Computer Interaction
  • Image Compression
  • Internet Engineering
  • IR Quizzes
  • IR Tools
  • IR Tutorials
  • Latent Semantic Indexing
  • Legacy Posts
  • Machine Learning
  • Marketing Research
  • Miscellaneous
  • Newsletters
  • Programming
  • Quack Science
  • Queries
  • Search Engines Architecture Course
  • SEO Myths
  • Software
  • Spam
  • Statistics and Mathematics
  • Theses
  • Vector Space Models
  • Web Mining Course

♣ Recent Posts

  • Puerto Rico’s Science and Technology Trust Fund: Innovation Island Blast II
  • The L’Hôpital Rule: Deriving the Geometric Mean
  • Understanding the L’Hôpital Rule
  • How to Create Windows Metro Style Apps with JavaScript
  • Electronic Drugs and Hackers
  • Why a Social and Search Presence is Important for You
  • NY SES – 2012: My little briefing
  • Hello, World. I’m SWM.
  • SES NY – See You All There!
  • Which separators to use with title tags?
  • A Study of Puerto Rico Newspaper Home Pages
  • Hey, SEOs: On Information Gain, Keyword Wallop, and Relevance
  • Social Media and Puerto Rico Local Brands
  • When and Why not to take arithmetic averages
  • l’Hopital’s Rule and the 0^0 Power Controversy

♣ Archives

  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • May 2011
  • April 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007

♣ Category Cloud

AIRWeb Course Conferences Data Mining Fractal Geometry Graduate Courses Hacking Homeland Security Human-Computer Interaction Image Compression Internet Engineering IR Quizzes IR Tools IR Tutorials Latent Semantic Indexing Legacy Posts Machine Learning Marketing Research Miscellaneous Newsletters Programming Quack Science Queries Search Engines Architecture Course SEO Myths Software Spam Statistics and Mathematics Theses Vector Space Models Web Mining Course

Blog at WordPress.com. Theme: Chateau by Ignacio Ricci.