Archive for the ‘Homeland Security’ Category

Verizon, FCC, and the C Block Competition

March 24, 2008

Now that the B and C blocks of the spectrum has been allotted by the FCC things are set-ready-go to open mobile broadband U.S. networks, broadband IR, and, yes, to a whole new hacking space. It’s a matter of time. The C Block hacking competition is coming. Never ignore what can be done with such new playground.

I wonder how the FCC is going to enforce regulations on the 22-MHz portion of the spectrum, already handled to Verizon. http://www.pcworld.com/article/id,143705-c,industrynews/article.html

Meanwhile, IR research centered around open broadband networks are needed, so as search engines.

Search Engines for Penetration Testing

February 21, 2008

Well, I’m getting ready for my talk this afternoon at University of Turabo. I’ve organized the talk in three parts:

 Part 1: Spam and Fraud through Search Engines

Part 2: Gathering Intelligence through Search Engines

Part 3: Identity Theft through Search Engines

A disclaimer will be necessary to indicate that the information to be presented is for educational purposes, only.

This gonna be a nice one. I hope to see old friends.

Web Mining, Search Engines, and Information Security

February 15, 2008

This thursday the 21st I’ll be presenting before the faculty of University of Turabo, Gurabo, PR the talk:

Web Mining, Search Engines, and Information Security

I hope to see old friends there. Here is the abstract of my talk:

Web Mining is a research area of Data Mining wherein the Web is the “database” and search engines are the “user’s interface”. End-users can resource to search engines for all sorts of things. For instance, marketers can use search engines to gain traffic derived from ranking high Web pages for specific queries, hence enhancing the online presence of businesses, products, and services (search engine optimization, SEO). Spammers can inundate search engine indexes to deceive searchers (spamdexing). Hackers can attempt to rank high documents that lead to security risks (hacketers, hacketering) or use all form of injections (links, forms, scripts, redirections, etc). Terrorists and criminals can use search engines to commit all sort of crime-enabling activities, for instance, by stealing private information like SSNs, passwords, students and users’s IDs, gaining access to “private” documentation, stalking people, etc.

This talk covers these and other aspects of search engines: the Good, the Bad, and the Ugly. The speaker will then talk about his own research projects in the area of Web Mining, Search Engines, and Intelligence. A disclaimer will be necessary to indicate that the information to be presented is for educational purposes only.

Web Mining Week 9

January 28, 2008

Week 9 Agenda

Intelligence Searching for Penetration Testers (PPT Presentation)
Searching for Terrorist Threats and Identity Thefts, the SSN Way (PPT Presentation)
Mining VIN numbers, Email Headers, and other Undocumented Commands (PPT Presentation)

Required Reading Material

Provided during lecture.

Thesis: DNIDS Using the CSI-KNN Algorithm

January 4, 2008

Here is a great 2007 MS Thesis from Liwei (Vivian) Kuang from School of Computing, Queen’s University, Kingston, Ontario, Canada. DNIDS: A Dependable Network Intrusion Detection System Using the CSI-KNN Algorithm

I’m happy she quoted my Cosine Similarity Tutorial.

Part of the abstract states: “In this thesis, we propose a Dependable Network Intrusion Detection System(DNIDS) based on the Combined Strangeness and Isolation measure K-Nearest Neighbor(CSI-KNN) algorithm. The DNIDS can effectively detect network intrusionswhile providing continued service even under attacks. The intrusion detection algorithmanalyzes different characteristics of network data by employing two measures:strangeness and isolation. Based on these measures, a correlation unit raises intrusionalerts with associated confidence estimates. In the DNIDS, multiple CSI-KNNclassifiers work in parallel to deal with different types of network traffic. An intrusiontolerantmechanism monitors the classifiers and the hosts on which the classifiers resideand enables the IDS to survive component failure due to intrusions. As soon asa failed IDS component is discovered, a copy of the component is installed to replaceit and the detection service continues.”

“We evaluate our detection approach over the KDD’99 benchmark dataset. Theexperimental results show that the performance of our approach is better than the bestresult of the KDD’99 contest winner. In addition, the intrusion alerts generated byour algorithm provide graded confidence that offers some insight into the reliabilityof the intrusion detection. To verify the survivability of the DNIDS, we test theprototype in simulated attack scenarios. In addition, we evaluate the performanceof the intrusion-tolerant mechanism and analyze the system reliability.”

Resources on the Dark Web

December 14, 2007

Few days ago I reported on the Dark Web Project.

There is one section of that paper that reads (emphasis added):

“IV. Presentations in Seminars or Conferences (PowerPoint) – Password protected; please send request via email and provide a brief explanation of your interest.”

Clicking on the links that follow that statement triggers a history.go(-1) JavaScript event in the browser history. Looking at the source of the document shows a JavaScript asking for the password (which is given as “ailab”) and the following partial paths to the documents:

publications/conf/WriteprintsandInkBlots.pdf
publications/conf/data%20mining%20webometric%20analysis%203aug05.pdf
publications/conf/SeminarGroupAuthorship.pdf
publications/conf/comparative_03_25_05.pdf
publications/conf/Dark%20Web%20200502.pdf
publications/conf/AASlidesMod.pdf
publications/conf/WebForum0712_2007.ppt
publications/conf/SpecializedContent_2007.ppt
publications/conf/ClearGuidance_2006.ppt

Other than accessing the entries in the history.go array of end users, I’m not sure why they added this “password protected” feature since simply adding http://ai.arizona.edu/research/terror/ to the above paths allows one to access and download the documents, anyway.

The article also points to the following great resources:

Reid, E. and Chen, H., “Mapping the Contemporary Terrorism Research Domain.” International Journal of Human-Computer Studies, 65, Pages 42-56, 2007.

Qin, J., Zhou, Y., Reid, E., Lai, G., Chen, H., “Analyzing Terror Campaigns on the Internet: Technical Sophistication, Content Richness, and Web Interactivity,” International Journal of Human-Computer Studies, 65, Pages 71-84, 2007.

H. Chen and F. Wang, “Artificial Intelligence for Homeland Security“,IEEE Intelligent Systems, Special Issue on Artificial Intelligence for National and Homeland Security, pp. 12-16, September/October 2005.

A. Abbasi and H. Chen, “Applying Authorship Analysis to Extremist-Group Web Forum Messages“,IEEE Intelligent Systems, Special Issue on Artificial Intelligence for National and Homeland Security, pp. 67-75, September/October 2005.

Zhou, Y., Reid, E., Qin, J., Lai, G., Chen, H., “U.S. Domestic Extremist Groups on the Web: Link and Content Analysis,”IEEE Intelligent Systems, Special Issue on Artificial Intelligence for National and Homeland Security, pp. 44-51, September/October 2005.

A. Abbasi and H. Chen, “Visualizing Authorship for Identification,” In Proceedings of the Intelligence and Security Informatics: IEEE International Conference on Intelligence and Security Informatics (ISI 2006), San Diego, CA, USA, May 23-24, 2006.

J. Wang, T. Fu, H. Lin, and H. Chen, “A Framework for Exploring Gray Web Forums: Analysis of Forum-Based Communities in Taiwan,” In Proceedings of the Intelligence and Security Informatics: IEEE International Conference on Intelligence and Security Informatics (ISI 2006), San Diego, CA, USA, May 23-24, 2006.

Y. Zhou, J. Qin, G. Lai, E. Reid, and H. Chen, “Exploring the Dark Side of the Web: Collection and Analysis of U.S. Extremist Online Forums,” In Proceedings of the Intelligence and Security Informatics: IEEE International Conference on Intelligence and Security Informatics (ISI 2006), San Diego, CA, USA, May 23-24, 2006.

A. Salem, E. Reid, and H. Chen, “Content Analysis of Jihadi Extremist Groups’ Videos,” In Proceedings of the Intelligence and Security Informatics: IEEE International Conference on Intelligence and Security Informatics (ISI 2006), San Diego, CA, USA, May 23-24, 2006.

J. Xu, H. Chen, Y. Zhou, and J. Qin, “On the Topology of the Dark Web of Terrorist Groups,” In Proceedings of the Intelligence and Security Informatics: IEEE International Conference on Intelligence and Security Informatics (ISI 2006), San Diego, CA, USA, May 23-24, 2006.

Zhou, Y., Qin, J., Lai, G., Reid E. and Chen, H., “Building Knowledge Management System for Researching Terrorist Groups on the Web,” Proceedings of the AIS Americas Conference on Information Systems (AMCIS 2005) , Omaha, NE, USA, August 11-14, 2005.

Mapping the Contemporary Terrorism Research Domain: Researchers, Publications, and Institutions Analysis,” ISI Conference 2005, Atlanta, GA, May, 2005.

Reid, E., Qin, J., Zhou, Y., Lai, G., Sageman, M., Weimann, G., and Chen, H., “Collecting and Analyzing the Presence of Terrorists on the Web: A Case Study of Jihad Websites,” IEEE International Conference on Intelligence and Security (ISI 2005), Atlanta, Georgia, 2005.

Chen, H., Qin, J., Reid, E., Chung, W., Zhou, Y., Xi, W., Lai, G., Bonillas, A. and Sageman, M., “The Dark Web Portal: Collecting and Analyzing the Presence of Domestic and International Terrorist Groups on the Web,” Proceedings of the 7th International Conference on Intelligent Transportation Systems (ITSC), Washington D.C., October 3-6, 2004.

IRSeek, Polymorphic JavaScript, and Hacketers

December 6, 2007

According to a DarkReading report IRSeek is a start-up designed to target hackers and their IRC anonymous chat activities. Hacking the hackers?

The report states:

“Hackers favor IRC because it allows them to protect their identities and cover their tracks. But a new search engine startup called IRSeek is now calling those features into question…”

“This could all be bad news for hackers, who don’t want their conversations indexed or searchable by nickname. While they could partially beat the system by simply changing their nicknames frequently, hackers may eventually feel that IRSeek threatens their anonymity, and ultimately, their privacy.”

Here is more on the topic.

Well, this can be fun to watch/test for those that conduct Web Mining for security purposes.

Meanwhile, according to a CNN report Search Engine-based hacking attacks are on the rise and becoming a preferred targeting method. This includes link-based spam, polymorphic JavaScript scripts also referred to as “Polyscripts”, and or combined with dark marketing practices. Here is a Top 10 List to watch.

1. Phishing
2. Malicious link injections through forums, blogs to rank high in search engines.
3. Attackers use Web’s ‘weakest links’ to launch attacks.
4. Compromised Web sites will surpass number of created malicious sites.
5. Cross-platform Web attacks .
6. Web 2.0-based attacks.
7. Polymorphic JavaScripts, designed to evade anti-virus scanners.
8. Data concealment methods.
9. Key hacker groups.
10.Vishing and voice spam.

Hackers + Spammers + Crook marketers/SEOs = What A Killer Combination. Compromised sites ranking high means trapping more users in the mess. I wonder how many of the folks from the seophere are involved and making few bucks. The usual suspects?

Perhaps not all are real SEOs, but as we say in Spanish: “Ante la duda, saluda.”

Here is a nice one: Hacking Duke University to rank high via link injection

And some how related, how about cracking passwords with Google?

Welcome to an-on-the-rise new breed:

Hacketers = Hackers + Marketers

PS. I coined the name after noticing with the Levenshtein Edit Distance Calculator that it only requires of two edits between hacketers and marketers.

http://www.miislita.com/searchito/levenshtein-edit-distance.html

Heh, Heh. Apparently “peer” pressure forced IRSeek to shutdown. Nevertheless, it is still a great concept: I wonder how many of these mole  projects are in place all over the Web. Check the whole deadpool story here:

http://www.techcrunch.com/2007/12/03/fastest-deadpool-ever-irseek-shuts-down/#comment-1813205

http://www.irseek.com/blog/

Dark Web Project and Web Mining

November 13, 2007

Prof Chen, UofArizona, has a fascinating project on Web Mining applied to Homeland Security called the Dark Web Project, over at http://ai.arizona.edu/research/terror/index.htm

The project is funded by NSF, DHS, CNRI, and Library of Congress. 

From their site:

“The AI Lab Dark Web project is a long-term scientific research program that aims to study and understand the international terrorism (Jihadist) phenomena via a computational, data-centric approach. We aim to collect “ALL” web content generated by international terrorist groups, including web sites, forums, chat rooms, blogs, social networking sites, videos, virtual world, etc. “

“We have developed various multilingual data mining, text mining, and web mining techniques to perform link analysis, content analysis,  web metrics (technical sophistication) analysis, sentiment analysis, authorship analysis, and video analysis in our research.”

“The approaches and methods developed in this project contribute to advancing the field of Intelligence and Security Informatics (ISI). Such advances will help related stakeholders to perform terrorism research and facilitate international security and peace. “

“It is our belief that we (US and allies) are facing the dire danger of losing the “The War on Terror” in cyberspace (especially when many young people are being recruited, incited, infected, and radicalized on the web) and we would like to help in our small (computational) way.”

Random Notes and LauraMansfield

September 12, 2007

These are some late random notes. Sorry for the delay.

1. I am putting together a research project for a graduate student. The topic is quite interesting: homeland security. While researching the topic I came across LauraMansfield.com site. Mansfield’s site is a goldmine of information, especially for those interested in co-occurrence and word association research applied to the terrorist knowledge domain.

2. I am reviewing a graduate thesis in which logistic regression is used for data mining medical claims. Quite interesting the thesis topic. The manuscript needs some rework, though.

3. I am reading bits and pieces of an old paper on the non-transitivity nature of Jaccard’s Coefficient and a proposed indirect similarity measure.

Data Mining and Reports on Terrorism

August 22, 2007

I’m researching the topic of Data Mining (KDD) and Terrorism Information Awareness (TIA) for a graduate course and came across a great old resource:

Data Mining

It is oldie, but the important part are the references.

It may interest IRs conducting similar research.

Here is another great resource:

Data Mining and Homeland Security

DARPA Agent Markup Language (DAML)

August 1, 2007

DARPA Agent Markup language (DAML) site has tons of tools and resources CS/IR graduate students and SEM/SEO practitioners with some IR knowledge can use for data mining purposes. These can help with nice experiments, from ontology-based keyword discovery to the construction of crawlers (or at least really learn how these actually work).

Here is a list of resources.

(more…)