In the current issue of IRW, “Constrained Co-Occurrence Searches”, we described cc searching in its two variants: proximity searching and adjacency searching. The difference between these two way of searching was explained and illustrated with few examples.

A case was made against the indiscriminate use of the NEAR, proximity, and adjacency searching expressions. A 2005 cc searching algorithm proposed by a research group from the Office of Naval Research (ONR) was also investigated.

In addition, we compared Google’s tilde operator with cc searching. Contrary to SEO opinions, the former is not an LSI operator, but used to conduct a lookup for synonyms; the later allows users to discover on-topic, in-context terms.

In our tests we have found that performance discovery is improved when cc searches are combined with Google’s commands like allintitle: and allinurl: commands, as in

allintitle: “car*insurance”

car rental insurance
car driver insurance
car accident insurance
car motor insurance
car dealers insurance

allinurl: “car*insurance”


To expand the text window, add a sequence of asterisks like this:


car insurance and home insurance
car rental loss damage insurance
car home and business insurance
car insurance young driver insurance
car life and commercial insurance

This allows users to retrieve documents wherein search terms are separated by at least three terms. To limit the search window to exactly three terms, the ONR algorithm has been suggested. The IRW issue discusses some advantages and limitations of this algorihtm.

Possible applications include SERPs snippet optimization, keyphrase discovery, contextual targeting of terms, and advanced EF-Ratio calculations, amongst other applications. It is clear that Web Mining of answer sets is possible. On-topic analysis is here to stay.

Subscribe to IRW and stay ahead of the curve. Learn about research that normally does not reach mainstream.