I’m building a client-side suite of text mining tools for extracting intelligence from text files, Web pages, and email documents. It comes in four versions: basic, intermediate, advance, and pro. The basic version provide the following reports:
linearization – markup removal
tokenization – punctuation removal
filtration – stopwords removal
sorting – randomness removal
deduplication – dupes removal
The intermediate version is like the basic, but scores term weights of survival terms using a variety of weight scales. It also generates specific term weight reports.
The advance version is like the intermediate, but creates a matrix out of results. It creates also a report.
The pro version is like the intermediate, but can be instructed to extract specific business intelligence data. Many reports are created, depending on what you are gattering.
All good and a great pet project, but I wasn’t sure about how to call it. I wanted to use a catchy name with some pizzaz. I didn’t want to use a slangy or too teckie name.
Few minutes of brainstorming led me to resource to all kind of metaphores.
Then I realized that conducting data mining for business intelligence is not any different from paparazzies chasing down their superstar victims. So along came this name: