World Search Engines and Directories


, , , , ,

A new Minerazzi miner: World Search Engines and Directories (

Find all top generic and specialty search engines and directories from around the World. Search them all by region, country, name, industry, or topic.

Once you find a result, either mine its URL or visit it to keep searching.

Query Examples

Try the following queries with this miner:

[ africa search engine], [ Australia ], [ Yahoo ], [ travel], [ news ], etc


A Cosine Similarity Tool and Companion Tutorial


On Cosine Similarity

Cosine similarity is commonly used in data mining and information retrieval as a measure of the resemblance between data sets; i.e. how similar or alike these are. It is an important concept used in Vector Space Theory and affine models.

While there are many tools and tutorials on the subject out there, quite often what is missed from these is a clear explanation of the underlying meaning and nature of the variables involved.

Did you know that centering data sets by subtracting the corresponding variable means can and will impact the angle between them, and therefore, the corresponding cosine similarity? Did you know that said change can be used to assess whether the variables are orthogonal, uncorrelated, or both/neither? Do you know what a cosine similarity of zero actually mean?

All these and similar questions are addressed with our cosine similarity tool and companion tutorial. Access them now at



To use the tool simply enter two data sets and select how these are delimited. Then check whether you want to compute their cosine similarity by using them as given (raw mode) or by subtracting their mean (centered mode). To interpret the results from either mode, read the companion tutorial.

The CRAN Miner: Find R web packages and more.


, , , , ,

The CRAN Miner is available now at

Search for R web packages from the Comprehensive R Archive Network. Recrawl or search inside a result to find R resources relevant to a given R package.

R is a free software environment for statistical computing and graphics. It is one of the preferred programming languages of big data researchers.

Some news


, , ,

Just some few news from Minerazzi:

1. All miners now give users the option of including/excluding results matching parts of a word (substring matching). Try one now at Useful for matching plurals, word roots, or word derivatives.
2. Slowly, but steadily, new tools and tutorials have been added to the platform.
3. We are currently building a new miner for accessing the Comprehensive R Archive Network (CRAN).

JStatMiner – A new miner at Minerazzi


, , , ,

JStatMiner is a new miner built with Minerazzi and available at

Use it to mine all top statistical journals from around the World!

Whether you are a researchers, librarian, teacher, or student, now you can have an easy access to a huge collection of popular and hard-to-find statistical journals.

Statpacks: A New Miner


, , , ,

This is a new miner built with Minerazzi. Use it to find all kind of statistical tools and software packages for research, teaching, or business. Search by company or product name.


Find Stata, SAS, SPSS, or similar software solutions. Access Mac, Microsoft, IBM statistical packages, or links pointing to rare research tools.


Available now at Visit for other equally interesting miners.

The Standardizer: A New Tool at Minerazzi


, , ,

This tool transforms a data set into z-scores and one/two-tail percentiles.

The tool also computes central tendency and dispersion measures like means, medians, standard deviations, variances, coefficients of variation, and ranges.

Available now at

The Self-Weighting Model Tutorial: Parts 1 and 2


, , , ,

This is a two-parts tutorial on The Self-Weighting Model (SWM), available at

In part 1, we show how the model provides a solution to the problem of computing valid averages from non additive quantities. In part 2, we derive the model and show how it could be used for a broad range of engineering, science, information retrieval, and data mining problems where conditional weighted means must be computed.

For other tutorials, visit

A Quantile-Quantile Tutorial

We have restored our quantile-quantile tutorial from our previous site and is now available at

This is an Excel-based tutorial.

Quantile analysis by means of constructing quantile-quantile plots (q-q plots) is a technique for determining if different data sets originate from populations with a common distribution.

If the common distribution is normally distributed, a q-q plot can be used to test for normality. The technique is applicable to a wide range of data mining and engineering problems.