Excel File for Quantile-Quantile Analysis


, , ,

This is an Excel .xlsx file for reproducing Table 1 of our tutorial on Quantile-Quantile Plots. Now anyone with Excel installed can play and explore this simple technique aimed at determining if a data set is normally distributed.

To download the Excel file, access the most recent update of the tutorial, available at


We also removed few extra “)” typos that were undetected in previous copies.

Have a great Q-Q day!🙂

Text Streamer Tool


, ,

The Text Streamer is a new tool available at


Streamline text by removing non-printable or encoded characters and multiple spaces.

The tool converts non-printable characters, including tabs, returns, newliners, and multiple spaces into single spaces. User can opt to remove all encodes. These are characters encoded in %, decimal, and hexadecimal notation.

To use the tool, just enter your input text and submit form. To remove all encodes, check the form checkbox. Click the output text to select it. Copy/paste it as usually you would.

It comes handy for users that need to copy/paste streamlined text (plain text) from one file type to another or post it through html forms residing in blogs, discussion forums, and social network sites, or any site for that matter.


BM25F Model Tutorial


, ,

We have restored, expanded, and updated our tutorial on the BM25 Extension to Multiple Weighted Fields Model, best known as BM25F. It is now available at


Active links were also added to the References section.

Enjoy it.

Nobel Prize Laureates Miner


, , , , , ,

This is a new Minerazzi.com miner, available now at


Use it to find resources relevant to laureates of the Nobel Prize. Search by laureates, country, discipline, or field. Find Nobel Prize Laureates in Chemistry, Physics, and other fields.



What is a Precision Matrix?


, , , ,

For completeness, we have added the following content to the Exercises section of the Matrix Inverter tool available at


and mentioned in the post


The following information was found online (Quora, 2013, StackExchange, 2013a; 2013b).

Let Ʃ be a covariance matrix and Ʃ-1 an inverse covariance matrix, commonly referred to as the precision matrix.

With Ʃ, one observes the unconditional correlation between a variable i, to a variable j by reading off the (i,j)-th index.

It may be the case that the two variables are correlated, but do not directly depend on each other, and another variable k explains their correlation. By computing Ʃ-1 we can examine if the variables are partially correlated and conditionally independent.

Ʃ-1 displays information about the partial correlations of variables. A partial correlation describes the correlation between variable i and j, once you condition on all other variables. If i and j are conditionally independent then the (i,j)-th element of Ʃ-1 will equal zero. If the data follows a multivariate normal then the converse is true, a zero element implies conditional independence.

In general, Ʃ-1 is a measure of how tightly clustered the variables are around the mean (diagonal elements) and the extend to which they do not co-vary with the other variables (non-diagonal elements). The higher the diagonal elements, the tighter the variables are clustered around the mean.

So far I found that to be, in my opinion, the simplest explanation on the subject. So there you have a good application for our Matrix Inverter tool.


Matrix Inverter: A Matrix Inversion Tool


, , ,

This is a new tool, available now at


The tool inverts a square matrix using Gauss-Jordan Elimination.

A matrix filled with zeroes is returned if the input matrix is non-invertible. This is used as a crude signal.

A non-invertible square matrix, also called singular or degenerate, is one whose determinant is zero.

The tool can be used to double check calculations of small matrices or as a demo resource.

Have a nice invertible day. 🙂

Box-Cox Power Transformations Tool


, , , ,

This is a new statistical and data mining tool, available now at


It greatly simplifies the work of those dealing with data transformation problems.

Enjoy it.

About the tool:

  • This tool lets you transform a data set by applying one or more Box-Cox Power Transformations. The research articles given in the References section of the tool cover this topic.
  • To use the tool enter one data set value per line. End each line by hitting the Enter key so these are recognized as individual entries.
  • To apply multiple transforms, check preset field.
  • To apply a single transform, uncheck preset field and enter a p value (p ∈ [-2,+2]).
  • Submit or reset form as needed.

10-02-2016 Update:

We added a new feature to the tool so it now lets users return all non-negative transforms.

OKAPI BM25 Tutorial


, , ,

We have restored, refined, and updated this tutorial and added some historical background.


This is a light tutorial on OKAPI BM25, a Best Match model where local weights are computed as parameterized frequencies and global weights as RSJ weights. Local weights are based on a 2-Poison model and the verbosity and scope hypotheses and global weights on the Robertson-Spärck-Jones Probabilistic Model.


In the early 80s Gillian Venner, Nathalie Mitev, and Stephen Walker (1985, 1987) conducted research work that led to the design and evaluation of online public access catalogs (OPACs) at Polytechnic of Central London (PCL).

The project initial phases spanned from November 1982 to May 1985. The prototype was named OKAPI (Online Keyword Access to Public Information). As Mitev (1985) wrote:

“Designing an online public access catalogue [OPAC]: Okapi, a catalogue on a local area network [LAN] is the final report of a two-year research  project ”Microprocessor networking in libraries” which was funded by the British Library and the Department of Trade and Industry, and based at the Polytechnic of Central London.”

“The aim was to produce an OPAC on a LAN, that would be readily usable without training or experience, without sacrificing effectiveness or being tedious for experienced users.”

“The result was a functioning prototype OPAC called Okapi, which has a number of distinctive features: use is eased by coloured keys and a lack of jargon; the system uses search decision trees to select a suitable action at each stage of a search, and it performs automatic Boolean and hyper-Boolean functions where appropriate. The OPAC was installed and evaluated in one of the Polytechnic site libraries.”

Want more? Read the tutorial at


Mayaro Virus (MAYV) Miner


, , , , ,

This is a new Minerazzi.com miner that is available now at


Research the scientific literature for the Mayaro Virus (MAYV). Read research and news from CDC, NIH, WHO, and other sources. Search by location, site, or health organization. Recrawl search results to build your own curated collection on MAYV.

This is a new disease with symptoms similar to Chickungunya (CHIKV) but stronger. It is now moving to the Caribbean and soon to PR and Florida.


Probabilistic Model Tutorial


, , ,

This is an updated version of a tutorial on the Robertson-Spärck-Jones Probabilistic Model.

It is available now at


The model computes global weights, known as RSJ weights, based on Independence Assumptions and Ordering Principles for probable relevance. The model subsumes IDF and IDFP as RSJ weights in the absence of relevance information.

Enjoy it.

09-26-2016 Update: A new section was added to the tutorial before the Conclusion section. References were added accordingly. Few lines were edited.

PS: I corrected the original publication date to read “Published: 03-30-2009” which is the correct date. My fault.