If you are a chemist, biodesigner, or a researcher working in other fields, eventually you may need to fit a paired data set to a polynomial regression model. You could use software to do that, or build your own solution. This tutorial is aimed at those interested in the latter. Access it now at
Three different methods for implementing polynomial regression are described. Teachers and students might benefit from the tutorial since the calculations can be done with a spreadsheet software like Excel, by writing a computer program, or with a programmable calculator.
Puerto Rico Statistics Institute (Instituto de Estadísticas de Puerto Rico) has a nice introductory manual on R, written by Dr. Orville M. Disdier. Check it out at
This is Part 2 of a tutorial series on the nonadditivity of correlation coefficients. This time we discuss Fisher r-to-Z and Z-to-r transformations and the risks of arbitrarily implementing these.
Misusing the transformations can have a detrimental impact on significance testing, confidence intervals, reported standard errors of correlations, and meta-analysis.
This is Part 1 of a tutorial series on the nonadditivity of correlation coefficients. We demonstrate why it is not possible to arithmetically add, subtract, and average Pearson’s r or Spearman’s rs.
The article is available at
07-04-2017 update: In page 1 the line for the Beta1 should read “is the slope of a simple linear regression model”. My fault. Fixed today along with few other nuances.
Here is a python-based search engine with an implementation inspired on one of our papers at the old Mi Islita.com site, now a search engine on Puerto Rico.
This is an Excel .xlsx file for reproducing Table 1 of our tutorial on Quantile-Quantile Plots. Now anyone with Excel installed can play and explore this simple technique aimed at determining if a data set is normally distributed.
To download the Excel file, access the most recent update of the tutorial, available at
We also removed few extra “)” typos that were undetected in previous copies.
We have restored, expanded, and updated our tutorial on the BM25 Extension to Multiple Weighted Fields Model, best known as BM25F. It is now available at
Active links were also added to the References section.
It greatly simplifies the work of those dealing with data transformation problems.
Enjoy it.
About the tool:
This tool lets you transform a data set by applying one or more Box-Cox Power Transformations. The research articles given in the References section of the tool cover this topic.
To use the tool enter one data set value per line. End each line by hitting the Enter key so these are recognized as individual entries.
To apply multiple transforms, check preset field.
To apply a single transform, uncheck preset field and enter a p value (p ∈ [-2,+2]).
Submit or reset form as needed.
10-02-2016 Update:
We added a new feature to the tool so it now lets users return all non-negative transforms.
We have restored, refined, and updated this tutorial and added some historical background.
Abstract
This is a light tutorial on OKAPI BM25, a Best Match model where local weights are computed as parameterized frequencies and global weights as RSJ weights. Local weights are based on a 2-Poison model and the verbosity and scope hypotheses and global weights on the Robertson-Spärck-Jones Probabilistic Model.
Introduction
In the early 80s Gillian Venner, Nathalie Mitev, and Stephen Walker (1985, 1987) conducted research work that led to the design and evaluation of online public access catalogs (OPACs) at Polytechnic of Central London (PCL).
The project initial phases spanned from November 1982 to May 1985. The prototype was named OKAPI (Online Keyword Access to Public Information). As Mitev (1985) wrote:
“Designing an online public access catalogue [OPAC]: Okapi, a catalogue on a local area network [LAN] is the final report of a two-year research project ”Microprocessor networking in libraries” which was funded by the British Library and the Department of Trade and Industry, and based at the Polytechnic of Central London.”
“The aim was to produce an OPAC on a LAN, that would be readily usable without training or experience, without sacrificing effectiveness or being tedious for experienced users.”
“The result was a functioning prototype OPAC called Okapi, which has a number of distinctive features: use is eased by coloured keys and a lack of jargon; the system uses search decision trees to select a suitable action at each stage of a search, and it performs automatic Boolean and hyper-Boolean functions where appropriate. The OPAC was installed and evaluated in one of the Polytechnic site libraries.”
This is an updated version of a tutorial on the Robertson-Spärck-Jones Probabilistic Model.
It is available now at
The model computes global weights, known as RSJ weights, based on Independence Assumptions and Ordering Principles for probable relevance. The model subsumes IDF and IDFP as RSJ weights in the absence of relevance information.
Enjoy it.
09-26-2016 Update: A new section was added to the tutorial before the Conclusion section. References were added accordingly. Few lines were edited.
PS: I corrected the original publication date to read “Published: 03-30-2009” which is the correct date. My fault.