The Almost Binary Heuristic


, , ,

Yet another bond order calculation heuristic that still fail.

I describe this new heuristic, The Almost Binary Heuristic, in the “What is computed?” section of the bond order calculator at

The tool itself is described in the Bond Order Calculator Tool post at

The Almost Binary Heuristic is aimed at computing bond orders of diatomic species having up to 20 electrons in a straightforward manner. The heuristic can be used to reproduce the results of our bond order calculator tool.

I’ve included the php script that generates the so-called “phone number” trick for computing bond orders of diatomic species with up to 20 electrons.

Feel free to copy/rewrite the code with your favorite programming language or use it to build your own bond order calculator tool. Just please keep the credit lines in place. 🙂


Vector Space Explorer Tool


, , , , , ,

Vector Space Explorer Tool is a new tool from Minerazzi, available now at

VSE is aimed at exploring combinations of local (e.g., FREQ, ATF1,…) and global (e.g., IDF, IDFP,…) term weighting schemes for documents and queries. All kind of combinations can be easily explored.

The tool lists results in decreasing order of cosine similarity scores, with or without implementing stopwords removal and parametric corrections.

VSE was developed for computer science students and those interested in information retrieval systems so they can learn how IR systems work.

First-time users may want to try the examples provided by pressing the Try This Example button from the tool. It is possible to cycle through the examples by repetitive pressing this button. Some of the examples list top titles obtained by querying commercial search engines.

Accepting the default settings instructs VSE to remove stopwords and implement the FREQ Model, also known as the Term Count Model. This is one of the simplest vector space implementations where term weights are mere raw frequencies (term counts).

Revamping the Cosine Similarity Calculator Tool


, , ,

One of the most interesting problems in data mining and cluster analysis relates to the transformation of similarities into distances without breaking the triangular inequality condition for a distance metric.

Some of the transformations found in the literature are based on heuristics and tricks of the trade, or based on assumptions applicable to a given knowledge domain. This topic is discussed in our tutorial on distance and similarity (

We have incorporated to our Cosine Similarity Calculator a simple methodology that easily transforms cosine similarities into distances while obeying the requirements for a distance metric. It all boils down to mean-centering the variables.

Check our revamped and improved Cosine Similarity Calculator tool now at

RAR Parser |An RSS, ATOM, and RDF Parser


, , , , , ,

The RAR Parser is a tool that lets you read RSS, ATOM, and RDF news feeds, without subscribing. It is available at

A practical example: By submitting the MIT Health Sciences and Technology news feeds url, the following news relevant to COVID-19 were obtained, among others. Results might change as time evolves.

MIT scientist helps build Covid-19 resource to address shortage of face mask

MIT initiates mass manufacture of disposable face shields for Covid-19 response

An experimental peptide could block Covid-19

3 Questions: The risks of using 3D printing to make personal protective equipment

Coronavirus (COVID-19) Miner



Just launched: The Coronavirus COVID-19 Miner

Resources will be added to the index as the coronavirus pandemic evolves. Find databases, research articles, and resources relevant to the coronavirus (COVID-19).

Build your own curated collection by extracting links from search results as well as scripts, contacts, and other type of data from same results.

You can also use this miner news channel to find news, stories, alerts, updates, and more from the CDC and other trusted sources.

The Ideal Gas Law Oracle


, , ,

Most online calculators reduce the user experience to returning results in response to some input data. As a tools developer, I know this quite well.

I’ve been asking to myself, “Why not use a different approach and build calculators that behave as oracles?” By an oracle I mean a black box that converts the input data into a user’s question (the query) and the output (the response) into the answer to the question.

To truly behave as an oracle, said tool should also take care of most of the tasks a user is expected to do. The tool should also “react” to mistakes made by a user.

With that in mind, here is my first attempt at turning an online calculator into an oracle-like tool: The Ideal Gas Law Oracle (

This one even takes care of significant figures and unit conversions. Chemistry teachers and students might find it useful. For instance, teachers can use the tool to add content to lecture notes, quizzes, and tests. Students can use it to double-check exercise results from homeworks and textbooks.

Weighted Averages of Correlation Coefficients


, ,

There is a great discussion on weighted averages of correlation coefficients at

My most recent comments there are given below.

“The main reason for not averaging correlation coefficients in the arithmetic sense follows.”

“Correlations coefficients cannot be averaged in the arithmetic sense as they are not additive in the arithmetic sense. This is due to the fact that a correlation coefficient is a cosine, and cosines are not additive. This can be understood by mean-centering a paired data set and computing the cosine similarity between the vectors representing the variables involved.”

“If a paired data set violates the bivariative normality assumption (often overlooked, as Seifert correctly asserted), that worsens the picture. However, even if it doesn’t violates bivariative normality the computed average is a mathematically invalid exercise. If a meta analysis study is based on these averages the results can be easily challenged on these grounds.”

“Sample-size weighting is a good start, as Seifert asserted. We can certainly do better. We may compute self-weighted averages from one, more than one, or all of the constituent terms of a correlation coefficient, to account for different types of variability information present in the paired data, which otherwise might be ignored by simply sample-size weighting or applying Fisher Transformations. Which self-weighting scheme to use depends on the source of variability information to be considered (”

On the Internet of Senses and Cyborg Organoids


, ,

In a previous post on the Internet of Senses,, the bell was rung. It is a matter of time for the human-electronic interfaces and hypersenses revolution.

Cyborg organoids are real and are here. Below is one bit of the blue print:

“Cyborg Organoids: Implantation of Nanoelectronics via Organogenesis for Tissue-Wide Electrophysiology”.

Links relevant to this reseach follow:

Click to access 697664.full.pdf

“Cyborg” Human Organ Grown in a Dish

Watch the beating of a cyborg heart here:

Internet of Senses Miner


, ,

Are you ready for the Internet of Senses? Many reports set the year of 2030 as the definitive one for it, though kind of blueprints are already buzzing around. If not ready or to find relevant resources on the topic, we have just launched the Internet of Senses Miner ( It is a small corpus that, as time goes by, we hope to improve.

Hypersenses are here to stay, along with new technologies, away from keyboard searching, social networks, and more into mind retrieval and cyborgsocials. That looks like the next marketing frontier. New technologies, academic research opportunities, and grants are a corner ahead. Don’t be left behind.

2018 –
2017 –
2016 –
2016 –
2015 –
2010 –