Some nice features added today to the Image Crawler, to please requests from current users. Thank you for the feedback.
Some nice features added to the Image Crawler
12 Friday Apr 2013
Posted in Data Mining, IR Tools, Programming
12 Friday Apr 2013
Posted in Data Mining, IR Tools, Programming
Some nice features added today to the Image Crawler, to please requests from current users. Thank you for the feedback.
11 Thursday Apr 2013
Posted in Data Mining, IR Tools, Marketing Research, Programming
The Images Crawler has arrived at Mi Islita.com. An easy way to view images from Web documents. Use it to view images from newspapers, forums, social networks, etc. Enjoy it!
08 Monday Apr 2013
Posted in Data Mining, IR Tools, Miscellaneous, News, Programming, Software
Puerto Rico Daily News & Image Searches. Driving traffic to Puerto Rico’s best media sites. The fastest way to find news and images relevant to Puerto Rico. Coming soon to http://www.miislita.com
I think this can be applied to many knowledge domains without making the same mistakes from similar services across the Web. For now, baby steps.
26 Tuesday Mar 2013
Our popular tool, The Web Crawler, is back! This new iteration of the tool is a lot more faster because is based on a different strategy: extractions of HREF sets and then refinement of these to get URLs that are qualified for status checks. So the tool also works as a link checker.
Another advantage of the above strategy is this:
A set of HREFs may contain information about absolute and relative URLs, visible and hidden links, internal and external file paths, email addresses, css files, local javascript calls, and anchors (#). A subset of HREFs can also be used as pointers to anchor text information. So, a set of HREFs can be more informative than a mere set of links or URLs as it subsumes both.
22 Friday Mar 2013
Posted in Data Mining, IR Tools, Marketing Research, Programming, Scripts, Software
We have added to our email crawler
(http://www.miislita.com/email-crawler/email-crawler.php)
the following features:
1. A User Tracking Session (just find the link and click on it) to view current user data.
2. Search for user email addresses in the top search engines and social networks
Give it a try.
We plan to add the tracking session feature to all our pages. This feature is now visible to gives you an idea of how it works, but can be invisible to users. Geo and search data can be added in a snap.
Why pay monthly fees when you can have your own tracking service, customized to your needs?
01 Friday Mar 2013
Posted in Data Mining, Programming, Software
The Unicode Miner, a tool for data mining characters and symbols from the Unicode System is now available.
27 Wednesday Feb 2013
Posted in Data Mining, Programming
We are building a nice Unicode mining tool.
BTW, here is a list of nice Unicode myths.
22 Friday Feb 2013
Posted in Data Mining, Fractal Geometry, IR Tools, Programming
With The Color Miner, we have programmatically reconstructed the classic Windows 16-color VGA palette with few basic algorithmic rules.
We also found that iterating the 16-color VGA palette, with these rules, the result converges to a 42-color palette. As given below.

Advantages?
The algorithms utilized allow one to:
For additional information and to verify these findings, visit the The Color Miner page.
10 Sunday Feb 2013
Posted in Data Mining, Programming
I was struggling with how to convert a long two-column (A and B) Excel data set into a PHP associative array format and then found out how easy is this with the CHAR() built-in function. This is what I did.
1. For the data in the A1 and B1 cells, I simply entered the following formula into the C1 cell and bingo!
=CHAR(34)&A1&CHAR(34)&CHAR(61)&CHAR(62)&CHAR(34)&B1&CHAR(34)&CHAR(44)
where the unicode numbers 34, 61, 62, and 44 are used to render the double quote, equality, greater than, and comma symbols.
2. Then I just pasted the formula in the remaining cells of column C.
3. Finally, I pasted the result into a php text file, between the parentheses of
$arr=array( );
and removed the last comma from the last array element.
I know there is an easier way by writing a short macro, but this worked for me just fine.
I hope this saves some time to others.
03 Monday Dec 2012
Posted in Data Mining, Programming, Scripts
I’m working on a data mining framework library and hit an old problem revisited many times across the Web: how to display HTML code on a page. Thanks to JQuery Syntax Highlighter and similar editors, this is easy to do.
This prompted me to search for even a simpler solution without all the nice features of these editors, but one that just does the mere work of showing code. I came up with the following strategy, tested on recent versions of IE, Opera, Chrome, Safari, and Firefox. I haven’t tested it with older browser versions and cannot claim it will work with those. More tests are necessary to please other programming/validation requirements.
1. Place HTML code into a hidden textarea.
2. Show it with the following javascript one-line snippet
document.getElementById(‘id’).innerHTML=”<pre><code>”+document.getElementsByTagName(‘textarea’)[index].innerHTML+”</code></pre>”;
Optionally, both the textarea and snippet can be placed right before the closing body tag.
In the snippet, id is the identifier given to a tag container used to show the html code. This container can be a division placed at will in the body section of the document and CSS styled to your heart needs. The index parameter is the index value of the textarea; i.e. [0] if there is only one textarea in the page, [1] if it is the second textarea, and so forth.
When the textarea is replaced by a noscript tag, the snippet seems to work with the above browsers, except with firefox (particularly if the html code itself contains script tags).
PS.
If you care about semantics and SEO, simply move the <pre> and <code> tags from the snippet to the division container. Then add the id to the code tag instead of to the division container, like this:
<div><pre><code id=’id’></code></pre></div>
This makes the snippet even shorter, like this:
document.getElementById(‘id’).innerHTML=document.getElementsByTagName(‘textarea’)[index].innerHTML;
This isolates programming from presentation, which is always a good practice.
PS2 I added single quotes in snippet, so if you simply copy/paste it won’t flag you an error.