Archive for the ‘Miscellaneous’ Category

Random notes prior to 4th July weekend

July 3, 2009

As the 4th of July weekend approaches, here are some notes before hitting to planet oblivious.

1. Yesterday we had an interesting business entrepreneur meeting with the CIO of the Government of Puerto Rico at El Palacio Rojo, Fortaleza.

2. IRW should be out by Monday. Main article: Data Mining Texting.

3. Only monkeys still believe in KD Myths. Ha, Ha.

Microsoft, Inter-Metro to Co-Launch a MIC

April 29, 2009

This afternoon, Microsoft in partnership with The Interamerican University of Puerto Rico, Metropolitan Campus (Inter-Metro) will announce that they are officially co-launching the Microsoft Innovation Center (MIC) of Puerto Rico.

This will be the first MIC in the region. A two stores building has been abilitated within the Inter-Metro campus for this project. As member of the MIC steering committee, I have been invited to the presentation by President, Manuel J. Fernos.

They have also provided me with office and lab space in the MIC building to put together the Internet Business Development Center (IBDC). The objectives of the MIC is the development and commercialization of ecommerce-related software tools. Emphasis will be given to egovernment and ebusiness solutions.

It looks like I will split my schedules between being the IBDC principal investigator, MIC meetings, doing research at Inter-Metro, teaching at PUPR, and writing IRWs. These are exciting news. Let see how things go, especially with the other great news  that PUPR’s ECE&CS department has been accredited by NSA as a CAE.

Matrix Multiplication in Excel

January 7, 2009

The QA section of the current issue of our IRW Newsletter has a practical piece of knowledge.

Question

In Excel, how do you multiply any two matrices M1 and M2 to get a third one M3?

Answer

I assume you know how to define an array in Excel (i.e., the first and last cell of a selected rectangular region defines an array).

Thus, let M1 be a matrix of r1 rows and c1 columns and M2 be a matrix of r2 rows and c2 columns. Let r1 = c2. Multiplying M1 times M2 results in a squared matrix M3 of r1 rows and c2 columns; i.e.

M1 * M2 = M3

To carry out this operation in EXCEL, do this:

1. Open a spreadsheet and enter M1 and M2 numerical data. Next, select an array region (r1 x c2) by dragging your mouse on the spreadsheet. This will be M3.

2. Type = MMULT(Array1, Array2) in the (fx) field. Next, Press F2.

3. Finally, press Ctrl + Shift + Enter and witness the magical black box of EXCEL.

An example is provided in Figure 1 of the newsletter.

From time to time, more complex math operations will be described in future issues of IRW.

It is that time of the Year

December 16, 2008

It is that time of the year:

Layoffs:

It is that time of the year where companies give pink slips. Yahoo layoffs started early this month. Sun Micro slashed 6,000, Sony 8,000, and CBS plans to send pink slips at TV.com, MP3.com, CNET, and Gamespot.com.

http://www.newsoxy.com/cbs/article11488.html

True colors showing:

It is that time of the year where companies show their true colors. According to WSJ, search engines and ISPs are betraying network bandwith neutrality.

http://online.wsj.com/article/SB122929270127905065.html

SSN Myths

December 8, 2008

From time to time we hear of some urban legends and myths in connection with social security numbers (SSNs).

One myth has it that SSNs label citizens based on their race or origins. Another myth is that a number can be decoded to spell out names. Let’s debunk these myths.

Regarding the first  myth, according to the SS Administration site (http://www.ssa.gov/history/ssnmyth.html):

“Apparently due to the fact that the middle digits of the SSN are referred to as the “group number,” some people have misconstrued this to mean that the “group number” refers to racial groupings. So a myth goes around from time-to-time that encoded in a person’s SSN is a key to their race. This simply is not true.”

“As should be clear from the explanation of the SSN numbering scheme, the “group number” refers only to the numerical groups 01-99. For filing purposes, the “area numbers” are broken down into these numerical subgroups. So, for example, for area numbers starting with 527 there would be 99 subgroups, one for every number starting with 527-01, and one for every number starting with 527-02, and so on. This was done back in 1936 because in that era there were no computers and all the records were stored in filing cabinets. The early program administrators needed some way to organize the filing cabinets into sub-groups, to make them more manageable, and this is the scheme they came up with.”

“So the “group number” has nothing whatever to do with race.”

Still, some folks like this Google user heard that the fifth digit of a SSN is odd for whites, but even for african-americans and minorities. Not true.

Regarding the second myth. Some have claimed that flipping a SSN might closely spell or encode a name, word, message, etc.

For instance in Feb of 2008, Google won the Dylan Stephen Jayne v. Google Founders lawsuit. Jayne claimed that his social security number upside down spelled ‘Google’. He was seeking a $5 billion compensation.

The United States Court of Appeals for the Third Circuit on appeal from the United States District Court for the Middle District of Pennsylvania(PDF) dismissed the case and resolved in favor of Google that:

“As explained by the District Court, Google and its founders are not state actors, and Jayne’s allegation concerning his coded social security number does not constitute a violation of the Constitution or federal law. We also agree that any amendment of the complaint would be futile.”

I don’t know about you, but to me and based on pure speculations and font-family, flipping upside down ‘Google’ resembles 216009. 

But, there is a problem: this sequence can appear anywhere in a candidate SSN (beginning, end, etc).

True that we can narrow down possible sequences since according to the SSN site the middle two digits cannot be ‘00′ in order to be a valid SSN. With all, three missing numbers are needed to complete a 9-digit sequence. Can you guess how to obtain these?

Still, this guessing exercise does not amount to a case. When it comes to guessing/gaming, you have the right to guess/game all you want to guess/game.

Now for those that believe in things like Numerology, Kabbalah, etc, 216009 can be reduced to 18 and then to 9. Upside down 9 resembles a 6 or a G, which is the first letter in Game, Gaming, Google, and God.

I have placed this post in the SEO Myth category just because the underlying nature of the above myths resembles the dumb nature of many of the myths promoted by SEOs.

Don’t yang about it. Goodbye, Yang.

November 19, 2008

Finally, Yahoo!’s CEO Jerry Yang is gone. Unfortunately, statements like the following show he was so disconnected from reality:

“despite the external environment we face, the fact remains that yahoo! is now a significantly different company that is stronger in many ways than it was just 18 months ago.”

http://news.cnet.com/8301-13578_3-10100809-38.html

Ha, Ha.

The only thing strong about Yahoo! is that:

they are a strong sale.

they are a strong target.

they are strongly looking forward at pre-Christmas layoffs.

I have seen this so often: make a buy offer to a competitor. If he refuses to sell, just wait and buy it later for less.  Microsoft is good at playing this game.

Link Sellers from 1995

June 20, 2008

I was looking for the oldest evidence of marketing firms formally selling links and came across this one from 1995 that predates Google and most current search engines. Back then the Internet-on-a-Disk newsletter was hot.

Their November 1995 issue http://bubl.ac.uk/archive/journals/ioad/n1395.htm  reports this:

A NEW KIND OF ADVERTISING — Webconnect http://www.worldata.com/webcon.htm These folks act as hyperlink brokers. They have signed up hundreds of Web sites. They go to potential advertisers and offer them a package deal. For $X per month, you can have hyperlinks to your Web site from Y Web sites which attract the kinds of audiences you want to appeal to. The revenue is shared with the Web sites, which have the right to refuse any advertiser they don’t feel is appropriate for them.

They contacted us about a month ago, and now already we have our first “advertiser” — The Encyclopedia Britannica. For including a hypertext link to their site (with a little graphic), we receive $45 a month. That’s not bad considering the run our entire Web site on free space that we get with our $29 a month SLIP account with TIAC. So our one advertiser more than pays for our Internet access and our Web space. And the advertiser is a company we’re glad to help promote – they have a site that we have wanted to point to anyway as an important educational resource (http://www.eb.com)

*********************************
“HIT-VITATIONS” – WHAT’S GOING ON? AND HOW DO YOU PLAY THIS GAME? by Richard Seltzer, B&R Samizdat Express
I never expected that blatant commerical advertising would work on the Internet. The medium is much better suited for providing detailed information to people who want it, when they want it, and how they want it. Surprisingly, some of the much travelled on-ramp sites like Netscape are showing impressive results from “hyper- banner” advertising. I recently spoke with Kathleen Gilroy of Kathleen Gilroy Associates, a distance education company in Cambridge, Mass.. In exchange for sponsorship of an Internet training program, she got a hyperlinked “banner” on the Netscape site. The result was 500,000 hits on her Web site in the first month (http://www.kga.com).

Well, if you learn anything from dealing with the Internet and human behavior there, it’s that you’ve got to expect the unexpected and adjust quickly to change.

So is advertising “in” now? Is that the way to go?

I’ve heard people comparing hits or visits at a Web site to responses to a direct mail campaign. That seems far-fetched — not the right ballpark, not the right order of magnitude in terms of predicting audience behavior.

The first-time visitor who clicks to your site by way of a hyper-banner does so on random impulse. You’ve generated some street traffic by making it easy for people to impulsively move in your direction from some other site — a click costs the user little time and almost no effort — little thinking is involved — curiosity is enough.

When you buy an ad on television or in a newspaper, you are buying an opportunity to catch the attention of an established audience. When you buy a hyper-banner on the Internet, you buy an opportunity to induce people to come to your site and be (at least once) part of your audience. You have not yet begun to catch their attention.

A reminder and invitation to check a website (not a direct ad for a product or service) is a step or two removed from traditional advertising. It is audience acquisition for another program.

Once they “hit” your site, you have an opportunity to catch their interest, to provide them with useful information or an enjoyable experience or a discussion with people of like mind. You have earned a chance to give them good reason to come back again and again to your site. If, at that point, you simply shove a blatant ad in their face or ask them to fill out a long form before you let them see or do anything else, you could be throwing away that opportunity.

In other words, a hyper-banner is a “hit-vitation,” an invitation to hit another site. And the success of this approach does not mean that blatant advertising is thriving on the Internet.

In the Hit-vitation business, you are in do-it-yourself mode. Your Web site is the equivalent of a publication or a broadcast station – run by you. You need to build an audience — by serving an audience — before you can expect to get results. And raw hits – randomly gleaned from pointers and paid-for banner links — are not an audience, they are just an opportunity to build an audience.

Generating hits by way of hyperlink invitations is analogous to acquiring a list of prospects for one-time direct-mail use. These people have not yet even seen, much less read, an ad or marketing material, and the vast majority, once at your site, will do the equivalent of throwing your marketing material in the wastebasket. In other words, this is a step removed from direct mail responses, and marketers should set their expectations of results accordingly.

At this point in the evolution of commerce on the Internet, the experience of the user with a Web site is simply too complex to reduce to statistics. For the long term, success should be measured not by hits or visits but by some index of user loyalty — how likely they are to retun again and again. For today, remember that if you pay for a banner/link, you are sending out invitations to anyone and everyone to click on over to your site and take a look. And what that’s worth to you depends on what you have at your site — how useful and compelling people find it.

I still believe that the most interesting opportunities on the Internet are likely to come from serving audiences rather than selling advertising.

In my ideal model, you provide a place where people can interact with one another about matters of common interest; you provide related free information and useful pointers; and once you have built an audience and interact with those people regularly, you begin to provide them with services and products which they need. The better you serve them, the more likely you are to be successful. And in this mode very small operations could be very profitable and very beneficial as well.

****************************************************
REACTIONS TO “HIT-VITATIONS”
by Tom Camp, camp@zeke.enet.dec.com

Some interesting thoughts regarding “hit-vitation”. Another way to view these interesting “sign-posts” is from the perspective of someone driving down a city street loaded with signs for organizations (e.g. churchs, clubs, etc.), businesses (stores, commercial sites, etc.) and leisure activities (theatres, parks, amusements, etc.).

The Internet allows individuals to return to first days of driving (a.k.a. teenagers) when “cruising” in and of itself was compelling. While cruising, we looked at all the signs. They were new, exciting and had never been seen from the drivers seat. A great way to just enjoy ourselves as we thrilled at the freedom. Computers prior to the Internet didn’t allow us much freedom, you know. We saw the same view of office applications and accounting programs, spreadsheets, lists, etc.

When we first drove our cars, we may have driven by those signs thousands of times and driven into a few parking lots and browsed in some stores. Slowing to check things out, talking with people on the sidewalk – just enjoying the thrill. The places we checked out had a high degree of relationship to our interests.

As we matured though, driving became routine and lost some of its thrill and excitment. We went from one place to another because we had a purpose. Sometimes that purpose was to browse or loose ourselves for few hours in a Mall or store that we liked, but most often it was guided by a very specific purpose. When driven by such a purpose, every red light, yellow light, traffic jam and small yellow volkswagon in front of us proved a maddening distraction. Eventually we stop only where we have a purpose.

Much of what we’re pursuing with the Internet today is an attempt to match our Internet content and services to purposes which people find compelling in their lives. For the consumer market this will not be an easy task. The business user will benefit significantly in the short term for all the reasons you’ve described before.

Obviously, we need to understand more about the habits and effects of maturation of the Internet driver. I know I still act like a teenager sometimes, clicking and clicking and clicking… But when I’m looking for specific information on a company or a product – I want it NOW (one click away). Long delays (regardless what the cause) drive me to look for a horn to blow or some gesture to make at some faceless Webmaster in the sky. I maintain my hot list and constantly scribble URLs to avoid those long lines.

As a marketeer, I know there is power in this new medium. Measuring its effectiveness will keep us all employeed for many years to come. I agree with your concept that return is important. But as in life on the road, for some sites how often is not so important — pure hits may be. The type of site is a critical component in measuring how successful it is. For example, a site which provides information for a specific event might be effectively measured on total hits, while a commercial site offering a variety of information over time might be better measured by some combination of new hits and returns.

Over time we’ll see an evolution of sites and a maturity of users. As with any new market, niches will evolve that we can’t anticipate today and specialized services will develop to meet these needs. For us the challenge is to keep looking to identify these trends and help characterize them, measure their success and build (as we say) compelling solutions.

Just a few thoughts…

**********************************************************
HOW DO YOU DEFINE SUCCESS? THE REAL RESULTS DIRECTORY

People who run Web sites have many different objectives — from making the world a better place to live, to building a business or both — and hence they have very different definitions of success and methods of trying to achieve it. If you run a Web site and believe that it brings you results, send email to samizdat@samizdat.com and ask for “results.txt”, and we’ll send back a questionnaire. We’ll gather the responses (no hobbies and personal pages please — just sites designed to produce results), and we’ll make them available to all for free on our Web site at http://www.tiac.net/users/samizdat/results.html (Remember, we’re just getting started. There’s not much to see yet.)

We hope that by sharing our experiences we can help one another make better use of this strange and exciting new medium. And at the same time, this is a vehicle for those who run Web sites to let people know what they’re doing and why, and why people should visit.

We’re calling this project “Real Results: The directory of successful Web sites.” Please spread the word.

I am not claiming these are the earliest link brokers and probably they aren’t.

If some have pointers to earliest link brokers, let me know as there is nothing new under the Sun. Nevertheless, it is a great fun reading about how online marketing components started. I always learn something new by reading or lecturing on pieces of online history.

Send me pointers to the original sources, not articles some marketer wrote or compiled about the history of the Internet. I am putting together material for a new lecture titled: The History of Online Marketing.

It is time for Yang to leave Yahoo!

June 19, 2008

After Yahoo’s CEO Jerry Yang blew it with Microsoft and before the most likely August’s proxy fight between the Icahn-Yang sides, more executives are leaving the company. Flickr’s founders Caterina Fake and Stewart Butterfield have joined the exodus of senior Yahoo managers.

Also missing in action are:

Usama Fayyad, Chief Data Officer
Jeff Winer, President of Yahoo’s Network division
Jeremy Zawodny, Software Developer/blogger and well known in SEO/SEM conferences

Meanwhile, Yahoo! is making the same dumbest and stupid mistake many businesses from the 90’s did: slave your revenue model to the revenue model of others (Google, in this case).

Google effectively controlling Yahoo! revenue stream channels and getting away with it. Great.

Schofield reports that TechCrunch is compiling a Then-Now table of formers Yahoo! executives.

Update: TechCrunch reports rumors that three more executives are leaving Yahoo! These are:

Vish Makhijani, the SVP and General Manager of Search
Brad Garlinghouse, head of Communications & Communities at Yahoo (Mail, Groups, Messenger, Flickr, and Zimbra)
Qi Lu, EVP engineering for Search and Advertising Technology Group and chief architect of Panama platform.

Wait! It is getting worse!

Today TechCrunch announced that Joshua Schacter, founder of Delicious, is also leaving Yahoo!.

And in a new twist, Information Week reports that Jason Zajac, general manager of social media at Yahoo is leaving.

 

Microsoft: Putting Dollar Value to Searches

May 22, 2008

According to VNUNET.com,

Microsoft has introduced a service which offers ad-funded cash rebates to customers who search for and buy products.

The Live Search cashback portfolio includes more than 10 million product offers from more than 700 merchants.

Early adopters of the service include eBay, Barnes & Noble, Overstock.com, Sears and Zappos.com.

The cost-per-action (CPA) model is one of the best advertising models around, in which advertisers pay each time a click results in a sale. Combine this with a cash rebate plan for consumers and you have a win-win revenue model.

For those that were born to hate Microsoft, that giant from the software world, sure they will find something wrong with this or any move from Microsoft, simply because comes from Bill Gates. I disagree with these folks, many of which are eager to justify in their minds “the other Microsoft”; that is, the one of the search world: Google.

I remember when GoTo (later Overture) put a dollar value to searches. Many SEOs and average users sworn not to use a search engine that allows competitors and advertisers buy their way to the top. Look around now. Google jumped in front of the parade “a la Microsoft” and few lawsuits later we are where we are.

History repeats itself. Who will jump in front of the parade now?

Cell Phone Spam

May 19, 2008

Cell phone spam: Hum. Nothing new, but it is more prevalent than ever.

Yesterday a local newspaper (El Nuevo Dia) featured the Los spams ahora atacan a los celulares article in which few local sources were inquired on the subject.

Unfortunately they all seem to miss the point.

Telephone companies are indoubtly making money from spam, and quite a lot. So, why kill the money making machine? Duh!

Don’t just take my word. Look around for a second opinion like this one:

Verizon Won’t Help You Filter Out SMS Spam Because It Makes Them Money

If that is not enough, then check why

Angry Customers Sue T-Mobile Over Texting Charges.

Indeed, cell phone spam “is the perfect storm of annoying attributes. It audibly interrupts your life like telemarketing”.

Search Engines Cache in the Times of Drug Busts

May 7, 2008

One nice thing about modern search engines is that these allow users access to cached pages. These are old version pages that reside -often precompressed- in a specific section of their architecture. 

Unless the owner or administrator of a site instructs search engines (via metadata or a robot text file) not to cache a document(s) old versions will be available to the end users via the cache command or via a cache link next to a search result.

This feature comes handy for those that use search engines for intelligence purposes. A lot of useful information can be found by searching for cached documents. At the same times old glorious pages can be become unwanted.

Ask San Diego State University’s Marketing and Communication Department. Out of embarrassment, they just removed the document listed at http://advancement.sdsu.edu/marcomm/features/2006/compact.html in which they feature a role model student (Kenneth Ciaccio), which yesterday was arrested on charges in connection with an on campus drug bust operation.

The page is still showing up in Google’s cache and reflects bad on SDSU and its Compact for Success program. To access this in Google just do a search and click the cache link or enter in the query box cache:url where url is the address of the above document.

 

Pink Keywords: Optimization of Resumes and Job Applications

May 5, 2008

The current slump in the US and PR economy and so many local employers giving pink slips induces me to think of the importance of pink keywords.

These are keywords one would use to optimize resumes and job applications.

Now than ever recruiters, middle management, and HR departments need to look through zillion of resumes, looking for specific clues in the form of pinky keywords. This means that resumes and job applications must be optimized for such terms.

http://career-advice.monster.com/resume-writing-basics/Keyword-Challenge/home.aspx

The best way of finding good pinky keywords consists in selling to employers their own crappy ads and job offers; that is, by scanning employment ads, job offerings, and classifieds relevant to the target position one is interested in and then using the target terms in your own resume. Another thing one can do is to expand these with related or contextual terms; of couse, using those that match your own experience and skills.

I see here an opportunity for ethical SEO companies to provide a valuable and noble service: Pinky Optimization. At the same time I see an opportunity for crook SEOs and spammers to prey on other people’s misfortune. Since many in the seophere have being disposed by fat cats and sold(soul)-outs, these folks are also job searching. Life ironies.

SEOs – Desperate Seeking Clients

April 24, 2008

From time to time I receive unsolicited emails from SEOs offering me their services, to list my site in the major search engines and directories. They often send templates-like automatic messages (”Dear website owner”) and appear not to even bother to check if recipients need the service. 

These SEOs often look desperate and sound like snakeoil sellers and crooks. They even claim to be better than other SEOs.

They often pitch the same crap:

  • “I recently visited your site” (Really? Why then send this crap?).
  • “you are not listed in the top search engines and directories” (Really? How do they know?).
  • “we can increase your traffic by X astronomical amount” (Really? Could you double X for me, please?).
  • “we can help you get top rankings in Google” (Really? For which keywords?).
  • “our link building program” (Really? Read here link exchange and link spam).
  • “we have proprietary crap, blah, blah, …” (Really? Sell it or get a patent!).

I just received one of such emails last night, even when my site is known in the IR/SEO spheres and has been listed for many years in the top search engines and directories, and ranking well.

Dear website owner,

I visited your website and noticed that you are not listed in many of the major search engines and directories. If our company can increase your traffic up to 500% by getting you top ranking results on the search engines such as Google would you be interested? We specialize in link building content writing and programming. We have proprietary techniques that work better and are less expensive than any other SEO firm.

Please let me send you a proposal and show you how we can make your website profitable.

Sincerely,

Christian Frank

2060 AVENIDA DE LOS ARBOLES, STE D
THOUSAND OAKS,
CA 91362-1361 – USA

These are the type of companies that give a black eye to the SEO industry. If SEOs send you this type of crap, I feel your pain. Stay away from their businesses or whatever they claim or seem to offer.

Few Rants: Microsoft, a Conference, and a Database Site

April 11, 2008

I normally don’t rant at this blog about trivial stuff in life since this blog is about IR and search engine research. Today I feel like I want to make an exception. So let see how I can tie few rants about silly every-day things to search engines.

Rant 1

I bought the Home and Student version of Windows Office ($122, through Costco). The learning curve started. I tried to open its case by just pulling off the red tab as suggested. The red tab was detached from the case and still there was no “open Sesame”. I then tried different thing until decided to slice the clear seal at the top of the case with a knife and voila! Nothing like a puertorican solution for a “Made in Puerto Rico” Windows Vista product! Duh!

So the recipe is: (1) get a knife, (2) slice seal, and (3) pull with your fingers the case identations toward your right. The inside case should open.

Out of curiousity I wanted to know if others out there struggled with the design of the case. I ended up googling for how to open windows office case and found this site which discussed the very same problem and the very same solution. I realize I was not alone.

There are now dozen of sites like this one that show users this dumb “how-to”. Many are complaining about the “brilliant” design of the box, which is just an usability and accessibility nightmare.

Read what others at the aforementioned site are commenting. Some there commented that ended up searching for:

open office 2007 box
open vista box
Office Packaging “how to open”
open microsoft office box
how to open MS office 2007 box

Something from the product design side is wrong when soooo many have to Google for just how to open the damn case of a Microsoft product, or of any product for that matter. Some thing is wrong when Microsoft lab rats have to explain online how to open the annoying case.

Rant 2

There is a local conference on information security I was invited to. Down the organization pipeline, something is wrong with a conference when their organizers have to chase for potential presenters one week before the event. I pass and wish them good luck.

Rant 3

There is a local company that created a database-driven site for the upcoming Elections. The problem: how to get politicians and average users to know how to use the technology. Also, the site already needs to be redesigned so it can rank high and gain traffic from search engine users.

All these, kind of belong to the Land of Duh.

Searchmageddon: Microsoft to Buy Yahoo!

February 1, 2008

As we mentioned few days ago,

http://irthoughts.wordpress.com/2008/01/23/microsofts-black-cloud-on-yahoos-seo-tag-clouds/ ,

Microsoft is finally buying Yahoo! Check here:

http://online.wsj.com/article/SB120186786283735047.html?mod=hpp_us_whats_news

As Jeremy Sawodny, I predict many old folks will walk rather than working for Bill Gates while duplicated positions will be eliminated. With already thousand being terminated at Yahoo!, more will soon follow.

The Final Search Battle is coming and its name is SEARCHMAGEDDON.

“And I saw from the mouth of the operating system, and the mouth of the database     and the mouth of the paid advertisers, three unclean spirits… the spirits of marketers, spammers, and hackers working signs, and they go forth to the kings of the whole network, to gather them to battle against the great day of the Almighty GoOGLE.  And Stanford shall gather them together into a place which in digital business is called SEARCHMAGEDDON.” –Internet 2:1, 2008.

Microsoft’s Black Cloud on Yahoo! & SEO Tag Clouds

January 23, 2008

From time to time rumors spread of the black cloud of Microsoft over Yahoo!; i.e., of Microsoft buying Yahoo!. This time things are less cloudy, especially now that Yahoo! is about to cut jobs.

Early this year, Jeremy Zawodny from Yahoo!, wrote:

“Sure, there would be cultural problems, integration challenges, and many people who’d likely walk. But at the end of the day, Microsoft would end up with a much larger set of online services, a better advertising network, and people who know how to build, brand, and market web stuff that people actually use.”

Talking about clouds:

A student asked me about some SEOs claiming that text tag clouds are a kind of LSI technology.

Pure non sense coming from many SEOs, as usual.

These clouds are easy to construct. No LSI is needed:

1. Sort terms from a document or lookup list by frequencies.
2. Normalize frequencies to run between the 0,1 interval.
3. Use normalized frequencies as parameters to be passed as font sizes.

For pizzaz, store terms into array to be sorted or randomized and or use some CSS.

We can do the same with hit counts assigned to blog categories, links, etc. No special technology is needed.

Until Next Year

December 28, 2007

Well, this was an incredible year.

I participated of several international conferences, changed ISPs, went back to teaching at the graduate school, and to conduct academic research; I also gained new friends from all over the world.

Next year I have several conferences and activities to take care of, teach next Spring a new graduate course, titled Search Engines Architecture, and take care of few consulting projects.

The IRW Newsletter should arrive to subscribers inbox early: Today.

I’m taking few days off. Until Next Year.

Cheers,

Dr. E. Garcia

Random Notes

October 3, 2007

1. IRW will run late due to my academic duties. The topic: Genetic Algorithms. It covers recent advances in the field and dozen of videos relevant to GA.

2. My vector analyzer and binary calculator is ready, but I need to double test its accuracy. It accept vectors in the order of 10,000 elements. Cool!

 3. Thank you for those using the Levenshtein Edit Distance Tool and for your suggestions.

4. The course on Web Mining and Business Intelligence is getting ready.

5. Current academic duties are limiting my time. Sorry. I need to be away from the web. Stick with me.

Random Notes and LauraMansfield

September 12, 2007

These are some late random notes. Sorry for the delay.

1. I am putting together a research project for a graduate student. The topic is quite interesting: homeland security. While researching the topic I came across LauraMansfield.com site. Mansfield’s site is a goldmine of information, especially for those interested in co-occurrence and word association research applied to the terrorist knowledge domain.

2. I am reviewing a graduate thesis in which logistic regression is used for data mining medical claims. Quite interesting the thesis topic. The manuscript needs some rework, though.

3. I am reading bits and pieces of an old paper on the non-transitivity nature of Jaccard’s Coefficient and a proposed indirect similarity measure.

Random Notes

August 30, 2007

I’m putting the final touches to IR Watch, now in its first year of publication. I started the project a year ago. Thank you for your support.

Tomorrow I will post a sneak preview of the September issue. This one is about research conducted at the Office of Naval Research in the area of search modes. If you are a keyword researcher you need to read this issue.

I’m also researching a large repository of obscure databases, accessible through ftp. If you are a KDD researcher, you will love to know about these.

JavaScript Tips

August 21, 2007

This is not a post about an IR topic, but since at some point IR projects resource to programming, I believe the post is relevant to this blog –especially when many IR tools used in a classroom demonstration setting are written in JavaScript.

I’m reading Douglas Crockford great video/ppt presentations on JavaScript via http://101out.com/js.php. There are many things average programmers don’t know about JavaScript, the most misunderstood programming language on the Planet. For those not familiar with Crockford, few years ago he pioneered the right way of writing JavaScript. Haven’t heard of JSON?

He is giving so much great tips in those videos and ppt slides. Here are some tips:

Tip #1

//Instead of

if(a==null) {...} //which does coercion

//do this:

if(a===null){...}

//Also instead of != use !==//Avoid altogether == and != in your code. The === operator compares objects references, not values. It is true only if both operands are the same object

Tip #2

//Instead ofif(a){return a.member;}else{return a;}

//do this, which is shorter:

return a && a.member;

 Tip #3

//Use || to set default values
//do this, which requires less typing:

var last=input||nr_items;

//if input is truthy, last is input, otherwise set last to nr_items

Tip #4

//Statements can have labels. Break statements can refer to labels. Use labels only on do, for, switch, and while.

//do this

loop: for(;;)
{
//do something
if(...){break loop;}
//do something
}

There are more great tips, but is better if you assimilate these at your own pace. Time to use literals more often. So,

//it is time to use () instead of new Object() and [] instead of new Array().

For code conventions for the JavaScript programming Language visit

http://javascript.crockford.com/code.html

I must agree with him that most JavaScript code on the web is crap.

When Local Relevancy is Irrelevant to Locals

August 10, 2007

When is local relevancy irrelevant to locals? In other words, when is local not important to locals?

That depends on whom you ask. For instance, at times news relevant to a location are not known by locals because of manipulation by media moguls. When globals know more than locals about local news you know that something is not working right.

(more…)

Minerazzi: What in a name?

May 22, 2007

I’m building a client-side suite of text mining tools for extracting intelligence from text files, Web pages, and email documents.  It comes in four versions: basic,  intermediate,  advance,  and pro. The basic version provide the following reports:

(more…)

Eigenvectors and Reggaeton Music = Eiggaeton

May 21, 2007

Eigenvectors and eigenvalues come in pairs; that is why we use the term eigenpair. Some have asked me about practical applications of eigenpairs. So this post goes.

Did you know the connection between eigenvectors and Reggaeton Music (or music in general)? How about eigenvectors and bridges, car designers, speakers, architecture, or oil companies?

(more…)

The New Iteration of Mi Islita

May 15, 2007

Today I uploaded the new iteration of Mi Islita.com site.

I’ve added or updated the following resource pages:

IR Thoughts Archives – A sample of posts from this blog, powered by a homemade AJAX reader and regexps.

IR Calls – A list of conferences and industry events we recommend you to attend.

IR Tutorials – Tutorials on Vector Space and LSI Models, Matrix Algebra, and more.

Educational Links – Graduate theses and research projects referencing Mi Islita.

Marketing Links – Search engine marketing articles referencing Mi Islita.

(more…)

IR Thoughts New Home

April 30, 2007

Welcome to the new home of IR Thoughts, Mi Islita.com blog about news, papers, and theses relevant to information retrieval, data mining, and search engine technologies. This blog replaces the version over at http://www.miislita.com.

Why a new home for IR Thoughts?

(more…)