1. Original Entry + Comments2. Write a Comment3. Preview Comment
New comments for this entry are disabled.

February 01, 2006  |  Google lessons  |  640 hit(s)

I recently was doing some "research" that consisted of plugging searches into Google.[1] As it turned out, I was getting results, but because I didn't understand Google search operators well, I was not getting all the results I could have.

I was looking for all words that ended in dango, like fandango. My initial search was therefore *dango -- an assumption based on 20 years of looking for files in DOS and Windows. But according to someone better informed than me, "the asterisk only works in Google as a wild-card character replacing a full word, so searching on *dango won't find fandango or blogdango." The search did find lots and lots of hits on the standalone word dango, which is a Japanese dumpling, and innumerable sites devoted to Hana Yori Dango, a Japanese manga series. The search *dango -hana -yori didn't yield better results. And I couldn't figure out how to search for the particle -dango and exclude the standalone word dango.

All of this leads me to a couple of things. One is to ask whether Google does or ever will offer really sophisticated seaches -- for example, searches that can be expressed as regular expressions. The second is to wonder where the heck I can learn really detailed information about search strategies. (Hey, I know! I'll google it!).

A workaround that people seem to use is to search Google Groups. This seems to succeed, but I don't quite get why.

[1] Hey, I kept re-searching. Haha. The re- prefix here is an intensifier, like in remark and (from Spanish) refried beans.

Seth   01 Feb 06 - 9:14 PM

While not well-suited to this particular issue (but coming under the category of "Useful nonetheless"), I find that _I_ can never remember all of the syntaxen for all of Google's specializied flags (definitons, search by file type, search within one domain, etc).

Luckily, there's Soople (http://www.soople.com/), which also has the benefit of being easy to remember.

Yaron   05 Feb 06 - 11:57 AM

The fact that search engines don't do stemming is constantly annoying, I agree.
The only one I found which currently claims to support stemming, and regular expressions, is Exalead ( http://www.exalead.com ).
But it's extremely quirky. The regex syntax they have is very limited, and it works strangely. Trying to search for words ending in dango (using their syntax /.*dango/ ) would return just a few hundreds of results, while limiting it to words starting with the f letter /f.*dango/ will return a few thousands, with the expected fandango included... I'm not entirely sure based on their result what is it that their engine does.