Archive for the 'Search' Category

Turkey flights

[link] Sunday, February 10th, 2008

This morning’s Observer column

It’s the metaphors and similes that get me. It’s a shotgun marriage, declared one commentator, ‘with Google holding the gun’. Putting Microsoft and Yahoo together, said another, was like trying to produce an eagle from an alliance of two turkeys.

This is unfair. Microsoft isn’t a turkey, but a profitable, boring mastodon that entertains fantasies about being able to fly. Yahoo, for its part, is an ageing hippy who invented hang- gliding but aspired to fly 747s and then discovered that he wasn’t very good at it. The mastodon hopes that by employing the hippy it will learn to hang-glide. The hippy’s feelings about the whole deal are plain for all to see…

Update: The NYT (and lots of other sources) claim that the Yahoo board has decided to reject the Microsoft bid, on the grounds that it undervalues the company. Ho!

If this is true then what’s likely to happen is that (a) some big Yahoo shareholders will revolt and (b) Microsoft will wage a proxy war with the aim of eplacing the Yahoo board at the next AGM. This one will run and, er, ruin.

Google’s loss is the Digger’s Gain

[link] Friday, February 1st, 2008

I always thought the MySpace/Google deal was a work of genius — for Rupert Murdoch. It’s beginning to look as though I was right.

The stock market may be fretting over Google’s disappointing earnings, but somewhere Rupert Murdoch is smiling.

One of the weaknesses that Google’s management highlighted in its conference call was advertising on social networks. The company said its traffic acquisition cost, the money it pays to sites on which it places ads, rose in the fourth quarter because of required minimum payments it must make to certain sites.

“We have found that social networking inventory is not monetizing as well as we would like,” said George Reyes, Google’s chief financial officer, implying that the sites on which the minimum payments are due were social networks. By far, the largest social network on which Google sells ads is MySpace, which is owned by Mr. Murdoch’s News Corp. In 2006, Google agreed to a three-year deal to sell ads on MySpace, committing to pay a minimum of $900 million.

People involved in that deal said that Google never assumed that it would earn its $900 million back from that deal, but it appears to be losing even more than it had expected.

Social Search

[link] Friday, February 1st, 2008

From Technology Review

Now a company called Delver, which presented at Demo earlier this week, is working on a search engine that uses social-network data to return personalized results from the larger Web.

Liad Agmon, CEO of Delver, says that the site connects information about a user’s social network with Web search results, “so you are searching the Web through the prism of your social graph.” He explains that a person begins a search at Delver by typing in her name. Delver then crawls social-networking websites for widely available data about the user–such as a public LinkedIn profile–and builds a network of associated institutions and individuals based on that information. When the user enters a search query, results related to, produced by, or tagged by members of her social network are given priority. Lower down are results from people implicitly connected to the user, such as those relating to friends of friends, or people who attended the same college as the user. Finally, there may be some general results from the Web at the bottom. The consequence, says Agmon, is that each user gets a different set of results from a given query, and a set quite different from those delivered by Google…

Xerox Enters Search Market

[link] Monday, June 25th, 2007

From TechCrunch

Xerox announced its entry into the search market this week with FactSpotter, document search software that is claimed to go beyond conventional keyword search.

FactSpotter is text mining software that combines a linguistic engine that allows users to make queries in everyday language. FactSpotter looks for the keywords contained in a query along with the context those words have.

According to Xerox, FactSpotter is capable of combing through almost any document regardless of the language, location, format or type; take advantage of the way humans think, speak and ask questions; and discriminate the results highlighting just a handful of relevant answers instead of returning thousands of unrelated responses…

Sounds interesting. But…

FactSpotter will not be coming to a browser near anyone, anytime shortly. Xerox plans to launch FactSpotter next year as part of the paid Xerox Litigation Service platform and has no plans for a wider or public release.

Snap search

[link] Tuesday, March 13th, 2007

From Technology Review

Searching for information on your cell phone by typing keywords can be cumbersome. But now researchers at Microsoft have developed a software prototype called Lincoln that they hope will make Web searches easier. According to Larry Zitnick, a Microsoft researcher who works on the project, phones equipped with the software could, for example, access online movie reviews by snapping pictures of movie posters or DVD covers and get product information from pictures of advertisements in magazines or on buses.

“The main thing we want to do is connect real-world objects with the Web using pictures,” says Zitnick. “[Lincoln] is a way of finding information on the Web using images instead of keywords.”

The software works by matching pictures taken on phones with pretagged pictures in a database. It provides the best results when the pictures are of two-dimensional objects, such as magazine ads or DVD covers, Zitnick says. (See the accompanying chart to find out how compatible certain pictures are with Lincoln.) Currently, the database contains pictures of DVD covers that link to movie reviews uploaded by Microsoft researchers. However, anyone can contribute his or her pictures and links to the database, and Zitnick hopes that people will fill it with pictures and links to anything from information about graffiti art to scavenger-hunt clues. Right now, Lincoln can only be downloaded for free using Internet Explorer 6 and 7, and it can only run on smart phones equipped with Windows Mobile 5.0 and PocketPCs.

Human-assisted search

[link] Saturday, February 3rd, 2007

From Technology Review

The Web has grown orders of magnitude bigger since the founding of Google, and neither the company nor its competitors have come up with new automatic search algorithms as seemingly magical or game changing as PageRank. Now some entrepreneurs believe it’s time to replace the algorithmic search engine with humans.

ChaCha, a free advertising-supported service launched last year by former MIT AI Lab research scientist Scott Jones and software entrepreneur Brad Bostic, doesn’t exactly give up on the concept of computerized search. Web wanderers in search of answers are free to settle for the algorithmic results served up by ChaCha’s own search engine. But the site’s real calling card is its collection of 29,000 human guides, who earn $5 to $10 per hour working with users in live chat sessions to locate the Web’s best answers to their queries.

Web services that tap the brainpower of real humans are all the rage. Many now-familiar sites such as Digg and Wikipedia depend on the “wisdom of the crowd”–users who contribute, edit, and collectively rank information items. But newer ventures depend on individuals. Yahoo Answers, where anyone may submit a question and anyone else may respond, has proved immensely popular, attracting more than 60 million users (despite the varying quality of the site’s answers). More recently, Amazon’s Mechanical Turk, a marketplace where individuals can earn small amounts for completing “simple tasks that people do better than computers,” in Amazon’s words, has provoked much discussion among followers of the user-centered Web 2.0 movement…

The User Is Not Broken

[link] Thursday, August 24th, 2006

My colleague Gill Needham, with whom I am working on an exciting new course called Beyond Google, sent me this, described by its (librarian) author as “a meme masquerading as a manifesto”. Excerpt

All technologies evolve and die.

Every technology you learned about in library school will be dead someday.

You fear loss of control, but that has already happened. Ride the wave.

You are not a format. You are a service.

The OPAC is not the sun. The OPAC is at best a distant planet, every year moving farther from the orbit of its solar system.

The user is the sun.

The user is the magic element that transforms librarianship from a gatekeeping trade to a services profession.

The user is not broken…

WorldCat

[link] Saturday, June 24th, 2006

Wow! Something I should have known about — Worldcat.

WorldCat is the world’s largest bibliographic database, the merged catalogs of thousands of OCLC member libraries. Built and maintained collectively by librarians, WorldCat itself is not an OCLC service that is purchased, but rather provides the foundation for many OCLC services and the benefits they provide.

I’ve just used it to look up a rare book and it told me which libraries in my part of the world have a copy.

The Alexa story

[link] Tuesday, December 13th, 2005

John Battelle, author of an excellent book on search, has a hyperbolic post on his Blog. It begins like this…

Every so often an idea comes along that has the potential to change the game. When it does, you find yourself saying - “Sheesh, of course that was going to happen. Why didn’t I predict it?” Well, I didn’t predict this happening, but here it is, happening anyway.

In short, Alexa, an Amazon-owned search company started by Bruce Gilliat and Brewster Kahle (and the spider that fuels the Internet Archive), is going to offer its index up to anyone who wants it. Alexa has about 5 billion documents in its index - about 100 terabytes of data. It’s best known for its toolbar-based traffic and site stats, which are much debated and, regardless, much used across the web.

OK, step back, and think about that. Anyone can use Alexa’s index, to build anything. But wait, there’s more. Much more…

It’s all done with web services. And it might indeed be significant because it could enable small but ingenious players to get into the search market.