Archive for July, 2007

What is a search engine: first of three now live

Monday, July 30th, 2007

The first installment of the three-part series I’ve been working on with Charles Knight of AltSearchEngines and Nitin Karandikar of Software Abstractions has just gone live! Make sure to have a look at Nitin’s excellent dissection of what constitutes search engines—he really did a thought-provoking job of proposing a definition for something we all understand instinctively yet have trouble articulating.

My contribution, ‘What isn’t a search engine?’ will appear on AltSearchEngines tomorrow, and Charles’ piece, ‘What is an alt search engine?’ on Wednesday.

Defining relevance in search: Part III

Monday, July 30th, 2007

In yesterday’s post, I discussed the human connection in relevance. This post deals with the discovery angle, which basically says that relevance can be expected or unexpected.

Discovery is a place where we can find true delight in life. It may be gratifying to find what we’re already searching for. But it is superb to stumble across the perfect thing when we didn’t even know we needed it.

Exceptional design always encompasses discovery. Nobody knew they wanted an iPod before it existed. Nobody knew they wanted a fax machine or a rubber garlic peeler thingy. Once people saw these things, though, or realized what they could do with them, they couldn’t imagine ever having lived without them.

In search, discovery can make the job of the engine easier and more difficult at the same time. Easier because results aren’t limited to what already exists in the conscious mind of the user, but more difficult because the search engine has to know the mind of the searcher better than he knows his own mind.

Just as the human connection has its own set of qualities or expressions, so too does discovery. I’ll touch on a few here, and I invite you to contribute to this ever-broadening definition.

  1. Happenstance discovery
    You’re walking down the street and the perfect pair of shoes catches your eye. You’re reading the business section of the New York Times and spy a headline about someone you went to high school with.

    Happenstance discovery is entirely unintentional on both sides of the equation. Yes, the shoe store is hoping their shoes will catch your eye, but it’s not really your eye they’re hoping for, is it? After all, they don’t know you, didn’t know you’d be walking by that shop at that time, don’t even know if you wear shoes at all. And a newspaper is putting out content they want people to be interested in, but they’re not trying to reconnect classmates, at least not overtly.

    Happenstance discovery is the ‘Can you believe it? What are the chances?’ type discovery. (That’s what I say to my sweetheart when I come home with a new pair of shoes.)

  2. Passive discovery
    Passive discovery is intentional on the part of the person or company serving up the content, but one-sided. The user may know it’s happening, but doesn’t participate in its creation. Passive discovery is what’s behind most recommendation engines: ‘Did they buy a big screen TV? Sell them some carpet! 42% of people who bought a big screen TV changed their carpet within six months!’

  3. Collaborative discovery
    Collaborative discovery is possibly the most powerful form of discovery. Essentially, collaborative discovery is when the user participates in the process:

    I am going to take your hand, and you and I are going to walk in that direction. Neither of us knows exactly where we will end up, but I know you, and you know yourself, and together we will make decisions along the way that will allow us to find experiences we can’t begin to imagine.

    How can it still be discovery if you are participating, you may ask? Doesn’t that mean you’re aware of what’s going on? Well… no, not really. Haven’t you ever chosen a dish off a Chinese menu without having the foggiest idea of what you were ordering?

    Collaborative discovery is when the intimate knowledge you have of yourself comes together with someone else’s intimate knowledge of content. If you don’t truly understand yourself, how can you define what you’re after? And if the person on the other end doesn’t comprehend the intricacies of what she’s offering, how will she know what will suit you personally?

    Collaborative discovery is where VortexDNA lives.

What I noticed as I was writing this piece is that all of these qualities that I’m attributing to relevance are interconnected. It may seem that there’s some repetition, for example, between the assumed interest I spoke about yesterday and the delivered discovery I spoke about today. The truth is that we’re weaving a web here; bringing together tightly woven concepts to create a cohesive understanding of relevance in search.

What do you think? Do you think I’m making arbitrary distinctions here? Or do you see some value in breaking relevance down into its composite parts? I’d be delighted to hear from you.

Defining relevance in search: Part II

Sunday, July 29th, 2007

A few days ago, I wrote a post called Defining relevance in search. In it, I suggested that relevance is not a single thing, but a set of qualities:

  1. The human connection: relevant results connect to the searcher.
  2. The discovery angle: relevance can be expected or unexpected.
  3. The subjective nature: the degree of relevance changes from person to person and moment to moment.
  4. The measurement conundrum: the degree of relevance occurs along a spectrum that makes it impossible to achieve 100%.

My blogbuddy David Berkowitz responded with some astute elaborations:

With the human connection, I think it?d be interesting to divide that into other categories. For instance, there?s the human connection of explicit interest - ?I know I want a new car, so ads about cars are relevant to me.? Then there?s the connection of assumed interest - ?I know I want a new minivan, so it?s true that I?m also in the market for baby furniture.? Then there?s the interest that technology unearths - ?I know I want a new car, but since I happened to visit a number of sites for vacation packages to South America, then I also am in the market for deals on hotels in Ecuador.? In each instance, it?s relevant to the human, but for different reasons.

I agree with David, and I think it’s worth taking the opportunity to further elaborate on each of these qualities and their potential manifestations. When I started writing this piece, I realized that we can go into a fair bit of detail here, so I’ve decided to devote a post to each; this topic will continue in a multi-part series.

The human connection
As I stated in the earlier post, relevance isn’t relevance unless it’s connected to a person. It only exists in the eyes of the beholder. A search result might match every keyword, but it doesn’t become relevant until the searcher says, ‘Yes! That’s what I want!’

To me, this is the first and foremost principle of relevance. If we don’t get this bit right, nothing else matters. If we get this bit right, we can tolerate a whole lot of ambiguity in the rest of what we do.

  1. Core interest
    Core interest is what VortexDNA is about. This is the stuff you are interested in because it connects with who you are at the deepest level. This is the stuff you find online that makes you feel like somebody knows you inside and out, or the stuff that makes you want to forward it to everyone you know.

    The premise behind VortexDNA is that this type of connection can lead to all of the others. If you can tap into core interest, every other effort will be more powerful because it will be based on the person, the individual.

    If you don’t mind, David, I’ll use your list as a starting point for the other ways in which the human connection can play out.

  2. Explicit interest
    Explicit interest is search at its most basic and its most obvious. I want a book. I know its title. I go to Amazon and I enter the title of the book. It’s completely overt and, therefore, the easiest to deliver.

    Just because it’s the easiest, though, doesn’t mean it’s easy. Included in the category of explicit search is search where the user knows exactly what she wants but doesn’t know how to ask for it. Maybe you heard a song and remember just a snatch of lyrics, no idea of the title or the artist. Maybe you’re trying to find who said a particular quote, but don’t realize that you’ve got the words the wrong way around. There are all sorts of challenges with explicit interest, but this has been the first line of attack for search engines and, therefore, the area in which they’ve gotten the most skilled.

  3. Assumed interest
    Assumed interest has its roots in brick-and-mortar merchandising efforts. Shopping for beer? You’ll be happy to find potato chips in the same aisle. Buying a new printer? Surely you’ll want to stock up on ink cartridges at the same time. Complementary offerings are the manifestation of assumed interest. Like explicit interest, these connections tend to be evident or become evident through user behavior.

  4. Suggested interest
    I’m referring to David’s ‘interest that technology unearths’ as ’suggested interest’. This is interest that the user was unaware of, but that can be extrapolated and considered likely based on other behavior.

    Suggested interest is where Amazon excels—offering connections between things that don’t seem interrelated but that experience has shown are. Customers who bought this book also bought that one. Customers who like popcorn makers also like feather duvets. (I don’t know if that last one’s true in general terms—it’s certainly true for me!)

    This type of interest is of huge importance to recommendation technologists and ecommerce sites. It is one of the ways in which vendors can begin to tap into the buried desires of the public. The market for stuff that people know they want is yay big; the market for stuff that people don’t yet know they want is many times larger. As my design guru friend Dorenda Britten says, you have to be able to look into the future if you want to satisfy your customers.

I’ll discuss discovery, subjectivity and measurement over the rest of the week, but I’d also be delighted to delve further into the human connection: what have we missed? What have we overstated? Do you agree that the human connection is the first principle of search, or do you think the primary focus should be on something else? I’ll be looking forward to hearing from you.

Don’t miss a special three-part series!

Saturday, July 28th, 2007

I’ve been working on an interesting project recently with Charles Knight from AltSearchEngines and Nitin Karandikar from Software Abstractions. Essentially, we’ve asked ourselves the question, “What is a search engine?” and collaborated on a three-part series that can serve as a starting point towards answering that question.

The series will address the following topics:

  • What is a search engine?
    by Nitin
  • What isn’t a search engine?
    my contribution
  • What is an alternative search engine?
    Charles’ area of expertise

You’ll be able to see the series on AltSearchEngines next week; Charles is releasing a piece a day from Monday to Wednesday. Charles and Nitin are pretty sharp, so I’d say it’s well worth a look. I’ll put up a link to it once it’s live :-)

Dinner, anyone?

Friday, July 27th, 2007

Nick Gerritsen and Branton Kenton-Dau, directors of VortexDNA, are heading off the island?they’re going to America. Want to meet them? Have someone you think they should meet? Shoot me an email: kaila@vortexdna.com.

As you may have already figured out, VortexDNA is based in New Zealand. That means we are geographically a long way from most of the potential users of the technology. For that reason, and because we think we are better at developing technology than marketing it, the company has decided to spin off two of our applications: e-commerce and a revolutionary search/ad platform.

From the big picture perspective, I think our technology is on the right track. Marc Andreessen recently summed up the three characteristics of the Internet from his perspective:

Giving people the ability to communicate in many new ways — making geography finally irrelevant.

Giving people the ability to express themselves in many new ways — the impact of the printing press, magnified a millionfold.

Giving people the ability to create their own worlds for everything they care about – and connecting with everyone else who shares the same interests, goals, and dreams.

VortexDNA is built around that third characteristic: giving people the ability to create their own worlds for everything they care about, and connecting with everyone else who shares the same interests, goals, and dreams. Our technology actually takes that concept a bit further, allowing people to connect with anyone and anything based on their purpose and values.

When we started, our first aim was to validate the technology. Thanks to the mywebDNA plug-in, we’ve been able to verify that we can dramatically improve the relevance of search results. Now the question becomes, ‘What do we do with that ability?’

From a commercial perspective, the first two applications we will spin off are:

  1. E-commerce recommendation technology—enabling e-commerce sites to incease revenue by serving up more relevant content and recommendations.
  2. Revolutionary search/ad platform—empowering users to make your web world relevant to you.

We’re interested in speaking with people and companies who might want to be involved in these spinoffs. If you’ve been following us for a while, and want to talk to us, know someone else who would, or just want to get together for dinner, please let us know; add a comment below or send me an email. Those of you who have ever been in a startup will understand how grateful I will be for your input.

Thanks for reading; we’ll see you in the States!

SezWho and distributed systems

Wednesday, July 25th, 2007

In a recent post on SezWho, Richard MacManus from Read/Write Web highlighted one of the comment ranker’s advantages:

Note that Greg’s SezWho profile can be utilized over other sites too - i.e. his profile is not centered around R/WW, but around Greg himself. In other words, it is a distributed system that can be used across multiple sites.

Why is this important? For the same reason that single sign on is important. For the same reason that RSS feeds are important:

The value of the filter being wrapped around the user rather than around a website can’t, in my opinion, be overestimated. Last month I described mywebDNA as providing the user with Relevant Eyes, that you can take with you anywhere you go, and at the risk of being insufferably self-congratulatory, I think vision is a useful metaphor.

If you have poor vision, do you think it would be more effective to wear glasses or to adjust the focus of everything around you? What if your vision was different from the person next to you? You couldn’t adjust your surroundings; you’d have to adjust your eyes, with glasses or contact lenses. The ability to carry that filter with you eliminates your dependence on the world around you being in focus.

When I go from one website to another, I still have the same pair of eyes, the same values, the same interests. I am still the same person. Doesn’t it make sense that I should carry the same filter with me wherever I go? Why should I go to one site and be ranked one way and then have to start from zero on another site?

When you think about two or three sites, perhaps it doesn’t seem so daunting. But what about if you’re visiting 20 sites a day, or 200?

Your day might look like this: you start with a site like del.icio.us to find some interesting sites. Then you visit each one, signing into a different system (TypeKey or some such) whenever you want to comment. Of the sites with ranking systems, you may comment a lot on one and be highly respected there, and not at all on another, where your comments aren’t worth anything.

Doesn’t the SezWho model make more sense? Build your reputation and take it with you. Your activities on one site follow you to another. You are the same person on ReadWriteWeb as you are on the VortexDNA blog. Isn’t this more akin to the way the offline world works?

What do you think is the barrier to implementing these distributed systems?