Web Genome Project on Mozilla Add-ons!

May 10th, 2009

Web Genome Project on AMOThe Web Genome Project has gotten great response since its launch in March, and this week we’re proud to announce a significant development: the Web Genome Project Firefox plug-in is now available on the official Mozilla add-on site, addons.mozilla.org!

Gaining acceptance into the Mozilla public directory is a gratifying stamp of credibility for the Web Genome Project and our movement to create a virtual topography of the web.

We’re hugely grateful to you: for downloading the add-on, for writing a review of it on Mozilla, and for being a part of the Web Genome Project. Tell your friends and help us get to 10,000,000 links on the WGP!

Online privacy infractions threaten civil liberties

April 30th, 2009

Remember the first time you saw Barack Obama?

If you’re like most Americans, it was roughly four and a half years ago at the Democratic National Convention. I refer you now to one line in particular of that historic speech:

“We worship an awesome God in the Blue States, and we don’t like federal agents poking around in our libraries in the Red States.

There is a reason the confidentiality of library records is sacrosanct: it is because the use of them for government intelligence virtually guarantees an imposition on civil liberties.

When we think about going to the library and checking out a John Grisham or a Stephen King, it’s hard to imagine what all the fuss is about. But imagine instead that you’re interested in religion and you check out a Bible or a religious reference book. Now imagine that instead of a Bible, you check out the Koran.

There are millions of versions of this scenario. You love planes and you check out a book to see how jetliners work — now imagine you’re of Middle Eastern descent. You’re fascinated with serial killers. Your friend David recommends Devil in the White City. You’re a student of human behavior and pick up a copy of The Lucifer Effect. Any one of these situations could imply suspicious activity — and, in more than 99% of cases, that suspicion would be dead wrong.

In the book Free Expression and Censorship In America, Herbert Foerstel describes the FBI’s attempts to monitor communist activity through the library system:

At [the University of Maryland, College Park], the agents asked librarians to report on anyone with a “foreign-sounding name or foreign accent” who used the libraries. Such a characterization would fit the majority of students and faculty on most American campuses, yet librarians were asked to monitor reference questions and on-line literature searches, including searches of [the National Technical Information Service], in order to establish the subject interests of these suspicious foreigners. All of this surveillance was conducted despite the fact that the UMCP libraries contained no classified materials, and their collections were presumably open to anyone. When the university complained about the surveillance, an FBI representative claimed that the libraries should feel no obligation to protect the access and privacy rights of noncitizens.

This backstory is why I was glad to see that a court is allowing a lawsuit against Blockbuster to proceed. The lawsuit is backlash from Blockbuster’s participation in Facebook’s ill-conceived Beacon program, which shared user purchase activity across the social network.

Just like libraries, it may seem that the potential harm from this program is minimal. You rent a copy of Wild Things, and the next thing you know your out-of-town girlfriend spots it on your News Feed and you’re having to explain yourself. But just like library books, movies can be an indication of who we are. Unfortunately, they are symptoms that point in a million different directions — symptoms that carry with them a potential for misinterpretation as tempting as a serpent’s apple.

We are eternally trying to find the right balance between freedom and security. Thankfully, books and other media coexist with speech firmly on the ‘freedom’ side of the line. Let’s keep it that way.

What do you think?

There is no ‘right’ brain

April 16th, 2009

Alex Madison and Lisa Harmon from Email Insider have written two articles (one and two) on the shift from valuing left-brain attributes to valuing right-brain attributes. The pieces were inspired by the new Daniel H. Pink book, A Whole New Mind: Why Right Brainers Will Rule The Future.

Say Madison and Harmon, ‘In his innovative book… Daniel H. Pink argues that our world has shifted from “left brain” dominance to the reign of right-brain thinkers: designers, inventors, teachers and storytellers. He deems this era “The Conceptual Age.”‘

Bravo. I’m delighted that empathy, play and meaning are coming into their own. Likewise, it’s about time businesses recognized the importance of design, story and symphony. Bring on the right brain!

At the same time, I find it interesting that the book is (probably intentionally) titled, ‘A Whole New Mind’. I haven’t read the book, so I’m not presuming to rebut its contents; instead, I’d like to explore a bit our human tendency to polarize.

When we polarize, we seek an extreme. We reduce the world to distinct categories, and then we elect from those categories: left or right, male or female, conventional or organic. We succumb to the ‘tyranny of the OR’ described by Collins and Porras in Built to Last.

The world we live in provides ample fuel for this tendency. It obligingly splits itself up into night and day, north and south, up and down. It practically begs us to choose sides.

If we pay close attention, though, we start to notice what philosophers and poets and gurus have observed throughout the millenia: that no thing exists without its opposite. Without night, day is meaningless; without north, south is meaningless; without up, down is meaningless. Yin contains yang and yang contains yin. Our world is the wholeness that contains all of our extremes.

The left brain — that logical, rational, emotionless creature — is what gives us the power to analyze, to reason, to plan, to calculate. It allows us to pay our bills, buy our houses, send our kids to college. None of these things are bad things.

The right brain — that creative, playful, feeling creature — is what gives us the power to explore, to dream, to invent, to transform. It allows us to find meaning, intuit connections, appreciate beauty. None of these things are bad things.

I prefer to live a life in which I can appreciate beauty AND pay the bills. I prefer to live a life in which I can plan ahead AND experience spontaneous joy. I don’t believe these things are mutually exclusive.

The research done by VortexDNA, whose technology powers the Web Genome Project, shows that companies that pay equal attention to all their stakeholders — customers, staff, shareholders, community and society — consistently outperformed companies that had a disproportionate focus on any subset. I would argue that the same need for equal attention exists for individuals, and that we use our brains to greatest effect when we use them whole-mindedly.

It’s wrong to say that the right brain is more important or the left brain is more important. The only ‘right’ brain is the whole brain.

Do you use your whole brain?

The Numerati and The Web Genome Project: Kindred Concepts

March 31st, 2009

I’m about halfway through Stephen Baker’s book The Numerati, and I get more excited with every page. It’s as if Baker had written a treatise on The Web Genome Project and what we’re all about — including making the case for a prediction model that doesn’t rely on historical data.

From the introduction:

The exploding world of data, as we’ll see, is a giant laboratory of human behavior. It’s a test bed for the social sciences, for economic behavior and psychology. Researchers at companies such as Microsoft and Yahoo are busy hiring scientists from fields as diverse as medicine and linguistics to help them grapple with the bits of our lives that are pouring in. These streams of digital data don’t recognize ancient boundaries. They’re defined by algorithms, not disciplines. They can easily cross-fertilize. This means that psychologists, economists, biologists, and computer scientists can collaborate as never before, all of them sifting for answers through countless details of our lives. Jack Einhorn, the chief scientist at a New York media start-up called Inform Technologies, predicts that the great discoveries of the twenty-first century will come from finding patterns in vast archives of data. “The next Jonas Salk will be a mathematician,” he says, “not a doctor.”

Baker goes on to explore the many ways in which people are being modeled and mapped, and in which mathematics are being used to predict human behavior. So far, though, the scenarios all fall under what we’d now think of as ‘traditional’ behavioral modeling: look at what you’ve done, and use it to predict what you’d do. In some cases, the connection may be correlative rather than causative (the example he gives is that romantic-movie watchers are more inclined to click on ads for car rentals), but the net result is the same.

…math-based predictions rely on patterns of past behavior. Let’s say I fly to Taiwan tomorrow and purchase 200 Michelin tires with my credit card. Within minutes, MasterCard will be calling my house in New Jersey, asking if that’s really me on an Asian spree. My buying patterns and those of card thieves are etched into their system.

These models obviously have their place, but they have some limitations. I don’t know anyone who complains when a credit card company monitors our activity with them and throws up a flag when there’s something abnormal. On the other hand, I don’t know anyone who would be happy if their credit card activity were given to other companies in order to better target sales offers.

Once our data is out in the world, its uses and movements become largely disconnected from us and our ability to grant permission. Like derivative mortgages, the data takes on a life of its own, independent from the individual who generated it — and there’s something about that that just doesn’t sit well with most of us.

So, yes, mathematical modeling is where we are and where we’re going. Mathematical models that predict behavior without tracking individual histories? Even better.

Have you read The Numerati? What did you think of it? And do you see the connection with the Web Genome Project as well?

A talk by Hal Varian, Google’s Chief Economist

March 20th, 2009

Professor Hal Varian, Chief Economist at Google

Professor Hal Varian, Chief Economist at Google

I had the privilege this week of attending a lecture by Professor Hal Varian, Chief Economist for Google. Varian discussed the advent of computer-mediated transactions and how they transform our business practices.

There were a couple of interesting points he raised: historical (in a pre-literate and pre-numerate era, how could people shipping barrels of olive oil have any confidence that the amount of oil that left was the same amount that arrived?), logistical (computer-mediated transactions enable more and more complex contractual arrangements), and conceptual (behavioral targeting, etc.).

This last, conceptual, is a big thing for Google these days, since they’ve been in the behavioral targeting business for all of two weeks. It’s also where Varian started to get into Web Genome Project territory. I found one thing he said particularly interesting:

In general, people have no problems with the intended use of data (more relevant content, etc.). What people are worried about is the unintended use of data (AOL’s massive data spill, etc.). The problem, therefore, is not so much a privacy problem, but rather a security problem.

That’s a pretty interesting comment, and it certainly rings true to me. “I don’t want Google knowing all this stuff about me,” people say. “Who knows what they’re going to do with it? What if somebody unscrupulous gets their hands on it?”

The core proposition of the Web Genome Project is personalisation with privacy. In light of Varian’s comments, however, it’s worth revisiting that proposition, because in fact it’s much stronger than that. The WGP model means that no clickstream or historical data is ever collected in the first place. If a thief were to break in, the vault would be empty; there’s just nothing there. So the model actually eliminates the entire question of privacy. It doesn’t much matter whether I can keep your data private if I don’t have any data on you to begin with.

Gratifying stuff from someone who’s earned his stripes. What are your thoughts about privacy vs. security?

Official Launch of the Web Genome Project

March 10th, 2009

Today is our official launch of the Web Genome Project, complete with press release distribution. I’ve included the release below; if you know anyone who might be interested in what we’re doing, by all means feel free to pass it along. Thanks for visiting!

Web Genome Project Launches Movement to Map the Internet

VortexDNA, Christchurch, NZ March 11, 2009. The Web Genome Project (WGP), designed to revolutionize the way we understand and interact with the Internet, launched today with an interactive search engine at www.webgenomeproject.org.

The WGP allows each individual a totally private way to find personally relevant content on the Web.

Each page on the Web has a distinct personality and flavor — as does each person who surfs the Web. The WGP dynamically and continuously calculates a numerical profile — a ‘genome’ — for web pages, based on the aggregate genomes of their visitors.

Visitors to webgenomeproject.org can use the tool to compare search results to a ‘filter genome’. They can adjust the filter to see how different genomes affect the order of search results, and they can also create their own genomes.

The Web Genome Project has been well received in the search industry. Charles Knight, Editor of the popular blog AltSearchEngines, said, “I downloaded the extension and gave it a spin… the WGP was spot on – and then some!” Mark Cramer, the CEO of SurfCanyon, shared Knight’s sentiments, saying, “I like it… I can see this becoming viral.”

As genomes get generated for more and more pages, they create a virtual topography of the Web. Individuals can use this topography to find sites that share their genomes.

Anyone can contribute to this virtual topography by installing the WGP extension, completing a short survey to create an initial genome, and then using the Web the way they normally do.

Individual genomes are based on a predictive algorithm from VortexDNA. They’re not personally identifying in any way, are not unique to the user, and don’t contain any demographic or historical information.

“There are more than 108 million websites on the World Wide Web,” says Branton Kenton-Dau, VortexDNA’s CEO. “The WGP is an attempt to make sense of it all, so everyone can enjoy the Internet more without being followed around online or having their clickstreams tracked.”

The WGP’s stated goal is to generate genomes for ten million web pages. So far more than half a million pages have associated genomes.

- END -

ABOUT THE WEB GENOME PROJECT
The Web Genome Project is a global movement to map the Web and make sense of its billions of pages. Its aim to give us the ability to tune into the content we’re most interested in at any given time.

ABOUT VortexDNA
VortexDNA offers a unique system for profiling users without retaining personal information, and the ability to map and codify that profile. Its predictive modeling algorithm has applications for online services, insurance, and health care.