DISQUS

VentureBeat: Open data is the future of web discovery

  • ifij775 · 3 months ago
    Isn't "important tweets" an oxymoron, or am I just a moron?

    Chris
    http://worstiphoneapps.blogspot.com
  • NMN · 3 months ago
    Nice post. It's an interesting concept, but would they be able to follow through and optimize the potential that this idea has? http://AppUseful.com
  • jackabraham · 3 months ago
    Toolbar data is by far the most powerful data on the web. It would be amazing if developers had access to it. It could literally be used to make every single website more personalized, interesting and fun. But Google won't give it up because they recognize this and want to use the data to improve their search rankings by measuring bounce rate.

    It really bothers me how Google justifies collecting the data too. No normal person cares about knowing the PageRank of any page they are on on the web. Most people don't even know what PageRank is. It's a Google "Do Evil" trick that is paying big dividends for them at the web's expense.
  • eugeneshteyn · 3 months ago
    Excellent and timely post. The increase in smartphone use increases the amount and scope of "toolbar" data that can be collected by the hosting browser and the operating system. The software would know which applications you use the most, which data sources you prefer, what type of entertainment you like, how much and when you talk to your friends and colleagues, etc. A lot opportunities for customizing service, and a lot of opportunities for privacy abuse.
  • Guna Deivendran · 3 months ago
    Very interesting post. You make good points. Having the data is more powerful and if companies learn to share and build tools to meet users demand then we will be so far ahead but since data is so powerful, many are reluctant to share the data. They all want to win. The issue here is that real entrepreneurs can't get a hold of the data to do a gap analyze so they can fill the gap with a new product. Twitter like tools are going to make it easy to understand where people are at but it is going to take time.

    -Guna
  • facebook-719131173 · 3 months ago
    Doug,
    Your article, “Open Data is the Future of Web Discovery,” is spot on and both you and those interviewed bring up many of the complex challenges and opportunities available today. The article brings up the search and discovery problems that TipTop (http://feeltiptop.com/) is solving by combining the best of home grown semantic algorithms and user generated content published on Twitter.
    The creators of TipTop hope to revolutionize the role that the Internet plays in people’s everyday lives by helping them find the people and the information that matters to them in a matter of seconds. TipTop isn’t a replacement for other search engines – it’s another concept entirely! TipTop helps people connect with other people to exchange ideas and experiences, rather than only find out factual information.
    I'm sure the founder of TipTop, Shyam Kapur (shyam@tiptopbest.com) would be happy to share his thoughts on using “toolbar data” and its value in the world of search and discovery as well.

    Thanks for putting all of these complexities in perspective,
    Greg Martin
  • paulhallett · 3 months ago
    Thought provoking. One of the examples you give here - finding hot/trending restaurants - represents an interesting opportunity to mix social + local... Please take a look at this twitter app, released today in public beta:

    http://www.schmap.com/picks

    (A way to browse/filter Twitter-trending restaurants in San Francisco, New York and 11 other top Twitter cities.)
  • laurence01 · 3 months ago
    I recently came across your blog and have been reading along. I thought I would leave my first comment. I don't know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.

    Sara


    http://pianotutorial.net
  • ajolie · 3 months ago
    why don't you put it into a single page text only format when you do longer articles. This is really simple to do (dump text to a page) and it makes it so much easier to read.
  • marccanter · 3 months ago
    Thank you sir

    This is excellent - you theories ring solid and whole.

    Now lets get them implemented!
  • Rory · 3 months ago
    Though perhaps not as germane to real-time trending of web behavior, bookmarks (arguably a subset of toolbar data) provide a highly curated data source for analyzing individual user interests and their individualized semantic structuring of those interests (i.e. how individuals categorize and associate various topics). By analyzing this data across many users, Xmarks (http://www.xmarks.com), a bookmark hosting and sharing service, provides one of the best web recommendation engines I've seen. The service includes a widget in the address bar which shows related websites. The accuracy/relevance of the recommendations it provides is better than other discovery services, in my opinion, because it is based upon highly curated data. Bookmarking is a far more structured and precise "clickstream" than link sharing or clicks.
  • Afsheen · 3 months ago
    Doug,
    Great article! One of the critical, recurring themes of propelling in this space is collaboration. When Facebook opened up its API, connecting through Facebook became prominent not only on sites with a lot of traction like Plaxo but also on sprawling services like Covet. This obviously bridged the gap for many other companies to attract and retain users. Similarly, as you mention in the conclusion of your article, it’s going to require a heavyweight with significant amounts of data to make toolbar data available for wider use. I think this in itself will also require significant collaboration as to the standards of what is considered private data and what is not, along with how it is pushed out in the context of social search – but as with the first launch of the mini-feed on Facebook, it might take users some time to adjust.
  • Mark Drummond · 3 months ago
    Toolbar data (as you define it) is indeed valuable. But sending that data from each user's client, to some centralized data management system, is a process fraught with privacy concerns.

    So, at Wowd we take a different approach... no toolbar data comes off the local machine, but the user still benefits from web search ranking, recommendations, etc., that use anonymous site "vote" data.

    Roughly, it works like this.

    I'm a member of the Wowd network, and I visit a site. My local Wowd client first checks to see if that web page is publically available – meaning, can other people out there on the web see the same page that I’m seeing.

    If I’m visiting a public page, then the site is nominated for inclusion in the Wowd index. No personally identifiable information leaves my machine!

    By visiting the site I'm simply and implicitly voting for it. That's all. The indexing of the publically available site is done from another machine in the Wowd network, not mine.

    So users get all the benefits of a system that understands toolbar / click-stream data, without having to actually share a single scrap of personally identifiable information. The results are better for users, and the privacy is better too.

    More information about the Wowd approach at http://blog.wowd.com/