Entropy and the Future of the Web

January 19, 2010

Add to FacebookAdd to DiggAdd to Del.icio.usAdd to StumbleuponAdd to RedditAdd to BlinklistAdd to TwitterAdd to TechnoratiAdd to Yahoo BuzzAdd to Newsvine

Inspired by this post of Chris Dixon, I summarized my thoughts on the future of the web in a single tweet like this:

The fundamental question that will shape the future of the web is how we deal with entropy.


The disorder of the web thrives on both content and their connections. Today’s approach to the web of tomorrow depend on how we address this issue. The figure below shows what we can expect as a result of different combinations of low and high entropy in the two layers.

Entropy in the content layer reflects the degree of internal disorder. If we choose to lower content entropy through the addition of relevant metadata or structure we’ll realize the semantic web. If we don’t then content will remain unorganized and we’ll end up in the noisy web.

Entropy in the connection layer expresses disorder in the network of content. By defining meaningful relations between content elements connection entropy will decrease leading to the synaptic web. Should we leave connections in their ad-hoc state, we’ll arrive in the unorganized web.

The study of web entropy becomes interesting when we take a look at the intersections of these domains.

  • Semantic – synaptic: The most organized, ideal form of the web. Content and connections are thoroughly described, transparent and machine readable. Example: linked data.
  • Semantic – unorganized: Semantic content loosely connected throughout the web. Most blog posts have the valid semantic structure of documents, however, they’re connected by hyperlinks that say nothing about their relation. (Say, whether a blog entry extends, reflects on or debates the linked one.)
  • Noisy – synaptic: Organizes high entropy content by connecting relevant elements via meaningful relations. Among others, tagging, filtering, recommendation engines and content mapping fall into this domain.
  • Noisy – unorganized: Sparse network of unstructured content. This is the domain we’ve known for one and a half decades where keyword based indexing and search still dominates the web. If it continues to develop in this direction then technologies such as linguistic parsing and topic identification will definitely come into play in the future.

Which one?

The question is obvious: which domain represents the optimal course to take? Based on the domains’ description semantic – synaptic seems to be the clear choice. But we’re discussing entropy here and from thermodynamics we know that entropy grows in systems that are prone to spontaneous change and order is restored only at the cost of energy and effort.

Ultimately, the question comes down to this: are we going to fight entropy or not?

Bringing the semantic web into existence is an enormous task. To me, fighting the reluctance of people to adopt the use of metadata and semantic formats is unimaginable. The synaptic web seems more feasible as the spreading of social media already indicates. But in the end what matters is which domain or combination of domains will be popular among early adopters. The rest will follow.

10 Responses to “Entropy and the Future of the Web”

  1. interesting post.
    how can you extract relationships between terms using LSA?


    • Dan Stocker Says:

      I’m no expert in NLP but from what I understood different words occurring in the same close context suggest synonymy for instance, that is detected via the term-document matrix.

  2. well I am very skeptical about it.
    first question, how do you determine a context?
    And let s say that you talk about programming java in your car:
    “I am writing a program in java while driving in my car”.
    Then, you will get a close relationship between program, java, car and driving.
    I believe that machines cannot do the job instead of human beings.
    Only electrochemical brains know the relationship between words….
    Words have been invented by humans not computers….

    • Dan Stocker Says:

      We’re on the same page on this one. I am too, convinced that NLP is not sufficient to bring about true understanding of content, and that human intelligence will play the key role in the semantic web. However, I also think that even inaccurate relationships detected by some algorithm may generate good leads for humans to follow up, so that we’re not just shooting in the dark.

  3. yes, but it seems that we have been relying on the same paradigm for too long. time to shift to another one.

  4. […] Entropy and the Future of the Web « Collective Web (tags: entropy internet informtion-overload synaptic-web) Possibly related posts: (automatically generated)Weekly Web Wanderings (weekly) […]

  5. Gunar Says:

    Having done my undergrad in Physics I totally get this – it is a nice infographic that summarize a number of interactions. I like the concept of synaptic web to describe the action of the semantic web.

    Reminds me that we humans exist because we are designed to fight entropy – at least for 80 years or so.

    The other thing to consider is that the energy and effort to keep systems organized is decreasing and becoming more democratized. High energy systems such as a central index (aka Google, aka humans) may succumb to more distributed, yet coordinated systems (aka Linked Data, aka bacteria).

    Cool post!

  6. Greg Satell Says:

    Don’t mean to be a jackass, but if “we” are going to decide to “fight” entropy then “we” will have to self organize (i.e. fight our own collective entropy).

    It’s almost like the famous ontological argument where the question presupposes the answer.

    • Dan Stocker Says:

      I agree that the question is a bit biased and leaning towards the “not fighting” end. But that’s only because I don’t see much sense in fighting entropy rather than making sense of it.

  7. Hiteshwar Azad Says:

    sir the most important thing is that “can we reduce the entropy at any layer in our web??” If yes then i think we have the most supersonic search engine in the world.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: