Four Pillars: Four more themes before the next recap

Yes, it’s nearly time for the next recap. Tempus f. and all that jazz, but I hope to complete the recap before I go on vacation with the family in early August. So I thought I’d share a few things buzzing around inside my head, see what you think.
The first theme is about client-side and server-side software and how they’re evolving. More and more, as web services and SOA and virtualisation become part of our lives, we get the opportunity to look at what happens at client and server level with a slightly different perspective. Some old problems go away, and new ones emerge.

What I’m mulling over is this: As client installs become thinner and smaller from an end-user application perspective, we may get two significant benefits. One, we can make real progress on (client) platform and device agnosticism, with sharply reduced rollout costs and longer device lives, and even lower maintenance costs; and two, we have this by-product of real diversity at the (client) device level, a diversity that acts as a natural brake on virus propagation.

The second theme is about caching versus long-tail. A lot of the arguments about net neutrality tend to focus on “Someone must pay for all the upgrades we must do, in order to let all of you download all these videos that are going to clog up the tubes and make sure Senator Stevens never receives his internet“, while the real arguments may be about something else altogether: See Doc’s recent post on the subject, and Gordon’s follow-up.

What I’m mulling over is this: There’s a lot of talk about some form of local “neighbourhood” caching to solve the “problem” of video downloads (while happily skipping over the forced asymmetry with respect to uploads); I’ve even heard tell of trucks being deployed as mobile mega-caches. But cache what? I thought there was a very long tail of things people watched, as Chris Anderson quite clearly demonstrated. The caching discussions I’ve seen all tend to believe that the concept of “hits” will remain, which obviously makes caching useful. But I can’t reconcile the long-tail argument with the cache argument. [That’s one more reason for me to stay Confused].

The third theme is that of customer information versus DRM. Dick Hardt et al have done wonders in educating all of us about “It’s the customer’s data, stupid”. And Doc and Steve Gillmor et al have done similar wonders in getting us to understand attention and intention. So we’ve got to grips with the idea that the customer owns his/her intentions, purchasing behaviour, preferences, the lot.

What I’m mulling over is this: What happens to “content” and DRM hawks if the customer says no, you can’t have my data, it’s an invasion of my privacy? Aren’t those behaviours and profiles and clickstreams worth much more (to the content “owner”) than the apparent loss of revenue as a result of no DRM? What would the content “owners” do if someone suddenly turned the tap off. A sort of You Can’t Mine My Data Because the Data’s Mine.

The fourth theme is about synthetic worlds and their value to enterprises, particularly if Second Life met Tivo. You’ve already seen me get started on some aspects of this.
Blogging is provisional, it’s a sharing of nascent thoughts and ideas and kernels and snowballs, trying to see what happens if enough eyeballs see the thoughts and ideas. So, before I do the next recap, I wanted to get your opinions on these themes, see where I’m going wrong.

11 thoughts on “Four Pillars: Four more themes before the next recap”

Balaji Sowmyanarayanan says:

July 26, 2006 at 4:00 am

Synthetic World will be *the* battleground for the enterprise talent hunt. Coz:
1) That is where Gen M hangs out.
2) And that is where mashups bloom.
3) There will be easy ways of mapping the traits needed with Synthetic world Karma indicators like Game score, reputation etc.
V Ramaswamy says:

July 26, 2006 at 6:33 am

Hi JP, I remembered you this morning as I read a friend’s mail. I know you used to have a thing with numbers and their properties. And now I think you would be one with the spirit of: “Computer Science is no more about computers than astronomy is about telescopes.” EW Dijkstra, and “The very crudities of the first attack on a significant problem… are more illuminating than all the pretty elegance of the standard texts which has been won at the cost of perhaps centuries of finicky polishing.” Eric Temple Bell, about mathematics. My friend, Abhijit, believes he has come upon an important discovery in mathematics / numbers, with far-reaching application implications. His paper is at:
http://abhijit.info/tristate/tristate.html You might be interested. Thanks. Best, chutki
Steve says:

July 26, 2006 at 7:03 am

Perhaps there is another – related – strand (maybe not enough to be a ‘theme’) and that is the immediacy of Web Services/SOA/SaaS/Web2.0

Users are already disgruntled with the concept of ‘install’, rebelling against the idea of ‘implement’.

With the user base suffering from a combination of Customer Attention Deficit Disorder and the need for instant gratification, how does the Enterprise Software market (big code) respond?
Andrew Yeomans says:

July 26, 2006 at 1:06 pm

Abhijit’s Balanced ternary paper brought back fond memories of my time with the Trinity Mathematical Society http://www.srcf.ucam.org/tms/archive.php when I restored the tradition of presenting the accounts in reverse duodecimal, which had the digits -6 .. +6. This had two main advantages, one being that there was no distinction between addition and subtraction, and the second being that they were even more incomprehensible than normal accounts!

Donald Knuth took the idea further, with the quater-imaginary system http://en.wikipedia.org/wiki/Quater-imaginary_base with base 2i. Every complex number can be represented with just the digits 0, 1, 2, 3 without a sign.

For even more esoteric discussions to take to the pub, what is God’s own numbering system? It must be able to hold from the Planck length (1.616 x 10^-35 m) to the diameter of the universe (perhaps 1.5 x 10^27 m), a ratio of 2.4 x 10^62 times. That requires 208 bits of precision or 131 trits or 104 quadrits or … And when wrap-around occurs we have the cyclic universe theory :-)
alexis says:

July 26, 2006 at 2:59 pm

Oh come on JP, you know the caching is to reduce the bandwidth costs of the short fat head and not the long thin tail.

This works because the economics/maths of the long tail is (surely) that the integral of the tail in the ‘attention curve’ is less than that in the head, while, for the profit curve, it can be the other way around.
JP says:

July 26, 2006 at 6:08 pm

:-) I take the Fifth….
Surely the short fat head is more suited to BitTorrenting than the tail, which still begs the caching question. And for the sake of this argument I am considering all video to be time-shiftable…. so if the bandwidth issue is all about video, and downloads at that, and there’s a 2% short fat head, and BitTorrent can do that in its sleep given the “hits” nature, and all video is time-shiftable.

Lots of assumptions. All I’m trying to do is take a no-taboos view, no anchors or frames.
alexis says:

July 27, 2006 at 2:41 pm

Ah yes, the so-called ‘view from nowhere’.

If you haven’t yet seen it you may enjoy:
http://www.amazon.com/gp/product/0195056442/002-0429603-4396814?v=glance&n=283155

Dream the Cartesian dream if you like ;-)

Possibly the courtroom of public opinion demands more though?

Talking of peers… Of course, there is a sense in which an extra peer on Bittorrent is just another (active) cache, and any active cache is a peer in a multimaster (asynchronously) replicated environment.

If you want to restrict it to time-shiftable stuff then that’s a good assumption IMHO.

So the question is: what is the impact of adding new peers to a content delivery network? Then, new peers decrease load on the old peers (the servers), but only decrease network traffic if getting the data from the new peer costs less than getting it from the old peer. In other words: if the new peer is near in the network topology, than the old peer. I.e. peer = cache.

For guys like Warner, reducing server load makes sense of course:
http://gigaom.com/2006/05/08/bittorrent-snags-warner-brothers/

For anyone shovelling traffic around, adding a new peer may not help unless the traffic routes elsewhere. Moreover in this case time-shifting does not help except in (possibly) suggesting some traffic shaping heuristics.

Net net: the benefits of caching/peering etc do not look like providing a solution, though clearly they are a necessary step towards one. There is a report on this which has some useful data in it. I am going to ask the owner if they can put it on the web and provide a link.

Finally, re your 2% short fat head number. Fair assumption but the distribution is surely *so* aggressively Zipf (eg http://www.useit.com/alertbox/zipf.html) that the short fat head constitutes most of the volume? Consider eg the release of a new ‘mega’ movie. How much traffic do we expect in week 1?

What I’m saying is that the long tail may contributing relatively little to the ‘caching’ problem.
Nic Brisbourne says:

July 27, 2006 at 3:05 pm

I thought from Chris Anderson is that the head is nearer 50% than 2% (although I’m only half way through The Long Tail).

The other problem with caching this stuff is recognising it. You can see a future where P2P traffic is encrypted so ISPs can’t recongnise it and throttle it back. If it’s encrypted it can’t be cached.
alexis says:

July 27, 2006 at 3:54 pm

OK here’s the promised link:

http://www.gpbullhound.com/research/documents/SectorReportP2PJuly2006.pdf

This report looks at the data explosion and provides a little data and some estimates. Scary stuff.

Nic – good point on the 50% vs 2%. But do you mean that in the same sense as eg “The top 2% of the sites contribute 50% of the volume”? If so then I think we all mean the same thing, though I would personally hazard a guess that for downloadable movies it might be 1% contributing more than 75% of volume. Don’t know the numbers though.

I think you are also right that ISPs will most likely be active participants in the solution to the load problem. That said it *is* possible to combine the benefits of encryption with caching. In several different ways, in fact.
Pingback: Confused Of Calcutta » Blog Archive » Four Pillars: On misses and hits
Shreyas says:

July 27, 2006 at 10:56 pm

Maybe the future of Thin Client is ThinOs?

More good stuff from MIT: http://www.youos.com
http://www.slate.com/id/2144896/