Freewheeling on “Filtering on the way out”

I said I would post further on David Weinberger‘s Four Strategic Principles as outlined in his new book, Everything is Miscellaneous .

David’s first principle is to filter information on the way out, not on the way in. I’m still working on it, masticating it, there’s some work involved, but I like the early flavours I can taste. So I thought I’d share with you the kind of stuff that went through my head when I saw that sentence and read what followed. Humour me.

1. In order to filter on the way in, we need to have filters, filters which can act as anchors and frames and thereby corrupt the flow of information. We’ve learnt a lot about anchors and frames and their effect on predilections and prejudices and decision-making. With David’s first principle, we reduce the risk of this bias entering our classification processes too early.

2. I think it was economist Mihaly Polanyi  who talked about things that we know we know, things that we know we don’t know and things that we don’t know we don’t know. Again, filtering on the way in prevents us gathering the things that we don’t know we don’t know.

3. The act of filtering is itself considered necessary to solve a scale problem. We can’t process infinite volumes of things. But maybe now it’s okay to be a digital squirrel, given the trends in the costs of storage. [Sometimes I wonder why we ever delete things, since we can now store snapshots every time something changes. We need never throw away information]. Filtering on the way out becomes something that happens in a natural-selection way, based on people using some element of information, tagging it, collaboratively filtering it.

4. I like the idea (proposed by David) of there being no need to throw stuff away. You just have to not-find it. If you can’t find it you might as well have thrown it away, and if it all costs the same then who cares? Reminds me of the Douglas Adams definition of flying: jumping off a tall building and missing.

5. Collecting information this way is fine, but it has no value unless someone tends to it, someone looks after it. So maybe I shouldn’t be thinking ‘not-find’ and instead I should find ways of incentivising people to clean up their information. Maybe there is a Silent Spring for information. I somehow like thinking of bad DRM and proprietary tools, methods, structures and standards as weeds that strangle the life out of good information. But then I would, wouldn’t I? Walled gardens have the worst sort of weeds.

Just musing. Comments welcome.

On content and kingdoms

I’m a weird kind of guy. I stopped reading David Weinberger’s blog for a week or so, I wanted to be able to finish reading his new book without seeing any of the reviews, and to write a review “unburdened by theory” and uninfluenced by what others said.

So it was only today that I came across his pointer to Andrew Hinton’s piece on Architectures for Conversation (ii): What Communities of Practice can mean for Information Architecture.

And in that piece, I came across this wonderful quote from Cory:

Conversation is king. Content is just something to talk about.

But that’s almost an aside. Andrew’s presentation is well worth a detailed look, and will be something I will blog on in some depth sometime soon. Currently I’m just enjoying catching up with what he’s been writing.

Everything is Miscellaneous

Everything Is Miscellaneous…is the title of David Weinberger’s new book. It’s a must-read, go get it now. David is a friend, someone I have immense respect for, but don’t let my bias come in the way. Go buy the book and read it for yourself.

What is it about? I won’t make the mistake of classifying it — otherwise I might as well not have read the book….. So think of these as tag-descriptors:

  • It’s a paean to the power of the digital world
  • It’s a lesson in the challenges of information discovery and retrieval
  • It’s a history of tabulation and classification, sequinned with great anecdotes
  • It’s a sequel to Small Pieces Loosely Joined; or maybe the Cluetrain Manifesto; or maybe The Social Life of Information
  • It’s a series of blog posts on a common set of themes
  • It’s a welcome addition to my library
  • It’s what you make it

And no, it’s not a solution to the Mid-East crisis or Global Warming. It’s a book. It’s a very good book. And it is all about information, which is one reason why I love it.

David takes us on a fascinating journey through the history, geography and science of classifying information, interspersed with his wry sense of humour (e.g. defending the state of the space under the average bed: “There isn’t a part of our homes that is truly unordered, except perhaps under our beds, and for many of even that is the site of the spontaneous ordering of dust into bunnies.” Or the way he describes Mendeleev as “unburdened by theory”.).

While doing this, he keeps drawing both parallels as well as differences between the two prior physical orders of collection and classification and the new, emerging digital order. Anecdotes are plentiful, covering plants, species, elements, books and even subjects themselves.

Anyone who is serious about the digital world would do well to read the book; anyone interested in information should read the book; anyone who is interesting in taxonomy and ontology must study the book.

As his arguments come to a crescendo, David espouses four new strategic principles, each of which deserves a set of posts in itself:

  • Filter on the way out
  • Put each leaf on as many branches as possible
  • Everything is metadata and everything can be a label
  • Give up control

I found much to fascinate me, and I am currently going through my third “very slow” read. There are tidbits for everyone: the description of the arguments between Panizzi and Carlyle should stir memories for everyone who’s ever been involved in a “we will define a data structure for everything” project; the description of Schachter’s insights compress a great deal of learning into a very small space; the paragraphs devoted to the Linnean Society HQ have a H.W. Fowler-like sense of humour: “It makes sense to bury first- and second-order organisations such as [Linnaean classification] and the Bettmann Archive. Specimens made of atoms are fragile and need protection.”

It’s a good place to go to if you want to understand more about items as diverse, yet related, as tagging, collaborative filtering, listmania, “statistically interesting phrases” and so on.

One of the more intriguing ideas David comes up with is espoused in the following sentence: “Because it can’t be fixed, the Dewey Decimal System is caught in a problem endemic to large classification systems tied to the physical world.” Until I got under the hood of that sentence, I never really accepted the notion of “legacy classification” as being a meaningful problem. Reminds me of the problems in shifting between Julian and Gregorian calendars….or why QWERTY remains in use….

I was particularly taken with the stories related to S.R. Ranganathan and his Five Laws of Library Science (a term, incidentally, that he is credited with first using). Ranganathan’s Laws are:

  • Books are for use
  • Every reader his/her books
  • Every book its readers
  • Save the time of the reader; save the time of the library staff
  • The library is a growing organism

When I first saw that, something strange stirred in me. I could imagine my maternal grandfather, Dr SV Anantakrishnan, saying just that, right down to the brusque to-the-point-ness. I was therefore completely unsurprised to find out that Ranganathan was, like my grandfather, also a Professor at Madras Christian College (where I holidayed, with my grandfather, every summer from 1961 to 1971 or so). So I will find out everything I can about the man who gave the world Colon Classification!

I was also intrigued by the way David made me understand something else that is happening, symptomatically shown in the way Wikipedia articles increase in length while Britannica articles shorten. I see something very opensource about that, and will comment in detail later.

For the unconvinced, here are some of my favourite quotes from the book:

We have to get rid of the idea that there’s a best way of organising the world.

The solution to the overabundance of information is more information.

How we draw lines can have dramatic effects on who has power and who does not.

The real problem is that any map of knowledge assumes that knowledge has a geography, that it has a top-down view, that it has a shape.

It’s not who is right and who is wrong. It’s how different points of view are negotiated, given context and embodied with passion and interest. Individuals thinking out loud have weight, and authority and expertise are losing some of their gravity.

It’s not what you know, and it’s not even who you know. It’s how much knowledge you give away. Hoarding knowledge diminishes your power.

Go buy the book. Even better, go read it.

Customer emancipation

Regular readers of this blog would know how much I care about the Three Is: doing the right thing about the Internet, Intellectual Property Rights and Identity. The weird thing about these issues is that they create conversation on both sides of the work-life fence. And, for some reason, they don’t attract the dogma and intolerance that characterise many political and religious conversations.

I dislike many of the terms used in these conversations: a perfect example is “content”, a word whose sound reminds me of fresh chalk squeaking on a glass-fronted blackboard. Now, one of the commonest phrases in which I hear that appalling word used is the following:

Content is king

And when I gently enquire of the speaker “Over what kingdom?” the usual answer I get is somewhere along these lines:

You don’t get it, do you? The content-owner rules, he owns the customer

The people who say that are right about one thing. I definitely do not get it.  People who choose to call themselves content-owners and pipe-owners (another term I deeply dislike despite Senator Stevens’ attempts) start squabbling over “ownership” of the customer.  Over the years, I’ve seen this manifest itself even within organisations, where power magically descends upon those who “own” the customer.

Pfui.

None of us owns the customer. If anything, the customer owns us. We seem to be taking a long time to understand this and to learn from it.

Musing about the ROI of IT

Yup, it’s time for another Very Provisional Post.

There’s something I don’t get about IT and ROI. Something fundamental. And that thing is: How can we possibly use the tools of a very old paradigm to solve the problems of a very new paradigm?

I guess this is something I’ve been musing about for fifteen years, after reading Paul Strassmann’s The Business Value of Computing.

I guess this is something I’ve wrestled with every time I’ve had to stand up and be counted during budget rounds at the various institutions I’ve worked in. And I’ve been in many such rounds, particularly since 2001, where the tone of the budget discussion was “Go South, Young Man“. And I wasn’t that young either.

I guess it is what was at the back of my mind when I read Nicholas Carr’s article in the Harvard Business Review in 2003, when I read his book a year later, and even when I spent time discussing various aspects of the issue with Andrew McAfee.

I guess I’m getting stupider as I grow older. You see, what gets me is this:

Ever since I read the Strassmann oeuvre, I’ve watched computing grow more distributed, more networked; I’ve seen a move towards more “enterprise architecture”,  more middleware, more platforms. I’ve watched a substantial increase in complexity.

This increase in complexity manifests itself in many ways:

  • requirements capture has gotten harder as we made the historical silos merge and coalesce
  • estimation has gotten harder, since everything now connects with everything else
  • testing has gotten harder, particularly regression and end-to-end testing
  • delivery has gotten harder and slower as silo spaghetti entangled us
  • fault replication has gotten harder, and as a consequence so has bug-fixing
  • and everything has gotten harder as the enterprise boundaries began to extend and even disappear

As IT professionals, we’ve recognised this and tried to simplify the chessboard, exchanging pawns, pieces and even queens:

  • using component architecture and reuse to speed up delivery
  • using publish-subscribe bus architectures and adapter frameworks to reduce the number of interfaces
  • using time-boxing  to ease requirements gathering
  • using fast iteration models to  make the gathering process more accurate
  • using increasing standardisation and rationalisation to simplify all this
  • using consolidation, virtualisation and service orientation to derive at least a modicum of value out of Moore’s Law during all this
  • using agile methods in general to speed up all of this

I’ve watched all this happen, watched us learn. But.

During all this time, I haven’t really seen changes in the way we account for our IT investments and expenditures. I’ve seen papers about changes, particularly those suggesting a move towards option theory; I’ve seen articles about such changes: I particularly liked the SMR proposition of Big Bets, Options and No-Regrets Moves. I’ve taken part in long arguments about the processes we use to price and value investments in IT.

But, unlike the IT environment during that period, I haven’t really seen changes in the way we measure the ROI of IT. Just 50-year old lipstick on 500 year old pigs. 

This was a problem in 1987. A bigger problem in 1997. And it’s an absolute killer in 2007.

You see, we’ve moved on. There have been various convergences, convergences of standards, of techniques, even of devices. The opensource community has had its effect, commoditising aggressively up the stack.  We’ve seen telephony become software, we’ve seen the disaggregation and reaggregation of hardware, software and services. [Much of my disagreement with Carr is about timing, not direction. ].

Today we have a new challenge. What Doc Searls calls The Because Effect.

In the past, we could claim there was a direct causal relationship between the investments made in IT and the returns, positive or negative. We had siloed systems so we somehow managed to shoe-horn what we did into 15th century mindsets. As everything became more connected, we couldn’t find the causal relationships any more, so we started wondering whether Strassmann et al were right. Yet we knew they couldn’t be, we could sense the productivity gains, the cycle time gains, the quality gains, even if they were later sacrificed. After all, there were many sacrificial altars: vendor lockin, vendor bloat, the politics of projects, the tragedies of e-mail and spreadsheet, the system of professions.

Last week I was at a conference where there was much discussion about agile methods, and the issue of agile-versus-cumbersome-accounting came up. You know something? I’ve yet to work in a place where people were happy with the finance system. Ever. This, despite finance being one of the first places to be “automated”. I don’t wonder why, I know why. Just ask Sig.

Now things will get harder still. The Because Effect is something we live with already. We make money with X because of Y.  X and Y aren’t unknowns we’re solving for. In many cases, Y is a commoditising infrastructure which enables or disables our ability to derive value out of X, the edge application.

Using traditional ROI techniques, we may drive investment away from both X as well as Y over time, as we continue the shoe-horning madness.  That’s why I read what McAfee and Brynjolfsson researched, why I read what Carr researched. Our measurement tools aren’t up to the job. And the consequences could be tragic.

Just musing. And looking forward to the comments and flames.