Four pillars: The disaggregation and reaggregation of search

Brendon Mclean tipped me the wink on Splunk. A search engine explicitly for logs and message queues and database transactions and the like, “IT information”. Sometime ago Chris Locke had told me about Krugle. Finding search code and related technical documentation. Dohop, from Iceland, concentrates on building the best travel search engine. Dibdabdoo is all about hand-finished web laundering for kids, using human judgement to validate kid-friendly content.

So. While Google go serious on Appliance and One Box, and people like FAST come at the enterprise in a different way, there are people spending time and energy building specialised search. And I’m still trying to work out why.

So I tried to see what attributes search could have. For example:

  • The space being covered: a disk or server or many of them, at one’s desk, behind a firewall, everywhere, the web proper.
  • The type of thing being covered: text or file or image or music or whatever, as narrow or as broad as needed
  • The way the space is covered or indexed or checked for changes.
  • The way the searcher interacts with the engine and the engine with the searcher, including personalisation and relevance heuristics

Early search was all about the space being covered and the way it was covered or made relevant. And as I understand more about the Splunks and Krugles of this world, the bulk of today’s innovation seems to be about the “type of thing being covered”, with a little bit on the interaction between searcher and engine. iTunes search became spotlight this way, I guess.
I wonder. I promised Steve Patrick and Phil Dawes I would never start a “semantic web project” at the bank, because our own internal equivalent of industry body and standards body and vendor would kick into overdrive to kill it every which way, a sort of natural antibody ever-present in large organisations, whereas what I wanted was a Steven Johnson emergence.

Maybe this, the emergence of the Krugles and the Splunks, is how some parts of the semantic web will come to be. The data that Tim Berners-Lee wants to see migrating to the web may not always get there via standards like RDF, however hard we try. Because standards are meat and drink to lock-in specialists, about as meaningful and as useful as governments and regulators in preventing lock-in.

But a million different Krugles and Splunks covering different areas deeply and doing it in such a way that information ecosystems can evolve? Some sort of high-cohesion-loose-coupling approach to layered search. Open on standards and agnostic on platforms and opensource in approach. [Opensource free as in freedom, not as in gratis, in case people think otherwise]. Guerilla and emergent in business model and approach. Maybe.

I could be talking absolute tosh, but isn’t that partly what blogs are for? To start the snowballs rolling and to see what happens. If I have to go by the progress made by the zillions of standards bodies in IT, I’d rather back the guerilla approach.

Tim Berners-Lee in the New Scientist

I was reading the New Scientist over the weekend, and came across this interview with Tim Berners-Lee. [My apologies, but unless you have a subscription the magazine won’t let you get past the article stub].

I’ll paraphrase what the interview said; any and all errors and misinterpretations are mine and mine alone. Where I have quoted directly from the article, this has been made clear.

  • Web was about putting documents and images online; semantic web is about putting data online.
  • We can publish articles and papers now, but not the underlying data. We need the data.
  • To publish this data we need a mark-up language for data. So we created RDF.
  • RDF lets you put data on the web and make connections so we have one big database.
  • When we free this data magical things can and will happen.
  • Some get the power of this; many don’t; the life sciences guys are good at getting it.
  • Privacy and data protection are issues, but nowhere near as much as people make out
  • Web did not fulfil potential for showing the “how”, stayed on the “what”
  • As HTML became a truly powerful presentation medium, looking improved and editing died
  • Blogs and wikis are helping change that, though we have much to learn about social software
  • “We have to learn about how people like to make groups and learn about the social systems involved in collaborations as well as the technical side of things”
  • “The internet was designed not to care what was done with it. It just moved packets of information from one place to another: the fundamental properties that make the internet work could not be held to ransom”
  • “The internet is all about division between layers”
  • “The web tries not to prefer one sort of information over another”
  • “The web needs to be the way it is to work”
  • “Before the web, and even now, a lot of the systems were being designed to be completely consistent. The way we’ve traditionally done that is to make top-down hierarchical systems, whether in organisations or in programming. This has always been considered a good thing. The maxims of top-down, structured programming are “information-hiding” so that modules don’t see into each other but are black boxes tied together at the edges.
  • “The maxim of the web, however, is if you have something important, give it a label and then people will link to it.
  • “….by trying to constrain ourselves to use hierarchical systems, we’ve reached the limit of scale”

Lots of good stuff. More later.

Not quite Four Pillars: Using technology to remember things or find lost things

I was intrigued to see this story about an RFID enabled purse that lets you know what’s not in it. While the specific story is unnecessarily sexist, the principle has potential. RFID enabled checklists.

And it made me think about something else.

I’ve lost an iPod nano and an iPod shuffle. At home. I know they’re both there somewhere. But where I know not. Again, I am less worried about these two iPods gone astray, they will resurface sometime. But wouldn’t it be nice to have a way of finding your (submerged) next-generation iPod? Is there a way already?

Supernova and unconversations about unconferences

I’m mildly confused by all this kerfuffle about Supernova 2006, apparently kicked off by Marc Canter’s comments on his blog. I don’t know Marc, and I do know Kevin, and I intend to be at Supernova again this year. [Disclosure: I have been on panels at Supernova before, and cannot rule out being on one again some day].

I do not understand all the arguments, and don’t claim to be an expert on any of this. I am perplexed as to how Kevin can be accused of Having the Same Old Faces at the same time as Not Inviting Some Of the Same Old Faces. I do not believe Esther Dyson bought her right to speak by CNET being a sponsor. I do not think Skype was a large company when Niklas spoke two years ago.

But maybe it’s me, and I’m confused. Of Calcutta.

All this made me think of conferences, why I go, what I expect to get out of them, which ones I go to. And it made me think of all this in the context of the way we connect and co-create today.

And here’s my take:

  • There are no audiences any more. It is better to call them communities. Gone are the days when people spouted pap from the front and people lapped up the pap in the back. Today good conferences are conversations. Active and participative.
  • There are no speakers any more. It is better to call them moderators. Moderators with some stories and some tools, but moderators nevertheless.
  • Conferences have become rites of passage, ritual meetings of communities and subcommunities. So there is always an element of Same Old Faces, and an element of Missing Same Old Faces, and an element of New Faces we’ve never heard of.
  • Community conversations take place before, during and after the ritual meetings. In many shapes and forms. Including if necessary at unconferences across the road. This is not a big deal.
  • Yesterday’s on-the-edge ritual meetings are tomorrow’s establishment programmes. We already live in a world where Skype and Amazon and Google are called “Incumbent to Watch” in the Next Net 25 by BusinessWeek. So maybe Supernova and PC Forum and O’Reilly are already establishment. And reboot is moving there. And geek dinners and barcamps and unconferences are tomorrow’s establishment. Plus ca change….

So I’m looking forward to saying hi to some of the same old faces; meeting some new ones; listening to some new stories and occasionally some old ones as well. And learning more about what it means to be at a conference in this day and age.

Especially for people who fly in from places other than the US, people like me, the Same Old Faces argument doesn’t wash. I’m looking forward to meeting Amy Jo Kim again, even though she was at Supernova last year. I think she has forgotten more about communities than I know. I’m looking forward to meeting Esther Dyson again, having missed PC Forum. I guess she sees a few Same Old Faces on her travels. I’m looking forward to finding out how Saul Klein is doing, if it’s the ex Firefly guy via some DVD rental outfit in between. Because I want to know more about collaborative filtering.

And I’m looking forward to meeting Marc Canter for the first time in Amsterdam before that -)

Butler, Ribstein and Sarbanes-Oxley

[Now how on earth did I move from Technorati rankings to Sarbanes-Oxley in one Saturday step? Easy when you know how. File Not Found to SOx via 404….]

The latest Economist, in an article entitled The Trial of Sarbanes-Oxley, reminded me of this document. It’s written by an economist and a law professor and well worth a read for those who are interested in such esoteric things. But then I’m told Einstein never wore SOx……

One paragraph in the Economist article stood out to me.

“Much of the blame for this should be pinned on accounting firms, which, despite being seen by the public as big offenders in the Enron and WorldCom scandals, have emerged as the big beneficiaries from SOX. According to Joe Grundfest, a former SEC commissioner, the audit industry has several incentives to “push Section 404 compliance to a point of socially inefficient hyper-vigilance”. To avoid further damage to their reputations, and to minimise the risk that they will be sued over accounting irregularities, audit firms are adopting the most prudent possible interpretations of the Section 404 rules — rules that are vague and open to argument. And, as Mr Grundfest points out, the “more onerous the requirements of Section 404, the more money the audit profession can earn” by selling its services.

Again, for those who are interested, please read Michael Power’s pamphlet The Risk Management of Everything, where he pretty much predicts the SOx debacle in style. Note to myself: must arrange to have lunch with Prof Power again soon.[An aside: I bought the pamphlet after reading a synopsis of his PD Leake lecture in 2004. Then, the only way to get the document was via Demos. Now Demos itself points you to Amazon, with no difference in price or conditions. Interesting]