A few days ago I wrote a post about how I found Gyorgy Faludy‘s Learn By Heart This Poem Of Mine. I’d been looking for the poem for a very long time, without knowing author, title or first line. Yet it happened. Because of the blogosphere.

Now I want to be able to do something else. This is very very provisional. Somewhere in my head, I place this poem in the same treasured collection as WB YeatsAn Irish Airman Foresees His Death and Dylan Thomas’s And Death Shall Have No Dominion. Something to do with the lilts and cadences and metre and scansion and hauntingness and je-ne-sais-quoi. It doesn’t matter whether I am right or wrong in this grouping, that’s a very personal thing. What matters is whether we can use the power of many and group selection and wisdom of crowds and collaborative filtering to come up with something like this. if you liked poems A and B then you are likely to like poem C.

Sounds easy, but I haven’t seen anything that does that. Is it because there’s no market? Maybe there are too few dinosaurs like me who like poetry. Is it because it already exists, but I haven’t seen it? Possibly. Someone out there will correct me.

Or maybe it’s because this is not easy. Collaborative filtering is possible when there is a clear and deep and liquid market where transactions are done, so that access/acquisition of items can be represented in a correlated manner. People who did this also did this. Is it possible when there is no central market? Collaborative filtering is possible when the items are homogeneous in nature. Are poems sufficiently homogeneous? Is homogeneity a necessary condition?

Can I create one, just using names and titles and links? And a folksonomic description basis? With not a dram of DRM in sight?

Just wondering. Any ideas out there?

  1. The Poetry Foundation does a lot of work making poetry more available, perhaps they would consider this as a project. Each poetry lover builds their own list of well-liked poems (referencing a database of publicly-available poetry, like, and new visitors can see the other poems chosen by those who like ‘their’ poems. They already operate a ‘Poetry Tool,’ this would extend and personalize it.

    In the early days of the Internet the MIT Media Lab did a great job of this, and that’s what spawned the collaborative filtering tools on Amazon etc. None of the commercial instances rival (IMHO) the simplicity and utility of the MIT seed (started, I think, by Patti Maes). Perhaps you need a nonprofit enterprise like a university or foundation to cut through the commercial motivation.

    Lest you think the Poetry Foundation is nothing beyond a clique of impoverished versifiers, it’s worth noting that they are the beneficiaries of the US$100 million bequest of Ruth Lilly, the pharmaceuticals heiress. They can do this! I don’t know anything about the Foundation beyond what’s on their web site but their mission is clear.

  2. There’s always a market, it’s just we tend not to capture transactions that have no direct monetary value. Google probably do, and could make the association.
    The ‘attention market’ is where the advertising dollars get spent…

  3. JP, speaking from the authority of the 37 musical compositions on my resume that supplement my more conventional publication record, I would propose that the path to recognizing both categories of and links between “creative artifacts” (such as poems) leads through an understanding of the social networks of the creative artists themselves. Randall Collins has already pursued this line of thought in the domain of philosophy in his wonderful book, THE SOCIOLOGY OF PHILOSOPHIES. I know you are already deep into one thick book; but these 1000+ pages are a must-read for anyone interested in the impact of social relations on creative thought.

    Very few creative minds work in isolation. (Presumably, you know the Pound canto about what it was like living with Yeats when Yeats was composing a poem.) I know of only two “isolationist” examples, neither of whom is a poet: Nathaniel Hawthorne and William Faulkner. (Faulkener’s dismal stint within the Hollywood system only reinforced his need for such isolation.) My guess is that the social networks of poets are not that different from the social networks of philosophers that Collins analyzed, except, perhaps, that they may not be limited to a single domain, such as poetry. Consider, for example, the social network that formed around Serge Diaghilev at the beginning of the twentieth century, which included composers, painters, and poets, as well as the choreographers and dancers of the Ballets Russes.

  4. Hello… I’ve finally discovered this CIO blog and am just wondering why there are so few? I’ve been trawling the web and have found eight so far. Why are ‘information’ officers so reluctant to blog? Does anyone know of any more? Sorry to hijack this comment arena.

  5. Hey Michelle, it’s not a hijack. Blogs are conversations. Between friends. You don’t tell a friend “stop going off-topic”, those are the nightmares of bulletin boards run by control freaks.

  6. Michelle: there are a few more than eight – I’m in the process of building a database* for linguistic analysis right now, not just of CIO blogs, obviously, but of corporate blogs in general. My impression is that many executives prefer internal blogs and leave the public ones to their PR departments.

    * actually it’s what linguists call a corpus. See

    JP: the problem I have with the concept of a “folksonomy of poems” is that because poems are purely textual, it is difficult to reach a consensus regarding any form of unified tagging system. I know that you were talking about filtering based on personal preference and not tagging (“other people who like this poem also like…”) but, as you point out yourself, poems are very heterogenic material and subject to very different interpretations over a period of time (heck, otherwise my professors would be out of business). The other thing is that at least to me, coincidental discovery is a part of of the whole experience of art. When I discover a new independent rock band on the web, the feeling that I have just dug some obscure musical treasure that others may not have noticed yet is part of the culture hunt. Not that that makes the idea of a social poetry network in any way less appealing. I just like the surprise involved in finding something that, judging by my own taste profile, I’m not even supposed to like.

  7. So does that mean CIO blogs are useful purely for internal communication? What about developing an alternative CIO community where CIOs can swap problems and solutions and so on?
    How did you guys come to read this?

    Sorry to bombard, I’m writing an article on this… so I suppose this is sort of an interview via blog.

  8. Cornelius, I share your difficulty with a “folksonomy of poems;” but I would like to take issue with one point in your argument, which is your claim that “poems are purely textual.” What my background has taught me is that the essence of (almost?) every poem lies in PERFORMANCE; the text is little more than an attempt to represent that performance in a static form. (The same is true of music. The music is in the performance, not in the printed notation the performer may be using.) If you accept my premise, then JP’s problem becomes even harder, if not impossible. Having worked as a performing arts critic, I have some vague intuitions about both categories of and links between performances; but, because performances are dynamic processes, I am not sure I can articulate those intuitions very well. They are certainly far more elusive than the methods we can engage for “document management” (for example).

  9. Point taken, Stephen. What I meant was that the surface form of a poem is the text, but I absolutely agree that only a performance completes the overall concept of a poem.

    “If you accept my premise, then JP’s problem becomes even harder, if not impossible […] because performances are dynamic processes, I am not sure I can articulate those intuitions very well.”

    The problem is magnified by the fact that, even if the intuitions can somehow be verbalized, people may still interpret the review of a poem or piece of music very differently. Language is a lot better for describing concrete physical things than non-tangible works of art.

  10. I tend to agree with what the two of you are saying, but I can’t give up on the idea, Reading a good book is no different from reading a good poem. The reader experiences something as part of the interaction with the text. That experience is potentially not unique, and could be shared.

    We can get bogged down in trying to label that experience with too many terms, or too complex an ontology. What folksonomies do is to seek to keep the tags simple, and this simplicity tends to confound all of us who expect that it won’t work.

    I still think there is scope for collaborative filtering, in conjunction with folksonomies, to provide a way for people to share the performances.

  11. Cornelius, you are quite right about the limitations of language. Unless I am mistaken, it was Gertrude Stein who said that the only way to review a work of art is with another work of art. I took that seriously enough that, when I had to review a new recording of a piece by John Cage that had been composed by chance techniques, I prepared a text composed by similar chance techniques! (My editor had a lot of fun with it and ran it as I had composed it.) For an example by an author with better credentials, you can check out Gary Shteyngart’s review of a new edition of OBLOMOV for the Book Review section of this Sunday’s NEW YORK TIMES. It is already on the Web at:

    So you see, JP, the experience of reading CAN be shared, even in the general sense of “reading” that includes other media artifacts; but the path to sharing does not lead through tags, whether they come from folksonomies or more sophisticated taxonomies. The act of sharing involves the creation of some form of DISCOURSE (albeit, in my own example, a postmodern form); and, in spite of the feeble attempts of AI types like Patrick Winston and Boris Katz, I believe that the reduction of discourse to any structure of tags (even an unordered set) is a futile effort. This is why we read reviews in the first place. It is why I continue to subscribe to THE NEW YORK REVIEW OF BOOKS and why my RSS feeds include other sources of reviews. As I said in my contrarian response to your admiration of Generation M, there is no “Superhighway to Being in the World;” and that includes no instant sharing of experience, particularly when that experience is an aesthetic one!

  12. Not quite following you, Stephen. I accept that the “reduction of discourse to any structure of tags is a futile effort”. What I don’t get is why “the act of sharing involved the creation of some form of discourse”.

    I say this in a very specific context, the phenomenon exemplified by or flickr or librarything or even OPML. I believe implicitly and explicitly in a “pattern matching” approach to collaborative filtering. If enough people shared meta-information about what they were interested in (in terms of books, music, poetry, art, even wine), then we could look for patterns. People who liked this also liked that.

    This is a form of sharing. It may or may not be what you consider to be discourse. But it exists, and it works for many people.

    I think what matters here is that it’s happening and it’s real.

  13. JP, what you may not be following is a distinction that I am making in terminology, if not ontology. The distinction is that “putting stuff out there” (with or without metadata) is not the same as sharing. Put another way, I use the word “sharing” as it applies to CONTENT, rather than ARTIFACTS.

    A couple of years ago, Gilman Louie, the first CEO of In-Q-Tel gave a PARC Forum in which he talked about the problem of “connecting the dots” in the United States intelligence community. He went on a moderately long riff about why “sharing” did not work, much of which came down to the fact that most of the staffers thought that sharing was simply a matter of making more artifacts available to more people. The result, as you might imagine, was that more people were getting flooded with more artifacts; and, consequently, less actual CONTENT was being shared.

    This distinction leads me to a possible corollary of that precept of Gertrude Stein that I cited for Cornelius (the only way to review a work of art is with another work of art). Because content is strongly embedded in a discourse that expresses it (whether it is a poem, a photograph, or an article in NATURE), the only way that content can be shared is through discourse. That is what happens when we write reviews (at least good ones); and we seem to agree that neither the underlying process nor the result of that process can be reduced to metadata structures. Now, if all that concerns you are the artifacts themselves, then we can just agree on our terminological differences and leave it there; but if you are more concerned about the content, then I may have to draw on more of my own discourse talents to persuade you!

  14. When I used to give “executive breakfast” talks for the English-language Fuji Xerox sectors, I would often caution my audience about getting trapped in the word games of others. Those traps are riskiest when the words are nouns! Like Warren McCulloch used to do, I would beg you to look where I am pointing rather than try to bite my finger! Otherwise, it may make more sense for me to abandon my mission of trying to wake you from your positivist slumbers!

  15. I have been playing with the idea that it might be possible to combine the preference dataset from say LastFM with LoveFilm (European Netflix), and maybe Homechoice or an IPTV service provider to generate a cross media collaborative filtering engine.

    That could also make collaborative filtering viable for less popular forms of media, like poetry.

    I don’t know if the maths is possible, but intuitively I suspect it is.

    As for the debate above I am with you – all that is necessary is to share a pattern and make suggestions of content. Being overly precise on the nature of the content isn’t necessary.

  16. Relax, Stephen, I wasn’t playing semantic bite-your-finger. I probably didn’t phrase it very well, given I was switching jobs at the weekend, but I meant my question seriously. Everything I know about collaborative filtering is about looking for shared patterns. And the patterns themselves are based on fairly rudimentary information, yet the collaboratively filtered outputs are of value.

    So do keep the comments coming; I haven’t yet pored through all your posts, especially those that need signing in to Yahoo 360, but I do read them, try and digest them and learn from them.

  17. I think Stephen and JP are looking at the problem from different angles. Stephen is making the case that distributing content (or artifacts, as he specifies) is not the same as sharing, because sharing has a discursive dimension and takes place in a social context. JP argues that common patterns of preference can be discovered when users tag their favorite media: you like X, a million people who also like X love Y, thus you may also like it.

    Hmmm. I’ve been thinking about how to appease these two approaches with one another for the last 30 minutes or so, and for tonight I admit defeat (long, long day). Maybe we can come up with a grand solution together?

  18. :-) thanks Cornelius. 90% of me agrees with Stephen’s view, so we are not that far apart. But there is a small voice that says that collaborative filtering works mainly because of the simplicity of the patterns involved. So I too am stuck. A monkey with its hand in the jar.

  19. At the risk of violating my own admonition against word games, I suspect that my primary reason for bridling up at the word “pattern” is BECAUSE of its simplicity. One path towards the sort of “grand solution” to which Cornelius aspires may be through Gerald Edelman’s concept of “perceptual categorization.” This concept was first discussed at length in Edelman NEURAL DARWINISM and progressed to the primary LEIT MOTIF in his study of consciousness, THE REMEMBERED PRESENT.

    My major reason for proposing the shift in the terminology is that, while there are a variety of objective criteria that can be used to identify and define patterns, perceptual categorization puts the human subject (the perceiver) squarely in the middle of the loop. If sharing is to be an intersubjective activity (and what else could it be in any practical setting?), then we cannot abstract the subject out of the picture (at least, with my own personal convictions, I cannot). If we further follow Edelman’s lead, we also encounter some interesting properties of perceptual categories and how “wet brains” deal with them.

    Most important is that perceptual categories are fluid. Edelman firmly rejects the idea that any part of the brain is implementing anything like a store-and-retrieve memory system. Rather, categorizing is something the brain is always doing (probably even when we are dreaming); and a lot of that categorizing is REcategorizing.

    Equally important is the brain evidence Edelman has mustered that demonstrates that different parts of the brain deal with categories in space and time, respectively. Most pattern theories tend to assume that patterns in time are the same as patterns in space, because you can just include a time dimension as one of your “spatial” axes. However, when you bring human subjects into the picture, time is not just “another dimension.” We have known this since Aristotle (read his separate treatises on physics and memory to give your own gray cells a real jolt); and we are just beginning to discover how this plays out in our brains.

    This then takes me to my third point: As I previously asserted, the discursive dimension is a dimension of PERFORMANCE. Because it is a dimension of performance, it is a TEMPORAL dimension and therefore firmly requires temporal perceptual categorization. (At this point you need to shift from Aristotle on physics and memory over to his “Poetics!”)

    So, if we can deal with what JP has been calling patterns of preference with tools of social software, we can probably close off the 10% of our disagreement! I actually believe that we CAN do this, but I have been having one hell of a time selling this conviction. One thing is for sure, it can expand the value of social software far beyond the pastime of sharing poetry!

