In earlier posts towards the tail end of last year and early this year, I committed to writing a number of posts on filtering. The background is simple:
- soon, everything and everyone will be connected
- that includes people, devices, creatures, inanimate objects, even concepts (like a tweet or a theme)
- at the same time, the cost of sensors and actuators is dropping at least as fast as compute and storage
- so that means everything and everyone can now publish status and alerts of pretty much anything
- there’s the potential for a whole lotta publishing to happen
- which in turn means it’s firehose time
- so we need filters
- which is why the stream/filter/drain approach is becoming more common
- and which is why I want to spend time on all this during 2014, starting with the filter
So here goes.
1. Filters should be built such that they are selectable by subscriber, not publisher There should be no publisher-level filters. Allow the firehose to happen. We know how to solve the firehose. What we don’t know is how to solve a much bigger problem: what to do when there are filters at publisher level. Once you allow this, the first thing that happens is that an entry point is created for bad actors to impose some form of censorship. In some cases it will be governments, sometimes overtly, sometimes covertly; at other times it will be traditional forces of the media; it may be generals of the army or captains of industry. The nature of the bad actor is irrelevant; what matters is that a back door has been created, one that can be used to suppress reports about a particular event/location/topic/person. If we keep making sure that it’s not easy to filter at publisher level, the bad actor is left with the strain of large-scale filtering of firehoses. Not easy.
2. Filters should intrinsically be dynamic, not static In keeping with the firehose that’s being filtered, the act of filtering should itself be one of flows and not stocks. There is a place for canned filters, to support trend analysis, pattern recognition, predictive analytics. But the norm should be that the subscriber can reset filters anytime without any loss of time or value.
3. Filters should have inbuilt “serendipity” functionality Have you ever chosen “random” when presented with a choice of things to look at, to listen to, to read, to follow? It’s a simple insurance policy to take out in order to avoid digital bigotry or heretical thinking or tunnel vision or herd instinct or groupthink or whichever other buzzphrase a la mode excites you. You must have something that takes you outside the pattern of what you do normally. And you must be able to switch that something on at will. The StumbleUpon approach is useful, but since you can “train” it you run the risk of “filter bubble” in Eli Pariser terms. In fact any publisher-level filter can create a filter bubble. Which, at its worst, allows someone else to determine what you can see/touch/listen to/engage with.
4. Filters should be interchangeable, exchangeable, even tradeable I should be able to give someone else my collection of filters; similarly, someone should be able to give me their filter set. Their transient filter set, nothing permanent as I said earlier. The idea is that one person is given an opportunity to engage with the firehoses of the digital universe while “walking in someone else’s shoes”. So I should be able to view news as if I was a 21-year old Iranian. Not by selecting the publisher-side filter for “21 year old Iranians” but by being able to exchange filters with a real live person who has those characteristics. Again, we need to watch for static, hierarchical filters and avoid them like the plague.
5. The principal filters should be by choosing a variable and a value (or range of values) to include or exclude The variable could be anything. Place. Time. Person. group. Topic. Temperature. Degree of wetness. Humidity. Blood pressure. Relative density. Weight. State. Number or count. Size. Type. Part number. SIC or NACE code. Tag. Hashtag. Label. Length. Material. Language. Species. Duration. Anything and everything. And filtered again, if needed, by the associated value. Hotter than. Lighter than. Higher than. Containing. And then filtered again for inclusion or exclusion.
6. Secondary filters should then be about routing This is where the concepts behind If This Then That come into their own. The universe that IFTTT represents is one of conditional filtering and routing. The filtered information, having passed the conditions set, needs to go somewhere. Devices now form part of the world of filters. A person who has a laptop, tablet, phone and wearable does not want the same filters for each, the same notifications to each. For one thing, the social conventions for each form factor are different; for a second, the readability and “actionability” will differ as well. So we will use IFTTT and similar constructs to filter by notification type, intensity, device, perhaps even recipient time of day and location.
7. Network-based filters, “collaborative filtering” should then complete the set Collaborative filtering is also critical. Show me the tweets that are trending with my friends that I haven’t seen yet. Let me know the restaurants frequented by people in my network who like spicy food and who’ve posted on TripAdvisor about those restaurants in the last six months; make it relevant to my location and the current time.
So that’s a starter set, seven principles that inform me when I think about these things. I shall expand on each in days to come. In the meantime, keep your observations, advice, questions and comments flowing, choosing whichever means or channel you prefer. Comment here. Respond to the link in Twitter, Facebook, Google+, LinkedIn. WhatsApp me if you want. Talk to me via @jobsworth. If you don’t like any of these, then I suppose you can email me via [email protected] but be warned that I look at email rarely and that too only under duress.
Fascinating read and a great start into 2014, JP, thank you for this. I can see an interesting dynamic being present in a the “filter exchange” – both publishers and subscriber could learn from each other, good and bad. Here’s a depiction: https://dl.dropboxusercontent.com/u/69746264/filters.png
Thank you Joachim, will take a look at the depiction. I’ve been keen on filter exchanges for over a decade now, and tried different ways of doing it. More later. Glad you liked the post.
wow. lovely post, thank you.
The process sounds like a job.
I can see “Filterer” in a “Top 10 Jobs for the 2020s” listicle.
Good post JP. The question I keep coming back to is whether or not filters must be algorithmic, or if we can have some combo of algorithm + people. I’ve written a few posts on this myself, but still don’t fully have head around it.
I think that Neil Perkin’s experiment with fraggl is a really interesting example of what filters could be…
@tim I tend to think of “filtering” as something that machines can do, while “curating” is what only humans can do. “Machines filter, humans curate” is one of my mantras. I’ve written about it before, will do so again. Also will look at Neil’s experiments, thanks for the pointer. Neil’s a good guy.
Ha, I was just about to mention the curation side but your last comment completely covers that.
Annoying as it is, Spotifys new “interface” seems to attempt to follow your guidelines (“people in your are like..”, “you played this, so try that..”, “this is popular..”, “try this, we think you will like it”) effectively a set of curated dynamic filters. Or if you like, a canned set of dynamic filters.
You sort of imply @MagicRecs and similar for collaborative filtering, though I think this just stresses paranoia or FOMO (people you know are acting in groups for reasons you may not be aware of – what’s wrong with you?) as filters displaying human trends are still too clumsy.
@DE I’d be interested in your opinion on what I cover in the collaborative filtering/network-based value bit. It’s not really MagicRecs or just “people who did A also did B”. I think of “social” as a powerful filter. My network alerts me on what is signal and what is noise. The DM, the @message, the RT, the Like, the +1, the recommendation, each of these helps separate signal from noise. Human interventions where my network does the curation. And then a second human intervention as I place different weights on individuals within the network. The classic Firefly-style collaborative filters are just some mechanical grist to the mill, augmenting the value provided by the human network. More later.
Not sure if you recall this related discussion on high/low/bandpass filters and Bernard’s brilliant idea of a fractioning column (click on the image and scroll right for a visual).. https://plus.google.com/u/0/+JoachimStroh/posts/LnGi2qXvsni
As I have occasionally, I might parallel blog as these are certainly pertinent issues
Excellent post JP! One of the main areas I’d like to see filters (in enterprise software) be improved is dynamically understanding context. For example, is there is a meeting on my calendar at 10am, then I’d love to be able generate a stream that includes the people in the meeting (whether I follow them or not) and posts tagged with topics similar to what the meeting is about. That would provide a quick overview of what the attendees are talking about (may or may not be related to the meeting) as well as what’s being talked about around the topic (which may or may not be by the attendees). Imagine how much that could help meeting preparing! Here is my presentation on this http://www.slideshare.net/alanlepo/things-id-like-social-software-vendors-to-focus-on
Just let me know, perhaps via Twitter, in case I miss it. Interested in where your thoughts lead.
It’s not in any way her best article, but Lauries article gives a little thoughts to why the first point is important – the evils of publisher based filtering.
Interesting. I’ve consistently violated your mantra then! (see this, for example: http://timkastelle.org/blog/2010/04/five-forms-of-filtering/) For me, the interesting part is where algorithms and people interact…
@tim you say “where algorithms and people interact”. I prefer “where humans act, augmented by machines” …. but we probably mean the same thing. I use filter in machine contexts and curate in human, that’s just a personal preference. And when it comes to mantras, they’re best when broken :-)
I can live with that. :-)
It’s a shame the legislation and policy makers in this country don’t fully understand the potential ramifications of publisher-level filters!
@colmjude not just in this country. Organisations that derive power from control will always prefer publisher level filters.
For this to bear fruit, you need the following. 1. A strong brand name. 2. A smart messaging service. 3. Personal Wires. 4. A search/discovery sandbox. 5. A smart connected mobile platform.
With 3, I believe that the relevance of the Desktop World Wide Web is in decline. New Mobile World Wide Wires, synced with Sensors, Location, Context, Apps, Cloud Data & Social Streams are the future. Mobile Personal Wires run and shared by People, Businesses, Places & IoT Data, will be the digital DNA of 2, 4 & 5.
Twitter may have 3 of the 5, but can someone make the special sauce of 1-5 a reality.
Have all the ingredients of 1-5. All I need now is a Master Chef.
Thank you very much for this insight. Filtering is key to making sense of the deluge of information. I have done thisl with my website http://www.marketprophit.com that filter, organizes, and helps curate information in the finance vertical.
thanks harish, will take a look shortly
Finding this blog has been a godsend. Joachim I love the text in your graphic that reads “show me your filters and i show you who you are”. I’m in the research phase of developing a service/product that ideally would enable people to “understand their filters” and their relation to the publishers, and perhaps the filters of publishers themselves. These 7 principles have been important in shaping the conceptual and ethical considerations of the direction of my research. Thank you!
@john glad you like it, John. And even gladder that the value you’re obtaining is in the conversations and comments rather than just the blog post.