News Stories & Interaction

To convey the internal structure of complex objects, illustrators create exploded views to expose hidden parts, to convey their global structure while exposing local semantic relationships.
Stories are such objects. I’m making a tool to illustrate that.

In the “The Future Of Information” Ted Nelson states: “The News” is a perplex, unless somebody edits it for you, in which case it has a point of view. He calls a perplex the information of the following form: a tangle of items and relations; of facts, partial facts, beliefs, statements and views which can contradict each other in many different ways.

Rarely our experience with a news story is a direct one, therefore our point of view on a story is driven by habit and conventions and not created through a first hand experience. To understand a such story you have not to only understand the elements that composes it, but the relationship between them. Understanding means deconstruction and reconstruction.

For illustrative purposes please watch this interactive 3D exploded view diagram demonstration:

Think of a sub-assembly as being an article covering that portion of a story, partially exposing (with cutaways) some inner parts, etc. To start to understand the story you need to deconstruct those sub-assemblies and explore how they fit together.

This kind of interaction is essentially navigation in an information space. This is passive reading.

Active reading would be questioning assumptions, considering alternatives, questioning the trustworthiness of the authors and their sources. An active reader will reconstruct his critical point of view for deeper understanding.

A passive reader needs nothing more than information software, an active one needs at least manipulation software. (I’m using Bret Victor’s categories for software from Magic Ink, 2006)

In the context of the prototype I’m working on I will discuss first the information software part (deconstruction, navigation) and not yet the manipulation part, which is the hardest to design.

Information Software

News stories and articles are not physical objects, a 3D representation of them would be a projection of the concepts within, with some of their quality dimensions (see Chapter 1 of Peter Gärdenfors’ ‘Conceptual Spaces—The Geometry of Thought‘) mapped to a 3D space our brain can understands and navigate through. Don’t worry, for the rest of this post we’ll stay in flatland.

We interact with a collection of articles in a 2D space (on paper and on screen), but usually those articles are aligned to one dimension only: the temporal dimension (“latest”) or frequently a composite of time and importance (“new & noteworthy”).

Marcos Weskamp employed a treemap visualisation to convey importance in his Newsmap.jp project. The temporal dimension was compressed to three intervals: less than 10 minutes ago, more than 10 minutes ago, and more than 1 hour ago.

In Newsmap, the size of each cell is determined by the amount of related articles that exist inside each news cluster computed by the Google News Aggregator.

Solely by clustering, labelling the cluster with the top article in it (title labels) and using the volume of the cluster to convey importance you get a better view of which articles (sub-assemblies) are critical in understanding a story.

Now, if you go past volume and you have access to the actual multi-dimensional space used for clustering, you can show not only the volume but how things relate to each other, through adjacency and distance. And to convey this you need to represent the cells as convex polygons, you need to partition the space not in rectangles (rectangle tessellation) but a Voronoi tessellation as Peter Gärdenfors explains in the Chapter 3 of his book.

Here is a Voronoi treemap of the search results for “debt senate obama” (here the clusters are labeled with centroid labels rather than title ones)

Please note that this is more than a tag cloud generated through automated tagging. Apart of having similarity measurements, cluster labelling yields labels on collections while automated tagging (and entity extraction) is usually employed on individual articles then used to aggregate the collection. And such collections can be explored through faceted navigation, a good example of a faceted browser is Exhibit.

A similar tool for visualising news was NewsMaps.com (now defunct, see an 1999 article on it), it’s ThemeScape underlying software is now part of a Thomson (Reuters) Innovation analytics and visualisation solution. Note that individual less important articles were shown as dots on the map.

 

What about the temporal dimension?

Craig Mod in Everymoment Now (which was focused on the 2008 US elections) explores the concept of news over time, ‘behind the fold':

“If stories ‘above the fold’ are important and those ‘below the fold’ secondary, then Everymoment Now is looking ‘behind the fold'; (fig 1) that is laterally, through time. Moments connected with an event both above and below the fold are brought next to each other to gain insight into the now.

People, places and events in the context of media are like stocks — they peak and bottom out over time. They have a media history. There are two insights to be culled from this:

  1. Being able to see these spikes (variance) in news coverage over time will illuminate potentially hidden patterns, stories or correlations.
  2. Providing a simple, clean and intuitive interface to this data will allow quick access to and assessment of these patterns and stories.”

Here is a screenshot from Everymoment Now taken a while ago, note the timeline at the left and the people, places and events timelines at the right shown as sparklines.

Everymoment Now is the best faceted browsing interface I had ever seen with regard of integrating a temporal dimension.

All of the above examples are about navigating a story space, and one of the navigational problems is that you lose that spatial context when you navigate to an article because the article will open in a new tab, window or even replace the view you had over that space.

There are several solutions to maintain that context and its dimensional navigation, one is to employ a zooming user interface, other is to present the article (usually stripped of irrelevant clutter and navigation) in an overlay which ideally only partially obscures the space behind and which allows navigation that is dimensionally coherent with the space beneath. A good example is Flipboard:

It is interesting to note that both Craig Mod and Marcos Weskamp are at Flipboard.

 

Introducing PLESPER

After submitting The Perplex & Other Stories at the MoJo (Moz­illa + Journ­al­ism) People-​Powered News challenge, I had to find a name for the project, a name that when explained would give me at least a coherent vision (I have odd naming concerns.) And I started deconstructing the word perplex:

I was more concerned in getting an available Twitter account than a domain name and PLEXPER was taken, and PLEXPR was not appealing, I needed to find something that could be connected somehow with the idea of making sense.

  • An esper refers to an individual capable of telepathy and other similar paranormal abilities. The term was apparently coined by Alfred Bester in his 1950 short story “Oddy and Id” and is derived from the abbreviation ESP for extrasensory perception. (Esper, Wikipedia)
  • In Ridley Scott’s Blade Runner, there is a scene featuring a device called an “ESPER” which is used to manipulate photographs.

 

If you’ve seen my sketches you may have noticed that I aim for a zoomable user interface (ZUI) over a tiled representation of a story space (similar with a treemap).

The interaction would be similar with Andreas Pihlström’s Grid-A-Licious (actually I used to have a local WordPress with the first Grid-A-Licious theme as a place to post ideas.) Imagine the interaction like Grid-A-Licious but based on Isotope layout with Zoomooz zooming. And in ZUI style, when zooming in more details would be available.

For filter I will use a simple faceted search with facets obtained via OpenCalais entity extraction service, search and clustering via Apache Lucene and one of the open source Carrot2 algorithms and with some Semantic Vectors magic.

All this is about navigating around articles, next level is to go inside articles and dissect the sub-assemblies further. Apart of highlighting the entities and facts detected by OpenCalais, I want to be able to find what other articles are related to a single paragraph from the current one, what other articles or bits of articles may connect as storyline.

I was thinking of using n-grams for a similarity measurement between the paragraphs of the selected articles. But this has nothing to do with those paragraphs underlying meaning.

In 2005 at the EuroLAN 2005 Summer School “The Multilingual Web,” I met Prof. Martin Kay and we had several brilliant conversations, one was a follow-up on his lecture on machine translation on how can you tell that document B is a translation of document A. The basic idea was B is a translation of A if it tells the same story.

Now what makes paragraph B similar with paragraph A? If both are factual then both answer the same question!

There is a lot of research in question answering, but I’m not interested in answering questions but in answers to the same question and for my purposes I might not even need the question.

In ‘Question Generation via Overgenerating Transformations and Ranking,’ Michael Heilman and Noah Smith created a framework for generating a ranked set of fact-based questions. I tested it on a sample text and it is very promising, especially for my navigational needs.

A problem which arise from chopping articles in paragraphs is increased ambiguity, not only from just taking statements out of context but also from unresolved references (endophoric ones becoming exophoric).

For example you find (via search, etc.) a tweet like (not the perfect example, but it shows the concept):

“The pair went stroke for stroke for almost the entire race, the biggest gap between them just the 0.65-second lead Friis established with 100m to go.” http://bit.ly/ndszRa

 

Them is an exophoric reference, following the link will give you the whole article, that’s information overload. You should be able to optionally see (tooltip, etc.) the minimum amount of context to understand it, and in the worst case to show the whole article with this phrase highlighted.

How can this be addressed? By silently retrieving the linked article, identifying the quoted phrase and running the whole article (or a larger window around the quoted phrase) through a coreference resolution system. For example through ARKref you get:

Imagine that all you need for them is to show ‘Adlington and Friis’ in a tooltip.

This is still navigation, passive reading. What about active reading?

Manipulation Software & Communication Software

I’ve already said that active reading would be questioning assumptions, considering alternatives, questioning the trustworthiness of the authors and their sources. An active reader will reconstruct his critical point of view for deeper understanding.

Being a lone active reader is not fun, therefore all of the above actions should be shareable at various stages in their lifecycle and mixable with the actions of others.

The first problem to be resolved with the current articles is the one of granular addressability. Articles have URLs but those points to the whole, fragment identifiers (the part after the # in an URL) can be used to address elements in the article when the article contains them. Other alternative fragment identifier processing can be used:

All these addressing methods are brittle: their association with the content won’t always survive editing. One of the options is to store enough context with a such link to be able to re-attach it, as in Robust Intra-document Locations.

How a manipulation software for an active reader engaged with a community could look like? The simplest example is Storify, where you can reconstruct a story, an argument, etc. from such addressable bits. Moreover, the URLs for those bits can be used to make assertions and connections to other bits like these ones from the CITO ontology.

And since those assertions and connections are not absolute and represent the reader’s view, reconstruction. The URLs on which those assertion are made should be distinguishable from the the other readers’ while pointing to the same bit (usual solution through an extra level of indirection).

For example my bit.ly shortened link to the Laughing Man Wikipedia page is bitly.com/qPJMMt while the aggregate link is bit.ly/nHhv4p. My assertions would be made on my shortened link and not on the aggregate or the long link. Other reader’s assertions would be made on a different shortened link with the same aggregate, etc. (PURLZ is the most likely to offer such metadata services in the future)

Those bit.ly links have also public pages on the link itself like bitly.com/oEBEMO+ and bit.ly is already surfacing there conversations from twitter. A shortener that knows more on how to interpret such assertions could display how other bits are connected and allow you to engage with those assumptions, etc. (yielding other links for your statements towards the same aggregate)

(Another collaboration option I described in a following post. Is not an alternative, they will weave together.)

Note that such URL shortening, and alternative fragment identifiers would be part of the browser (extensions and maybe native in the future), there is no specific requirement for the existing sites to support them.

They will work over old pages on the web, they will work over Twitter, Facebook, G+ and whatever will be invented, they will work with deprecated CMS systems, with newsroom software, etc. because they are only made of what the web is made of: links.

 

The way to get the reader to be active is not to provide more fancy social networks and shiny buttons. It is to get the passive reading experience so right that the passive reader will turn into an active one because he finally formed an opinion.

2 thoughts on “News Stories & Interaction

  1. Juan Gonzalez

    I’m so glad I kept reading past the 500-word limit… Found the Blade Runner trivia and some great references about javascript libraries that I could use for my one prototype. You don’t mind, do you?