News Stories & Interaction

To con­vey the internal struc­ture of com­plex objects, illus­trat­ors cre­ate exploded views to expose hid­den parts, to con­vey their global struc­ture while expos­ing local semantic rela­tion­ships.
Stor­ies are such objects. I’m mak­ing a tool to illus­trate that.

In the “The Future Of Inform­a­tion” Ted Nel­son states: “The News” is a per­plex, unless some­body edits it for you, in which case it has a point of view. He calls a per­plex the inform­a­tion of the fol­low­ing form: a tangle of items and rela­tions; of facts, par­tial facts, beliefs, state­ments and views which can con­tra­dict each other in many dif­fer­ent ways.

Rarely our exper­i­ence with a news story is a dir­ect one, there­fore our point of view on a story is driven by habit and con­ven­tions and not cre­ated through a first hand exper­i­ence. To under­stand a such story you have not to only under­stand the ele­ments that com­poses it, but the rela­tion­ship between them. Under­stand­ing means decon­struc­tion and reconstruction.

For illus­trat­ive pur­poses please watch this inter­act­ive 3D exploded view dia­gram demonstration:

Think of a sub-​assembly as being an art­icle cov­er­ing that por­tion of a story, par­tially expos­ing (with cut­aways) some inner parts, etc. To start to under­stand the story you need to decon­struct those sub-​assemblies and explore how they fit together.

This kind of inter­ac­tion is essen­tially nav­ig­a­tion in an inform­a­tion space. This is pass­ive read­ing.

Act­ive read­ing would be ques­tion­ing assump­tions, con­sid­er­ing altern­at­ives, ques­tion­ing the trust­wor­thi­ness of the authors and their sources. An act­ive reader will recon­struct his crit­ical point of view for deeper understanding.

A pass­ive reader needs noth­ing more than inform­a­tion soft­ware, an act­ive one needs at least manip­u­la­tion soft­ware. (I’m using Bret Vic­tor’s cat­egor­ies for soft­ware from Magic Ink, 2006)

In the con­text of the pro­to­type I’m work­ing on I will dis­cuss first the inform­a­tion soft­ware part (decon­struc­tion, nav­ig­a­tion) and not yet the manip­u­la­tion part, which is the hard­est to design.

Inform­a­tion Software

News stor­ies and art­icles are not phys­ical objects, a 3D rep­res­ent­a­tion of them would be a pro­jec­tion of the con­cepts within, with some of their qual­ity dimen­sions (see Chapter 1 of Peter Gärden­fors’ ‘Con­cep­tual Spaces — The Geo­metry of Thought’) mapped to a 3D space our brain can under­stands and nav­ig­ate through. Don’t worry, for the rest of this post we’ll stay in flatland.

We inter­act with a col­lec­tion of art­icles in a 2D space (on paper and on screen), but usu­ally those art­icles are aligned to one dimen­sion only: the tem­poral dimen­sion (“latest”) or fre­quently a com­pos­ite of time and import­ance (“new & noteworthy”).

Mar­cos Weskamp employed a tree­map visu­al­isa­tion to con­vey import­ance in his News​map​.jp pro­ject. The tem­poral dimen­sion was com­pressed to three inter­vals: less than 10 minutes ago, more than 10 minutes ago, and more than 1 hour ago.

In News­map, the size of each cell is determ­ined by the amount of related art­icles that exist inside each news cluster com­puted by the Google News Aggregator.

Solely by clus­ter­ing, labelling the cluster with the top art­icle in it (title labels) and using the volume of the cluster to con­vey import­ance you get a bet­ter view of which art­icles (sub-​assemblies) are crit­ical in under­stand­ing a story.

Now, if you go past volume and you have access to the actual multi-​dimensional space used for clus­ter­ing, you can show not only the volume but how things relate to each other, through adja­cency and dis­tance. And to con­vey this you need to rep­res­ent the cells as con­vex poly­gons, you need to par­ti­tion the space not in rect­angles (rect­angle tes­sel­la­tion) but a Voro­noi tes­sel­la­tion as Peter Gärden­fors explains in the Chapter 3 of his book.

Here is a Voro­noi tree­map of the search res­ults for “debt sen­ate obama” (here the clusters are labeled with centroid labels rather than title ones)

Please note that this is more than a tag cloud gen­er­ated through auto­mated tag­ging. Apart of hav­ing sim­il­ar­ity meas­ure­ments, cluster labelling yields labels on col­lec­tions while auto­mated tag­ging (and entity extrac­tion) is usu­ally employed on indi­vidual art­icles then used to aggreg­ate the col­lec­tion. And such col­lec­tions can be explored through faceted nav­ig­a­tion, a good example of a faceted browser is Exhibit.

A sim­ilar tool for visu­al­ising news was News​Maps​.com (now defunct, see an 1999 art­icle on it), it’s ThemeS­cape under­ly­ing soft­ware is now part of a Thom­son (Reu­ters) Innov­a­tion ana­lyt­ics and visu­al­isa­tion solu­tion. Note that indi­vidual less import­ant art­icles were shown as dots on the map.

What about the tem­poral dimension?

Craig Mod in Every­mo­ment Now (which was focused on the 2008 US elec­tions) explores the concept of news over time, ‘behind the fold’:

If stor­ies ‘above the fold’ are import­ant and those ‘below the fold’ sec­ond­ary, then Every­mo­ment Now is look­ing ‘behind the fold’; (fig 1) that is lat­er­ally, through time. Moments con­nec­ted with an event both above and below the fold are brought next to each other to gain insight into the now.

People, places and events in the con­text of media are like stocks — they peak and bot­tom out over time. They have a media his­tory. There are two insights to be culled from this:

  1. Being able to see these spikes (vari­ance) in news cov­er­age over time will illu­min­ate poten­tially hid­den pat­terns, stor­ies or cor­rel­a­tions.
  2. Provid­ing a simple, clean and intu­it­ive inter­face to this data will allow quick access to and assess­ment of these pat­terns and stories.”

Here is a screen­shot from Every­mo­ment Now taken a while ago, note the timeline at the left and the people, places and events timelines at the right shown as spark­lines.

Every­mo­ment Now is the best faceted brows­ing inter­face I had ever seen with regard of integ­rat­ing a tem­poral dimension.

All of the above examples are about nav­ig­at­ing a story space, and one of the nav­ig­a­tional prob­lems is that you lose that spa­tial con­text when you nav­ig­ate to an art­icle because the art­icle will open in a new tab, win­dow or even replace the view you had over that space.

There are sev­eral solu­tions to main­tain that con­text and its dimen­sional nav­ig­a­tion, one is to employ a zoom­ing user inter­face, other is to present the art­icle (usu­ally stripped of irrel­ev­ant clut­ter and nav­ig­a­tion) in an over­lay which ideally only par­tially obscures the space behind and which allows nav­ig­a­tion that is dimen­sion­ally coher­ent with the space beneath. A good example is Flip­board:

It is inter­est­ing to note that both Craig Mod and Mar­cos Weskamp are at Flipboard.

Intro­du­cing PLESPER

After sub­mit­ting The Per­plex & Other Stor­ies at the MoJo (Moz­illa + Journ­al­ism) People-​Powered News chal­lenge, I had to find a name for the pro­ject, a name that when explained would give me at least a coher­ent vis­ion (I have odd nam­ing con­cerns.) And I star­ted decon­struct­ing the word perplex:

I was more con­cerned in get­ting an avail­able Twit­ter account than a domain name and PLEXPER was taken, and PLEXPR was not appeal­ing, I needed to find some­thing that could be con­nec­ted some­how with the idea of mak­ing sense.

  • An esper refers to an indi­vidual cap­able of tele­pathy and other sim­ilar paranor­mal abil­it­ies. The term was appar­ently coined by Alfred Bester in his 1950 short story “Oddy and Id” and is derived from the abbre­vi­ation ESP for extra­sens­ory per­cep­tion. (Esper, Wiki­pe­dia)
  • In Rid­ley Scott’s Blade Run­ner, there is a scene fea­tur­ing a device called an “ESPER which is used to manip­u­late photographs.

If you’ve seen my sketches you may have noticed that I aim for a zoom­able user inter­face (ZUI) over a tiled rep­res­ent­a­tion of a story space (sim­ilar with a treemap).

The inter­ac­tion would be sim­ilar with Andreas Pihlström’s Grid-​A-​Licious (actu­ally I used to have a local Word­Press with the first Grid-​A-​Licious theme as a place to post ideas.) Imagine the inter­ac­tion like Grid-​A-​Licious but based on Iso­tope lay­out with Zoomooz zoom­ing. And in ZUI style, when zoom­ing in more details would be available.

For fil­ter I will use a simple faceted search with facets obtained via Open­Cal­ais entity extrac­tion ser­vice, search and clus­ter­ing via Apache Lucene and one of the open source Carrot2 algorithms and with some Semantic Vec­tors magic.

All this is about nav­ig­at­ing around art­icles, next level is to go inside art­icles and dis­sect the sub-​assemblies fur­ther. Apart of high­light­ing the entit­ies and facts detec­ted by Open­Cal­ais, I want to be able to find what other art­icles are related to a single para­graph from the cur­rent one, what other art­icles or bits of art­icles may con­nect as storyline.

I was think­ing of using n-​grams for a sim­il­ar­ity meas­ure­ment between the para­graphs of the selec­ted art­icles. But this has noth­ing to do with those para­graphs under­ly­ing meaning.

In 2005 at the Euro­LAN 2005 Sum­mer School “The Mul­ti­lin­gual Web,” I met Prof. Mar­tin Kay and we had sev­eral bril­liant con­ver­sa­tions, one was a follow-​up on his lec­ture on machine trans­la­tion on how can you tell that doc­u­ment B is a trans­la­tion of doc­u­ment A. The basic idea was B is a trans­la­tion of A if it tells the same story.

Now what makes para­graph B sim­ilar with para­graph A? If both are fac­tual then both answer the same question!

There is a lot of research in ques­tion answer­ing, but I’m not inter­ested in answer­ing ques­tions but in answers to the same ques­tion and for my pur­poses I might not even need the question.

In ‘Ques­tion Gen­er­a­tion via Overgen­er­at­ing Trans­form­a­tions and Rank­ing,’ Michael Heil­man and Noah Smith cre­ated a frame­work for gen­er­at­ing a ranked set of fact-​based ques­tions. I tested it on a sample text and it is very prom­ising, espe­cially for my nav­ig­a­tional needs.

A prob­lem which arise from chop­ping art­icles in para­graphs is increased ambi­gu­ity, not only from just tak­ing state­ments out of con­text but also from unre­solved ref­er­ences (endo­phoric ones becom­ing exo­phoric).

For example you find (via search, etc.) a tweet like (not the per­fect example, but it shows the concept):

The pair went stroke for stroke for almost the entire race, the biggest gap between them just the 0.65-second lead Friis estab­lished with 100m to go.” http://​bit​.ly/​n​d​s​zRa

Them is an exo­phoric ref­er­ence, fol­low­ing the link will give you the whole art­icle, that’s inform­a­tion over­load. You should be able to option­ally see (tool­tip, etc.) the min­imum amount of con­text to under­stand it, and in the worst case to show the whole art­icle with this phrase highlighted.

How can this be addressed? By silently retriev­ing the linked art­icle, identi­fy­ing the quoted phrase and run­ning the whole art­icle (or a lar­ger win­dow around the quoted phrase) through a core­fer­ence res­ol­u­tion sys­tem. For example through ARKref you get:

Ima­gine that all you need for them is to show ‘Adling­ton and Friis’ in a tooltip.

This is still nav­ig­a­tion, pass­ive read­ing. What about act­ive reading?

Manip­u­la­tion Soft­ware & Com­mu­nic­a­tion Software

I’ve already said that act­ive read­ing would be ques­tion­ing assump­tions, con­sid­er­ing altern­at­ives, ques­tion­ing the trust­wor­thi­ness of the authors and their sources. An act­ive reader will recon­struct his crit­ical point of view for deeper understanding.

Being a lone act­ive reader is not fun, there­fore all of the above actions should be share­able at vari­ous stages in their life­cycle and mix­able with the actions of others.

The first prob­lem to be resolved with the cur­rent art­icles is the one of gran­u­lar address­ab­il­ity. Art­icles have URLs but those points to the whole, frag­ment iden­ti­fi­ers (the part after the # in an URL) can be used to address ele­ments in the art­icle when the art­icle con­tains them. Other altern­at­ive frag­ment iden­ti­fier pro­cessing can be used:

All these address­ing meth­ods are brittle: their asso­ci­ation with the con­tent won’t always sur­vive edit­ing. One of the options is to store enough con­text with a such link to be able to re-​attach it, as in Robust Intra-​document Loc­a­tions.

How a manip­u­la­tion soft­ware for an act­ive reader engaged with a com­munity could look like? The simplest example is Stor­ify, where you can recon­struct a story, an argu­ment, etc. from such address­able bits. Moreover, the URLs for those bits can be used to make asser­tions and con­nec­tions to other bits like these ones from the CITO onto­logy.

And since those asser­tions and con­nec­tions are not abso­lute and rep­res­ent the reader’s view, recon­struc­tion. The URLs on which those asser­tion are made should be dis­tin­guish­able from the the other read­ers’ while point­ing to the same bit (usual solu­tion through an extra level of indirection).

For example my bit​.ly shortened link to the Laugh­ing Man Wiki­pe­dia page is bitly​.com/​q​P​J​MMt while the aggreg­ate link is bit​.ly/​n​H​h​v4p. My asser­tions would be made on my shortened link and not on the aggreg­ate or the long link. Other reader’s asser­tions would be made on a dif­fer­ent shortened link with the same aggreg­ate, etc. (PURLZ is the most likely to offer such metadata ser­vices in the future)

Those bit​.ly links have also pub­lic pages on the link itself like bitly​.com/​o​E​B​EMO+ and bit​.ly is already sur­fa­cing there con­ver­sa­tions from twit­ter. A shortener that knows more on how to inter­pret such asser­tions could dis­play how other bits are con­nec­ted and allow you to engage with those assump­tions, etc. (yield­ing other links for your state­ments towards the same aggregate)

(Another col­lab­or­a­tion option I described in a fol­low­ing post. Is not an altern­at­ive, they will weave together.)

Note that such URL short­en­ing, and altern­at­ive frag­ment iden­ti­fi­ers would be part of the browser (exten­sions and maybe nat­ive in the future), there is no spe­cific require­ment for the exist­ing sites to sup­port them.

They will work over old pages on the web, they will work over Twit­ter, Face­book, G+ and whatever will be inven­ted, they will work with deprec­ated CMS sys­tems, with news­room soft­ware, etc. because they are only made of what the web is made of: links.

The way to get the reader to be act­ive is not to provide more fancy social net­works and shiny but­tons. It is to get the pass­ive read­ing exper­i­ence so right that the pass­ive reader will turn into an act­ive one because he finally formed an opinion.