CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Tags

PhC: Multiresolution Visualization and Exploration of Text Corpora with Parallel Hierarchical Coordinates

by: K. Selçuk Candan, Luigi Di Caro, Maria L. Sapino
ACM Trans. Intell. Syst. Technol., Vol. 3, No. 2. (February 2012), doi:10.1145/2089094.2089098  Key: citeulike:11238862

Formatted Citation


Show HTML

Likes (beta)

This copy of the article hasn't been liked by anyone yet.

View FullText article


Abstract

The high-dimensional nature of the textual data complicates the design of visualization tools to support exploration of large document corpora. In this article, we first argue that the Parallel Coordinates (PC) technique, which can map multidimensional vectors onto a 2D space in such a way that elements with similar values are represented as similar poly-lines or curves in the visualization space, can be used to help users discern patterns in document collections. The inherent reduction in dimensionality during the mapping from multidimensional points to 2D lines, however, may result in visual complications. For instance, the lines that correspond to clusters of objects that are separate in the multidimensional space may overlap each other in the 2D space; the resulting increase in the number of crossings would make it hard to distinguish the individual document clusters. Such crossings of lines and overly dense regions are significant sources of visual clutter, thus avoiding them may help interpret the visualization. In this article, we note that visual clutter can be significantly reduced by adjusting the resolution of the individual term coordinates by clustering the corresponding values. Such reductions in the resolution of the individual term-coordinates, however, will lead to a certain degree of information loss and thus the appropriate resolution for the term-coordinates has to be selected carefully. Thus, in this article we propose a controlled clutter reduction approach, called Parallel hierarchical Coordinates (or PhC), for reducing the visual clutter in PC-based visualizations of text corpora. We define visual clutter and information loss measures and provide extensive evaluations that show that the proposed PhC provides significant visual gains (i.e., multiple orders of reductions in visual clutter) with small information loss during visualization and exploration of document collections.


Visual Text Analytics - Fall 2012's tags for this article

Citations (CiTO)

No CiTO relationships defined

X There is 1 review Average rating 3.0

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History


X Export records

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.