University of Bergen | Faculty of Mathematics and Natural Sciences | Department of Informatics | Visualization Group
Visualization
You are here: Department of Informatics > Visualization Group > Publications > piringer08comparing
 Visualization
 > about
 > team & contact info
 > research
 > publications
 > projects
 > teaching
 > seminars
 > resources
 > network
 > events
 > links

Quantifying and Comparing Features in High-Dimensional Datasets

Harald Piringer, Wolfgang Berger, Helwig Hauser

INPROCEEDINGS, Proceedings of the International Conference on Information Visualisation (IV 2008), 7, 2008

Abstract

Linking and brushing is a proven approach to analyzing multi-dimensional datasets in the context of multiple coordinated views. Nevertheless, most of the respective visualization techniques only offer qualitative visual results. Many user tasks, however, also require precise quantitative results as, for example, offered by statistical analysis. In succession of the useful Rank-by-Feature Framework, this paper describes a joint visual and statistical approach for guiding the user through a high-dimensional dataset by ranking dimensions (1D case) and pairs of dimensions (2D case) according to statistical summaries. While the original Rank-by-Feature Framework is limited to global features, the most important novelty here is the concept to consider local features, i.e., data subsets defined by brushing in linked views. The ability to compare subsets to other subsets and subsets to the whole dataset in the context of a large number of dimensions significantly extends the benefits of the approach especially in later stages of an exploratory data analysis. A case study illustrates the workflow by analyzing counts of keywords for classifying e-mails as spam or no-spam.

Published

Proceedings of the International Conference on Information Visualisation (IV 2008)

Media

  • www
  • Click to view
  • Click to view
  • Click to view

BibTeX

@inproceedings{piringer08comparing,
 author = {Harald Piringer and Wolfgang Berger and Helwig Hauser},
 title = {Quantifying and Comparing Features in High-Dimensional Datasets},
 booktitle = {Proceedings of the International Conference on Information Visualisation (IV 2008)},
 abstract = {Linking and brushing is a proven approach to analyzing multi-dimensional 
	datasets in the context of multiple coordinated views. Nevertheless, most of the respective 
	visualization techniques only offer qualitative visual results. Many user tasks, however, 
	also require precise quantitative results as, for example, offered by statistical analysis.
	In succession of the useful Rank-by-Feature Framework, this paper describes a joint visual 
	and statistical approach for guiding the user through a high-dimensional dataset by ranking
	dimensions (1D case) and pairs of dimensions (2D case) according to statistical summaries. 
	While the original Rank-by-Feature Framework is limited to global features, the most 
	important novelty here is the concept to consider local features, i.e., data subsets defined 
	by brushing in linked views. The ability to compare subsets to other subsets and subsets to 
	the whole dataset in the context of a large number of dimensions significantly extends the 
	benefits of the approach especially in later stages of an exploratory data analysis. 
	A case study illustrates the workflow by analyzing counts of keywords for classifying 
	e-mails as spam or no-spam.},
 location =   {London, UK},
 year = {2008},
 pages = {240--245},
 month = 7,
 URL = {http://dx.doi.org/10.1109/IV.2008.17},
 publisher = {IEEE Computer Society},
 address = {Washington, DC, USA},


}






 Last Modified: Jean-Paul Balabanian, 2013-05-29