We present a comprehensive framework for evaluating line chart smoothing methods under a variety of visual analytics tasks. Line charts are commonly used to visualize a series of data samples. When the number of samples is large, or the data are noisy, smoothing can be applied to make the signal more apparent. However, there are a wide variety of smoothing techniques available, and the effectiveness of each depends upon both nature of the data and the visual analytics task at hand. To date, the visualization community lacks a summary work for analyzing and classifying the various smoothing methods available. In this paper, we establish a framework, based on 8 measures of the line smoothing effectiveness tied to 8 low-level visual analytics tasks. We then analyze 12 methods coming from 4 commonly used classes of line chart smoothing-rank filters, convolutional filters, frequency domain filters, and subsampling. The results show that while no method is ideal for all situations, certain methods, such as Gaussian filters and Topology-based subsampling, perform well in general. Other methods, such as low-pass cutoff filters and Douglas-Peucker subsampling, perform well for specific visual analytics tasks. Almost as importantly, our framework demonstrates that several methods, including the commonly used uniform subsampling, produce low-quality results, and should, therefore, be avoided, if possible.
LineSmooth: An Analytical Framework for Evaluating the Effectiveness of Smoothing Techniques on Line Charts
P. Rosen, G.J. Quadri
IEEE Transactions on Visualization and Computer Graphics (VAST 2020)
Scatterplots are used for a variety of visual analytics tasks, including cluster identification, and the visual encodings used on a scatterplot play a deciding role on the level of visual separation of clusters. For visualization designers, optimizing the visual encodings is crucial to maximizing the clarity of data. This requires accurately modeling human perception of cluster separation, which remains challenging. We present a multi-stage user study focusing on 4 factors-distribution size of clusters, number of points, size of points, and opacity of points-that influence cluster identification in scatterplots. From these parameters, we have constructed 2 models, a distance-based model, and a density-based model, using the merge tree data structure from Topological Data Analysis. Our analysis demonstrates that these factors play an important role in the number of clusters perceived, and it verifies that the distance-based and density-based models can reasonably estimate the number of clusters a user observes. Finally, we demonstrate how these models can be used to optimize visual encodings on real-world data.
Modeling the Influence of Visual Density on Cluster Perception in Scatterplots Using Topology
G.J. Quadri, P. Rosen
IEEE Transactions on Visualization and Computer Graphics (InfoVis 2020)
Line charts are commonly used to visualize a series of data values. When the data are noisy, smoothing is applied to make the signal more apparent. Conventional methods used to smooth line charts, e.g., using subsampling or filters, such as median, Gaussian, or low-pass, each optimize for different properties of the data. The properties generally do not include retaining peaks (i.e., local minima and maxima) in the data, which is an important feature for certain visual analytics tasks. We present TopoLines, a method for smoothing line charts using techniques from Topological Data Analysis. The design goal of TopoLines is to maintain prominent peaks in the data while minimizing any residual error. We evaluate TopoLines for 2 visual analytics tasks by comparing to 5 popular line smoothing methods with data from 4 application domains.
TopoLines: Topological Smoothing for Line Charts
P. Rosen, A. Suh, C. Salgado, M. Hajij
EuroVis Short Papers
Peer review is a widely utilized pedagogical feedback mechanism for engaging students, which has been shown to improve educational outcomes. However, we find limited discussion and empirical measurement of peer review in visualization coursework. In addition to engagement, peer review provides direct and diverse feedback and reinforces recently-learned course concepts through critical evaluation of others’ work. In this paper, we discuss the construction and application of peer review in a computer science visualization course, including: projects that reuse code and visualizations in a feedback-guided, continual improvement process and a peer review rubric to reinforce key course concepts. To measure the effectiveness of the approach, we evaluate student projects, peer review text, and a post-course questionnaire from 3 semesters of mixed undergraduate and graduate courses. The results indicate that course concepts are reinforced with peer review—82% reported learning more because of peer review, and 75% of students recommended continuing it. Finally, we provide a road-map for adapting peer review to other visualization courses to produce more highly engaged students.
Leveraging Peer Feedback to Improve Visualization Education
Z. Beasley, A. Friedman, L. Piegl, P. Rosen
IEEE Pacific Visualization Symposium (PacificVis)
Graphs are commonly used to encode relationships among entities, yet their abstractness makes them difficult to analyze. Node-link diagrams are popular for drawing graphs, and force-directed layouts provide a flexible method for node arrangements that use local relationships in an attempt to reveal the global shape of the graph. However, clutter and overlap of unrelated structures can lead to confusing graph visualizations. This paper leverages the persistent homology features of an undirected graph as derived information for interactive manipulation of force-directed layouts. We first discuss how to efficiently extract 0-dimensional persistent homology features from both weighted and unweighted undirected graphs. We then introduce the interactive persistence barcode used to manipulate the force-directed graph layout. In particular, the user adds and removes contracting and repulsing forces generated by the persistent homology features, eventually selecting the set of persistent homology features that most improve the layout. Finally, we demonstrate the utility of our approach across a variety of synthetic and real datasets.
Persistent Homology Guided Force-Directed Graph Layouts
A. Suh, M Hajij, B. Wang, C. Scheidegger, P. Rosen
Transaction on Visualization and Computer Graphics (InfoVis)
Reproducibility has been increasingly encouraged by communities of science in order to validate experimental conclusions, and replication studies represent a significant opportunity to vision scientists wishing contribute new perceptual models, methods, or insights to the visualization community. Unfortunately, the notion of replication of previous studies does not lend itself to how we communicate research findings. Simple put, studies that re-conduct and confirm earlier results do not hold any novelty, a key element to the modern research publication system. Nevertheless, savvy researchers have discovered ways to produce replication studies by embedding them into other sufficiently novel studies. In this position paper, we define three methods–re-evaluation, expansion, and specialization–for embedding a replication study into a novel published work. Within this context, we provide a non-exhaustive case study on replications of Cleveland and McGill’s seminal work on graphical perception. As it turns out, numerous replication studies have been carried out based on that work, which have both confirmed prior findings and shined new light on our understanding of human perception. Finally, we discuss how publishing a true replication study should be avoided, while providing suggestions for how vision scientists and others can still use replication studies as a vehicle to producing visualization research publications.
You Can’t Publish Replication Studies (and How to Anyways)
G. Quadri, P. Rosen
VIS x Vision Workshop at IEEE VIS
The current generation of radio and millimeter telescopes, particularly the Atacama Large Millimeter Array (ALMA), offers enormous advances in observing capabilities. While these advances represent an unprecedented opportunity to facilitate scientific understanding, the increased complexity in the spatial and spectral structure of these ALMA data cubes lead to challenges in their interpretation. In this paper, we perform a feasibility study for applying topological data analysis and visualization techniques never before tested by the ALMA community. Through techniques based on contour trees, we seek to improve upon existing analysis and visualization workflows of ALMA data cubes, in terms of accuracy and speed in feature extraction. We review our application development process in building effective analysis and visualization capabilities for the astrophysicists. We also summarize effective design practices by identifying domain-specific needs of simplicity, integrability, and reproducibility, in order to best target and service the large astrophysics community.
Using Contour Trees in the Analysis and Visualization of Radio Astronomy Data Cubes
P Rosen, A Seth, B Mills, A Ginsburg, J Kamenetzky, J Kern, CR Johnson, B Wang
Topological Methods in Data Analysis and Visualization (TopoInVis)
Dimensionality reduction is an integral part of data visualization. It is a process that obtains a structure preserving low-dimensional representation of the high-dimensional data. Two common criteria can be used to achieve a dimensionality reduction: distance preservation and topology preservation. Inspired by recent work in topological data analysis, we are on the quest for a dimensionality reduction technique that achieves the criterion of homology preservation, a specific version of topology preservation. Specifically, we are interested in using topology-inspired manifold landmarking and manifold tearing to aid such a process and evaluate their effectiveness.
Homology-Preserving Dimensionality Reduction via Manifold Landmarking and Tearing
L Yan, Y Zhao, P Rosen, C Scheidegger, B Wang
Visualization in Data Science (VDS at IEEE VIS 2018)
Topological data analysis is an emerging area in exploratory data analysis and data mining. Its main tool, persistent homology, has become a popular technique to study the structure of complex, high-dimensional data. In this paper, we propose a novel method using persistent homology to quantify structural changes in time-varying graphs. Specifically, we transform each instance of the time-varying graph into metric spaces, extract topological features using persistent homology, and compare those features over time. We provide a visualization that assists in time-varying graph exploration and helps to identify patterns of behavior within the data. To validate our approach, we conduct several case studies on real world data sets and show how our method can find cyclic patterns, deviations from those patterns, and one-time events in time-varying graphs. We also examine whether persistence-based similarity measure as a graph metric satisfies a set of well-established, desirable properties for graph metrics.
Visual detection of structural changes in time-varying graphs using persistent homology
Mustafa Hajij, Bei Wang, Carlos Scheidegger, Paul Rosen
IEEE Pacific Visualization Symposium (PacificVis) 2018
Parallel coordinates plots (PCPs) are a well-studied technique for exploring multi-attribute datasets. In many situations, users find them a flexible method to analyze and interact with data. Unfortunately, using PCPs becomes challenging as the number of data items grows large or multiple trends within the data mix in the visualization. The resulting overdraw can obscure important features. A number of modifications to PCPs have been proposed, including using color, opacity, smooth curves, frequency, density, and animation to mitigate this problem. However, these modified PCPs tend to have their own limitations in the kinds of relationships they emphasize. We propose a new data scalable design for representing and exploring data relationships in PCPs. The approach exploits the point/line duality property of PCPs and a local linear assumption of data to extract and represent relationship summarizations. This approach simultaneously shows relationships in the data and the consistency of those relationships. Our approach supports various visualization tasks, including mixed linear and nonlinear pattern identification, noise detection, and outlier detection, all in large data. We demonstrate these tasks on multiple synthetic and real-world datasets.
DSPCP: A data scalable approach for identifying relationships in parallel coordinates
H Nguyen, P Rosen
IEEE transactions on visualization and computer graphics 24 (3), 1301-1315