The current generation of radio and millimeter telescopes, particularly the Atacama Large Millimeter Array (ALMA), offers enormous advances in observing capabilities. While these advances represent an unprecedented opportunity to facilitate scientific understanding, the increased complexity in the spatial and spectral structure of these ALMA data cubes lead to challenges in their interpretation. In this paper, we perform a feasibility study for applying topological data analysis and visualization techniques never before tested by the ALMA community. Through techniques based on contour trees, we seek to improve upon existing analysis and visualization workflows of ALMA data cubes, in terms of accuracy and speed in feature extraction. We review our application development process in building effective analysis and visualization capabilities for the astrophysicists. We also summarize effective design practices by identifying domain-specific needs of simplicity, integrability, and reproducibility, in order to best target and service the large astrophysics community.
Using Contour Trees in the Analysis and Visualization of Radio Astronomy Data Cubes
P Rosen, A Seth, B Mills, A Ginsburg, J Kamenetzky, J Kern, CR Johnson, B Wang
Topological Methods in Data Analysis and Visualization (TopoInVis)
This evidence-based practice paper employs a data-driven, explainable, and scalable approach to the development and application of an online peer review system in computer science and engineering courses. Crowd-sourced grading through peer review is an effective evaluation methodology that 1) allows the use of meaningful assignments in large or online classes (e.g. assignments other than true/false, multiple choice, or short answer), 2) fosters learning and critical thinking in a student evaluating another’s work, and 3) provides a defendable and non-biased score through the wisdom of the crowd. Although peer review is widely utilized, to the authors’ best knowledge, the form and associated grading process have never been subjected to data-driven analysis and design. We present a novel, iterative approach by first gathering the most appropriate review form questions through intelligent data mining of past student reviews. During this process, key words and ideas are gathered for positive and negative sentiment dictionaries, a flag word dictionary, and a negate word dictionary. Next, we revise our grading algorithm using simulations and perturbation to determine robustness (measured by standard deviation within a section). Using the dictionaries, we leverage sentiment gathered from review comments as a quality assurance mechanism to generate a crowd comment “grade”. This grade supplements the weighted average of other review form sections. The result of this semi-automated, innovative process is a peer assessment package (intelligently-designed review form and robust grading algorithm leveraging crowd sentiment) based on actual student work that can be used by an educator to confidently assign and grade meaningful open-ended assignments in any size class.
Designing Intelligent Review Forms for Peer Assessment: A Data-Driven Approach
Z Beasley, L Piegl, P Rosen
ASEE Annual Conference & Exposition, 2019
We use persistent homology along with the eigenfunctions of the Laplacian to study similarity amongst triangulated 2-manifolds. Our method relies on studying the lower-star filtration induced by the eigenfunctions of the Laplacian. This gives us a shape descriptor that inherits the rich information encoded in the eigenfunctions of the Laplacian. Moreover, the similarity between these descriptors can be easily computed using tools that are readily available in Topological Data Analysis. We provide experiments to illustrate the effectiveness of the proposed method.
Mesh Learning Using Persistent Homology on the Laplacian Eigenfunctions
Y Zhang, H Liu, P Rosen, M Hajij
International Geometry Summit (poster), 2019
Assessing the quality of 3D printed models before they are printed remains a challenging problem, particularly when considering point cloud-based models. This paper introduces an approach to quality assessment, which uses techniques from the field of Topological Data Analysis (TDA) to compute a topological abstraction of the eventual printed model. Two main tools of TDA, Mapper and persistent homology, are used to analyze both the printed space and empty space created by the model. This abstraction enables investigating certain qualities of the model, with respect to print quality, and identifies potential anomalies that may appear in the final product.
Continue reading “Inferring Quality in Point Cloud-based 3D Printed Objects using Topological Data Analysis”
This paper introduces and extension to our previous papers to handle anomalies in the point based object slicing method. The anomalies handled are point, line and plane touch cases as well as overlaps. These anomalies can cause major problems in any intersection procedure, yet, they are seldom discussed, let alone handled. It turns out that the point based approach is capable of handling these special cases with minor extensions.
Handling Anomalies in Object Slicing for 3-D Printing
W Oropallo, L Piegl, P Rosen, K Rajab
Computer-Aided Design and Applications, 2019
Dimensionality reduction is an integral part of data visualization. It is a process that obtains a structure preserving low-dimensional representation of the high-dimensional data. Two common criteria can be used to achieve a dimensionality reduction: distance preservation and topology preservation. Inspired by recent work in topological data analysis, we are on the quest for a dimensionality reduction technique that achieves the criterion of homology preservation, a specific version of topology preservation. Specifically, we are interested in using topology-inspired manifold landmarking and manifold tearing to aid such a process and evaluate their effectiveness.
Homology-Preserving Dimensionality Reduction via Manifold Landmarking and Tearing
L Yan, Y Zhao, P Rosen, C Scheidegger, B Wang
Visualization in Data Science (VDS at IEEE VIS 2018)
Topological data analysis is an emerging area in exploratory data analysis and data mining. Its main tool, persistent homology, has become a popular technique to study the structure of complex, high-dimensional data. In this paper, we propose a novel method using persistent homology to quantify structural changes in time-varying graphs. Specifically, we transform each instance of the time-varying graph into metric spaces, extract topological features using persistent homology, and compare those features over time. We provide a visualization that assists in time-varying graph exploration and helps to identify patterns of behavior within the data. To validate our approach, we conduct several case studies on real world data sets and show how our method can find cyclic patterns, deviations from those patterns, and one-time events in time-varying graphs. We also examine whether persistence-based similarity measure as a graph metric satisfies a set of well-established, desirable properties for graph metrics.
Visual detection of structural changes in time-varying graphs using persistent homology
Mustafa Hajij, Bei Wang, Carlos Scheidegger, Paul Rosen
IEEE Pacific Visualization Symposium (PacificVis) 2018
Parallel coordinates plots (PCPs) are a well-studied technique for exploring multi-attribute datasets. In many situations, users find them a flexible method to analyze and interact with data. Unfortunately, using PCPs becomes challenging as the number of data items grows large or multiple trends within the data mix in the visualization. The resulting overdraw can obscure important features. A number of modifications to PCPs have been proposed, including using color, opacity, smooth curves, frequency, density, and animation to mitigate this problem. However, these modified PCPs tend to have their own limitations in the kinds of relationships they emphasize. We propose a new data scalable design for representing and exploring data relationships in PCPs. The approach exploits the point/line duality property of PCPs and a local linear assumption of data to extract and represent relationship summarizations. This approach simultaneously shows relationships in the data and the consistency of those relationships. Our approach supports various visualization tasks, including mixed linear and nonlinear pattern identification, noise detection, and outlier detection, all in large data. We demonstrate these tasks on multiple synthetic and real-world datasets.
DSPCP: A data scalable approach for identifying relationships in parallel coordinates
H Nguyen, P Rosen
IEEE transactions on visualization and computer graphics 24 (3), 1301-1315
We study the topological construction called Mapper in the context of simply connected domains, in particular on images. The Mapper construction can be considered as a generalization for contour, split, and joint trees on simply connected domains. A contour tree on an image domain assumes the height function to be a piecewise linear Morse function. This is a rather restrictive class of functions and does not allow us to explore the topology for most real world images. The Mapper construction avoids this limitation by assuming only continuity on the height function allowing this construction to robustly deal with a significant larger set of images. We provide a customized construction for Mapper on images, give a fast algorithm to compute it, and show how to simplify the Mapper structure in this case. Finally, we provide a simple procedure that guarantees the equivalence of Mapper to contour, join, and split trees on a simply connected domain.
The Shape of an Image: A Study of Mapper on Images
Alejandro Robles, Mustafa Hajij, and Paul Rosen
International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP) 2018
Scalar fields are used to describe a variety of data from photographs, to laser scans, to x-ray, CT or MRI scans of machine parts and are invaluable for a variety of tasks, such as fatigue detection in parts. Analyzing scalar fields can be quite challenging due to their size, complexity, and the need to understand both local and global details in context. Join trees are a data structure used to capture the geometric properties of scalar fields, including local minima, local maxima, and saddle points. Unfortunately, computing these trees is expensive, and their incremental construction makes parallel computation nontrivial. We introduce an approach that combines three strategies, pruning, spatial-domain parallelization, and value-domain parallelization, to parallelize join tree construction using OpenCL. The resulting implementation shows a significant speedup, making computation of trees on large data practical on even modest commodity hardware.
A hybrid solution to parallel calculation of augmented join trees of scalar fields in any dimension
P Rosen, J Tu, LA Piegl
Computer-Aided Design and Applications 15 (4), 610-618