Progressive Data Analysis

Our civilization is collecting data at a pace never seen before. While data analysis has made tremendous progresses in scalability in the last decade, this progress has only benefited “confirmative” analysis or model-based computation; progress in data exploration has lagged behind. The main reason is that, to maintain their efficiency during exploration, humans need a rapid feedback loop of about 10 seconds. However, when data becomes larger or algorithms more complex, bounding the latency is not possible with existing computation paradigms. To address this problem, recent research have proposed an approach called Progressive Visual Analytics (PVA). Instead of performing the whole computation in one long step, PVA quickly generates estimates of the results and updates them continuously to allow the analyst to 1) monitor progress, usually with data visualizations, 2) steer algorithms by interactively adjusting parameters while the computation is performed, and 3) control the process (start, stop, resume).

Our objective is to explore this new field of research. We are studying it with multiple facets, both from the human side, and from the machine side.

Positions

We have a post-doctoral position on the topic. If you are interested, contact Jean-Daniel Fekete.

Current Topics

Studies on Requirements for Progressive Visual Analytics

We study the cognitive, perceptive, and HCI issues raised by progressive data analysis.

Software Infrastructure

We develop ProgressiVis, a toolkit for progressive data analysis in Python.

Progressive Data Structure and Algorithms

We study the data structure suited to progressive data analysis, and how to transform standard data analysis algorithms into their progressive counterpart.

Publications

  • Ameya Patil, Gaëlle Richer, Christopher Jermaine, Dominik Moritz, Jean-Daniel Fekete. Studying Early Decision Making with Progressive Bar Charts. IEEE Transactions on Visualization and Computer Graphics, Institute of Electrical and Electronics Engineers, In press, ⟨10.1109/TVCG.2022.3209426⟩. ⟨https://hal.inria.fr/hal-03738461v2⟩
  • Xin Chen, Jian Zhang, Chi-Wing Fu, Jean-Daniel Fekete, Yunhai Wang. Pyramid-based Scatterplots Sampling for Progressive and Streaming Data Visualization. IEEE Transactions on Visualization and Computer Graphics, Institute of Electrical and Electronics Engineers, inPress, ⟨10.1109/TVCG.2021.3114880⟩. ⟨https://hal.inria.fr/hal-03360776⟩
  • Leilani Battle, Philipp Eichmann, Marco Angelini, Tiziana Catarci, Giuseppe Santucci, Yukun Zheng, Carsten Binnig, Jean-Daniel Fekete, Dominik Moritz. Database Benchmarking for Supporting Real-Time Interactive Querying of Large Data. SIGMOD ’20 - International Conference on Management of Data, Jun 2020, Portland, OR, United States. pp.1571-1587, ⟨10.1145/3318464.3389732⟩. ⟨https://hal.inria.fr/hal-02556400⟩
  • Jaemin Jo, Jinwook Seo, Jean-Daniel Fekete, PANENE: A Progressive Algorithm for Indexing and Querying Approximate k-Nearest Neighbors, IEEE Transactions on Visualization and Computer Graphics, Institute of Electrical and Electronics Engineers, 2020, 26 (2), pp.1347-1360. ⟨10.1109/TVCG.2018.2869149⟩. (https://hal.inria.fr/hal-01855672)
  • Jean-Daniel Fekete, Danyel Fisher, Arnab Nandi, Michael Sedlmair. Progressive Data Analysis and Visualization. Oct 2018, Wadern, Germany. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2019, ⟨10.4230/DagRep.8.10.1⟩. ⟨https://hal.inria.fr/hal-02090121⟩
  • Cagatay Turkay, Nicola Pezzotti, Carsten Binnig, Hendrik Strobelt, Barbara Hammer, Daniel A. Keim, Jean-Daniel Fekete, Themis Palpanas, Yunhai Wang, Florin Rusu. Progressive Data Science: Potential and Challenges. 2019. ⟨https://hal.inria.fr/hal-01961871⟩
  • Jaemin Jo, Jinwook Seo, Jean-Daniel Fekete, A Progressive k-d tree for Approximate k-Nearest Neighbors, Workshop on Data Systems for Interactive Analysis (DSIA), Oct 2017, Phoenix, United States. <hal-01650272>
  • Sriram Karthik Badam, Niklas Elmqvist, Jean-Daniel Fekete. Steering the Craft: UI Elements and Visualizations for Supporting Progressive Visual Analytics . Computer Graphics Forum, Wiley, 2017, Eurographics Conference on Visualization (EuroVis 2017), 36 (3), pp.12. ⟨10.1111/cgf.13205⟩.<hal-01512256>
  • Emanuel Zgraggen, Alex Galakatos, Andrew Crotty, Jean-Daniel Fekete, Tim Kraska. How Progressive Visualizations Affect Exploratory Analysis. IEEE Transactions on Visualization and Computer Graphics, Institute of Electrical and Electronics Engineers, 2017, <10.1109/TVCG.2016.2607714>. <hal-01377896>
  • Jean-Daniel Fekete, Romain Primet. Progressive Analytics: A Computation Paradigm for Exploratory Data Analysis. 2016. <<arXiv:1607.05162 <hal-01361430>
  • Jean-Daniel Fekete. ProgressiVis: a Toolkit for Steerable Progressive Analytics and Visualization. 1st Workshop on Data Systems for Interactive Analysis, Oct 2015, Chicago, United States. pp.5, 2015, <http://www.interactive-analysis.org/>. <hal-01202901>

Contact

For more information contact Jean-Daniel Fekete.