Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (2025)

Ji Won ChungBrown University,Tongyu ZhouBrown University,Ivy ChenBrown University,Kevin HsuBrown University,Ryan A. RossiAdobe Research,Alexa SiuAdobe Research,Shunan GuoAdobe Research,Franck DernoncourtAdobe Research,James TompkinBrown UniversityandJeff HuangBrown University

(2025)

Abstract.

Traditional data presentations typically separate the presenter and visualization into two separate spaces–the 3D world and a 2D screen–enforcing visualization-centric stories. To create a more human-centric viewing experience, we establish a more equitable relationship between the visualization and the presenter through our InfoVids. These infographics-inspired informational videos are crafted to redefine relationships between the presenter and visualizations. As we design InfoVids, we explore how the use of layout, form, and interactions affects the viewer experience. We compare InfoVids against their baseline 2D ‘slides’ equivalents across 9 metrics with 30 participants and provide practical, long-term insights from an autobiographical perspective. Our mixed methods analyses reveal that this paradigm reduced viewer attention splitting, shifted the focus from the visualization to the presenter, and led to more interactive, natural, and engaging full-body data performances for viewers. Ultimately, InfoVids helped viewers re-imagine traditional dynamics between the presenter and visualizations.

Mobile AR, Data Visualization-based AR, Interactions

copyright: nonejournalyear: 2025copyright: noneprice: 15.00conference: ; ; ccs: Computing methodologiesArtificial intelligenceccs: Computing methodologiesMachine learningccs: Mathematics of computingApproximation algorithmsccs: Information systemsData mining

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (1)

\Description

[Teaser Figure]Shows a drawn person holding up a smartphone and pointing to the phone screen. The phone screen is enlarged to show three images on top of a film roll to emphasize it is a moving image or a video. In the left image, a presenter is standing on a football field with her left arm pointing upward towards the sky. Above her head, a 3D rendered plane is approaching. In the middle image, the presenter stands in the same position and the plane gets closer to her. In the rightmost figure, the presenter is standing to the right of the image, letting the plane hover next to her. On top of the 3D plane, 3D models of seats appear. Most of them are blue seats, but some seats near the front of the plane are highlighted in red.

1. Introduction

Now I’m going to try something I’ve never done before:
Animating the data in real space.

Prof.Hans Rosling

Data visualizations convey both the data and the ideas behind them. As Rosling demonstrates in his widely-viewed presentation 200 Countries, 200 Years, 4 minutes(Rolsing, 2011), how you narrate the story behind the data can be just as important as the visualization itself—or even more so—in conveying the ideas. Rosling’s body and the data co-exist in the video space, letting him synchronize his body movements with animated 2D visualizations. His passion is evident as he kneels and gesticulates to demonstrate rises in life expectancy.

Despite Rosling’s captivating performance in 2011, modern visualization presentations by design typically separate presenters from the content to draw viewers’ attention towards the visualizations. In many commercial, teleconferencing presentations (e.g. Zoom, Teams, Google Meet), presenters are compartmentalized away in a little box in the corner to maximize view of the visualizations. Such visualization-centric formats implicitly send the viewer a message to prioritize the visualization over the presenter. However, establishing an imbalanced relationship is a design choice and comes with trade-offs. The viewer misses the human element, an opportunity to fully connect with the presenter as the presenter’s body language and expressions are minimized. Imagine Rosling’s performance restricted to a little box—would we, as viewers, feel the same level of passion and engagement?

What if we offer viewers an alternative presentation paradigm that by design establishes an equitable relationship between the visualization and presenter? Rather than having the two compete for screen space, what if we integrate the visualization within the same space of the presenter, allowing the viewer to see both within the same frame? How would this shift impact the way viewers engage with the content, the presenter, and the overall experience compared to more traditional formats? And in the process of designing and evaluating such presentations, what can we learn about designing for alternative visualization presentation systems?

To answer these questions, we conduct an exploratory investigation of the viewer experience with four custom-constructed case technology probes(Hutchinson etal., 2003), InfoVids. These are informational videos inspired by ‘shorts’ and infographics, iteratively designed over the span of four months. Each InfoVid showcases the viewer a unique spatial arrangement of the visualization and presenter, while ensuring the viewer has a full view of both at all times.

To understand the potential advantages and challenges of the InfoVid from the viewer’s perspective, we ask 30 viewers in public to compare an InfoVid with its ‘baseline’ equivalent. The latter includes the same content as the InfoVid, but adapts the format of a more traditional, videoconferencing presentation with visualizations. The baseline is not used to evaluate whether our probes are ‘better’, but as a method to help viewers articulate the differences between the two formats, which may be difficult with the probe alone given that they are encountering a new medium. Viewers compare the two based on eight metrics that may influence the viewing experience: perceived presenter immersion, engagement, and co-presence with the visualization, natural body movement, enjoyability, storytelling, information understanding, and viewer’s attention between the presentation and the visualization. We conduct a semi-structured interview to understand what elements in the InfoVid affected their viewing experience. Finally, we share the lessons learned from our viewer evaluations and the process of designing the InfoVids from an autobiographical perspective.

Our exploratory investigation with InfoVids reveals valuable design implications for alternative, visualization presentation systems. First, we learn that the simple design condition of having both the presenter and the visualization in the same frame affects the viewing experience by compelling the presentation designer to be more conscious of the relationship between the visualization and the presenter — an overlooked factor when the presenter is sidelined to a small box. In addition, while most viewers find InfoVids more enjoyable and pay more attention to the presenter over the visualization than the baseline, we discover how varying social expectations among the presenter, visualization, and viewer influence the viewing experience. Lastly, situating the presenter and the visualization in the same space enables viewers to consider the presenter’s body as connected to the flow of the story, an experience that is otherwise absent when the two are not co-located.

2. Related Work

2.1. Different Ways to Communicate Visual Data

Data is more than just numbers—it tells a story, a message through the numbers. To tell an effective story, we need the right tools. Different tools change how we present information to the viewer. To interactively explore data stories, a viewer can use web-based visualizations such as D3.js(Bostock etal., 2011), Vega(Satyanarayan etal., 2016), or PortalInk(Zhou etal., 2024b). To unfold a story in real-time with the viewers, the presenter can draw a story using SketchStory(Lee etal., 2013). To visually captivate viewers and convey a singular, simple message to them, we can create infographics(Lo etal., 2022; Li and Moacdieh, 2014; Lu etal., 2020; Zhou etal., 2024a).

More recently, as tools have evolved, a new paradigm for the viewing experience has emerged. Information now disseminates on social media platforms via short-form videos, or ‘shorts’. To engage viewers, ‘shorts’ situate the presenter in the same space as the augmented virtual content, allow the presenter to interact directly with the virtual elements, and even use hard cuts and quick transitions to capture the viewer attention(Wang etal., 2019; Wang, 2020; Hassoun etal., 2023; Yang etal., 2023). This format stands apart from ‘data videos’(Amini etal., 2015, 2018; Sallam etal., 2022), as it places the presenter themselves at the core function of the video’s narration and storytelling. In addition to the presenter’s voice, their body language and facial expressions are central to the viewing experience.

Such ‘shorts’ aim to engage day-to-day viewers and inform them in a short amount of time. They often employ a more casual, and even playful narrative style and visualizations (e.g. memes, GIFs) and are not as formal as traditional presentations with visualizations(Wang, 2020; Zhu etal., 2020). They are like the infographic equivalent of videos; both use visual embellishments and focus on delivering simple and engaging data narratives to reach a wider audience(Bateman etal., 2010; Harrison etal., 2015; Li and Moacdieh, 2014). Inspired by such ‘shorts’, we design new informational videos, or InfoVids, that integrate presenters and visualizations in the same frame.

Question is, how do viewers respond to the integration of visualizations into ‘shorts’ — a new layout, different from traditional presentation styles? The literature has limited understanding on these new forms of presentations, and on what factors impact the viewer experience. This is, in part, because AR data visualization tools have traditionally focused on using augmented reality to explore multidimensional scientific data(Luo etal., 2021; Hubenschmid etal., 2021; Yang etal., 2020; Satriadi etal., 2022; Tong etal., 2023), enhance human information processing capabilities(Rajaram and Nebeling, 2022; Chen etal., 2020, 2023), or enhance immersion with the data(Cordeil etal., 2019; Sicat etal., 2018; Langner etal., 2021; Chen etal., 2019). They do not explore how the viewing experience is affected when AR is leveraged to convey information in more casual, presentation settings—this is the very gap in knowledge this paper seeks to fill.

2.2. Implicit Relationships Defined by Presenter-Visualization Interactions

To transform and enhance the viewing experience, previous works have explored innovative ways a presenter could interact with visual graphics. Both ChalkTalk(Perlin etal., 2018) and Saquib et al.(Saquib etal., 2019) allow the presenter to interact with 2D graphics and sketches. CLIO(Davis etal., 2023) and Reality Talk(Liao etal., 2022) demonstrate a new viewing experience by allowing the presenter to interact with 2D visuals with their voice and hand gesture interactions. Hall et al. designs their own hand gesture language and experiments with different layouts of the visualization to ensure the scientific integrity of the visualizations for the viewers(Hall etal., 2022).

Other works investigate how to enhance and transform the viewing experience by coupling or binding visualizations with the presnter’s body. BodyVis engages the viewers with anatomy visualizations by using a wearable e-textile shirt(Norooz etal., 2015). Within the realms of AR, MagicMirror(Subramonyam, 2015) and mirracle(Blum etal., 2012) both overlay medical or biodata visualizations over the presenter’s body. RealitySketch uses AR to analyze different physical motions, including those of people(Suzuki etal., 2020). In other cases, the coupling provides viewers with more to creative and artistic presentations. HandAvatar enables the presenter to create non-humanoid puppet performances by binding virtual animals to the hand(Jiang etal., 2023). Pei et al.’s Hand Interfaces framework uses hands to generate an associated virtual object such as wands and healing potions(Pei etal., 2022).

Such interactions not only transform the viewing experience, but more importantly, these interactions also implicitly define the relationship between the presenter and the visualization. Interactions establish who can affect what, and the parties involved in the interaction imply the nature of the relationship between them. For example, the presenter’s ability to move around a visualization means the visualization is secondary, at will, to the presenter. Understanding the nature of the relationship is crucial because the relationship itself can convey a message to the viewers. However, this aspect is overlooked in previous works, and this paper aims to explore the factors that shape such relationships and understand how they may affect the viewing experience.

3. Designing InfoVids

To inform the development of the tools needed to build and design four case InfoVids, we engaged in an iterative design process over the course of nine months. Each InfoVid serves as a technology probe, used to ‘find out about the unknown’ and we adopt this approach to explore how we can ‘challenge pre-existing ideas and influenc[e] future’(Hutchinson etal., 2003) visualization presentation technologies. This section outlines the development of our iterations and the lessons learned throughout the process. Because this design process required interdisciplinary knowledge of augmented reality, visualizations, and presentations—an intersectional expertise not commonly found among participants—we chose to draw our insights from an autobiographical perspective(Neustaedter and Sengers, 2012; Desjardins and Ball, 2018; Zhou etal., 2024b; Huang and Qian, 2023) to offer a deeper understanding of the challenges and decisions involved in the process.

3.1. Designing for an Equitable Relationship

Our primary aim was to design a format that establishes an equitable relationship with presenter and the visualization, from both the presenter and the viewer’s perspective. To achieve this, we first integrated the spatial dimensions of the visualizations with that of the presenter. Drawing from previous works(Saquib etal., 2019; Liao etal., 2022; Hall etal., 2022; Davis etal., 2023; Perlin etal., 2018; Rolsing, 2011), we overlaid a 2D visualization of bar charts and scatter plots onto the same screen of the presenter using augmented reality (ARKit on a iPhone). While, from the viewer’s perspective, this setup placed the presenter at the center of the frame and resolved the visual divide between the presenter and the visualization (e.g. how the presenter is compartmentalized in a box in teleconferencing systems), we observed from our own experiences acting as presenter that this design still did not fully establish an equitable relationship.

3.1.1. Redefining Relationships Using the Third Dimension

The flat 2D form of the visualization warranted that we, as the presenters, compromise our movements in 3D space for the visualizations. When we took a viewer’s perspective, and reviewed recordings of ourselves presenting with this design, we frequently observed ourselves ‘miming’. As presenters, we were moving around the virtual visualization and using awkward body language only in the horizontal dimension. We never utilized the third, z𝑧zitalic_z-dimension, as we might in a real-life presentation in 3D space. These awkward body movements are noticeable distractions, or ‘breaks in presence’(Slater and Steed, 2000; Slater etal., 2003), that disrupt the viewer experience. In other words, integrating spaces was not enough to establish an equitable relationship; we were compromising bodily freedom and still prioritizing the visualization by design.

To create an equitable relationship, we had to also ensure that the interactions between the visualizations and presenter appeared balanced — the presenter’s body movements should not look compromised to the viewer. Following our lesson from the previous iteration, we introduced three-dimensionality to the form of the visualization. Our objective was to understand whether having the visualization occupy the same physical space as the presenter would reduce awkwardness in body movements and interactions. While the added third dimension allowed for more staging flexibility and enabled us, as the presenters, to move beyond the constraints a 2D plane, we still found ourselves walking and moving around the visualizations. The visualization continued to affect our interactions. However, we realized that this was not the crux of the issue.

3.1.2. Teeter the Balance with “Wearable” Visualizations

The imbalance continued because the interaction was not bidirectional — the visualization affected our interactions as presenters, but not vice versa. Thus, to equalize their relationship and send the message to the viewers that the two were of equal standing, we incorporated gestures that allowed the presenter to control the visualization, as informed by previous works(Hall etal., 2022; Liao etal., 2022; Suzuki etal., 2020; Davis etal., 2023). As a result, we now had two primary directions of interaction: the presenter could affect the visualization and vice versa.

However, this process raised another question: could we design an even more equitable interaction, where control was not unidirectional, but mutual, or simultaneously bidirectional? Referencing previous works on body-augmented visualizations(Norooz etal., 2015; Blum etal., 2012; Subramonyam, 2015; Jiang etal., 2023; Pei etal., 2022; Saquib etal., 2019; Fribourg etal., 2021), we found that this was indeed possible if we allowed the presenter to ‘wear’ the visualizations, or attach them directly to the body like costumes from theater. The visualization could influence the presenter by restricting certain body movements (similar to how different costumes could affect the actors movements), while the presenter simultaneously influence the visualization through their movements (costumes move as the actor moves).

3.2. A Tool to Implement InfoVids

To offer the viewer with InfoVids that place the presenter and the visualization on an equal footing, we needed to consider both their spatial layout, form, and interactions. Because there were no off-the-shelf tools that would allow us to create such InfoVids, we developed a new tool called the Body Object Model (BOM) to help create our probes. While the tool itself is a means to an end, and not our contribution, we discuss it because it plays a key role in the process of creating our probes, InfoVids. More detailed implementation details are in the Appendix.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (2)

\Description

[Body Object Model] Three arrows stemming from the self–facing camera to show where the data for generating Face Node, Body Joints, and Hand Joints are from. One of these arrows show how ARKit, coded in red, generates a red Face Node; the other how VNDetectHumanBodyPose3DREquest generates light blue body joints; and light orange for VNDetectHumanHandPoseRequest generating hand Joints. These face node, body joints, hand joints are encapsulated in a box called ‘body anchors’. There is a purple ‘VisNode’ box which generates a purple circle of either ‘Scatter Plot, Bar Chart, or Map Nodes’. There is a purple double arrow that connects the ‘body anchors’ with the ‘visNode’ with the caption ‘Bind(BodyNode, SceneBode), Binding VisNode to BodyAnchors’ to show that you can bind visNode with body anchors. Under the box demonstrates where the nodes are anchored to the body and the hands. The faceNode is on the center of the face, the blue nodes are at the top of the head, three points across the left, center, and right of the shoulder, the elbows, the spines, the wrists, the left, center, and right hip, the knees, and the ankles. The hand joints have one on the wrist and the rest have four for each joint of the fingers. There is an arrow from ‘Body Joints’ to ‘Skeleton Tracker’ to demonstrate that skeleton tracker manages those joints. There’s another from ‘hand joints’ to ‘Hand Tracker’ for a similar reason. There’s a double arrow stemming from ‘Hand Tracker’ to produce a green box of ‘Gesture States’. This green box is essentially a table of 9 gesture icons and their associated descriptions of what those hand gestures mean. There is ‘Thumb Left’, ‘Thumb Right’, ‘Palm Out’, ‘PalmBack’, ‘OpenHand’, ‘Pinching’, ‘Fist’, ‘Pointing’, and ‘Unknown’. There’s then another double arrow from ‘Gesture States’ to a grayish blue box called ‘Action Sequence’ to demonstrate that action sequences are created using ‘Gesture States’. There are 5 different action sequences and each of them show how the action is made using each of the gesture states. Hence gesture icon, arrow, gesture icon. The first one is a ‘Come Hither’ gesture which demonstrates a beckoning motion. ‘Push Forward’ is a series of ‘PalmOut’. ’Clench Fist’, is the user having their ’Palm Out’ and then making a ‘Fist’. The ‘Appear’ has the user make a ‘Fist’, and then ‘Open Hand’. The last one is an unknown state represented by a question mark. There is an arrow connecting Gesture States to Action Sequence. Then there is the gray ‘VisHandler’ box, with description ‘Scripting Visualisation Interactions with the Body’. Each of ‘FacialCue Tracker’, ‘Skeleton Tracker’ and ‘Hand Tracker’ has an arrow connecting to ‘VisHandler’. The ‘VisHandler’ also has a double arrow line that is purple with ‘VisNode’ to demonstrate that you can use it to bind gestures with the vis node.

We implemented the Body Object Model using ARKit, an augmented reality package, on the iPhone as this enabled us to merge the virtual visualizations with the physical space of the presenter. To create a presentation where we could control the presenter-visualization relationship, we implemented three components: the VisNode, BodyAnchors, and the VisHandler (Figure2). VisNodes are containers for 3D visualizations. The BodyAnchors represent locations on the presenter’s body. The tracking system provides the location of the VisNode and BodyAnchors in 3D space. The VisHandler allows us to script the interactions between the visualization (VisNode) and the presenter (BodyAnchor).

These building blocks enable the design of presentations with equitable relationships between the visualization and presentation with the following three features:

  1. F1.

    Use of 3D Physical Space \rightarrow C1ARKit and the segmentation of the presenter from the physical background allows the designer to integrate 3D visualizations and place visualizations in different 3D locations. These new setups enhance the sense of depth and open new creative ways to integrate the physical surroundings with the presenter. The added z-dimension provides more space flexibility than 2D spaces, enabling more stagings in which the presenter’s body does not compete for space with the visualization.

  2. F2.

    Body-vis Attachments \rightarrow C2 The performance designer can attach a VisNode to a BodyAnchor, almost like a wearable costume, and use the presenter’s body as an integral part of the performance. These body-vis attachments enable simultaneously bidirectional interactions.

  3. F3.

    Body-vis Interactions \rightarrow C3 The presenter can attach actions to the VisNodes and use their body to control or interact with them.

3.3. Creating Four InfoVids

With the tools to create InfoVids in place, we then explored how different combinations of form and interactions (F1–F3) would affect the viewer’s experience. We design four case InfoVids with varying conditions (C1–C3). To understand how the merged spaces of the visualizations and presenters affected the visualizations, for each InfoVid, we created its baseline equivalent representing a 2D videoconferencing style presentation. To investigate how the three-dimensionality of the visualizations would affect the viewing experience, we made two InfoVids with visualizations that looked two-dimensional (InjuryVis, WalmartVis) and two InfoVids with three-dimensional visualizations (AirplaneVis, NapoleonVis). To understand how different types of interactions may affect the viewer, we strategically created one InfoVid where the visualization induced the presenter to move around it (AirplaneVis), one where it was evident the presenter moved the visualization (NapoleonVis), and one where the presenter and visualization were bound to each other physically and interaction wise (InjuryVis).

The content for each InfoVid was selected with these conditions in mind. We narrowed our search scope to infographics (Network, 2020; Friendly, 2023; Graham, 2024), because they are known for their simple and engaging data narratives(Bateman etal., 2010; Harrison etal., 2015; Li and Moacdieh, 2014). This eliminated the need to craft a captivating storyline for the viewers from scratch. To refine the design of our InfoVids and ensure that they were engaging from a viewer’s perspective, we conducted internal critiques with 15 HCI researchers over 4 months with alternative designs(Tohidi etal., 2006) before evaluating them with viewers in public.

The following paragraphs provide a brief description of each InfoVid, the reason behind its selection, and what elements reflect the conditions C1–C3. Additionally, we describe how we make the baseline, or each InfoVid’s equivalent in a videoconferencing format. A summary on how the conditions C1–C3 are applied is available in Table1.

Visualization
Name
C1. Visualizations Use
3D Physical Space
C2. Body-Vis Attachments for
Simultaneous, Bidirectional Interactions
C3. Unilateral Body-Vis
Interactions by the Presenter
Baseline--
AirplaneVis--
NapoleonVis-
InjuryVis-
WalmartVis *-

\Description

[Condition Table]Shows a table with four columns are five rows, each row with a visualization name. The Under the first column, ‘Physical Surroundings Contextualizes Visualization (Criteria 1), Baseline is unchecked, AirplaneVis is checked, WalmartVis is unchecked, InjuryVis is unchecked, and NapoleanVis is checked. Under ‘Performer Body as Contextualizes Visualization (Criteria 2), Baseline is unchecked, AirplaneVis is unchecked, WalmartVis is unchecked, InjuryVis is checked, and NapoleanVis is unchecked. Under ‘Performer Body-Vis Binding Enhances Narrative (Criteria 3), Baseline is unchecked, AirplaneVis is unchecked, WalmartVis is checked with with one asterisk, InjuryVis is checked, and NapoleanVis is checked with double asterisk.

AirplaneVis (C1)

Based on a commercial airplane crash infographic(Graham, 2024), this uses a 3D airplane and the physical surroundings (C1), such as the sky, to accentuate the sense of depth and present a scenario where the presenter must step to the side of the visualization because it takes 3D space (Figure1).

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (3)\Description

[NapoleonVis]Shows two sections, one section has a blue banner named ‘Baseline Slides Version’, beneath it four horizontal slideshow images labeled with blue labels that mark 1, 2, 3 and 4. Each blue label image shows on the left the NapoleonVis, a tall wall of orange, virtual dots that is animatable representing the size of french soldiers, and on the right a vertical video space for the presenter. The background of the NapoleonVis is gray. From image 1 through 4, the presenter gestures away from the screen, moves towards the left, then right, and eventually back to the center. The animation shows a portion of the army moving back, then the entire army moving left and right, and lastly the army laying horizontal on the ground, colored in white to represent casualties. The second section has a purple banner named ‘InfoVids Version’, beneath it are four vertical images labeled with purple labels that mark 1, 2, 3 and 4. The presenter moves around from images 1 through 4 and is standing on a field outside. In image 1, the presenter stands on the right hand side of NapoleonVis and gestures away from the NapoleonVis. A portion of the army moves further away from the camera. The label reads ‘Gestures Army to March Away to Emphasize Depth’. In image 2, the presenter shifts towards the left while holding her first, the label reads ‘Remaining Army is Bound to presenter’s Face Location’. In image 3, the presenter shifts towards the right, and the NapoleonVis follows her body. The label reads, ‘presenter Moves to Right and Army Follows presenter’. In image 4, the presenter shifts back to the center, the armies are rotated by 90 degrees, now laying flat on the ground, colored in white to represent casualties. The label reads, ‘presenter Explains the Dire Consequences of the Outdoors March’.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (4)\Description

Shows two sections, one section has a blue banner named ‘Baseline Slides Version’, beneath it four horizontal slideshow images labeled with blue labels that mark 1, 2, 3 and 4. Each blue label image shows a gray diagram of a person against white background on the left, and a vertical video space for the presenter on the right. The presenter moves around slightly from images 1 through 4. In label 1, the figure is unlabelled. Starting from label 2, a red, half-transparent red circle appears on the knee of the figure. It is labeled ‘Knee: 18’. In image 3, a larger red circle appears around the head of the figure, labeled ‘Head: 34’. The presenter is posting towards their head with their left index finger. In the final image, the presenter puts their arm down and the visualization stays the same, indicating the end of the performance. The second section has a purple banner named ‘InfoVids Version’, beneath it are four vertical images labeled with purple labels that mark 1, 2, 3 and 4. In each image, the presenter is shown full body standing in a room. In image 1, the presenter appears to be explaining. The caption reads, ‘Narrates High School Football is a Dangerous Sport’. In image 2, the presenter gestures to the left knee with their left hand, and a half–transparent red circle appears at their knee, with label ‘Knee: 18’. The circle is about the size of the presenter’s head. The caption reads ‘Gesture to the Knee Makes Red bubble Appear at Knee’. In image 3, the presenter uses their left index finger to point at their head. Another red circle appears around the head. It is extremely big, about 5 times the size of the presenter’s head. The caption reads ‘Pointing to the Head Makes Larger Bubble Appear at Head’. In image 4, the presenter puts their left arm down. The caption reads, ‘Radius Difference shows Football Injures the Head More than Knee’.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (5)\Description

[]Shows two sections, one section has a blue banner named ‘Baseline Slides Version’, beneath it four horizontal slideshow images labeled with blue labels that mark 1, 2, 3 and 4. Each blue label image shows a rectangular US map with white background on the left, and a vertical video space for the presenter on the right. The presenter gets closer to the camera from images 1 through 4. In label 1, the US map is empty. Starting from label 2, blue dots showing Walmart locations in the US begin to appear. Between 2 and 4, the number of blue stores increases as they spread out from the South towards the West Coast. The second section has a purple banner named ‘InfoVids Version’, beneath it are four vertical images labeled with purple labels that mark 1, 2, 3 and 4. In each image, the presenter is shown from knee up. A rectangular US map with white background is attached to the torso of the presenter. In image 1, the caption reads ‘U.S. Map Bound to presenter Torso’. In image 2, the presenter is using their left hand to trigger the animation, which triggers the date 1981 to appear on the map as well as little blue dots representing store locations. The caption reads ‘Gestures to Emphasize Start of Growth in Central U.S’. In image 3, the caption reads ‘Steps Forward to Show Detailed View of Spread along Coasts, representing the image content, with the map dated 1993. In Image 4, the date is 2006 on the map and the caption reads ‘Finished Animated Spread Shows Full Spread of Walmart’.

NapoleonVis (C1, C3)

This references Charles Minard’s depiction of the catastrophic march across Russia in 1812 and Numberphile’s presentation(Kosara and Mackinlay, 2013; Grime, 2015; Friendly, 2023). The 3D army and the physical surroundings, the grass, reinforces the outdoor setting of the event as the army parts into the z-axis (Figure3, Frame 1). Mid performance (Figure3, Frames 2 & 3), the presenter uses their full-body to move back and forth the French army to simulate the French following the Russians (C3).

InjuryVis (C2, C3)

A direct homage to the New York Times infographic(Network, 2020), InjuryVis fuses the presenter and the visualization with multiple body-vis bindings, making the presenter inseprable from the visualization, and act as an integral narrative device of the visualization (C2). The presenter points to their own body (C3) to demonstrate the dangers of football. Translucent red bubbles appear on the presenter’s ankle and head to indicate the number of injuries at that region of the body (Figure4).

WalmartVis (C2, C3)

The animated geoplot, a replica of Bostock’s D3.js animation(Bostock, 2023), is a controlled counterexample. It is crafted to understand when body-vis bindings are useful or harmful for viewers. The animated map is bound to the presenter’s torso, but serves little narrative purpose, unlike the InjuryVis.

Baseline

The baseline is the equivalent InfoVid as a 2D videoconferencing style presentation with slides. Comparing the baseline with the InfoVids will allow us to understand the effects of the new stagings enabled from the added flexibility of 3D space (F1). We minimized confounding factors between the baseline and InfoVids by using the same take of the performance and scripting a common body language that worked for both formats. Given the limited screen size of a mobile phone, we chose a horizontal video orientation for the baseline such that the presenter’s upper body, including their face and hands, would remain visible. To make aesthetically pleasing layouts, we used a video composition technique called the rule-of-thirds. We summarize this in Figure6. However, making a baseline format that enables a fair comparison between the baseline and the InfoVids using the same performance takes is challenging and requires a detailed discussion of its own. Therefore, we provide a detailed discussion of the trade-offs and the reasons behind our choices in the Appendix.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (6)\Description

[Making the Baseline Video Format]Shows the three sections representing steps of creating Baseline Video Format. First section is labeled ‘Two Phones Simultaneously Record Performance’. Two iPhone interfaces are shown, each with a presenter standing in a room. One screen has an AR overlay and the other does not. The one without AR overlay is connected to the second section via an arrow, with the label, ‘No AR Overlay’. The second section is labeled ‘3:1 Ratio to Maximize View of Slides and presenter’. It shows a horizontal iPhone screen. The leftmost 3/4 of the screen is occupied by a white slide with a gray figure and a semi-transparent red circle around the knee. The other 1/4 is occupied by the video taken from the no AR overlay iPhone video from section 1. On this video there is a semi-transparent orange overlay and black grid lines that separates the video into 9 sections. The three rows of grids are labeled ‘Eyes above the Line’, ‘Centered Head’ and ‘Space for Features’. The presenter’s eyes are lined up within the first row of grids, etc. The third section is labeled ‘Animation and Slides Synced Post-Hoc with presenter Movements’. It shows four images, the top two are screenshots from the video with AR Overlay, taken at different time frames. The bottom two images are screenshots from the synced up slideshow performance.

4. A Comparative Probing with InfoVids

We conducted street interviews to evaluate the InfoVids with members of the public. To poll diverse people, we followed Denning et al.’s methodology—interviews were conducted at three different cafes and a mall because of their ‘reasonable throughput of traffic’ and ability to ‘attract different demographics’(Denning etal., 2014). Two interviewers worked in parallel each for 10 hours to ask potential participants whether they would like to participate in a 15-minute evaluation of visual performances in exchange for a pastry. Demographic information was not collected to encourage interviewees to respond more and to provide honest feedback. Out of respect of people’s time and space, we only approached people who appeared to be waiting or not preoccupied. A total of 74 people were approached, of which 44 declined to participate. Subsequent analyses are based on the 30 participants who voluntarily consented to participate.

4.1. Interview Procedure

Participants were shown 4 pairs of performances on a smartphone. Each pair showed the same topic with one in a video conferencing format and another with InfoVid. As shorthand, we called the baseline condition the ‘slides’ version and the latter the ‘non-slides’ one. Both videos had 886×19208861920886\times 1920886 × 1920 resolution and a maximum length of 35 seconds. To minimize ordering and recency bias, we took random permutations of the 4 pairs of videos and randomly chose which version of the presentation was shown first.

After each pair of viewings, participants filled an anonymous online 9-question survey on a laptop. Question orderings were randomized to mitigate ordering bias. We ask the following 9 questions in the survey:

  1. (1)

    Which presentation makes the presenter look more immersed with the visualization?

  2. (2)

    Which presentation makes the presenter look more engaged with the visualizations?

  3. (3)

    Which presentation makes it more believable is that the presenter is in the same room as the visualizations?

  4. (4)

    Which presentation does the presenter look more natural, body movement wise?

  5. (5)

    Which presentation is more enjoyable to watch?

  6. (6)

    Which presentation style is better for telling a story?

  7. (7)

    Which presentation style is better for understanding the information?

  8. (8)

    For the NON-SLIDES version, did you view the presenter or the visualization more?

  9. (9)

    For the SLIDES version, did you view the presenter or the visualization more?

The questions are designed as a two-alternative forced choice test (2AFC), but participants were presented with a 6-point Likert scale to discourage random selection bias. The analysis, however, treats the answers as a 2AFC, as this was the intended framework from the onset of the study design. After all 4 surveys, we conducted a short semi-structured interview to ask what elements in the performances affected their decisions. The interviewer transcribed notes in real-time to record the participant responses.

4.2. Method of Analyses

There were four major parts to the mixed methods analyses. First, to analyze the overall efficacy of InfoVids for each visualization, we first investigated participants’ overall preferences between the baseline and the InfoVid and the presenter and visualization. For each of the 9 survey questions for a given visualization, a binomial test was conducted with (α=0.05)𝛼0.05(\alpha=0.05)( italic_α = 0.05 ), because each question was designed as a 2AFC test with 2 possible outcomes. The null hypothesis H0subscript𝐻0H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT assumed both outcomes were equally likely (p=0.5)𝑝0.5(p=0.5)( italic_p = 0.5 ). Rejecting the null hypothesis signified the observed preference over one outcome or the other was not random and was statistically significant.

Next, to understand if the different survey results among visualization types were significant, we ran the Friedman’s Chi-square tests for each of the question. The Friedman test (α=0.05𝛼0.05\alpha=0.05italic_α = 0.05) was used because the same viewers ordinally rated each question using a 6-point Likert scale on 4 different conditions. The H0subscript𝐻0H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in this case assumes there is no difference in the medians, or that the results across different visualization types were comparable. To further investigate whether a pair of groups significantly different from each other, we ran the Wilcoxon Signed-Rank test to see if the distribution of differences between the two groups were approximately normal, else we ran a weaker signed test.

We also analyzed how the new staging format introduced by InfoVids affects viewer attention to the presenter and visualization prop. For each visualization type, we twice asked participants whether they viewed the presenter or the visualization more—once for the InfoVid and another for the baseline. For each visualization, we compare whether the InfoVid made significant differences in visualization-presenter attention from the baseline. As we have fewer than 50 participants and paired, binary outcomes (visualization/presenter), we conduct the Fisher’s exact test with α=0.05𝛼0.05\alpha=0.05italic_α = 0.05 and summarize the results in Figure8. To evaluate how many viewers switched their focus more to the presenter with InfoVids, in comparison to the baseline, we also analyze the directionality of change using contingency tables.

Lastly, to understand the factors that affected the viewer’s decisions, we thematically analyzed the semi-structured interviews using deductive coding(Braun and Clarke, 2006). Given the short interviews, only three iterations were needed to generate the following 9 themes: captivating, aesthetics, distracting, attention split, helpful for information understanding, breaks in presence, body language and interaction, body as context, and situational context.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (7)\Description

[]: Shows four charts of view preference of Baseline vs InfoVids between the four sets of videos, AirplaneVis, NapoleonVis, InjuryVis and WalmartVis. Underneath the Heading, ‘Viewer Preference of Baseline vs InfoVids’, it reads ‘Statistical significance of (a = 0.05) from binomial test indicated by *’ For each chart, the y axis is ‘Number of Votes by Viewers, going from 0 to 30, and the x axis are categories questions asked, including ‘Presenter Immersion with Vis’, ‘Presenter Engagement with Vis’, ‘Presenter in Same Room with Vis’, ‘Natural Body Movement’, ‘Enjoyable to Watch’, ‘Storytelling’ and ‘Information Understanding’. For each chart, Baseline(Slides) votes, in the form of a bar, are represented by sky blue and the first chart. InfoVids(Non-Slides) votes are represented by purple bars. In the first chart, ‘AirplaneVis’, presenter Immersion with Vis scores a Baseline of 1 and InfoVids of 28, with a p value of 0.000. Presenter Engagement with Vis scores a Baseline of 5 and InfoVids of 25, with a p value of 0.000. Presenter in Same Room with Vis scores a Baseline of 4 and InfoVids of 26, with a p value of 0.000. Natural Body Movement scores a Baseline of 6 and InfoVids of 24, with a p value of 0.001. Enjoyable to Watch scores a Baseline of 10 and InfoVids of 20. Storytelling scores a Baseline of 8 and InfoVids of 22, with a p value of 0.016. Information Understanding scores a Baseline of 13 and InfoVids of 17. In the second chart, ‘NapoleonVis’, presenter Immersion with Vis scores a Baseline of 1 and InfoVids of 28, with a p value of 0.000. Presenter Engagement with Vis scores a Baseline of 4 and InfoVids of 26, with a p value of 0.000. Presenter in Same Room with Vis scores a Baseline of 4 and InfoVids of 26, with a p value of 0.000. Natural Body Movement scores a Baseline of 5 and InfoVids of 25, with a p value of 0.000. Enjoyable to Watch scores a Baseline of 6 and InfoVids of 24, with a p value of 0.001. Storytelling scores a Baseline of 10 and InfoVids of 20. Information Understanding scores a Baseline of 14 and InfoVids of 16. In the third chart, ‘InjuryVis’, presenter Immersion with Vis scores a Baseline of 1 and InfoVids of 28, with a p value of 0.000. Presenter Engagement with Vis scores a Baseline of 7 and InfoVids of 23, with a p value of 0.005. Presenter in Same Room with Vis scores a Baseline of 5 and InfoVids of 25, with a p value of 0.000. Natural Body Movement scores a Baseline of 12 and InfoVids of 18. Enjoyable to Watch scores a Baseline of 15 and InfoVids of 15. Storytelling scores a Baseline of 13 and InfoVids of 17. Information Understanding scores a Baseline of 20 and InfoVids of 10. In the fourth chart, ‘WalmartVis’, presenter Immersion with Vis scores a Baseline of 12 and InfoVids of 18. Presenter Engagement with Vis scores a Baseline of 11 and InfoVids of 19. Presenter in Same Room with Vis scores a Baseline of 8 and InfoVids of 22, with a p value of 0.016. Natural Body Movement scores a Baseline of 21 and InfoVids of 9, with a p value of 0.043. Enjoyable to Watch scores a Baseline of 25 and InfoVids of 5, with a p value of 0.000. Storytelling scores a Baseline of 26 and InfoVids of 4, with a p value of 0.000. Information Understanding scores a Baseline of 27 and InfoVids of 3, with a p value of 0.000.

5. Findings and Lessons Learned from the Viewer’s Perspective

By asking participants to compare among different InfoVids and between the baseline and the InfoVids, we aim to understand the following questions: (1) How do InfoVids affect the viewing experience compared to traditional videoconferencing formats? (2) Does the merging of spaces make viewers focus on the presenter more than the visualization (survey Q8-9)? (3) How do the different conditions (C1–C3) in the InfoVids affect their viewing experiences (survey Q1-Q7)? We hope the answers to these questions can better guide the future design and presentation of InfoVids-like visualizations.

According to survey results, InfoVids enhances the stories and makes the experience more enjoyable for viewers in some cases than the baseline. From the rationale given, the viewers perceived the presenter to use more of their natural body movements and looked more immersed, engaged, and physically present with the visual props (Figure7). While viewers primarily focus on the visualizations in the baseline, InfoVids make the viewer focus on the presenter more by blending the presenter into the same space with the visual props (Figure8).

Body Language is All You Need: InfoVids Enable Engaging and Enjoyable Full-Body Performances

The results suggest that InfoVids were more engaging and enjoyable to viewers because the layout designed to foster an equal standing between the visualization and presenter encouraged the use of the full-body with the blended space. While the baseline had the same animations and performance takes, synced to the presenter’s movements, the presenter’s body was cut off, making the presentation visualization-centric by design. Participants “couldn’t really see her [the presenter’s] whole body”(P8). The baseline format prevented viewers from fully seeing the presenter’s actions directly related to the visualizations, such as pointing to different parts of the plane.

However, by co-locating the presenter and visualization in the same 3D space (C1), we changed the relationship between them, and could send the message to the viewers that the presenter’s body holds more meaning in the presentations than traditional ones. The presenter’s body, scripted to move to the side for a crashing plane in AirplaneVis, emphasizes that the plane is taking up the presenter’s 3D space. The presenter’s body is the visualization itself in InjuryVis and a symbol for the Russians that the French were following in NapoleonVis. As P14, a strong advocate for InfoVids, states:

Even for WalmartVis, body language is all you need; you don’t need different languages—French, English, Spanish—it’s understandable for anyone it’s eye grabbing and it just pulls your attention and makes you focus on the presenter all the time and it’s engaging.

The body-vis bindings in NapoleonVis (C3) and InjuryVis (C2) emphasized the use of the full-body by enhancing direct interactivity with the visualizations. As a result, participants “liked how the soldiers [from NapoleonVis ] were going back and forth and [that] the presenter was moving with them”(P1). Some felt it “connected the story with the movement”(P22) of the presenter. InjuryVis was engaging because the “presenter is pointing at their own body”(P25) and “refer[ring] to her own body”(P23) and some were even “wow[ed]”(P29). As P17 states:

The body one, it was really appropriate—what better way to talk about the body than to use the body?

Even AirplaneVis (C1) was preferred over the baseline because of “the way the presenter moved to the side and occupied the space of the frame…made it more cohesive”(P17). Moving to the side of the frame was scripted, but did not involve any direct body-vis bindings or interactions (C2, C3). Thus, this indicates that the presenter’s full-body on its own acts a critical narrative device and occluding the body in the baseline makes the presentations less engaging.

We corroborate this sentiment in the quantitative findings (Figure7). Viewers found presenters in AirplaneVis, NapoleonVis, and InjuryVis to look more natural and better at storytelling than the baseline. Furthermore, the majority of viewers found the AirplaneVis and NapoleonVis InfoVids to be more enjoyable to watch than the baseline.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (8)

\Description

Shows four bar charts and four contingency tables for the four sets of visualizations. Under the heading, ‘Viewer Attention of presenter vs Visualization by Presentation Type’, it reads ‘Statistical significance of (a = 0.05) from binomial test indicated by *’, “Statistical Significance of (a = 0.05) for differences in visualization-presenter attention between InfoVids and Baseline from Fisher’s exact test indicated by **’ presenter is indicated by green and visualization is indicated by yellow. For each bar chart, the y axis is Number of Votes by Viewers ranging from 0 to 30, the x axis categories are Baseline and InfoVids. For AirplaneVis, the Bar chart shows that Baseline scores a presenter score of 4, Visualization score of 26*, p value of 0.000. InfoVids scores a presenter score of 17 and Visualization score of 13. The p value for differences between Baseline and InfoVids is 0.001. For NapoleonVis, the Bar chart shows that Baseline scores a presenter score of 6, Visualization score of 25*, a p value of 0.001. InfoVids scores a presenter score of 18 and Visualization score of 12. The p value for differences between Baseline and InfoVids is 0.003. For InjuryVis, the Bar chart shows that Baseline scores a presenter score of 9, Visualization score of 21*, a p value of 0.043. InfoVids scores a presenter score of 22*, Visualization score of 8, p value of 0.016. The p value for differences between Baseline and InfoVids is 0.002. For WalmartVis, the Bar chart shows that Baseline scores a presenter score of 5, Visualization score of 25*, a p value of 0.000. InfoVids scores a presenter score of 20, Visualization score of 10. The p value for differences between Baseline and InfoVids is 0.000. Underneath the bar charts shows four contingency tables. Each table has a horizontal heading ‘InfoVids’ spanning across two columns. The first column is labeled ‘presenter’, the second column is labeled ’Vis’. The Vertical heading is ‘Baseline’, spanning across two rows. The first row is labeled ‘Vis’ and the second labeled ‘Pres.’, short for presenter. For AirplaneVis, the value at row 1 column 1 is 16, value at row 1 column 2 is 10. Value at row 2 column 1 is 1, and value at row 2 column 2 is 3. For NapoleanVis, the value at row 1 column 1 is 17, value at row 1 column 2 is 7. Value at row 2 column 1 is 1, and value at row 2 column 2 is 5. For InjuryVis, the value at row 1 column 1 is 17, value at row 1 column 2 is 4. Value at row 2 column 1 is 5, and value at row 2 column 2 is 4. For WalmartVis, the value at row 1 column 1 is 19, value at row 1 column 2 is 6. Value at row 2 column 1 is 1, and value at row 2 column 2 is 4.

InfoVids Reduce Split Viewer Attention and Increase Attention on the Presenter

Viewers preferred InfoVids because the blending of props and presenters reduced the need to split their attention between them. Multiple participants (P8, P12, P15, P20, P22, P24, P30) found the equitable, spatial layout of InfoVids “easier to follow along—see both the presenter and the visualizations”(P15) while the slides version felt like “two videos were fighting for their attention”(P20). Even for participants who “prefer seeing data rather than people [it was] good to see person and data at the same time”(P25). Thus, it is not only evident that the new layout changed the relationship dynamics between the visualizations and presenters for viewers, but also, at times, even preferred.

Furthermore, more than half the participants switched from focusing on the visualizations to the presenter with InfoVids (WalmartVis 19/30, InjuryVis 17/30, NapoleonVis 17/30, AirplaneVis 16/30; Figure8). These results are non-obvious because many of the visualizations, other than InjuryVis, were not directly overlaid over the presenter at all times—the airplane was detached from the presenter, the French army was at the side and in front of the presenter many of the times, and the map for WalmartVis was located at the torso.

At the same time, we also learn from comparing WalmartVis and InjuryVis that how we integrate visualizations with the presenters also affect the viewer’s experience. We designed WalmartVis as a controlled counterxample to showcae instances of improper integration of presenter and visualization. This was to understand when visualizations should not be combined with the presenter and to assess whether participants would recognize these issues.

As expected, viewers significantly preferred the baseline WalmartVis over the InfoVid version in all measures. A significant number of viewers thought the presenter’s body movement looked unnatural (WalmartVis 21/30, Figure7), and for some even “awkward”(P10). The map was not only “distracting”(P20) but also “not very acceptable when you’re trying to learn”(P4). Consequently, WalmartVis is the only performance in which viewers significantly preferred the baseline over InfoVids on the metrics of enjoyability, storytelling, and information understanding. However, our results indicate that the merging of spaces did not warrant such negative reactions. Viewers perceived the presenter in WalmartVis as immersed, engaged, and co-existing in the same space as the visualization. Given that many viewers had positive reactions with InjuryVis, we also know that viewers are not necessarily opposed to the act of attaching visualizations to the body. Thus, the results suggest that the viewers are being affected by how the presenter uses the props.

These findings indicate a need to develop new tools that help us investigate viewer attention in 3D presentations. Such tools can help presentation designers understand how they can optimize 3D space and orient visualization props in relation to presenters to enhance storytelling. Similar to how studies investigate visual attention and flow in static information layouts with eye-tracking(Papoutsaki etal., 2017; Bylinskii etal., 2022; Borkin etal., 2015; Lu etal., 2020), more research needs to be conducted to fully understand what components split the viewer attention from the presenter and the visualization when they co-exist on the same 2D screen and 3D space. These visual design patterns can then inform how to create authoring tools that help with strategic blocking to enhance the narrative of the performance.

Merging Spaces Necessitates New Social Engineering Considerations

While most people enjoyed InfoVids, some did not, as their expectations of presentations did not align with their pre-existing mental models. Some viewers did not like the NapoleonVis InfoVid because their “brain associates [presentations] with a more academic setting”(P17). Participants believed historical data should be presented in the lecture hall settings and not like an InfoVid. P19 believed that for any presentations,

The [presenter] is not essential to understanding information, human connection isn’t that necessary.

While we anticipated the new relationship dynamics introduced by InfoVids would impact the viewers, we did not foresee that these relational dynamics would conflict with the pre-existing ones in the viewers. Thus, these results indicate that there may first need to be a change in a viewer’s mental model and norms of social acceptability associated with presentations on data before such InfoVids can be fully accepted by a broader audience.

In addition, we find that visual preferences also affected viewer expectations. The AR elements, while enjoyed by many, for some “[were] disorienting… clarity of the visual separation [by the slides] made it more immersive as a learning experience”(P21). While the red bubbles of InjuryVis were effective storytelling devices for most, others preferred the baseline because they thought “the red graphic overtook too much of the body”(P26) and thought the presenter looked “comical when it’s moving”(P19).This explains why InjuryVis tied with the baseline on the metric of enjoyability and preferred on the baseline for information understanding.

Thus, future tools should guide designers on how to design visualizations in relation to the presenter, such that they are not too distracting to the performance. These tools should also ensure that the presenter does not compete with the visualization to muddy the message. As P12 points out:

Maybe in [certain] situations…you should be aware of the applicability [to] all kinds of people…[if] it focuses more on the presenter themselves and their identity, maybe there could be tension in that.

Similar to how character appearance in game design and face filters affect the self-portrayal and identity of individuals in mixed reality(Fribourg etal., 2021; Morris etal., 2023; Birk and Mandryk, 2013; Chung etal., 2023), future authoring systems for visual performances should consider the social and visual context that the presenter brings into the performance. Else, mismatches in presentation content and the context the presenter brings will lead to discomfort. While these were considerations we did not anticipate, the positive reception to the AirplaneVis, NapoleonVis, and InjuryVis InfoVids indicate that with proper social engineering, InfoVids offer a unique experience that traditional videoconferencing formats cannot provide.

6. Lessons Learned from Desiging InfoVids: An Autobiographical Perspective

In this section we discuss the lessons learned from the findings and from our nine-month experience designing InfoVids. While we cannot claim generalizability, as Neustaedter and Sengers state(Neustaedter and Sengers, 2012), we include insights from an autobiographical standpoint because they provide practical and long-term insights(Huang and Qian, 2023; Zhou etal., 2024b; Desjardins and Ball, 2018) for future visualization presentation tools.

Changing Presenter-System-Viewer Dynamics

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (9)\Description

Shows relationship in a triangular for the following three factors: the non-user spectator, system, and the user. Between user and system there’s a double arrow connecting them with the line having the associated label of ‘present paper, revisiting traditional hci’ to emphasize the focus of the paper. There is a double line connecting the ‘non-user spectator’ with the ‘user’ with an associated label ‘Dalsgaard et al’ to demonstrate what Dalsgaard et al emphasized. Similarly, there’s a double line from ‘non-user spectator’ to the middle of system-user line with associated label ‘Reeves et al’. There is also a self-looped arrow for the user with the associated caption ‘spectating the self’ to emphasize the fact that the user is also a spectator of themselves.

To articulate the experience of spectators in emerging public performances, Reeves et al.(Reeves etal., 2005) formalizes a classification and taxonomy for performative interfaces. Dalsgaard et al.(Dalsgaard and Hansen, 2008) later on extends Reeves et al.’s framework to emphasize the user-spectator relationship more than the spectator and user-system relationship and the traditionally-investigated user-system relationships. In spirit, both Reeves et al.and Dalsgaard et al. describe the emerging role of spectators as active participants affecting the user’s performance.

However, as we were filming ourselves as the presenter and simultaneously looking at the augmented visualizations overlaid to our body as a viewer, we found that the division between the spectator and the presenter blurred. The relationship between the presenter and viewer was being redefined. As viewers, we could both perform and view ourselves in real-time with the self-facing camera. This suggests that similar presentation systems found on streaming platforms, such as Twitch, TikTok, and Discord and videoconferencing platforms such as Skype, Teams, and Zoomhave also introduced new user-system-spectator dynamics.

These setups, however, cannot be fully explained by pre-existing frameworks of user-system-spectator relationships(Reeves etal., 2005; Dalsgaard and Hansen, 2008). As we demonstrate in Figure9, we may need to update pre-existing performance theories to account for self-facing cameras and streaming(Lu etal., 2018; Lottridge etal., 2017; Cheung and Huang, 2011) which have now muddied the division between the user and the spectator in the opposite direction suggested by Reeves et al. and Dalsgaard et al.(Reeves etal., 2005; Dalsgaard and Hansen, 2008). Thus, as virtual and physical spaces start to blend, our probe with InfoVids prompts the need for future work to investigate the emerging tensions not only between presenter and visualization, but also among the self-spectating presenter, system, and viewer.

Dynamic Presenter-Visualization Relationships

We also learn that InfoVids do more than blend space—InfoVids change the presenter’s relationship with the visualization and influence the message conveyed to the viewers. At times, the presenter blends in with and becomes one with the prop. For example, the presenter’s full body acts as the visual highlight in AirplaneVis to indicate where the viewer should look. Presenters are the context and information in InjuryVis and the starter of an engaging visual animation in NapoleonVis.

However, these relationships can also be dynamic and evolve over time. As we used augmented reality to visually place data in real life settings, the data physicalized(Jansen etal., 2015). However, unlike physical data visualizations, which remain relatively static, the properties of virtual data can be dynamic—they can change in size, location, material, and style. As a result, the relationship between visualization and presenter can constantly change. Visualizations that change in size can change how they occupy space, and ultimately alter how a presenter can interact with them. All of these factors redefines the relationship with the presenter and transforms the types of presentations a viewer can experience(VEGJ, 2023).

On the other hand, such changing affordances can be cognitively tiring to maintain. While the presenter has a view of their augmented self with the visualizations, the visualizations are not tangible like physical objects. The intangible form and changing affordances(Norman, 2013) warrant different interactions from the presenter(Brooks, 1988; Kister etal., 2017; Gong etal., 2023), which means presenters have to memorize how to interact with the visualizations. Even if the presentations were brief, a single presenter had to present in four different InfoVids. This required context switching and the ability to memorize various details and body language for each presentation, all of which can be mentally taxing.

Yet, this lesson prompts us to consider how future technologies can be designed to better support extended and multiple presentations—scenarios that are often overlooked in prior works. This could lead to new hidden cue systems customized for the presenter, but hidden from the viewers. For example, in the process of designing InfoVids, we started to develop a system in which a presenter triggered animations based on the spatial location of the presenter. This freed up the hands, reduced the need to memorize different interactions, and allowed the presenter to command the stage more. All of these components can enhance the viewer’s experience, as a less fatigued presenter can positively impact the quality and engagement of the presentation. And as we show in our findings, if executed well, InfoVids can prove to be more engaging than traditional formats for the viewer.

7. Conclusion

We approached our investigation to introduce a more equitable presentation paradigm using InfoVids from two distinct angles: one from the perspective of the viewers and another from a autobiographical perspective as long-term designers and users of InfoVids. Our study findings suggest that viewers may find InfoVids to be a more engaging and preferable format than traditional, 2D presentations, if the relationship of the visualization to the presenter is appropriately considered. In the process of designing and implementing InfoVids, we learn how spatial layout, form, and interactions affect how viewers perceive presenter-visualization relationships. Our insights into the process of making InfoVids will hopefully inform future data performance systems that nurture a new generation of data presenters who dance with data to tell sophisticated stories for a broader audience, carrying on the legacy of Prof.Hans Rosling.

References

  • (1)
  • Amini etal. (2015)Fereshteh Amini, Nathalie HenryRiche, Bongshin Lee, Christophe Hurter, and Pourang Irani. 2015.Understanding data videos: Looking at narrative visualization through the cinematography lens. In Proceedings of the 33rd Annual ACM conference on human factors in computing systems. 1459–1468.
  • Amini etal. (2018)Fereshteh Amini, NathalieHenry Riche, Bongshin Lee, Jason Leboe-McGowan, and Pourang Irani. 2018.Hooked on data videos: assessing the effect of animation and pictographs on viewer engagement. In Proceedings of the 2018 international conference on advanced visual interfaces. 1–9.
  • Bateman etal. (2010)Scott Bateman, ReganL Mandryk, Carl Gutwin, Aaron Genest, David McDine, and Christopher Brooks. 2010.Useful junk? The effects of visual embellishment on comprehension and memorability of charts. In Proceedings of the SIGCHI conference on human factors in computing systems. 2573–2582.
  • Birk and Mandryk (2013)Max Birk and ReganL Mandryk. 2013.Control your game-self: effects of controller type on enjoyment, motivation, and personality in game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 685–694.
  • Blum etal. (2012)Tobias Blum, Valerie Kleeberger, Christoph Bichlmeier, and Nassir Navab. 2012.mirracle: An augmented reality magic mirror system for anatomy education. In 2012 IEEE Virtual Reality Workshops (VRW). IEEE, 115–116.https://doi.org/10.1109/VR.2012.6180909
  • Borkin etal. (2015)MichelleA Borkin, Zoya Bylinskii, NamWook Kim, ConstanceMay Bainbridge, ChelseaS Yeh, Daniel Borkin, Hanspeter Pfister, and Aude Oliva. 2015.Beyond memorability: Visualization recognition and recall.IEEE transactions on visualization and computer graphics 22, 1 (2015), 519–528.
  • Bostock (2023)Mike Bostock. Accessed in 2023.Walmart’s Growth.https://observablehq.com/@d3/walmarts-growth
  • Bostock etal. (2011)Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011.D3 data-driven documents.IEEE transactions on visualization and computer graphics 17, 12 (2011), 2301–2309.
  • Braun and Clarke (2006)Virginia Braun and Victoria Clarke. 2006.Using thematic analysis in psychology.Qualitative research in psychology 3, 2 (2006), 77–101.
  • Brooks (1988)F.P. Brooks. 1988.Grasping Reality through Illusion—interactive Graphics Serving Science. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Washington, D.C., USA) (CHI ’88). 1–11.https://doi.org/10.1145/57167.57168
  • Buchenau and Suri (2000)Marion Buchenau and JaneFulton Suri. 2000.Experience prototyping. In Proceedings of the 3rd conference on Designing interactive systems: processes, practices, methods, and techniques. 424–433.
  • Bylinskii etal. (2022)Zoya Bylinskii, Lore Goetschalckx, Anelise Newman, and Aude Oliva. 2022.Memorability: An Image-Computable Measure of Information Utility.Springer International Publishing, Cham, 207–239.https://doi.org/10.1007/978-3-030-81465-6_8
  • Chen etal. (2023)Junjie Chen, Chenhui Li, Sicheng Song, and Changbo Wang. 2023.iARVis: Mobile AR Based Declarative Information Visualization Authoring, Exploring and Sharing. In 2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR). 11–21.
  • Chen etal. (2019)Zhutian Chen, Yijia Su, Yifang Wang, Qianwen Wang, Huamin Qu, and Yingcai Wu. 2019.Marvist: Authoring glyph-based visualization in mobile augmented reality.IEEE transactions on visualization and computer graphics 26, 8 (2019), 2645–2658.
  • Chen etal. (2020)Zhutian Chen, Wai Tong, Qianwen Wang, Benjamin Bach, and Huamin Qu. 2020.Augmenting static visualizations with paparvis designer. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.
  • Cheung and Huang (2011)Gifford Cheung and Jeff Huang. 2011.Starcraft from the stands: understanding the game spectator. In Proceedings of the SIGCHI conference on human factors in computing systems. 763–772.
  • Chung etal. (2023)JiWon Chung, XiyuJenny Fu, Zachary Deocadiz-Smith, MalteF Jung, and Jeff Huang. 2023.Negotiating Dyadic Interactions through the Lens of Augmented Reality Glasses. In Proceedings of the 2023 ACM Designing Interactive Systems Conference. 493–508.
  • Cordeil etal. (2019)Maxime Cordeil, Andrew Cunningham, Benjamin Bach, Christophe Hurter, BruceH Thomas, Kim Marriott, and Tim Dwyer. 2019.IATK: An immersive analytics toolkit. In 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). IEEE, 200–209.
  • Dalsgaard and Hansen (2008)Peter Dalsgaard and LoneKoefoed Hansen. 2008.Performing perception—staging aesthetics of interaction.ACM Transactions on Computer-Human Interaction (TOCHI) 15, 3 (2008), 1–33.
  • Davis etal. (2023)JoshUrban Davis, Paul Asente, and Xing-Dong Yang. 2023.Multimodal Direct Manipulation in Video Conferencing: Challenges and Opportunities. In Proceedings of the 2023 ACM Designing Interactive Systems Conference. 1174–1193.
  • Denning etal. (2014)Tamara Denning, Zakariya Dehlawi, and Tadayoshi Kohno. 2014.In situ with bystanders of augmented reality glasses: Perspectives on recording and privacy-mediating technologies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2377–2386.https://doi.org/10.1145/2556288.2557352
  • Desjardins and Ball (2018)Audrey Desjardins and Aubree Ball. 2018.Revealing tensions in autobiographical design in HCI. In proceedings of the 2018 designing interactive systems conference. 753–764.
  • Fribourg etal. (2021)Rebecca Fribourg, Etienne Peillard, and Rachel Mcdonnell. 2021.Mirror, mirror on my phone: Investigating dimensions of self-face perception induced by augmented reality filters. In 2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 470–478.
  • Friendly (2023)Michael Friendly. Accessed in 2023.Minard’s Graphic Works.https://www.datavis.ca/gallery/re-minard.php
  • Gong etal. (2023)Weilun Gong, Stephanie Santosa, Tovi Grossman, Michael Glueck, Daniel Clarke, and Frances Lai. 2023.Affordance-Based and User-Defined Gestures for Spatial Tangible Interaction. In Proceedings of the 2023 ACM Designing Interactive Systems Conference. 1500–1514.
  • Graham (2024)Tim Graham. Accessed in 2024.The Safest Seat to Sit In On a Plane is….https://flowingdata.com/2008/05/20/the-safest-seat-to-sit-in-on-a-plane-is/
  • Grime (2015)James Grime. 2015.The Greatest Ever Infographic - Numberphile.https://www.youtube.com/watch?v=3T7jMcstxY0&ab_channel=Numberphile
  • Hall etal. (2022)BrianD Hall, Lyn Bartram, and Matthew Brehmer. 2022.Augmented chironomia for presenting data to remote audiences. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–14.
  • Harrison etal. (2015)Lane Harrison, Katharina Reinecke, and Remco Chang. 2015.Infographic aesthetics: Designing for the first impression. In Proceedings of the 33rd Annual ACM conference on human factors in computing systems. 1187–1190.
  • Hassoun etal. (2023)Amelia Hassoun, Ian Beacock, Sunny Consolvo, Beth Goldberg, PatrickGage Kelley, and DanielM Russell. 2023.Practicing Information Sensibility: How Gen Z Engages with Online Information. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–17.
  • Huang and Qian (2023)Jeff Huang and Jing Qian. 2023.irchiver: A Full-Resolution Personal Web Archive for Users and Researchers. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval. 449–453.
  • Hubenschmid etal. (2021)Sebastian Hubenschmid, Johannes Zagermann, Simon Butscher, and Harald Reiterer. 2021.Stream: Exploring the combination of spatially-aware tablets with augmented reality head-mounted displays for immersive analytics. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.
  • Hutchinson etal. (2003)Hilary Hutchinson, Wendy Mackay, Bo Westerlund, BenjaminB Bederson, Allison Druin, Catherine Plaisant, Michel Beaudouin-Lafon, Stéphane Conversy, Helen Evans, Heiko Hansen, etal. 2003.Technology probes: inspiring design for and with families. In Proceedings of the SIGCHI conference on Human factors in computing systems. 17–24.
  • Jansen etal. (2015)Yvonne Jansen, Pierre Dragicevic, Petra Isenberg, Jason Alexander, Abhijit Karnik, Johan Kildal, Sriram Subramanian, and Kasper Hornbæk. 2015.Opportunities and challenges for data physicalization. In proceedings of the 33rd annual acm conference on human factors in computing systems. 3227–3236.
  • Jiang etal. (2023)Yu Jiang, Zhipeng Li, Mufei He, David Lindlbauer, and Yukang Yan. 2023.HandAvatar: Embodying Non-Humanoid Virtual Avatars through Hands. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–17.
  • Kister etal. (2017)Ulrike Kister, Konstantin Klamka, Christian Tominski, and Raimund Dachselt. 2017.GraSp: Combining Spatially-aware Mobile Devices and a Display Wall for Graph Visualization and Interaction.Computer Graphics Forum 36, 3 (2017), 503–514.
  • Kosara and Mackinlay (2013)Robert Kosara and Jock Mackinlay. 2013.Storytelling: The Next Step for Visualization.Computer 46, 5 (2013), 44–50.https://doi.org/10.1109/MC.2013.36
  • Langner etal. (2021)Ricardo Langner, Marc Satkowski, Wolfgang Büschel, and Raimund Dachselt. 2021.Marvis: Combining mobile devices and augmented reality for visual data analysis. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–17.
  • Lee etal. (2013)Bongshin Lee, RubaiatHabib Kazi, and Greg Smith. 2013.SketchStory: Telling more engaging stories with data through freeform sketching.IEEE transactions on visualization and computer graphics 19, 12 (2013), 2416–2425.
  • Li and Moacdieh (2014)Huiyang Li and Nadine Moacdieh. 2014.Is “chart junk” useful? An extended examination of visual embellishment. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol.58. Sage Publications Sage CA: Los Angeles, CA, 1516–1520.
  • Liao etal. (2022)Jian Liao, Adnan Karim, ShiveshSingh Jadon, RubaiatHabib Kazi, and Ryo Suzuki. 2022.RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–12.
  • Lo etal. (2022)Leo Yu-Ho Lo, Ayush Gupta, Kento Shigyo, Aoyu Wu, Enrico Bertini, and Huamin Qu. 2022.Misinformed by visualization: What do we learn from misinformative visualizations?Computer Graphics Forum 41, 3 (2022), 515–525.
  • Lottridge etal. (2017)Danielle Lottridge, Frank Bentley, Matt Wheeler, Jason Lee, Janet Cheung, Katherine Ong, and Cristy Rowley. 2017.Third-wave livestreaming: teens’ long form selfie. In Proceedings of the 19th international conference on human-computer interaction with mobile devices and services. 1–12.
  • Lu etal. (2020)Min Lu, Chufeng Wang, Joel Lanir, Nanxuan Zhao, Hanspeter Pfister, Daniel Cohen-Or, and Hui Huang. 2020.Exploring Visual Information Flows in Infographics. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). 1–12.https://doi.org/10.1145/3313831.3376263
  • Lu etal. (2018)Zhicong Lu, Haijun Xia, Seongkook Heo, and Daniel Wigdor. 2018.You watch, you give, and you engage: a study of live streaming practices in China. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1–13.
  • Luo etal. (2021)Weizhou Luo, Eva Goebel, Patrick Reipschläger, MatsOle Ellenberg, and Raimund Dachselt. 2021.Exploring and slicing volumetric medical data in augmented reality using a spatially-aware mobile device. In 2021 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). IEEE, 334–339.
  • Luo and Tang (2008)Yiwen Luo and Xiaoou Tang. 2008.Photo and video quality evaluation: Focusing on the subject. In Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part III 10. Springer, 386–399.
  • Morris etal. (2023)MargaretE Morris, DanielaK Rosner, PaulaS Nurius, and HadarM Dolev. 2023.“I Don’t Want to Hide Behind an Avatar”: Self-Representation in Social VR Among Women in Midlife. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (Pittsburgh, PA, USA) (DIS ’23). 537–546.https://doi.org/10.1145/3563657.3596129
  • Network (2020)TheLearning Network. 2020.What’s going on in this graph? — high-school sports injuries.https://www.nytimes.com/2020/01/23/learning/whats-going-on-in-this-graph-high-school-sports-injuries.html
  • Neustaedter and Sengers (2012)Carman Neustaedter and Phoebe Sengers. 2012.Autobiographical design in HCI research: designing and learning through use-it-yourself. In Proceedings of the Designing Interactive Systems Conference. 514–523.
  • Norman (2013)Don Norman. 2013.The design of everyday things: Revised and expanded edition.Basic books, New York, New York.
  • Norooz etal. (2015)Leyla Norooz, MatthewLouis Mauriello, Anita Jorgensen, Brenna McNally, and JonE Froehlich. 2015.BodyVis: A new approach to body learning through wearable sensing and visualization. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 1025–1034.
  • Papoutsaki etal. (2017)Alexandra Papoutsaki, James Laskey, and Jeff Huang. 2017.Searchgazer: Webcam eye tracking for remote studies of web search. In Proceedings of the 2017 conference on conference human information interaction and retrieval. 17–26.
  • Pei etal. (2022)Siyou Pei, Alexander Chen, Jaewook Lee, and Yang Zhang. 2022.Hand interfaces: Using hands to imitate objects in AR/VR for expressive interactions. In Proceedings of the 2022 CHI conference on human factors in computing systems. 1–16.
  • Perlin etal. (2018)Ken Perlin, Zhenyi He, and Karl Rosenberg. 2018.Chalktalk: A Visualization and Communication Language–As a Tool in the Domain of Computer Science Education.arXiv:1809.07166[cs.HC]
  • Rajaram and Nebeling (2022)Shwetha Rajaram and Michael Nebeling. 2022.Paper trail: An immersive authoring system for augmented reality instructional experiences. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–16.
  • Reeves etal. (2005)Stuart Reeves, Steve Benford, Claire O’Malley, and Mike Fraser. 2005.Designing the spectator experience. In Proceedings of the SIGCHI conference on Human factors in computing systems. 741–750.
  • Rolsing (2011)Hans Rolsing. 2011.Hans Rosling’s 200 Countries, 200 Years, 4 Minutes.https://www.youtube.com/watch?v=jbkSRLYSojo&t=54s&ab_channel=BBC
  • Sallam etal. (2022)Samar Sallam, Yumiko Sakamoto, Jason Leboe-McGowan, Celine Latulipe, and Pourang Irani. 2022.Towards design guidelines for effective health-related data videos: An empirical investigation of affect, personality, and video content. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–22.
  • Saquib etal. (2019)Nazmus Saquib, RubaiatHabib Kazi, Li-Yi Wei, and Wilmot Li. 2019.Interactive body-driven graphics for augmented video performance. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
  • Satriadi etal. (2022)KadekAnanta Satriadi, Jim Smiley, Barrett Ens, Maxime Cordeil, Tobias Czauderna, Benjamin Lee, Ying Yang, Tim Dwyer, and Bernhard Jenny. 2022.Tangible globes for data visualisation in augmented reality. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–16.
  • Satyanarayan etal. (2016)Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer. 2016.Vega-lite: A grammar of interactive graphics.IEEE transactions on visualization and computer graphics 23, 1 (2016), 341–350.
  • Sicat etal. (2018)Ronell Sicat, Jiabao Li, Junyoung Choi, Maxime Cordeil, Won-Ki Jeong, Benjamin Bach, and Hanspeter Pfister. 2018.DXR: A toolkit for building immersive data visualizations.IEEE transactions on visualization and computer graphics 25, 1 (2018), 715–725.
  • Slater etal. (2003)Mel Slater, Andrea Brogni, and Anthony Steed. 2003.Physiological responses to breaks in presence: A pilot study. In Presence 2003: The 6th annual international workshop on presence, Vol.157. Citeseer.
  • Slater and Steed (2000)Mel Slater and Anthony Steed. 2000.A virtual presence counter.Presence 9, 5 (2000), 413–434.
  • sqadia.com (2024)sqadia.com. Accessed in 2024.Introduction to Anatomy SUBDIVISIONS — Made Easy for Medical Students.https://www.youtube.com/watch?v=q6fQf6VLDOY
  • Subramonyam (2015)Hariharan Subramonyam. 2015.SIGCHI: magic mirror-embodied interactions for the quantified self. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. 1699–1704.
  • Suzuki etal. (2020)Ryo Suzuki, RubaiatHabib Kazi, Li-Yi Wei, Stephen DiVerdi, Wilmot Li, and Daniel Leithinger. 2020.Realitysketch: Embedding responsive graphics and visualizations in AR through dynamic sketching. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 166–181.
  • Tohidi etal. (2006)Maryam Tohidi, William Buxton, Ronald Baecker, and Abigail Sellen. 2006.Getting the right design and the design right. In Proceedings of the SIGCHI conference on Human Factors in computing systems. 1243–1252.
  • Tong etal. (2023)Wai Tong, Zhutian Chen, Meng Xia, Leo Yu-Ho Lo, Linping Yuan, Benjamin Bach, and Huamin Qu. 2023.Exploring interactions with printed data visualizations in augmented reality.IEEE Transactions on Visualization and Computer Graphics 29 (2023), 418 – 428.Issue 1.
  • VEGJ (2023)DAVOD VEGJ. Last accessed in 2023.BLOCKING 101 How directors tell stories with movement.https://dramatics.org/blocking-101/
  • Wang (2020)Yunwen Wang. 2020.Humor and camera view on mobile short-form video apps influence user experience and technology-adoption intent, an example of TikTok (DouYin).Computers in human behavior 110 (2020), 106373.
  • Wang etal. (2019)Zezhong Wang, Shunming Wang, Matteo Farinella, Dave Murray-Rust, Nathalie HenryRiche, and Benjamin Bach. 2019.Comparing effectiveness and engagement of data comics and infographics. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
  • Yang etal. (2023)Saelyne Yang, Sangkyung Kwak, Juhoon Lee, and Juho Kim. 2023.Beyond Instructions: A Taxonomy of Information Types in How-to Videos. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–21.
  • Yang etal. (2020)Yalong Yang, Tim Dwyer, Kim Marriott, Bernhard Jenny, and Sarah Goodwin. 2020.Tilt map: Interactive transitions between choropleth map, prism map and bar chart in immersive environments.IEEE Transactions on Visualization and Computer Graphics 27, 12 (2020), 4507–4519.
  • Zhou etal. (2024a)Tongyu Zhou, Jeff Huang, and Gromit Chan. 2024a.Epigraphics: Message-Driven Infographics Authoring. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24).https://doi.org/10.1145/3613904.3642172
  • Zhou etal. (2024b)Tongyu Zhou, JoshuaKong Yang, VivianHsinyueh Chan, JiWon Chung, and Jeff Huang. 2024b.PortalInk: 2.5D Visual Storytelling with SVG Parallax and Waypoint Transitions. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (Pittsburgh, PA, USA) (UIST ’24). 16.https://doi.org/10.1145/3654777.3676376
  • Zhu etal. (2020)Chengyan Zhu, Xiaolin Xu, Wei Zhang, Jianmin Chen, and Richard Evans. 2020.How health communication via Tik Tok makes a difference: A content analysis of Tik Tok accounts run by Chinese provincial health committees.International journal of environmental research and public health 17, 1 (2020), 192.

Appendix A How to Create InfoVids with the Body Object Model

The Body Object Model (BOM) treats the full-body as a series of nestable anchor points, similar to HTML tags, in which the designer can nest, or bind, visual props to the body. This nested structure serves two purposes. First, it reinforces the idea that the visualization is an element that can be nested, or blended in, with the presenter, effectively blurring the boundaries between the visualization and the presenter. Second, it forces the designer of the visual performance to define the visualizations in relation to the presenter, with a parent-child hierarchy, making the presenter the primary focus even in its language. We implement BOM with Swift and ARKit as they naturally align with the nestable tree structure that we aim to achieve.

The iPhone’s TrueDepth and ARKit software detect and segment the presenter’s face from the background within 3 m of the device’s front camera. This provides a relative, proxy depth from the presenter’s face, enabling us to incorporate three-dimensional form into the visualizations.

Using the self-facing camera, ARKit, VNDetectHumanBodyPose3DRequest, VNDetecthumanHandPoseRequest respectively generate the body anchors, or the face node, body joints, and hand joints. All body anchors are inherited fromSCNNode and the node positions are overlaid on the diagram of the person and hand. Body anchors or the locations that one can bind aVisNode with usingbind. Each face, body, and hand joint is managed by the FacialCueTracker, SkeletonTracker, and HandTracker respectively. HandTracker also manages GestureStates which then is used to build an Action Sequence. System provides 10 GestureStates and 5 ActionSequences. Using Vishandler the scripts interactions with VisNode and the body.

The prototype is a proof-of-concept tool for experts to design and perform InfoVids. Using this system requires multidisciplinary knowledge of AR, visualization, and performance. Before programming, the designer needs to design a simple narrative for the visualizations. Then, they need to detail how they will stage the interactions between the visualizations and presenter, much like how blocking is done in theater(VEGJ, 2023). Then, after creating the body-vis bindings, all interactions and gestures to trigger a visualization are coded. Based on our nine-month experience with the system, we find that gestures are dictated by the narrative the designer wishes to convey and need to flexibly adapt to suit each unique presentation (Sec.6). Thus, at this time, there is no fixed gestural language defined and this is a design choice.

After designing the InfoVid, the smartphone is positioned such that the front-facing camera faces the presenter. The phone screen provides a mirrored view of the presenter with the visualizations such that they can see how they are engaging with the visualizations in real-time, much like AR filters. This setup allows us to evaluate our research questions with InfoVids that have genuine interactions from the presenter(Buchenau and Suri, 2000), unlike Wizard-of-Oz styled setups which use post-processing to overlay the augmentations.

Appendix B Baseline Video Format

This section outlines how we made the baseline, a 2D videoconferencing style presentation with slides, as comparable as the presentation with InfoVids. From our external critiques we found there were two major factors to consider to make the baseline.

First, different takes, or renditions of the performance, and the author’s biased knowledge of the system generated different performances. Minute changes in facial expressions, enthusiasm, and dialogue changed the perceived narrative. To account for this, we hired an external actor with no prior knowledge of the study to perform in the videos and ensured that both versions of the presentations used the same take of the performance.

Next, the differences in presenter body movements caused by the added depth were too subtle. If the differences were hard to find by researchers who were actively critiquing, they would be even less discernible to a non-researcher. To make the differences more apparent, we chose a more traditional 2D video-conferencing style presentation as the baseline, referencing the studies of previous interactive visual systems(Davis etal., 2023; Lee etal., 2013).

We controlled for the actor’s body language, the layout, and the visualizations and its interactive effects with the presenter. The overall process is summarized in Figure6.

B.1. Scripting for a Common Body Language

To incorporate feedback from the critique, we recorded the performance with two iPhone 13 minis simultaneously. As demonstrated in Figure6, this setup provided two videos with the same performance, one InfoVid and the other with no overlaid visualizations for the baseline.

The setup to control for confounding factors, however, introduced new problems. Simultaneous recording stipulated a pre-planning process such that the body language could be shared in both presentations. For example, an actor looking directly up towards or at the plane would look awkward and aberrant in a typical slides presentation without the plane, because the actor would be looking into the void (Figure10). Thus, to minimize such distractions for the viewer, we provided a script that used a common body language to preserve semantic meaning in both. We instructed the actor to (1) use large, open hand movements with no specific hand or finger gestures and (2) face the camera. As depicted in Figure10, these modifications detracted from InfoVid benefits and made the presenter look less immersed, but allowed for fairer comparisons with the baseline.

B.2. Layouts to Preserve Presenter Body Language and Maximize Visualization View

Because we are interested in the effects of perceived performer body language with the visualizations using InfoVids, we chose a videoconferencing layout that would clearly capture presenter body language and the visualizations for the baseline. We referenced pre-existing videoconferencing formats in-the-wild and disregarded formats that removed or minimized the size of the presenter to make the visualization the main view.

While it is possible to stage a visual performance where the full-body is shown, we decided to opt for a version that only showed the upper torso for a few reasons. First, we wanted the baseline to reflect a realistic problem that commonly happens in videoconferencing systems: the cut-off of body language because of the ‘boxing’ out of the presenter. In addition, we needed to use the same take of the performance for the experiment, but some body movements while using InfoVids were impossible to translate in a 2D setting even with a scripted common body language. For example, InfoVids encouraged use of full-body movements such as walking back and forth in 3D space or pointing to different parts of the plane, but these movements obstructed the narrative in a 2D videoconferencing setting. However, we kept the upper-half of the torso, so that viewers could still have a view of the performer’s hand gestures and facial expressions.

We also controlled for how much the presenter’s body occupied the video frame across different videos. We chose horizontal baseline layout that placed slides to the left of the presenter at a 3:1 ratio(sqadia.com, 2024) to maximize views of the presenter and the visualization. We also applied rule of thirds, a video composition technique often applied in professional videos to create videos that direct the viewer attention to the presenter(Luo and Tang, 2008). As the orange overlay shows (Figure6), we divided the presenter video horizontally and vertically into three parts and placed the presenter’s face just above the top vertical line while leaving some space above the head. If the presenter moved around in the video, we zoomed in or translated the video with the presenter to maintain the rule of thirds, using post-hoc video editing. The end effect is comparable to auto zoom-in techniques following a moving presenter in videoconferencing.

However, the horizontal setup is a trade-off. We exchanged control over phone orientation to maximize screen real estate. Using the vertical orientation of the phone limits the size of the slides and the presenter. Horizontal orientation of the InfoVid version cuts off half the presenter’s body and unfairly disadvantage InfoVids, which enables full-body performances. On the other hand, as seen in Figure6, the horizontal layout enables a full view of the slides and a sizable view of the presenter and their hand gestures for the baseline. While this setup clips the lower body and some of the presenter’s hands at times, these are not atypical formats in video conferencing. Thus, the horizontal layout is a compromise that was necessary for a more equitable comparison between the InfoVids and the baseline.

B.3. Post-hoc Synchronization

In a typical slide presentation, the performer does not have direct control over the visualizations. They can only sync their movements with it. However, because one of the benefits of InfoVids is the performer’s ability to directly control the visualization we synced the animations with the presenter’s movements to simulate a well-rehearsed presentation. Using Adobe Premiere Pro, we post-hoc synced the video of the presenter with no virtual overlays with the timings of the animations, gesture triggers, and performer’s movements in InfoVids. To control for effects caused by different visualization props, we also used the same visualization props and animation effects used for InfoVids, except for InjuryVis where the presenter’s body was replaced by a silhouette. These choices dilute InfoVids advantage to bind visualizations to the body as many presentations in-the-wild don’t sync meticulously with the visualizations, but it reduces confounds for the study.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (10)

\Description

Shows four images side by side. There are four labels on the top, in the order of ‘Original’, ‘Modified for Study’, ‘Original’, ‘Modified for Study’. Each ‘Original’ tag is light gray and each ‘Modified for Study’ tag is gray. For each image underneath the four labels, the performer is seen standing outside on a field. Underneath the first label, ‘Original’, the performer is looking up towards the sky while a 3D plane enters from the top of the image. The caption reads, ‘Head Tilt to the Sky Triggers Plane Entrance’. Underneath the second label, ‘Modified for Study’, the performer is gesturing towards the sky with the left arm, pointing towards the sky. The caption reads, ‘Large Hand Gesture Triggers Plane Entrance’. Under the third label, ‘Original’, the presenter is standing towards the right of an enlarged plane with colored seats, looking at the plane while tilting their body towards the plane. The caption reads, ‘presenter Looks Directly at the Plane’. Under the fourth label, ‘Modified for Study’, the presenter faces forward and uses their right arm to point towards the enlarged plane. The caption reads, ‘Presenter Faces Forward while Pointing to the Plane’.

Reimagining the Viewer Experience with Alternative Visualization-Presenter Relationships (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Roderick King

Last Updated:

Views: 5944

Rating: 4 / 5 (71 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Roderick King

Birthday: 1997-10-09

Address: 3782 Madge Knoll, East Dudley, MA 63913

Phone: +2521695290067

Job: Customer Sales Coordinator

Hobby: Gunsmithing, Embroidery, Parkour, Kitesurfing, Rock climbing, Sand art, Beekeeping

Introduction: My name is Roderick King, I am a cute, splendid, excited, perfect, gentle, funny, vivacious person who loves writing and wants to share my knowledge and understanding with you.