Exploring Standard and Novel Interactions on a Multi-Touch-Enhanced

Mouse

Mahmoud Sadeghi

and David Ahlstr

Interactive Systems, University of Klagenfurt, Austria

{mahmoud.sadeghi, david.ahlstroem}@aau.at

Keywords:

Multi-Touch Surfaces, Mouse Interactions (Point-and-Click, Dragging), Window-Switching.

Abstract:

The mouse is the primary pointing device in desktop computing. Its shape and form have advanced over decades,

but its functionality is still limited to basic interactions. Inspired by early work, we envision multi-touch enhanced

mouse input. With two user studies, we ﬁrst show that our simple multi-touch mouse prototype – a smartphone

attached to the base of a mouse – compares well to a standard mouse in point-and-click and dragging tasks. In a

third study, we explore the potential of multi-touch enhanced mouse input and show that with Depth-Drag, a new

multi-touch interaction, users are up to 47% faster when navigating and moving objects between stacked desktop

windows than when they use a regular mouse.

1 INTRODUCTION

The computer mouse remains the primary input device

for desktop computing, continually evolving through

technical and ergonomic advancements to accommodate

a broad range of tasks, from basic icon selection to com-

plex image editing. Researchers have explored ways to

enhance the mouse via hardware (Kim et al., 2020; Son

et al., 2023) or software modiﬁcations (Ahlstr

om et al.,

2006; Grossman and Balakrishnan, 2005). Several

projects introduced innovative designs that expand

the mouse’s input capabilities with additional sen-

sors (Martin and Raisamo, 2004; Perelman et al., 2015;

Villar et al., 2009; Shi et al., 2009; Yang et al., 2010;

Saidi et al., 2017). However, most fail to convincingly

demonstrate equivalence with the standard mouse in

fundamental desktop interactions, such as point-and-

click or dragging, thereby limiting their potential as

viable alternatives.

Drawing inspiration from touchpads that allow

for multi-ﬁnger gestures, we revisit the concept of

having a multi-touch surface on a mouse. Benko et al.

(2010) provide valuable insights into the design space

of multi-touch mice, and the LensMouse (Yang et al.,

2010) project inspired our prototype.

Our multi-touch mouse (MTM) prototype (Figure

1a) is straightforward: a regular smartphone attached to

the base of a standard mouse. This enables multi-touch

interactions via the touchscreen while allowing for

https://orcid.org/0000-0002-5475-7960

https://orcid.org/0000-0002-9553-1685

exact cursor pointing via the mouse base. In two studies,

we ﬁrst demonstrate how MTM compares to a standard

mouse in a point-and-click task and four different

dragging tasks. After that, we introduce the Depth-Drag

interaction on MTM (Figure 1b-d). Depth-Drag enables

users to seamlessly move desktop items between stacked

interface layers, such as between overlapping windows,

content layers in graphic applications, or tabs in tabbed

views. As such, Depth-Drag could serve to eliminate

the often cumbersome rearrangement of windows, and

it could also reduce reliance on the ‘invisible’ clipboard.

In our third study, we evaluate the efﬁciency of Depth-

Drag in a common ﬁle-moving scenario and compare

its performance to the corresponding mouse-based

interactions.

Our contributions are: 1) we present MTM, a

multi-touch enhanced mouse prototype; 2) we devise

and assess ways to perform the fundamental desktop

interactions: point-and-click and dragging; 3) we

introduce and evaluate Depth-Drag, a novel interaction

tailored for touch-enhanced pointing devices, and 4) we

sketch ideas and scenarios beneﬁting from multi-touch

enhanced mouse-like devices, such as MTM.

2 BACKGROUND & PROTOTYPE

Given its importance, the computer mouse has been a

focal point of substantial research. Previous studies have

primarily aimed to 1) improve current mouse technology

(ergonomics, hardware, software), 2) reduce reliance

Sadeghi, M. and Ahlström, D.

Exploring Standard and Novel Interactions on a Multi-Touch-Enhanced Mouse.

DOI: 10.5220/0013124200003912

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2025) - Volume 1: GRAPP, HUCAPP

and IVAPP, pages 551-560

ISBN: 978-989-758-728-3; ISSN: 2184-4321

551

Figure 1: A) The MTM prototype combines cursor movements and touch gestures. With Depth-Drag interaction, the user quickly

moves objects between stacked windows using b) a tap-click to select the source window, then c) a two-ﬁnger swipe to grab the

desired object, and d) while holding it, two-ﬁnger swipe up or down to navigate to the destination window where the object is

dropped using a ﬁnger liftoff.

on the mouse, or 3) enhance the mouse with extra

hardware to provide add-on functionality. Researchers

have investigated alterations to the mouse’s physical

characteristics, such as its shape (Peters and Ivanoff,

1999), size (Hourcade et al., 2007), weight (Conroy

et al., 2022), or the position of its sensor (Kim et al.,

2020), to assess their impact on pointing performance.

Software-based solutions

involve

adjusting pointing

targets on the screen (McGufﬁn and Balakrishnan,

2002) and optimizing cursor behavior (Ahmad et al.,

2014; Casiez et al., 2008; Grossman and Balakrishnan,

2005). Researchers have also explored ways to remove

the physical mouse by emulating its functionality via

ﬁnger gestures on the computer’s side (Mistry and Maes,

2011) or mid-air gestures above the keyboard (Taylor

et al., 2014; Ortega and Nigay, 2009; Ultraleap, 2013).

The third major theme in the mouse literature involves

enhancing the mouse with additional hardware to

broaden its functionality. Several projects have explored

incorporating various sensors to add one or more

extra degrees of freedom to the mouse. Examples

include the VideoMouse (Hinckley et al., 1999), the

TrackMouse (Martin and Raisamo, 2004), and the

Gesture Mouse (Franz et al., 2016). Other projects

have departed from mouse-like designs, presenting

sophisticated prototypes with additional degrees of

freedom, such as the GlobeFish (Froehlich et al., 2006),

the Roly-Poly mouse (Perelman et al., 2015), and the

TDome (Saidi et al., 2017).

Several projects have focused on adding touch

sensors to the mouse. Hinckley and Sinclair (1999)

presented two early prototypes and other mouse pro-

totypes with touch-input have followed (e.g., Villar

et al., 2009; Bartindale et al., 2010; Benko et al., 2010).

Nowadays, a few commercial mice, such as Apple’s

Magic Mouse (2022) and Microsoft’s Arc Mouse (2022)

offer multi-touch sensing. However, these devices still

rely on physical buttons for clicking and dragging. We

focus on performing these primary interactions using

touch gestures only. Researchers have also extended the

mouse beyond a mere input device, using a touchscreen

to provide both touch input and visual output (Yang

et al., 2010; Saidi et al., 2017). We revisit the idea

of a multi-touch mouse by employing a touchscreen-

enhanced prototype, inspired by the work of Benko et al.

(2010) and the LensMouse project (Yang et al., 2010).

Benko et al. presented one of the ﬁrst studies

of how to merge multi-touch input on a mouse with

cursor movements into purposeful interactions. Their

exploration revealed inherent challenges in designing

effective multi-touch mouse interactions. Taking a

visual route, Yang et al. introduce the LensMouse

and demonstrate how ofﬂoading desktop content to its

touchscreen for easy ﬁnger manipulation can speed

up desktop work. While both Benko et al. and Yang

et al. propose promising ideas for enhancing desktop

interactions through multi-touch or touchscreen inte-

gration on the mouse, they do not address the execution

of fundamental mouse interactions on such a device.

Addressing this gap, our paper focuses on designing

and evaluating point-and-click and drag interactions

on a mouse-like device equipped with a touchscreen

instead of mechanical click-buttons. We hope to save

the similar mouse-related projects much of the laborious

work in devising touch-input actions that translate to

fundamental mouse interactions. Additionally, we seek

to motivate future research by highlighting opportunities

and use cases.

Our multi-touch mouse prototype (MTM) consists

of an off-the-shelf smartphone atop a mouse base. The

mouse base with its sensor provides precise cursor

positioning, while the smartphone enables multi-touch

input and visual output. We chose this simple approach

over more complex methods, like mounting and wiring

a lighter touch surface onto a mouse. We did not use

devices such as Apple Magic Mouse or Microsoft’s Arc

Mouse, as they lack an SDK or supported method for

developing interactions. For our ﬁrst version, we used

the base of a corded optical mouse (Logitech B100)

and a Samsung Galaxy A3 smartphone (2016 model,

Android OS version 7), mounted at a slight ‘upward’

angle facing the user (similar to the LensMouse). We

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

552

deactivated the hardware buttons at the bottom of the

phone, and our software ignored touches within the

bottom 5 cm of the screen to prevent unintended touch-

events while gripping the device. We used this version

in our ﬁrst two studies on fundamental interactions

(Study 1 point-and-click, Study 2 dragging). Informed

by participants’ feedback, we transitioned to a second

version for our last study. Depicted in Figure 1a, the

updated version has a wireless mouse base (LeadsaiL

optical mouse) and a smaller smartphone (the 2017

Samsung Galaxy A3). More importantly, we mounted

the phone at a ‘downward’ angle, slopping away from

the user for a more comfortable grip and smoother

ﬁnger gestures.

We conducted the point-and-click and dragging

studies on a MacBook Pro (macOS version 11–13)

connected to an external 27-inch Apple monitor

(2560

1440 pixels). The laptop’s monitor was turned

off. We conducted the Depth-Drag study (Study 3) on

the laptop’s monitor (2560

1600 pixels, 13.3 inches

diagonal) since smaller screens typically necessitate the

layering of windows. We compared participants’ task

performance using the MTM prototype and a standard

mouse, utilizing the same model as the base for MTM.

Our goal is not to exhaustively analyze every po-

tential differences in participants’ task performance

between the mouse and MTM. Instead, Instead, our

studies focus on fundamental point-and-click and drag-

ging interactions as proof-of-concept tests, aimed at

collecting data from tasks as realistic as possible and

covering diverse desktop scenarios. Our analysis em-

phasizes the overall performance differences between

the devices across various task variations.

3 STUDY 1: POINT-AND-CLICK

We start with point-and-click. Taking the familiar

tap-action used on touchscreens, MTM has a Tap-Click

to select: the user touches the MTM’s screen with one

ﬁnger (‘mouse button’ press) and swiftly lifts it off

(‘mouse button’ release). We use duration and distance

thresholds to distinguish a Tap-Click from the idle

resting of one ﬁnger (or more) on the screen or from

swipe gestures. Tap-Click is triggered if the duration

between the ﬁnger-down and ﬁnger-up events is less

than 200 ms and the touch movement is shorter than

1 mm. If more than one ﬁnger rests on the screen,

Tap-Click must be executed with the leftmost ﬁnger

for a left-click. For a right-click, at least two ﬁngers

must touch the screen, and the rightmost must issue the

Tap-Click. Extensive experimenting helped us ﬁnalize

these threshold values. Pilot tests informed us to use a

short vibration (using the smartphone’s vibration motor)

to signal a successful click. We compared using MTM

with Tap-Click to a standard mouse in point-and-click

tasks.

3.1 Participants and Study Task

Twelve right-handed university students (4 female)

aged 22 to 43 years (mean 32.6,

5.5) participated.

Ten reported using a mouse, and two reported using a

touchpad as their main pointing device.

Participants completed 120 point-and-click trials

with both a mouse and Tap-Click in counterbalanced

order. Each trial involved clicking inside a green start

circle (diameter 16 mm), then moving the cursor to

click inside a blue target circle. Target sizes (2, 4,

10, or 20 mm) and distances (75, 100, or 300 mm)

were chosen to represent common macOS interface

elements such as window buttons, dialog buttons, and

application icons. The longest distance (300 mm)

represented less frequent, long-range clicks (with study

monitor

∼

600 mm wide), while shorter distances (75

and 100 mm) more common, shorter-range clicks.

Timing started with a release event inside the start

(i.e., mouse button release and ﬁnger-up with Tap-

Click), and ended with a second release event anywhere

outside the start. Start-target pairs were positioned,

with the start to the left of the target in 50% of the trials

(random order). We instructed participants to complete

trials as fast and as accurately as possible.

Before the timed trials, participants adjusted pre-

ferred cursor speed in the macOS settings and performed

20 practice trials per technique. We enforced breaks

between blocks of 12 trials (covering all size-distance

combinations in random order). To start the next block,

participants had to simultaneously press two keys at the

opposite ends of the keyboard, marked with blue and

red stickers. This mimicked the transition between

keyboard and mouse use in typical desktop scenarios.

A session lasted about one hour and ended with a

short questionnaire, which included the participant’s

preference and rating of how easy it was to use each

technique (from 1 to 10, 10 being very easy).

3.2 Results

The two techniques were on par regarding errors:

overall, participants clicked outside the target in 78

trials (5%) when using the mouse and in 81 (6%) trials

with Tap-Click.

To analyze the trial time, we removed all error

trials. A successful trial lasted on average

1.01 s

(

0.3) with the mouse and

1.27 s

(

0.4) with Tap-

Click (

t(11) = 13.30, p < 0.001

). This indicates that

participants quickly adapted to MTM’s alternative click

Exploring Standard and Novel Interactions on a Multi-Touch-Enhanced Mouse

553

mechanisms with minimal practice. While Tap-Click

had error rates similar to the mouse, it was slightly

slower. Out of 12 participants, four preferred Tap-Click

over the mouse. However, the mouse scored an average

rating of 8.96 (

0.8), and Tap-Click scored 7.12 (

1.8). A Wilcoxon test showed that the difference was

signiﬁcant (Z = −2.40, p < 0.05).

4 STUDY 2: DRAGGING

Dragging with a mouse involves three steps: grabbing

an object by pressing the mouse button, moving the

mouse to the desired location, and releasing the button.

The same steps apply to MTM, with two alternatives

explored: Double-Tap-Hold (DTH) and Two-Finger-

Swipe (TFS). DTH is similar to dragging on trackpads.

To grab an object, the user makes a ‘one-and-a-half’

tap with one ﬁnger on the MTM surface; i.e., keeping

the ﬁnger on the surface after the second touch. Like

Tap-Click in Study 1, duration and distance thresholds

identify the ﬁrst tap in DTH. Exceeding the thresholds

cancels DTH. After a correct grab, the user moves the

cursor to the destination while keeping the ﬁnger on the

touch surface. Lifting the ﬁnger releases the object.

Multiple ﬁngers can rest on the surface, but the leftmost

ﬁnger must perform the actions.

With Two-Finger-Swipe (TFS), the user grabs

an object by swiping downward with two ﬁngers

simultaneously and keeping them at the end position.

Similar to Swipe-Click in Study 1, we employ swipe

distance and direction thresholds to identify correct

swipes in TFS. Any threshold violations cancel TFS.

The user releases the object at the destination by lifting

off both ﬁngers from the surface. Lifting only one

ﬁnger does not release the object.

Dragging serves various purposes. Some tasks,

like moving ﬁles between folders, demand minimal

precision. Others, such as adjusting slider knobs or

highlighting text, require ﬁner cursor control. High

precision throughout cursor movement is crucial for

tasks like tracing shapes in drawing applications. Our

study incorporates three different tasks to cover these

variations, while the fourth exempliﬁes combining

touch-input actions into interactions not possible with a

mouse and its buttons.

4.1 Participants and Study Tasks

Twelve right-handed university students (6 female)

aged 22 to 40 years (mean 29.3,

4.8) participated.

All used a mouse as their main pointing device. Four

had prior experience using MTM from Study 1.

The box task in Figure 2a simulates situations

START

Tunnel length

100 mm

Tunnel width

5 or 8 mm

Stop area width

18 mm

Start area width

18 or 36 mm

Distance

100 mm

Object width

6 mm

Distance

50 mm

×Destiation size (w h)

82×82 or 134×134 mm

Object size (w×h)

28×28 or 40×40 mm

Object size (w×h)

Destination size (w×h)

50×30 or 8×30 mm

2×5 or 4×5 mm

Object size (w×h)

5×2 or 5×4 mm

30×5 or 30×8 mm

Destination size (w×h)

Distance

50 mm

Figure 2: The four dragging tasks in the study: a) the box

task, b) the bar task, c) the tunnel task, and d) the peek task.

requiring minimal precision during grabbing, drag-

ging, and dropping, such as dragging icons between

windows or moving windows. Participants move a blue

square (representing the object) to a much larger gray

destination area. The chosen object sizes (Figure 2a)

represent a medium-sized ﬁle icon in macOS and the

height of a ﬂoating video playback window (the largest

draggable object we found in macOS). Movement is

either straight or diagonal. The software randomly

selects one of the intercardinal compass directions for

diagonal trials and, for straight movement trials, one

of the cardinal directions. The object-destination pair

appears at random positions.

Timing starts when the cursor enters the object the

last time before a grab event, and ends at object release

while the cursor is inside the destination. Each block

contains eight trials (2 object sizes

2 destination

sizes

2 movement categories) in random order.

Each participant completed 5 blocks for a total of 40

trials with each technique, mouse, DTH, and TFS

(counterbalance order).

The bar task (Figure 2b) simulates situations that

require precise object grabbing and dropping, such as

aligning window borders or highlighting text. A small

blue rectangle represents a narrow interface object (a

window border or the space between words), and a

gray rectangle represents an exact destination. The

object sizes correspond to the spacing between words

set in Arial 12pt and the size of a slider knob in macOS.

To represent a high-precision task, the destination’s

constraining dimension is 1mm larger than the width or

height of the largest object. The object-destination pair

appears at a random position on the screen. Vertical

trials had a 50mm distance, while horizontal trials had a

coding error, resulting in a 73-78mm distance.

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

554

Timing begins when the cursor last enters the

object before a grab event and ends when the object is

released inside the destination. Incorrect trials (when the

cursor is dropped outside the destination) are repeated

randomly within the current block. Each block contains

eight trials (2 object sizes

2 destination sizes

movement axes). Each participant completed 5 blocks,

totaling 40 trials with each technique, mouse, DTH, and

TFS (counterbalanced order).

The tunnel task (Figure 2c) represents scenarios

where precision is required in cursor traversal, such as

tracing objects in drawing applications. The tunnel is

either horizontal or vertical and appears randomly on

the screen. Timing starts when the cursor enters the

tunnel and ends upon exit. Each participant completed

ﬁve blocks of four trials (2 tunnel widths

2 movement

axes) with each technique, mouse, DTH, and TFS

(counterbalanced order).

The peek task (Figure 2d) illustrates how multi-

touch surface provides opportunities to devise new

interactions for common desktop tasks. Users often

need to temporarily resize windows to view content

hidden behind or to see more of its content. To peek at

content, the user grabs the window border and drags it

until the desired content appears. At this point, the user

can release the border to keep the new window size or

drag the border back to its original location.

As shown in Figure 2d, the peek task consists of

three steps: 1) the participant grabs the blue object

(the window border), 2) drags it towards the opposite

side of the yellow curtain (the window), which shrinks

as the cursor moves, and 3) stops dragging when the

green stop area becomes visible (revealing the content

beneath). The participant then returns the blue object to

the gray start area, either by dragging it back when

using the mouse or by issuing an upward swipe with

TFS or DTH. Timing starts when the cursor last enters

the object before a grab event and ends when the object

is released. The drag movement is either horizontal or

vertical. The scene appeared at a random position on the

screen. If the green stop area is never visible throughout

the trial, the trial is repeated at a random position in the

current block. Each participant completed ﬁve blocks

of four trials (2 sizes of the stop area

2 movement

axes) with each technique.

We organized a session (lasting about one hour) in

four parts. All participants began with the bar task,

followed by the tunnel task, then the box task, and

ﬁnished with the peek task, which required familiariza-

tion with the revert interaction. Before the timed trials,

participants set their preferred cursor speed. Before

beginning a new task type, we instructed the participant

on performing the task with each technique. After that,

the participant completed as many practice trials as

needed to learn the current technique for the task. We

counterbalanced technique orders among participants

within each task type. After completing each task type,

participants answered a short questionnaire and rated

each technique’s ease of use in that task (on a 1 to 10

scale, 10 being very easy).

4.2 Results

Table 1 presents the results for the dragging tasks.

As in our previous analyses, we focus on an ‘overall’

comparison of the devices. For the time analysis, we

removed the error trials and outliers in each unique task

constellation. When a learning effect was observed

with any of the techniques, we excluded the affected

blocks for all techniques. We ensured the distribution of

the remaining data was close to normal before applying

an RM-ANOVA.

In the box task, the error rate was 0.5% (mouse),

4.7% (DTH), and 0.8% (TFS). We observed a learning

effect in the ﬁrst block for all techniques, hence excluded

it from analysis. Error rates in the bar task were

4.4% (mouse), 7.5% (DTH), and 3.7% (TFS), with

no learning effect observed. For the tunnel task, we

excluded trials where participants crossed a tunnel

border. This occurred in 19.2% (mouse), 20.7% (DTH)

and 18.3% (TFS) of the trials. We saw the mean trial

time in the ﬁrst block with TFS was longer than in

the other blocks, indicating a learning effect. The ﬁrst

block was excluded for all techniques.

The error rate in the peek task was 1.0% (mouse),

11.5% (DTH), and 7.2% (TFS). We noticed a longer

mean trial time with DTH in the ﬁrst block. We also

saw decreasing times in the last two blocks with TFS,

and a drop after the second block with the mouse.

Thus, we only used data from the last two blocks.

After RM-ANOVA (Table 1), Bonferroni post-hoc test

showed only mouse and DTH signiﬁcantly differ.

The results show that the mouse outperformed MTM

techniques in the quick box task, which required mini-

mal precision. While the mouse had a slight advantage

in the precise bar task, the MTM techniques matched its

performance in the tunnel task. The peek task, introduc-

ing a novel MTM interaction, had high error rates but

was regarded as fun and practical by participants. With

practice, users could become more comfortable with the

reverting gesture and better integrate MTM interactions

like peek into desktop workﬂows.

5 DEPTH-DRAG

Desktop and mouse users often have numerous windows

open simultaneously, as many tasks require access to

Exploring Standard and Novel Interactions on a Multi-Touch-Enhanced Mouse

555

Table 1: Analysis results of time and rating of the four dragging tasks.

Task Metric DTH TFS Mouse Statistical Signiﬁcance

Box

Mean Time

1.16 s (

0.2) 1.16 s (

0.1) 0.57 s (

0.1)

2,22

= 248.69, p < 0.001, η

= 0.96

Mean Rate 8.25 (SD 1.4) 8.50 (SD 0.8) 9.75 (SD 0.6) χ

(2) = 11.53, p < 0.05

Bar

Mean Time 2.1 s (SD 0.4) 2.3 s (SD 0.3) 1.4 s (SD 0.2) F

2,22

= 4.35, p < 0.05, η

= 0.75

Mean Rate 7.50 (SD 1.7) 7.33 (SD 1.6) 9.45 (SD 0.8) χ

(2) = 12.04, p < 0.05

Tunnel

Mean Time

1.50 s (

0.8) 1.50 s (

0.7) 1.52 s (

0.9)

No signiﬁcant difference found

Mean Rate 8.08 (SD 1.4) 7.75 (SD 1.7) 8.83 (SD 1.3) No signiﬁcant difference found

Peek

Mean Time

2.22 s (

0.6) 2.17 s (

0.5) 1.96 s (

0.4)

2,22

= 5.77, p < 0.05, η

= 0.34

Mean Rate 8.5 (SD 1.0) 7.8 (SD 1.2) 9.4 (SD 0.9) No signiﬁcant difference found

various applications and their content. Studies have

shown that having eight or more windows open is

common and that window-switching is frequent and

time-consuming (Smith et al., 2003; Hutchings et al.,

2004; Warr et al., 2016). Ideally, with all windows

visible, window-switching becomes a trivial point-and-

click task. However, limited screen space, particularly

on laptops, often leads to overlapping or maximized

windows stacked in layers. This makes direct switching

with a single click impossible.

Most desktop systems provide dedicated window-

switching mechanisms, such as keyboard shortcuts

(Alt/Cmd+Tab), taskbars with icons for open applica-

tions (or windows), and hotkeys or ‘hot screen corners’

that display clickable proxies for open windows. The

importance of effective window-switching is evident

in extensive previous research on novel approaches

to window-switching (e.g., Smith et al., 2003; Warr

et al., 2016; Xu and Casiez, 2010; Tak et al., 2009,

2011; Sato et al., 2020) and window management (e.g.,

Tashman, 2006; Ahlstr

om et al., 2009; Zeidler et al.,

2013; Tashman and Edwards, 2012). Here, we see

opportunities for a multi-touch enhanced mouse and

designed an MTM interaction for situations involving

copy- or cut-and-paste content between windows. With

many windows open, such situations often demand ex-

tensive window-switching (to ﬁnd the desired windows)

or cumbersome re-arrangement of windows (to make

the content visible). Typical scenarios include copying

text from a web page to a text document, pasting cell

content between spreadsheets, and moving ﬁles between

ﬁle browser windows.

Depth-Drag is simple. It partly relies on the taskbar

and works as follows. Assuming a ﬁle-moving scenario

where the user wants to copy ﬁles from different ﬁle

browser windows (i.e., folders) into one common

destination folder (open in its own ﬁle browser window).

All windows are maximized and stacked, with the

destination window at the top of the stack. The user

ﬁrst moves the cursor to click (with a tap on the touch

surface) the desired window icon in the taskbar to

activate the corresponding ﬁle browser window, which

hosts the desired ﬁle. With the window active, the user

moves the cursor to the desired ﬁle icon and ‘grabs’

it by moving two ﬁngers simultaneously in a short

downward swipe gesture over the touch surface. At

the end of the swipe, when the ﬁngers depart from the

touch surface, the second window in the stack (i.e., the

ﬁle browser window displaying the destination folder)

is automatically brought to the front, and the grabbed

ﬁle is dropped inside.

A second version of Depth-Drag uses MTM’s visual

output capability (through its touchscreen). We call it

Depth-Drag-Visual. With the visual version, the user

taps with two ﬁngers on the MTM to display a grid

with the window icons from the taskbar on the MTM’s

touchscreen, as visualized far right in Figure 3. A tap

on the desired icon forwards the corresponding window

to the front, and the user can continue the task the same

way as when using the non-visual Depth-Drag. This

way, the user can save long cursor movements to and

from the taskbar.

The Depth-Drag interaction also allows for naviga-

tion through layers of stacked windows – hence, its

name. If the user brieﬂy pauses after the downward

two-ﬁnger swipe and slides them upwards without

lifting off the ﬁngers, hidden windows in the stack

appear up front, one replacing the other as the user

slides further. Sliding down again brings back the

previously shown windows, one after the other, while

the user continues downwards.

6 STUDY 3: DEPTH-DRAG

We invited 12 persons (6 female) to participate in a

study that compared Depth-Drag, Depth-Drag-Visual,

and a standard mouse in a ﬁle-moving task, similar to

the above-mentioned description. They aged 24 to 56

years (mean 36.3,

8.4). All used a mouse as their

main pointing device. Four had used MTM in one of

the earlier studies.

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

556

Figure 3: To cut-and-paste with the mouse in Study 3, the user: a) clicks on the ‘S’ icon in the taskbar to activate the source

window, then b) right-clicks on the target item and selects ‘Cut’ from the context menu, then clicks on ‘D’ icon to activate

destination window, and c) right-clicks and selects ‘Paste’, which d) moves the target item to the destination window.

6.1 Study Task and Procedure

Before each trial, the study software displays a “Start

Trial” button in a maximized window. Clicking the

button loads the ﬁrst task screen and starts timing.

Figure 3 depicts the remaining task screens. The taskbar

at the bottom displays 11 icons, with windows’ icons ‘D‘

(destination) and ‘S‘ (source) in random positions. The

ﬁrst screen (Figure 3a) shows the maximized destination

window containing several identically colored square

items (number and color vary per trial). The participant

needs to navigate to the source window to cut a target

item (marked with a star) and paste it back into the

destination window. With the mouse or Depth-Drag, the

participant clicks (one-ﬁnger tap in Depth-Drag) on the

‘S’ icon in the taskbar. When using Depth-Drag-Visual,

the taskbar is hidden. The participant taps with two

ﬁngers on the MTM touchscreen to display a grid

representation of the taskbar (Figure 3, far right). A

one-ﬁnger tap on the ‘S’ icon activates the source

window (Figure 3b). It contains 5 to 45 items arranged

in a grid (the number varied randomly between trials;

their size always 22×22 mm).

With the Depth-Drag techniques, participants acti-

vate the source window, move the cursor to the target

item, perform a two-ﬁnger downward swipe to ‘select’

it, and ﬁnish by lifting off both ﬁngers. The liftoff acti-

vates the destination window and pastes the ‘grabbed’

item, ending the trial (and timing). With the mouse, the

task involves using the context menu to cut and paste

the target item. We acknowledge that expert users often

cut and paste using keyboard shortcuts. However, we

aim to strictly compare the two pointing devices.

Any diversion from this sequence of actions (e.g.,

selecting the wrong icon in the taskbar or grabbing the

wrong item in the source window) marks the trial as an

error and ends it. Error trials were reloaded at random

positions within the current block.

Each participant completed two series of three trial

blocks, each ending after 20 error-free trials. A series

included one block per technique, with the order coun-

terbalanced across participants. Participants received

instructions and practice trials with all techniques

before starting with the ﬁrst series of timed blocks:

Tech-1 (practice+trials), Tech-2 (practice+trials), and

Tech-3 (practice+trials). Participants answered NASA

TLX (Hart and Staveland, 1988) questions for each tech-

nique after its block of trials in the second series: Tech-1

(trials+NASA-TLX), Tech-2 (trials+NASA-TLX), and

Tech-3 (trials+NASA-TLX).

6.2 Results: Error Rate and Trial Time

Participants made errors in 54, 87, and 86 trials with the

mouse, Depth-Drag, and Depth-Drag-Visual, resulting

in error rates of 10.1%, 15.3%, and 15.2%, respectively.

We removed error trials from subsequent analyses.

Participants could quickly pick up both Depth-Drag

techniques. However, they were notably slow in the

ﬁrst timed trial with each technique and in Trial 21

(starting the second series) due to adjusting to another

technique. Hence, we excluded Trial 1 and Trial 21

from all participants in our time analyses and conﬁrmed

the approximate normality of the remaining data.

The overall mean times were

6.67 s

(

0.5)

with the mouse,

3.52 s

(

0.3) with Depth-Drag,

and

4.59 s

(

0.6) with Depth-Drag-Visual. An

RM-ANOVA revealed a signiﬁcant effect for tech-

nique (

2,22

= 219.50, p < 0.001, η

= 0.95

), with

Bonferroni-adjusted post-hoc tests conﬁrming sig-

niﬁcant differences between all pairs (all

’s

< 0.001

Participants were 23% faster with Depth-Drag than

with Depth-Drag-Visual, and they were 47% faster with

Depth-Drag compared to the mouse. They were 31%

faster with Depth-Drag-Visual than with the mouse.

While we anticipated better performance with the Depth-

Drag techniques, the magnitude of the improvement

exceeded expectations.

6.3 Workload Ratings

Comparing our participants’ subjective workload ratings

across the six NASA-TLX dimensions–mental demand,

physical demand, temporal demand, effort, frustration,

and performance– revealed no signiﬁcant differences for

most dimensions among the three techniques. However,

Exploring Standard and Novel Interactions on a Multi-Touch-Enhanced Mouse

557

a Friedman test showed a signiﬁcant difference in the

mental demand ratings (

= 7.70, p < 0.05

). Post-hoc

Wilcoxon signed-rank tests with Bonferroni adjustment

identiﬁed the difference between Depth-Drag-Visual

and the mouse (

Z = −2.54, p < 0.05

), with Depth-

Drag-Visual receiving higher mental workload ratings.

This result resonates with participants’ feedback and

comments during the study. Almost all participants

noted that frequently switching visual attention between

the desktop monitor and MTM’s touchscreen was

mentally taxing: glancing at the ‘mouse hand’ felt

unnatural and challenging to remember. A few even

stated frustration about the need to shift focus constantly.

This opinion, however, did not manifest itself in the

NASA-TLX frustration ratings.

7 DISCUSSION

We ﬁrst discuss and highlight our three studies’ main

results and conclusions, along with thoughts on im-

proving the tested MTM interactions. After that, we

present our ideas regarding possible future routes for

our multi-touch-enhanced mouse.

7.1 Findings

Given the mouse’s unmatched ubiquity, the results of

our ﬁrst two studies on fundamental point-and-click

and dragging interaction did not surprise us.

Point-and-click is the most basic and rapid mouse

interaction and an ofﬁce worker points and clicks

countless times during a workday. Accordingly, there is

not much room for improvement. However, the Tap-

Click we used in Study 1 is also rapid, and many people

are skilled in tapping on their smartphones. We suspect

that a more ergonomic multi-touch device than our

MTM prototype can be as effective in point-and-click

with a mouse.

In the box and bar dragging tasks, the mouse

had several speciﬁc advantages that contributed to

its signiﬁcant performance lead over MTM. First,

the point-and-click study revealed that the Tap-Click

gesture on the MTM did not match the mouse click.

Expectantly, an MTM gesture (Double-Tap-Hold), more

complicated than Tap-Click, performs much worse than

the simpler mouse technique (press to grab). Second,

grabbing with MTM required a short mental pause

from participants to perform the correct grab technique,

while pressing the mouse button to grab is the least

demanding action on the mouse. Lastly, the mouse

allowed the users to grab the object while passing over

it, a capability not replicated with MTM. The tunnel

task results demonstrate that in careful maneuvering of

the two devices, the difference in the weight or the

shape did not inﬂuence the time.

The results of the peek task were the most unex-

pected, as we suspected the revert gesture provided

a clear advantage to MTM. Upon closer examina-

tion, we found that the revert gesture in Double-Tap-

Hold introduced additional cognitive load compared

to Two-Finger-Swipe. While swiping up to revert

in Two-Finger-Swipe aligns with the natural move-

ment of the ﬁngers, performing the same gesture in

Double-Tap-Hold requires a different cognitive pro-

cess. Additionally, the physical positioning of the

ﬁngers during the revert gesture posed challenges for

participants, who expressed this issue with statements

such as “The revert gesture was not intuitive” and “It

was confusing whether to swipe up or lift the active

ﬁnger.” The results did not show a difference between

the mouse and Two-Finger-Swipe, and we believe

performance with this technique can reach its potential

after more practice and familiarization with the revert

gesture. Our observation strengthens this in that some

participants paused brieﬂy in the stop area to recall the

revert gesture.

Participants were 31% faster with Depth-Drag-

Visual than with the mouse. We expected to see better

performance with the two Depth-Drag techniques than

with the mouse. The MTM provides a direct drag

using one gesture, while the mouse requires two steps.

However, we did not expect such large differences.

The notable time advantage of MTM-Non-Visual

over MTM-Visual is intriguing. Since both MTM

techniques employ identical dragging gestures, the

time difference stems from the windows-switching

methods. Although the MTM-Visual saved the travel

to the taskbar, the special gesture to activate the task

grid, looking at the MTM screen, and selecting an

icon signiﬁcantly impeded participants. Participants

also expressed discomfort and annoyance with the

MTM-Visual due to the constant need to switch visual

focus between the monitor and the MTM screen.

With our studies, we have shown that it is possible

to combine sequences of touch gestures performed

on top of a mouse into interactions mapped to the

fundamental desktop interaction, point-and-click and

dragging. Our Depth-Drag interaction also demonstrates

that sequences of multi-touch gestures can be tailored

into new interactions that are impossible to perform

with a standard mouse and its two buttons.

7.2 Future Work

Our ideas for future work include exploring ‘bounded’

touch interactions, scroll and pan-and-zoom function-

alities, and mode switching.

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

558

Bounded Touch Interactions. The current MTM

interactions we tested can all be performed at any

position on the screen (excluding the bottom part,

which we deactivated to avoid unintended contact with

the palm of the hand). Nevertheless, having invisible

tapping areas on the surface opens up yet another design

space for a multi-buttoned mouse with customizable

functionalities for the buttons. This idea expands on the

LensMouse’s (Yang et al., 2010) feature to host visual

contextual buttons but without the drawback of forcing

the user to look at the visual display on the pointing

device (which contributed to the mediocre performance

of Depth-Drag-Visual in our third study). We plan

to investigate the use of invisible buttons and how

accurately users can tap different areas on the screen

without looking. We predict seeing good results since

people are good at relying on their muscle memory.

Leveraging the touch surface’s physical boundaries

is another promising way to further expand the input

vocabulary with interactions starting from the edge

of the surface, commonly seen on smartphones and

smartwatches (Roth and Turner, 2009; Serrano et al.,

2013; Ahn et al., 2018).

Scroll, Pan, and Zoom. We plan to study scrolling,

panning, and zooming with our multi-touch mouse.

These interactions are commonly used for navigating

content, such as text documents, maps, or images.

Scroll, pan, and zoom can be designed as multi-touch

gestures performed anywhere on the MTM touch

surface. Moreover, we see great potential in binding

these interactions to speciﬁc regions or edges of the

surface. For instance, while vertically swiping along

the edge of the MTM could invoke zoom, swiping

elsewhere could scroll. We also envision using surface

regions to conﬁgure interactions: swiping on the left

side enables quick long-distance scrolling (using a

high-gain transfer function), whereas swipes on the

right side are used for high-precision positioning (using

a function with a low gain).

Mode Switching. Multi-touch capabilities provide

another area for exploration. We envision utilizing

gestures performed with a secondary ﬁnger, such as

the middle ﬁnger to switch between modes while

performing interactions with the index ﬁnger. For

example, in a spreadsheet application, the user can

swipe ‘in’ from the right bezel with the middle ﬁnger to

copy the formula in a cell. While holding the ﬁnger on

the screen, the user can move the cursor to different

cells and repeat a paste gesture with the index ﬁnger.

8 CONCLUSION

We have presented a multi-touch-enhanced mouse proto-

type. Through two user studies, we have demonstrated

that our prototype, a smartphone attached to the base of

a mouse, performs comparably to a standard mouse

in point-and-click and dragging tasks where the user

issues touch gestures instead of clicking with hardware

buttons. Our third study highlights the potential of

joining multi-touch with precise cursor positioning.

Through a combination of ﬁnger movements on the

touch surface, the Depth-Drag interaction allows users

to quickly move objects between layers of desktop win-

dows, such as when moving ﬁles between ﬁle browser

windows or when moving text between documents.

With little practice, our participants were up to 47%

faster with Depth-Drag than with a regular mouse.

Our research on the fundamental desktop interactions

for touch-based mouse input provides the necessary

foundation to support and push development further in

this direction.

REFERENCES

Ahlstr

om, D., Großmann, J., Tak, S., and Hitz, M. (2009).

Exploring new window manipulation techniques. In

OzCHI ’09, pages 177–183. ACM Press.

Ahlstr

om, D., Hitz, M., and Leitner, G. (2006). An evaluation

of sticky and force enhanced targets in multi-target

situations. In NordiCHI’06, pages 58–67. ACM.

Ahmad, B. I., Langdon, P. M., Bunch, P., and Godsill, S. J.

(2014). Probabilistic intentionality prediction for target

selection based on partial cursor tracks. In UAHCI 2014,

pages 427–438. Springer.

Ahn, S., Lee, J., Park, K., and Lee, G. (2018). Evaluation of

edge-based interaction on a square smartwatch. Int. J.

Hum. Comput. Stud., 109:68–78.

Apple Inc. (2022). Magic mouse-white multi-touch surface.

Bartindale, T., Harrison, C., Olivier, P., and Hudson, S. E.

(2010). Surfacemouse: Supplementing multi-touch

interaction with a virtual mouse. In TEI ’11, page

293–296. ACM Press.

Benko, H., Izadi, S., Wilson, A. D., Cao, X., Rosenfeld, D.,

and Hinckley, K. (2010). Design and evaluation of

interaction models for multi-touch mice. In Graph.

Interface, pages 253–260. Canadian Human-Computer

Communications Society.

Casiez, G., Vogel, D., Balakrishnan, R., and Cockburn, A.

(2008). The impact of control-display gain on user

performance in pointing tasks. Hum.-Comput. Interact.,

23(3):215–250.

Conroy, E., Toth, A. J., and Campbell, M. J. (2022). The

effect of computer mouse mass on target acquisition

performance among action video gamers. Appl. Ergon.,

99:103637.

Exploring Standard and Novel Interactions on a Multi-Touch-Enhanced Mouse

559

Franz, J., Menin, A., and Nedel, L. (2016). Lossless multi-

tasking: Using 3d gestures embedded in mouse devices.

In IEEE SVAR, pages 109–116. IEEE.

Froehlich, B., Hochstrate, J., Skuk, V., and Huckauf, A. (2006).

The globeﬁsh and the globemouse: Two new six degree

of freedom input devices for graphics applications. In

CHI ’06, pages 191–199. ACM Press.

Grossman, T. and Balakrishnan, R. (2005). The bubble cursor:

Enhancing target acquisition by dynamic resizing of the

cursor’s activation area. In CHI ’05, page 281–290.

ACM Press.

Hart, S. G. and Staveland, L. E. (1988). Development of

nasa-tlx (task load index): Results of empirical and

theoretical research. In Adv. Psychol., volume 52, pages

139–183. Elsevier.

Hinckley, K. and Sinclair, M. (1999). Touch-sensing input

devices. In CHI ’99, pages 223–230. ACM Press.

Hinckley, K., Sinclair, M., Hanson, E., Szeliski, R., and

Conway, M. (1999). The videomouse: A camera-based

multi-degree-of-freedom input device. In UIST ’99,

pages 103–112. ACM Press.

Hourcade, J. P., Crowther, M., and Hunt, L. (2007). Does

mouse size affect study and evaluation results? a study

comparing preschool children’s performance with small

and regular-sized mice. In IDC ’07, pages 109–116.

ACM Press.

Hutchings, D. R., Smith, G., Meyers, B., Czerwinski, M., and

Robertson, G. (2004). Display space usage and window

management operation comparisons between single

monitor and multiple monitor users. In AVI ’04, pages

32–39. ACM Press.

Kim, S., Lee, B., Van Gemert, T., and Oulasvirta, A. (2020).

Optimal sensor position for a computer mouse. In CHI

’20, pages 1–13. ACM Press.

Martin, B. and Raisamo, R. (2004). Trackmouse: A new

solution for 2+2d interactions. In NordiCHI ’04, pages

89–92. ACM Press.

McGufﬁn, M. and Balakrishnan, R. (2002). Acquisition of

expanding targets. In CHI ’02, pages 57–64. ACM.

Microsoft Corporation (2022). Microsoft arc mouse.

Mistry, P. and Maes, P. (2011). Mouseless: A computer mouse

as small as invisible. In CHI EA ’11, pages 1099–1104.

ACM Press.

Ortega, M. and Nigay, L. (2009). Airmouse: Finger gesture for

2d and 3d interaction. In Human-Computer Interaction –

INTERACT 2009, pages 214–227. Springer.

Perelman, G., Serrano, M., Raynal, M., Picard, C., Derras, M.,

and Dubois, E. (2015). The roly-poly mouse: Designing

a rolling input device unifying 2d and 3d interaction. In

CHI ’15, pages 327–336. ACM Press.

Peters, M. and Ivanoff, J. (1999). Performance asymmetries

in computer mouse control of right-handers, and left-

handers with left- and right-handed mouse experience.

J. Motor Behav., 31(1):86–94.

Roth, V. and Turner, T. (2009). Bezel swipe: Conﬂict-free

scrolling and multiple selection on mobile touch screen

devices. In CHI ’09, pages 1523–1526. ACM.

Saidi, H., Serrano, M., Irani, P., and Dubois, E. (2017).

Tdome: A touch-enabled 6dof interactive device for

multi-display environments. In CHI ’17, pages 5892–

5904. ACM Press.

Sato, K., Imada, S., Mazume, S., and Nakashima, M. (2020).

A mechanism of window switching prediction based

on user operation history to facilitate multitasking. In

Barolli, L., Nishino, H., Enokido, T., and Takizawa, M.,

editors, Adv. Networked-Based Inf. Syst., pages 154–166.

Springer.

Serrano, M., Lecolinet, E., and Guiard, Y. (2013). Bezel-tap

gestures: Quick activation of commands from sleep

mode on tablets. In CHI ’13, pages 3027–3036. ACM.

Shi, K., Subramanian, S., and Irani, P. (2009). Pressuremove:

Pressure input with mouse movement. In INTERACT

2009, pages 25–39. Springer.

Smith, G., Baudisch, P., Robertson, G., Czerwinski, M.,

Meyers, B., Robbins, D., and Andrews, D. (2003).

GroupBar: The TaskBar evolve. In OzCHI ’03, pages

34–43. ACM Press.

Son, S., Jung, J., Ham, A., and Lee, G. (2023). Touchwheel:

Enabling ﬂick-and-stop interaction on the mouse wheel.

Int. J. Hum.-Comput. Inter., 0(0):1–13.

Tak, S., Cockburn, A., Humm, K., Ahlstr

om, D., Gutwin,

C., and Scarr, J. (2009). Improving window switch-

ing interfaces. In INTERACT 2009, pages 187–200.

Springer.

Tak, S., Scarr, J., Gutwin, C., and Cockburn, A. (2011).

Supporting window switching with spatially consistent

thumbnail zones: Design and evaluation. In INTERACT

2011, pages 331–347. Springer.

Tashman, C. (2006). Windowscape: a task oriented window

manager. In UIST ’06, pages 77–80. ACM Press.

Tashman, C. and Edwards, W. K. (2012). Windowscape:

Lessons learned from a task-centric window manager.

ACM Trans. Comput.-Hum. Interact., 19(1):8:1–8:33.

Taylor, S., Keskin, C., Hilliges, O., Izadi, S., and Helmes,

J. (2014). Type–hover–swipe in 96 bytes: A motion

sensing mechanical keyboard. In CHI ’14, pages

1695–1704. ACM Press.

Ultraleap (2013). Leap motion controller.

Villar, N., Izadi, S., Rosenfeld, D., Benko, H., Helmes, J.,

Westhues, J., Hodges, S., Ofek, E., Butler, A., Cao, X.,

et al. (2009). Mouse 2.0: Multi-touch meets the mouse.

In UIST ’09, pages 33–42. ACM Press.

Warr, A., Chi, E. H., Harris, H., Kuscher, A., Chen, J.,

Flack, R., and Jitkoff, N. (2016). Window shopping: A

study of desktop window switching. In CHI ’16, pages

3335–3338. ACM Press.

Xu, Q. and Casiez, G. (2010). Push-and-pull switching:

Window switching based on window overlapping. In

CHI ’10, pages 1335–1338. ACM Press.

Yang, X.-D., Mak, E., McCallum, D., Irani, P., Cao, X., and

Izadi, S. (2010). Lensmouse: Augmenting the mouse

with an interactive touch display. In CHI ’10, pages

2431–2440. ACM Press.

Zeidler, C., Lutteroth, C., and Weber, G. (2013). An evaluation

of stacking and tiling features within the traditional

desktop metaphor. In Kotz

e, P., Marsden, G., Lindgaard,

G., Wesson, J., and Winckler, M., editors, INTERACT

2013, pages 702–719. Springer.

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

560