Introduction

It is important to distinguish our own actions and outcomes from those of other individuals in order to successfully interact with the external world and prepare for potential threats. This ability to refer the origin of an action to oneself may be based on computational models of motor control1,2. An internal forward model may predict the sensory consequences of motor commands and this predicted sensory information is compared with actual sensory information3,4,5. When predicted and actual sensory consequences match, these sensory events are interpreted as self-generated and an individual will experience a sense of agency for those events6,7,8. Recent investigations have demonstrated that a sense of agency is eliminated when predicted and actual movements are asynchronous9,10, while it is elicited by active hand movements10,11,12. Incongruent positioning of one's hand does not eliminate this sense of agency10; however, the matching process, during which predicted and actual sensory information are compared, remains unknown.

Here, we provided a novel illusion, created when individuals watch their and another's hand motion alternately from a first- person perspective, which successfully extends observers' sense of agency to others' movements (see also Movie S1; in this movie, two observers' hand movement images are switched at various speeds and temporal intervals). In this illusion, observers perceive their own movements and those of others' as a single movement. Observers know that they are not performing a united motion, because the motion contains oscillation caused by the switching hand images. Although some researchers have reported that a sense of agency is the result of matching predicted and actual actions11,13, this illusion can arise even with a discrepancy between predicted and actual actions. On the other hand, the appearance of the object moving in synchronization with the observer's motion should not affect the sense of agency, since even the appearance of an avatar like a mouse cursor can induce a user's sense of agency. Thus, the impact of this illusion, which is observed by two individuals, lies in the extension of the sense of agency toward movement that differs from an observer's own predicted movement, rather than toward an object whose appearance is different from the observer's own hand.

Therefore, in the present study, we conducted quantification experiments of our illusion using visuo-motor congruent and incongruent movements to exclude the hand appearance factor. First, in addition to objectively demonstrating this illusion, we will show that this illusion arises even if congruent and incongruent movements, considered as obstructive factors for creating a sense of agency, are rapidly switching. Next, when participants experienced this illusion, we investigated the position where participants perceived their hand.

Results

We first demonstrated this illusion by using a perceptual judgment task, where a sense of agency was subjectively evaluated. Nine participants watched their own arm and hand movements (‘congruent’) and those of an ‘incongruent’ movement (to control for differing hand appearances, participants' own finger movements were recorded before the experiment and these movements were displayed as the incongruent movements) using a head-mounted camera affixed to a desk. The display showed real-time images captured by the camera, which was fixed to the same position as participants' eyes. Participants observed their own congruent and incongruent movements under 4 conditions. In the ‘switching’ condition, where we expected to create the illusion, participant's congruent and incongruent hand images were viewed alternately at 133- ms and 267-ms intervals respectively in each cycle (Fig. 1d). These intervals were determined during preliminary observations (see the next reaching experiment). The other conditions were set up as controls for comparison: ‘only congruent’, where only participants' own real-time movement was displayed (Fig. 1a); ‘only incongruent’, where only the incongruent movement was displayed (Fig. 1b); and ‘blending’, where both movements were superimposed and displayed simultaneously (Fig. 1c).

Figure 1
figure 1

Experimental and control conditions for displaying congruent and incongruent movements.

Photos beside each panel are examples of participants' view during the perceptual judgement (left) and reaching experiments (right).

For each condition, participants were asked to answer to what extent they felt a sense of agency using two forced-choice questions. Question (A) was used to assess whether a participant believed the movement of the displayed hand to be a single movement, while question (B) assessed whether a participant believed that the displayed movement was initiated by himself (see also Method).

First, a participant put his hands on the table, palms open (Fig. 1). A beeping sound initiated the experiment, which started displaying images of finger movements and allowed participants to begin their own movements. Participants could only move their fingers; any finger movements were allowed. After 8 s, a black background was presented and participants were instructed to answer the questions by pressing the keyboard as soon as possible.

In the ‘switching’ condition, participants formed two groups: 6 participants who might see the congruent and incongruent movements as a single movement (the group stated seeing the illusion) and 3 participants who saw the movements as belonging to separate people. This result indicates a degree of ambivalence with our illusion; despite knowing that two different movements are being made, some participants still perceived a single continuous movement intuitively. The ‘single movement’ group in the ‘switching’ condition showed no significant differences in their beliefs that the whole movement was their own between the ‘switching’ and ‘only congruent’ conditions. This is despite the fact that the switching condition displayed participants' incongruent movements that were not spatio-temporally identical to their own (Fig. 2a). On the other hand, in the switching condition for the ‘separate movements’ group, participants perceived their own movement within these two movements, just as in the ‘blending’ condition (Fig. 2b). This suggests that participants assigned agency to only one of the displayed movements. One might argue that the ‘single movement’ group was simply focusing their attention on the time window featuring congruent movements, since congruent hand movements are presumably more salient than incongruent ones. A recent study actually suggests that a visual motion stimulus which is consistent with observers' own hand motion tends to be prompted to awareness more than an inconsistent visual stimulus14. To further test this possibility, we conducted an experiment whereby a participant judged whether he/she perceived continuity between congruent and incongruent hand movements in various switching conditions (see supplementary material). Fig. S1 shows that all participants perceived continuity between congruent and incongruent hand movements in the switching condition in which participants' congruent and incongruent hand images were viewed alternately at 133 ms and 267 ms intervals respectively in each cycle. Combined with this result, it seems that the ‘single movement’ group in the perceptual judgment experiment saw the congruent and incongruent movements as a single and continuous movement.

Figure 2
figure 2

Perceptual judgment experimental results.

Error bars denote SD. Participants were classified into two groups. One (Group A) denotes participants who saw a single movement in the switching condition and another (Group B) denotes participants who saw two movements in the switching condition. (a) Shows ‘the conditional probabilities of stating a feeling of agency in the perceived movement when seeing a single movement’ in each condition among Group A (n = 6). This probability was interpreted as ‘not calculated’ if the probability those participants stated seeing a single movement was below 20%. A one-way analysis of variance on the mean of these conditional probabilities was significant (F(2,10) = 125.61, P < .01). Ryan's method indicated that the probability in the ‘only incongruent’ condition (mean = 10.44, SD = 16.92) was significantly smaller than that in the ‘only congruent’ condition (mean = 99.17, SD = 1.86, P < .01) and the ‘switching’ condition (mean = 90.60, SE = 8.38, P < .01). There were no significant difference between the ‘only congruent’ and ‘switching’ conditions (P = 0.20). (b) Shows ‘the conditional probabilities of stating a feeling of agency in perceived movements when seeing two movements’ in each condition among Group B (n = 3). This probability was interpreted as ‘not calculated’ if the probability of those participants stated seeing two movements was below 20%. An ANOVA indicated that the mean probabilities were significantly different (χ2(1) = 4.35, P < .05). The probability in the ‘blending’ condition was greater than in the ‘switching’ condition (Scheffe's method, P < .05).

Our results indicate that alternating the images not only made some participants perceive their own congruent and incongruent movements as a single movement but also made them attribute a sense of agency to an incongruent movement. However, in the blending condition, participants appeared to recognize these as two different movements, suggesting that a sense of agency can be attributed to a single entity; thus, participants already have a sense of agency for their own congruent hand while ignoring the incongruent hand. Conversely, participants might not have felt any agency for either hand because they saw two hands moving. Indeed, our results indicate that alternating images extends participants' sense of agency to incongruent movements. However, at what position/location do participants perceive their hand?

To investigate the above question, participants were asked to state their hand position by performing a ballistic open-loop pointing action. Their own hand and a fake hand were placed at different positions on a table and then a reaching target appeared to the front. Participants had to judge where to point based on where they believed their hand to be in relation to the target. If participants did not see the two movements as a single movement in the ‘switching’ condition, they would point towards the target from the position of their actual hand. We used the same conditions mentioned above to test this possibility.

The same apparatus was used in this experiment and the fake hand movement was created by shifting real-time images of participants' own hands seven degrees to the left or right (hereinafter referred to as the ‘incongruent’ position, Fig. 1). First, participants were required to oscillate their hand from side to side within 200 mm (22 deg) at 2 Hz for 4 s without seeing anything (a black screen on the display). After that, the hand movement seen during another 4 s corresponded to one of the four visual conditions shown in Fig. 1. Next, participants stopped moving and placed their hand at the centre of their desk. The hand image disappeared and a black background with a white square target was presented on a display 50 mm away from the centre of their own and the incongruent hand. Participants needed to reach for the target with their index finger as soon as possible without viewing the hand(s). If participants believed their hand to be in its actual position, they would reach towards the right when the incongruent hand was on the right and towards the left if they believed their hand to be in the incongruent hand position.

First, we conducted two preliminary experiments. In the first preliminary experiment, we tested whether participants could precisely point to the target based on the above procedure. After participants viewed their own hand movements displayed at the congruent position, a hand image disappeared and the target displayed any 1 of 3 targets in front of the index finger on the display (right front, left front or just front; see Fig. 3). Results from 6 participants' reaching position are shown in Fig. 3. A one-way analysis of variance on the reaching endpoints for the horizontal (x-axis) direction was significant (F(2,10) = 89.18, P < .01). Ryan's method indicated that the reaching endpoints to each of the 3 directional targets were significantly different between right front (mean = 50.77, SE = 5.23, P < .01), left front (mean = −40.34, SE = 7.44, P < .01) and just front (mean = 5.76, SE = 5.07, P < .01) target positions. These results show that participants could distinguish each of the 3 directional targets by their pointing responses and participants' reaching endpoints on the horizontal direction can be a valid index of their own hand position.

Figure 3
figure 3

Preliminary reaching task results.

(a) Example of pointing movement trajectories. All start-points were completed. (b) The mean of reaching the end points (n = 6). Error bars denote SD.

We next conducted the second preliminary experiment with a single participant to capture the relative landscape of the illusion according to the different conditions and duration parameters as noted above. We then performed the same experiment with 10 new participants. Participants' subjective hand positions before the reaching action were estimated according to the trajectory at which they pointed at the target. The distributions of the calculated starting points on the horizontal direction in all conditions were investigated (Fig. 4a).

Figure 4
figure 4

Reaching task results.

(a) Example pointing movement trajectories; all endpoints were completed. In this example, the incongruent hand was presented 7 degrees to the right of the participant's hand. (b) Second preliminary experimental results. A heat map of the absolute start point distance from the target in the x-axis direction (n = 1). The histogram indicates the distribution of the horizontal start point distances in relation to the target in each viewing condition. The ‘blending’ condition was interpreted as the condition in which the display times of participants' own and incongruent movements in each cycle were extremely small.

Interestingly, in the ‘switching’ condition where the illusion arises, the participant reached for the target from directly in between the positions of his own congruent and incongruent hands (Fig. 4b [iii]). The distribution of the estimated starting points became a single peak, which indicates the participant could not discriminate between his own congruent and incongruent hand and perceived the two to be a single hand placed almost equidistantly between his actual hand position and the incongruent position. The illusion seems to be sensitive to duration parameters. The switching conditions with different parameters (i.e. lower or higher alternation rate) do not produce the illusion. The participant reached from both hand positions (Fig. 4b [ii] and [iv]). This can be interpreted as the participant being unable to discriminate between his own and incongruent hands and he arbitrarily identified either hand as his own and started the reaching action from whichever position he had perceived his hand to be in. This resulted in a bimodal distribution. In the ‘only congruent’, ‘only incongruent’ and ‘blending’ conditions, the distribution became a single peak whose average is around either of one's own congruent or incongruent hand position.

To more precisely evaluate the motion responses in different conditions causing different cognitive effects, we performed the same experiments with 4 conditions (Fig. 1) using 10 participants. A histogram of the 10 participants (Figs. 5a and b) reproduced the single-peak distribution of the starting points in between their own and incongruent hands in the switching condition, which cannot be seen in other conditions. A within-subjects analysis of variance indicated that the mean starting position of the pointing movement was significantly different between viewing conditions (F(3,27) = 122.03, P < .01). The mean starting position in the ‘switching’ condition (mean = −8.34, SE = 3.26) was different from the ‘only congruent’ (mean = −38.53, SE = 2.42), ‘only incongruent’ (mean = 23.32, SE = 2.32) and ‘blending’ (mean = −19.88, SE = 3.02) conditions (Ryan's method, P < .01). These results show that participants perceived that their hand position was in between congruent and incongruent hand positions when these hand movement images were displayed alternately.

Figure 5
figure 5

Histograms of the estimated start points in the horizontal direction for pointing movements (n = 10).

The target's position was zero. (a) Histogram of the start points in the ‘only congruent’, ‘only incongruent’ and ‘switching’ conditions. (b) Histogram of the start points in the ‘switching’ and ‘blending’ conditions.

Does our illusion require physical hand motion? During basic observation, a motionless hand abolished our illusion. This result is compatible with another study that reported passive movement (i.e. no motor command) abolished a sense of agency10. Thus, we conducted a reaching experiment without hand movements in a switching condition and estimated participants' hand position. The experimental procedure was the same as the above reaching experiment except participants were instructed to stop moving their hand during the final 4 s. Viewing included a congruent condition and three switching conditions. The intervals of congruent and incongruent images in each cycle were 100 ms and 200 ms, 133 ms and 267 ms and 167 ms and 333 ms; these were chosen from near the best parameter for the emergence of our illusion (Fig. 4b).

A histogram of 6 participants (Fig. 6) suggests that motionless hands eliminate reaching movements from the centre between two hands regardless of visual condition. Moreover, the mean of reaching start points for the switching condition with hand premotion in the above experiment (Fig. 5) was significantly smaller than that of the switching conditions without hand premotion (100 ms vs. 200 ms condition: t(14) = 3.44, P < .01; 133 ms vs. 267 ms condition: t(14) = 4.31, P < .01; 167 ms vs. 333 ms condition: t(14) = 4.25, P < .01). These results suggest that our illusion required physical hand motion. Additionally, a within-subjects analysis of variance indicated that the mean starting position of the pointing movement was significantly different between viewing conditions (F(3,15) = 13.465, P < .01). Ryan's method indicated that the mean starting position in the three switching conditions (100 ms vs. 200 ms condition: mean = −25.36, SE = 2.90; 133 ms vs. 267 ms condition mean = −29.12, SE = 3.00; 167 ms vs. 333 ms condition: mean = −29.81, SE = 3.58) is significantly different from that in the congruent condition (mean = −39.37, SE = 4.44) at P <.01. This result suggested that the switching of hand image might also contribute to our illusion.

Figure 6
figure 6

Histograms of the estimated start points in the horizontal direction for pointing movements in the hand motionless experiment (n = 6).

The target's position was zero.

Discussion

The present study reported a novel illusion whereby alternating images allow participants to perceive their own congruent and incongruent movements as a single movement. This influenced participants to attribute a sense of agency to that incongruent movement. Although it has been said that a sense of agency requires congruency between predicted and actual movements, our illusion suggests that a sense of agency can be perceived for an incongruent movement made by uniting congruent and incongruent movements. This illusion offers new insights into the mechanism underlying integration of predicted and observed movements on the judgment of agency.

In regard to the reaching experiments, how can the results be explained? Perceptual responses are based on body image (i.e. perceived bodily representations) and reaching actions are based on body schemas (i.e. bodily representations related to the control of actions15,16,17,18,19). The body image can accommodate multiple hands since it has been shown that a rubber hand can be incorporated as a third, supernumerary hand20 and that the rubber hand illusion can be induced for two rubber hands simultaneously21. On the other hand, Newport et al. reported that when only the movements of two fake hands, identical to participants' own movements, were displayed at positions shifted to the left and right of participants' hands simultaneously, reaching movements started from one of the two hand positions; thus, our body schema cannot accommodate multiple hands22. Our ‘blending’ condition reproduced these results; however, results from the ‘switching’ condition suggest that participants' body schemas extended to include both the fake and real hands, as participants reached from between the real and perceived positions. For this to happen, multisensory integration regarding the incongruent hand is required23,24,25,26, as well as both a sense of ownership6,27 and agency11,28,29. Therefore, the results of both the perceptual judgement and reaching experiments can be explained by the mechanism underlying the integration of predicted and observed movements. We suggest that the visual continuity between congruent and incongruent hands partially disrupted proprioceptional integration, thus extending participants' body schema and agency in both experiments.

Therefore, how does alternating of images create this illusion? The visual stream segregation (VASS30) illusion gives some hints. With the VASS illusion, participants view two lamps being switched on/off at different speeds. At slower speeds, one light appears to be moving between two separate lamps, but at higher speeds, two lamps are perceived. In our experiment, if images of both movements are switched at high speeds (e.g. 200 ms/cycle), participants perceived both their own and incongruent hands simultaneously; with decreasing speeds, participants' own hands and the incongruent hand were perceived as one movement (‘switching’ condition; Fig. 2). This perceived continuity might be what extended participants' sense of agency.

To briefly test this hypothesis, we conducted an experiment whereby a participant judged whether he perceived continuity between congruent and incongruent hand movements in various switching conditions (see supplementary material). Fig. S1 shows that the participant perceived continuity as a decrease in switching frequency. To compare this with the reaching experiment's results (heat map in Fig. 4b), it appears that the switching frequency's parameters of beginning to perceive continuity corresponded to that of beginning to point from in between congruent and incongruent hand positions. It is likely that our illusion requires perceiving continuity between congruent and incongruent hands (that is, apparent motion between two hands) and one united motion as well as the VASS30. However, as switching frequency becomes slower, while sense of continuity is kept, start points of reaching movements are further from the centre between congruent and incongruent hands. This is because a longer time for observing either congruent or incongruent hand movements may assist participants determining whether either hand belongs to oneself. Furthermore, while reaching movements start at around the centre of two hands when there is a 1:2 ration between the display time of the image of the hand congruent with proprioceptive signals and that of the incongruent hand image (see Fig. 4b), the perceiving continuity tends to be large when the proportion approaches 1:1 (see Fig. S1). Why does this difference exist? Perhaps when the participants enacted their reaching movement, the image of a hand congruent with proprioceptive signals might be more readily interpreted as one's own hand than an image that is not congruent with proprioceptive signals, since the reaching movement requires detecting their hand. A recent study14 indicating that a visual motion which is congruent with an observer's hand movement tends to be more easily prompted to awareness than an incongruent movement supports this possibility. Thus, the congruent and incongruent hand images are balanced in this switching parameter by suppressing the congruent hand image which might be more easily detected. On the other hand, when participants judge the continuity between the hands that are congruent and incongruent with proprioceptive signals, detecting their own hand might not be required.

In conclusion, our results show that alternating images influenced participants to perceive their own congruent and incongruent movements as a single and continuous movement. This was done by perceiving apparent motion, which influenced participants attributing a sense of agency to incongruent movements.

Methods

All participants had normal or corrected-to-normal vision, were right-handed and gave written informed consent. The study protocol was approved by the local ethics research committee at the University of Osaka, Japan and has been performed in accordance with ethical standards outlined by the Declaration of Helsinki.

Apparatus

Our experimental system consisted of mirrors, cameras (Point Grey Research, Firefly MV) and a display (head-mounted display, HMD: eMargin, z-800). The layout was designed to precisely share first-person perspectives. Positioned at the eyes, cameras captured what subjects saw while not wearing the HMD. The captured images were sent to a PC, displayed on the HMD and viewed binocularly. Views were easily swapped or blended by exchanging or modifying images using the PC. The HMD's field of view was 32° horizontally and 24° vertically, with an 800 × 600-pixel resolution. The viewing distance was 500 mm. This system had a 75-ms delay before displaying the synthesized images in all conditions. Index finger position was measured using an electromagnetic sensor (Ascension, trakSTAR, 240 Hz).

Perceptual judgement experiment

Nine males (an experimenter and 8 naïve participants) participated. Twenty trials were conducted per condition. The instructions used in this experiment were (A) ‘If you perceive the displayed movement(s) as a single movement, press this key. If you perceive the displayed movement(s) as two different movements, press that key’; and (B) ‘Regarding the movement(s) in the first question (A), if you have no sense of having made the movement(s) yourself, press this key, or if you have the sense that you made the movement(s) yourself, press that key’. The conditional probabilities of (B) given (A) were calculated. Participants were instructed to move their fingers freely regardless of viewing condition.

Reaching experiment

Six males (an experimenter and 5 naïve participants) participated in the first preliminary experiment, an experimenter participated in the second preliminary experiment, 10 males (an experimenter and 9 naïve participants) participated in the main experiment and 6 males (an experimenter and 5 naïve participants) participated in the hand motionless experiment.

To control the speed of left-and-right motions, rhythmic beats were set at 120 beats per minute with an auditory metronome. At an 8 s experimental duration, subjects moved their fingers left and right to the beats. Only in the hand motionless experiment did participants stop their hand motion during the final 4 s. Participants' views were shut down during the first 4 s, as well as in one of the visual conditions during the remaining 4 s. When participants stopped, a 0.27° white square target was displayed for 0.5 s. Participants were instructed to move their hand side to side in accordance with the beep during the first 4 s (hand motionless experiment) or 8 s (the other experiments).

During the first preliminary experiment, 10 trials were conducted per condition. During the other experiments, 20 trials were conducted per condition. Ten trials displayed the incongruent hand on the left and 10 on the right. To account for when the incongruent hand was on different sides, we reversed the values of the pointing movements (making them negative) in cases where the incongruent hand was displayed to the left of participants' hands.

To capture the relative landscape of our illusion in different temporal parameters (Fig. 4b), a cubic spline interpolation was applied to the start points data using an interpolation function in MATLAB (MathWorks). The interval grid was 50 ms in the interpolation because conditions where reaching movements start from the more centralised position between the two hands are sampled at approximately 50-ms intervals.