Introduction

It has been demonstrated that the baroreceptors in the deeper layer of the human skin can endure and sense a wide range of external pressures (1 Pa−300 kPa)1,2, while also enabling astonishingly ultrafast detection (30–250 Hz)3,4. More advanced artificial skin requires the ability of sensing mechanical stimuli in fast and transient processing similar to the biological skin, which can detect the position, intensity and mode of various static and dynamic stimuli, including pressing, tapping, scratching, rubbing, etc. to meet more intelligent human-machine interaction3,5,6,7,8,9. In order to mimic or even surpass human skin for intricate tactile perception, motivating interest in iontronic skin10,11,12,13,14,15,16,17 and, especially, more recently, this device with a fast-sensing readout interface18,19,20,21,22.

Iontronic skin (I-skin), a pressure sensing modality of great interest, has a nanoscale electrical double layer (EDL) capacitance formed at the interface between ionic dielectric layer and electrodes. The change in contact area between the ion layer and the electrode under pressure stimulation induces a significant response in the EDL capacitance. As shown in Supplementary Table 1, Compared to advanced pressure-sensitive sensors, iontronic sensors exhibit prominent advantages such as high sensitivity, low noise, strong interference resistance, and static response10,23,24,25. These outstanding qualities have garnered widespread research attention towards iontronic skins. A single device with only one pressure point can only measure a limited area of pressure. To achieve a spatiotemporal tactile response like human skin, a high-density I-skin array is required. However, the development of I-skin array capable of ultrafast readout and crosstalk-free under large capacitance response generated within a wide pressure range, comparable to human skin, remains a major challenge. Especially when the sharp increase in the number of tactile units imposes significant constraints on data acquisition bandwidth, it severely impacts the I-skin array’s ability to achieve rapid and precise tactile sensing3,26.

One of the main reasons is that most iontronic arrays are currently based on Time Division Multiplexing (TDM) for data acquisition, which measures the individual addressing capacitance between row and column electrodes in sequence in each measurement cycle6,27,28,29. Despite the benefits of the serial readout feature of TDM, which enables shared conductor routing between multiple sensors and simplifies the wiring of large arrays30,31, it also suffers from drawbacks such as readout delay32,33, electrical crosstalk and signal disturbance because of the continuous channel switching34,35, and complex sequential channel control circuit36,37. Existing solutions to improve the performance of TDM-based I-skin array include use of high-speed electronic components38, optimization of switch sequence36, and the circuit with compensation algorithm17. However, these methods would increase the complexity of architecture design, measure power consumption, and occupy additional component space. In summary, the serial readout architecture based on TDM has been fundamentally constrained by its low readout rate (<100 Hz)5,17,30,31,39.

Inspired by the signal coding of large-scale mechanoreceptor in the human tactile system, spike trains coding is expected to be an alternative to TDM-based array to realize data parallel processing and avoid reading delay40,41. The spike trains coding has been demonstrated as an effective approach to encode the capacitance of iontronic sensor pixels into the unified electrical spikes, offering a promising avenue for the development of more efficient and effective electronic skin technologies3,21,26,42. Unfortunately, the implementation of this mechanism necessitates a substantial number of additional components, such as transistor oscillators or analog-to-digital converters, which adds complexity to the device structure and poses challenges in terms of circuit crosstalk between sensor units. Most critically, this spike trains coding architecture can only encode a limited range of capacitance responses, which restricts tactile perception within a limited pressure range (<20 kPa)3,21,26. In summary, the development of an iontronic array architecture with wide pressure sensing range and ultrafast detection capabilities remains a major challenge.

In this work, we propose an ultrafast readout architecture of artificial ion mechanoreceptors (AIM-skin) called FC-AIMS (overview shown in Fig. 1a). This platform consists of an artificial ion mechanoreceptor skin (AIM-skin) array (8\(\times\)8 pixels in a 9 cm \(\times \,\) cm area) and is equipped with an ultrafast readout architecture based on frequency-coded multiplexing. The AIM-skin is assembled into a flexible array by the individual bionic flexible artificial ion mechanoreceptor (AIM) proposed in previous work39. The AIM-skin’s dielectric layer is a flexible nanofiber membrane with [BMIM+] [TFSI-] and PVDF-HFP composite core layers embedded in a thermoplastic polyurethane (TPU) shell layer (Fig. 1b). Upon activation by external mechanical stimuli, ions [BMIM+] [TFSI-] in a resting state are displaced from the core layers and migrate towards the shell layer. The migration of ions under pressure causes a significant change in the interface of the EDLs, resulting in a high sensitivity capacitive response (>1.2 \({{\rm{kPa}}}^{-1}\) in 0.12–1880 kPa). Furthermore, the FC-AIMS platform is capable of performing rapid frequency encoding and decoding in field-programmable gate array (FPGA) (Fig. 1c). The platform has the ability to quickly readout spatiotemporal tactile information (within 2 ms), aiming to efficiently assist robots in dynamic tactile perception and interaction with the support of deep learning (Fig. 1d). Each row of AIM-skin possesses a distinct encoded frequency signal, thereby facilitating the parallel measurement transmission of multiline arrays (Fig. 2a). The orthogonal characteristics of these frequencies enable multiple channels to be efficiently propagated through a single capacitance-voltage (C/V) converter to the decoder, facilitating parallel readout. In contrast to the prevalent readout interface based on TDM, our approach eliminates switching delay, mitigates temporal loss resulting from TDM, and circumvents crosstalk in the circuit (Fig. 2b). In addition, in comparison to the spike trains encoding architecture, we offer a wider pressure detection range, all the while bypassing the dependence of complex integrated back-end interface electronics. Furthermore, we integrated the FC-AIMS platform with deep learning techniques to achieve precise dynamic tactile perception characterized by high spatiotemporal responsiveness and crosstalk suppression (Fig. 2c). We devised spatiotemporal 3D residual convolution network (Res3D) that accurately classifies ten types of dynamic touch gestures, achieving an accuracy of up to 89.3%. Consequently, our study provides a crucial solution for achieving tactile sensors with high spatiotemporal resolution, which hold significant promise for applications in robot control and companion robot interaction.

Fig. 1: Concept of the Frequency Coded Artificial Ion Mechanoreceptor Skin (FC-AIMS) platform.
figure 1

a The schematic diagram illustrates the FC-AIMS platform enables real-time dynamic touch interaction in intelligent robots through high spatiotemporal tactile perception. b Sensing mechanism of AIM-skin. Under mechanical stimulation, the nanofibers undergo deformation due to external force, leading to the release of ions from the core layer and their migration to the surface of the shell layer, resulting in a significant response capacitance signal. c The frequency-coding architecture utilizes different frequencies to encode and decode data channels, enabling efficient and ultrafast readout. d The intelligent sensing system employs deep learning techniques to mimic the brain’s information fusion and perception recognition processes, enabling advanced cognitive capabilities.

Fig. 2: The development process of the FC-AIMS platform.
figure 2

a The tactile perception process under frequency-coding, which involves encoding tactile channels using different frequencies to represent different stimulation positions, allowing for high-speed and accurate perception of tactile pressure signals. b Comparison diagram of readout interface for sensor array. Compared with the prevalent switching circuit based on TDM, our work avoids switching delay, time division multiplexing and crosstalk of circuit. c Dynamic touch gesture recognition is achieved on the FC-AIMS platform using the Res3D network architecture. Tactile frames, spatiotemporal sequence data within a certain time length; Res3D network, spatiotemporal feature extraction based on multilevel convolution residual blocks; Recognition, the decision-making layer carries out gesture classification.

Results

Fabrication of AIM-skin array

The human tactile system can respond sensitively and quickly to a wide range of external mechanical stimuli. In our previous work, inspired by the mechano-transduction mechanism of human mechanoreceptor Merkel cells, we fabricated an AIM with high sensitivity and high-pressure resolution in an ultrawide linear range39. However, the pressure information that can be provided by a single artificial mechanoreceptor is limited, as there is only one stress-sensing point. Here, we package multiple AIM units into an 8\(\times\)8 flexible artificial ion mechanoreceptor skin (AIM-skin) array to achieve broader tactile sensing capabilities. Figure 3a illustrates the hierarchical structure of AIM-skin, with the middle active layer using the ionic nanofiber membrane reported in our previous work. The membrane embeds nanofibers composed of [BMIM+] [TFSI-] and PVDF-HFP composite in a matrix of TPU, as shown in the scanning electron microscopy (SEM) image of Fig. 3b, left. The sandpaper templating method, which is used to create hierarchical structures, has been shown to significantly enhance the sensor’s compressibility, imparting high sensitivity and a broader pressure response range43,44. Flexible array electrodes with abundant surface microstructures made of polydimethylsiloxane (PDMS) are fabricated using the sandpaper templating and magnetron sputtering technique. The surface microstructure of the electrode was observed by SEM (Fig. 3b, right). The detailed preparation process is presented in Supplementary Fig. 8 and the experimental procedures section. The active layer is placed between two electrode arrays, and the surface is protected by a PI film. The optical photograph of the AIM-skin is shown in Fig. 3b, below.

Fig. 3: Structure of the artificial ion mechanoreceptor skin (AIM-skin) array and characterization of tactile stimulus response performance.
figure 3

a Schematic diagram of the hierarchical structure of the AIM-skin array. b SEM images and optical photograph of the AIM-skin array. High-magnification SEM image of ionic nanofiber (left). SEM image of a silver-sputtered micro structured PDMS electrode that utilizes sandpaper as a template for creating microstructures (right). Optical photographs of the AIM-skin array, showing that the skin can be greatly bent (below). c Schematic diagram of the capacitance response mechanism for AIM-skin array. d The capacitance response curve of the AIM-skin array under loading. Inset: the pressure machine applies 0.12–1880 kPa pressure load to the target node of the sensor array. e 500 cycles of endurance testing under a pressure load of 40 kPa on the AIM-skin unit. Target position: The pressure is loaded onto the measurement node. Adjacent location: Pressure is applied to adjacent nodes of the measurement node.

The design inspiration behind AIM-skin originates from the mechanical transduction mechanism of ions passing through membranes in mechanosensory cells39. As shown in Fig. 3c, The PVDF-HFP matrix of the core layer has a high dipole moment and generates ion dipole interactions with imidazole cations [BMIM+]. The shell layer of TPU realizes the separation role between the ionic region and the electrode interface. With the synergistic effect of ionic dipole interaction and ion region separation, almost no ions form an EDL at the interface in the absence of external stimuli. Under external mechanical stimulation, ions break away from the core layer and migrated to the interface between the active layer and the electrodes under the electric field force to form an EDL (Supplementary Fig. 9 and Supplementary Fig. 10). Simultaneously, the microstructure on the electrode surface undergoes deformation under pressure, further increasing the surface area of the EDL, enabling the device to achieve superior sensitivity (Supplementary Fig. 11). The EDL enables extremely close distances between positive and negative charges, far smaller than traditional capacitive sensors10,14. This results in the iontronic array exhibiting an exceptionally high capacitance response signal. Additionally, the response process of AIM-skin demonstrates a high degree of reversibility and recoverability (Supplementary Fig. 12).

We measured the \((\Delta C/{C}_{0})\)-\(P\) curve of AIM-skin unit and calculated its sensitivity values in different pressure ranges. The AIM-skin can perform excellent linear response to the pressure load in the range of 0.12–1880 kPa (Fig. 3d), which is sufficient to monitor the weak external touch stimulus. The sensitivity is 3.9 \({{\rm{kPa}}}^{-1}\) within 0.12–380 kPa, and slightly drops to 2.2 \({{\rm{kPa}}}^{-1}\) in 380–1000 kPa, and finally maintains at 1.2 \({{\rm{kPa}}}^{-1}\) in 1000–1880 kPa. The time-dependent relative change (\(\Delta C/{C}_{0}\)) responses of five successive loading and unloading cycles at peak pressures of 40 kPa, 200 kPa, 360 kPa, 520 kPa, and 680 kPa demonstrate that the sensor is stable and can undergo various dynamic loading tests (Supplementary Fig. 13). To further verify durability, reproducibility and crosstalk immunity, we measured the (\(\Delta C/{C}_{0}\)) responses of a single unit on AIM-skin at a peak pressure of 40 kPa over 500 cycles (Fig. 3e). Target position: the pressure is loaded on the measuring node unit; adjacent position: pressure is applied to adjacent node unit of the measurement node unit to reflect the influence of crosstalk. When pressure was applied to the target unit position, the (\(\Delta C/{C}_{0}\)) response of the AIM-skin unit was stable and regular, with an almost constant baseline capacitance. Meanwhile, when pressure is applied to adjacent unit locations, the physical crosstalk capacitance response of the AIM-skin units was almost less than 0.4%. AIM-skin exhibits a response to bending during the bending process, but it can maintain good flexibility and mechanical stability (Supplementary Fig. 14). The presented results demonstrate the AIM-skin array’s ability to collect tactile signals reliably under large capacitance response generated within a wide pressure range, making it a suitable candidate for artificial skin with high spatiotemporal response.

Design of the FC-AIMS platform

In the biological afferent sensory system, high-speed data readout of skin receptors to achieve fast readout rate (30–250 Hz)3,4, and a large number of afferents are required for high spatial resolution (\(141/{{\rm{cm}}}^{2}\) in the fingertip and \(25/{{\rm{cm}}}^{2}\) in the palm)40,45. Furthermore, human mechanoreceptors possess the ability to perceive stimuli over a wide range of pressure (1 Pa–300 kPa)46. However, with the increase of artificial tactile receptors, high-speed acquisition of wide range pressure has become a huge challenge. In this study, we design the Frequency Coded Artificial Ion Mechanoreceptor Skin (FC-AIMS), the AIM-skin array is equipped with an ultrafast readout architecture based on frequency-coded multiplexing, making high spatiotemporal tactile perception possible.

In the above-mentioned experiments, the AIM-skin array exhibited excellent sensitivity and an ultrawide pressure range in detecting pressure stimuli. As a passive sensor, AIM-skin characterizes the pressure value by measuring the large capacitance between its two electrodes. It can be arranged in a row/column addressable matrix due to its two terminals to achieve frequency-coded multiplexing detection (Fig. 4a). The AC-based capacitance measurement method, which has been verified to be used in high-precision and high-frequency capacitance measurement systems47, is utilized for multi frequency-coded multiplexing in our design.

Fig. 4: Architecture of frequency-coding system.
figure 4

a Scheme of frequency coding in the AIM-skin array encompasses multi-frequency synchronous excitation coding and spatial pressure decoding. b The detailed structure of a C/V converter, which is responsible for amplifying and stacking multi-channel encoded signals. c Quadrature demodulator (middle) is used to extract the signal of interest frequency (right) from the mixed encoded signal (left). d Response diagram of 10 g weight on 13 kHz coded column. When pressure is applied to the encoding node, the energy of the corresponding coded frequency undergoes a significant increase. e The spectrum diagram shows the output signal without any applied pressure. f The spectrum diagram shows the output signal when a 10 g weight is applied to the 13 kHz coded column.

The excitation voltage signals with different frequencies generated by the Direct Digital Synthesis technique are applied on the column electrodes, and the row electrodes are connected to the capacitance-to-voltage (C/V) converters (Fig. 4b). The voltage \({V}_{i}(t)\) with the one-to-one frequency is applied to the column electrode \(i\), and its expression is as follows.

$${V}_{i}\left(t\right)={A}_{0}\exp \left(j\left(2\pi {f}_{i}t+{\varphi }_{0}\right)\right)$$
(1)

where \({A}_{0}\) is the voltage amplitude (200 mV), \({\varphi }_{0}\) is the fixed initial phase, and \(t\) is the time. The frequency in our example is \({f}_{i}=(10+1\times i)\) kHz, the column electrode number \(i\). The encoding frequency interval of the array electrodes is 1 kHz. A single C/V converter responsible for encoding signal amplifying and stacking is shown in Fig. 4b, where \({{\rm{OPA}}}_{{\rm{n}}}\) is an operational amplifier, the capacitance units to be measured are at different column addresses, \({R}_{f}\) and \({C}_{f}\) are the feedback resistance and capacitance. The output voltage \({V}_{o}^{{\prime} }(t)\) of the C/V converter on each row interface can be expressed as:

$${V}_{o}^{{\prime} }\left(t\right)=\mathop{\sum }\limits_{i=0}^{7}-\frac{{R}_{f}}{{\rm{j}}{\omega }_{i}{C}_{f}{R}_{f}+1}\cdot {\rm{j}}{\omega }_{i}{C}_{i}\cdot {V}_{i}\left(t\right)$$
(2)

where \({V}_{o}^{{\prime} }\left(t\right)\) is the output signal of the C/V converter on the row interface number \(o\), \({\omega }_{i}\) is the angular frequency of the excitation source. In this case, the feedback resistance and capacitance are selected as 20 \({\rm{k}}\Omega\) and 40 pF, which can respond well within the sensor variation range (Supplementary Note 1)48. Therefore, the voltage \({V}_{o}^{{\prime} }\) after passing through the C/V converter on the row \(o\) is the result of signal superposition of the capacitor units on each different column. As depicted in Fig. 4c, each column excitation signal employs different frequency coding, and the frequency domain of the signal after aliasing on the row is displayed on the left-hand side. The output of the C/V converter can be sampled by an analog-to-digital converter (ADC) and demodulated by a digital quadrature demodulator. The digital quadrature demodulator is the most widely used, by which both the amplitude and phase of the signal can be obtained. Digital quadrature demodulator (Supplementary Note 2): the channel is divided into several frequency orthogonal subchannels to realize parallel transmission of data49, that is, demodulating the signal of the frequency of interest from the mixed modulation signa (Fig. 4c).

The corresponding encoding signal can quickly respond under parallel conditions when a stimulus acts on the frequency encoding position in FC-AIMS. Figure 4d illustrates the change in signal output of the corresponding row electrode when a 10 g weight is applied to a column node encoded at 13 kHz. It is evident from the graph in Fig. 4d that the power corresponding to the frequency of 13 kHz changes rapidly and significantly when force is applied to the column node encoded at 13 kHz. Benefiting from the frequency coding on the AIM-skin array, we can’t only quickly distinguish the magnitude of the pressure but also distinguish the spatial location nodes of the pressure. Figure 4e depicts the spectrum when there is no pressure, with the amplitude at the coding frequency being less than 1 mV. On the other hand, Fig. 4f illustrates that when a 10 g weight is placed on the 13 kHz encoded channel, the amplitude of 13 kHz rises as high as 57 mV, while the amplitude of other frequency encoded channels barely changes. It can be observed that the circuit crosstalk between nodes under the parallel zero-potential mechanism (PZPM) also remains very weak (<0.4%), confirming that our frequency-coding architecture exhibits excellent crosstalk suppression capability (PZPM circuit anti-crosstalk theoretical analysis, Supplementary Note 3). Compared to traditional TDM-based array (crosstalk ~75%)17 and spike trains coding array (spike deviation <1%)3, our FC-AIMS demonstrates excellent crosstalk suppression capability. We also demonstrate that AIM-skin exhibits effective responses across the entire range of encoding frequencies (Analysis of the response under frequency encoding, Supplementary Note 4). In summary, the FC-AIMS platform can effectively distinguish different sensing channels and suppress circuit crosstalk, enabling high-speed parallel tactile sensing.

Spatiotemporal resolution of the FC-AIMS platform

Due to the frequency-coding architecture used in the FC-AIMS platform for parallel data demodulation, the sensory system in this study can also meet ultrafast readout rate requirements as the number of tactile units in space increases. Compared with the iontronic array based on TDM and spike trains coding, our platform has advantages such as high-speed parallel readout, crosstalk suppression, wide sensing range, and simple device structure (Supplementary Table 2). In addition, the FC-AIMS platform has broad universality for iontronic skin arrays and has the potential to become the most widely used high spatiotemporal sensing solution for artificial skin.

The quadrature demodulator is responsible for demodulating the signal of interest from the mixed modulation signals to obtain the capacitance value on the corresponding node. In parallel demodulation, the spatial resolution of the FC-AIMS system when receiving dynamic stimulation is determined by the spatial density of the sensory units, while the readout rate is determined by the time of orthogonal demodulation. Figure 5a shows the performance test experiment of the quadrature demodulator, where Sine waves with frequencies ranging from 10–17 kHz are superimposed with an amplitude of 20 mV, and a noise signal with an amplitude of 20 mV is added. The signal superposition test circuit is shown in Supplementary Fig. 15. The target frequency for demodulation in the orthogonal demodulator is set to 13 kHz. The detailed structure and parameters of the digital orthogonal demodulator can be found in Supplementary Fig. 16. It can be observed from Fig. 5a that the demodulation time (\({{\rm{T}}}_{{\rm{r}}}\) = 2 ms) of the quadrature demodulator, which is sufficient to meet the requirements for fast and accurate amplitude measurement. Therefore, the FC-AIMS platform demonstrates the ultrafast readout rate compared to existing artificial skin arrays based on TDM5,17,30,31,39 (Fig. 5b). Simultaneously, compared to spike trains coding architecture3,21,26, our device exhibits an ultrawide pressure response range.

Fig. 5: Spatiotemporal response of the FC-AIMS platform.
figure 5

a Performance experiment of quadrature demodulator, which verifies the demodulation accuracy and demodulation time of mixed frequency signal. b Readout rates and pressure ranges of the FC-AIMS platform (this work) compared with the skin of human hand and the reported artificial sensor arrays (TDMA-based sensors, Spike trains coding sensors). c The measurement relative error of capacitors with different values. Each capacitor is continuously measured 1000 times. d 3D visualization of single-touch and multi-touch sensing in real-time. e Visualization of pressure distribution within 10 s under dynamic scratching gesture stimulation.

The accuracy of capacitance measurement refers to the degree of closeness between the measured value and the true value of the capacitance being measured. To verify the accuracy of the capacitance measurement system, experiments were conducted using standard capacitors of varying values (Description of standard capacitance measurement experiment, Supplementary Note 5). The following parameter is used to express the accuracy of the system. The relative error defined by.

$${e}_{r}=\frac{\left|{C}_{m}-{C}_{e}\right|}{{C}_{e}}\times 100 \%$$
(3)

where, \({C}_{m}\) and \({C}_{e}\) are the measured capacitance value of the system and the expected nominal capacitance value respectively. The plot in Fig. 5c illustrates the measurement relative error of capacitors with different values, where the mean relative errors are calculated from 1000 consecutive measurements represented by the central square blocks, and the error bars indicate the uncertainty. The capacitance to be measured is set at 30 pF to 4.7 nF, and the relative error is less than 6%. Furthermore, we plot the measurement curve of capacitance response under frequency coding architecture (Supplementary Fig. 17). These results demonstrate that the FC-AIMS platform is capable of accurately sensing capacitance signals under dynamic stimulation, thereby making it a suitable candidate for an artificial skin with high-precision response. Figure 5d depicts the spatial response of the FC-AIMS platform under human touch, demonstrating the visualization of pressure distribution using single finger, two fingers, and three fingers touch sensors (Supplementary Movie 1).

To further characterize the parallel measurement performance of the AIM-skin array units, a multi-point synchronous touch device was designed (Supplementary Fig. 18a, left). The indenter for multi-touch experiments were 3D-printed with a resin material (Black resin, JLC). The indenter is installed in the front end of the linear motor as the loadcell of the touch test. The shape of the tip of the indenter was designed as a hemisphere with a diameter of 5 mm, and several tips are arranged in \(4\times 4\) with a tip spacing of 10 mm to prepare a multi-touch experimental indenter (Supplementary Fig. 18a, right). In the multi-point touch experiment, the linear motor reciprocates at a constant speed, and the FC-AIMS platform records the synchronous response of the sensor unit in real-time (See Supplementary Fig. 18b for the synchronous response of array sensor under multi-point touch).

We have demonstrated that the FC-AIMS platform is capable of responding to pressure distribution under both static and dynamic stimuli due to its high-speed parallel measurement and suppression of circuit crosstalk. For the detection of touch, a single finger is slid across the sensor, and the dynamic response of sensor nodes on the sliding path is illustrated in Supplementary Fig. 19. The high-speed and real-time pressure response of the AIM-skin array units allows for an effective characterization of the dynamic spatial position and pressure of touch stimuli. At the same time, we conducted tests on the response of the FC-AIMS platform to dynamic gesture stimuli. As depicted in Fig. 5e, the FC-AIMS platform was tested for its response to dynamic gesture stimuli, wherein the sensor recorded the pressure distribution in space at an interval of 2 ms when the AIM-skin surface was dynamically scratched. The results demonstrate that the sensors can achieve effective dynamic tactile perception, which can meet the requirements of future robot tactile interactions.

Establishment of spatiotemporal tactile recognition model

With the continuous development of robots and artificial intelligence, human-robot interaction (HRI) with high spatiotemporal response can bring more immersive experience to human beings. The robot skin is capable of perceiving various dynamic touch stimuli from the external world, which can facilitate interactive feedback. Incorporating tactile sensors with neural networks and deep learning can create a powerful tool for robotic perception and recognition, allowing robots to solve interactive problems without an explicit model50.

To achieve dynamic touch gesture recognition on the FC-AIMS platform, we developed a custom AIM-skin data acquisition system that can simultaneously collect tactile information from the sensors and images of touch (Fig. 6a). In the system, volunteers make specific dynamic touch gestures based on their personal habits to collect data for model training. We constructed a custom AIM-skin dataset containing 800 data samples distributed among ten touch gesture categories (Fig. 6b). We recruited 20 participants (14 males and 6 females) to take part in the touch experiments. As an artificial skin, our sensor can capture dynamic tactile information in the experiment, which is related to the defined dynamic touch gesture (Supplementary Movie 2). Each subject performed ten different dynamic touch gestures, with each gesture being recorded four times for data acquisition, and the duration of each touch was 10 s. The gestures of hit, stroke, rub, tap, poke, press, scratch, pat, circle, and put were performed by the 20 subjects (See Supplementary Table 3 for detailed description of touch gestures). Therefore, a total of 800 touch gesture samples were collected in the dataset. In summary, the FC-AIMS platform can effectively record the dynamic stimuli of different touch gestures, which serves as the foundation for intelligent perception in robots.

Fig. 6: Architecture of dynamic touch gesture recognition based on the FC-AIMS platform.
figure 6

a The AIM-skin data acquisition system is responsible for synchronously collecting tactile information from AIM-skin sensors and capturing images of touch gestures. b The left photograph depicts the scene for collecting dynamic touch gestures using the FC-AIMS platform. (i) AIM-skin is installed on plush toy robot to capture tactile information. (ii) Cameras are used to capture touch gesture images. (iii) The screen displays a touch pressure distribution map. The right photographs display ten distinct dynamic touch gestures utilized for tactile perception. c The internal structure of the basic residual blocks that make up Res3D. d Visualizing the tactile spatiotemporal features extracted by Res3D within the 800 samples in the AIM-skin dataset using t-SNE dimensionality reduction. e Classification test confusion matrix with the test dataset (3 participants, 120 samples); each row and column represent an instance in a predicted class and a true class, respectively, and the diagonal values represent correct predicted results. The color bar represents the accuracy of the prediction. f Control of the robotic hand based on dynamic tactile perception.

The above spatiotemporal tactile information and deep learning are combined to enable recognition of different touch gestures. An efficient deep 3D Residual ConvNet architecture (that we term Res3D)51 based on the PyTorch framework shown in Supplementary Fig. 20 was constructed for spatiotemporal feature learning, which adopts ResNet52 as the backbone network by expanding 2D convolutions into 3D convolutions. Res3D is composed of multiple basic residual blocks (as shown in Fig. 6c), which include shortcut connections that bypass signals between layers. Inside the residual block are two convolution layers with the size of \(3\times 3\times 3\), after which batch normalization and ReLU layers are added. This learned tactile output of ResNet can be reviewed as transferable semantic features, forming a 512D vector, which is used as the input of the fully connected neural network for final decision learning. The training, validation and test samples were randomly selected from 20 participants in the AIM-skin dataset at a ratio of 15:2:3.

The classification accuracy of touch gestures is used as a quantitative index to evaluate dynamic touch recognition. Meanwhile, t-SNE embedding is used to display the change of feature probability distribution of different touch gestures53, and the effectiveness of the feature extraction model is evaluated. The accuracy of touch gesture recognition is evaluated as follows.

$${\mathrm{Accuracy}}=\frac{{TP}+{TN}}{{TP}+{TN}+{FP}+{FN}}$$
(4)

where True Positive (\({TP}\)), True Negative (\({TN}\)), False Positive (\({FP}\)), False Negative (\({FN}\)) can be obtained from the result of model and label. Moreover, the confusion matrix was shows in Supplementary Table 4. The final classification accuracy of the Res3D learning architecture can reach up to 89.3% for touch gesture recognition. The classification results of this state-of-the-art neural network demonstrate that the dynamic touch gesture recognition strategy based on our sensors can effectively learn the spatiotemporal features of touch. The classification results obtained using this competitive neural network demonstrate that the dynamic tactile sensing strategy based on our sensors is capable of learning the spatiotemporal features of touch, and can effectively recognize different touch gestures. Meanwhile, t-SNE visualization of spatiotemporal features extracted by Res3D within the 800 samples in the AIM-skin dataset is shown in Fig. 6d, which demonstrates the effectiveness of the feature extraction process. Each point in the t-SNE visualization represents the tactile information of one dynamic touch gesture. The points are projected from the 512D spatiotemporal features extracted by Res3D into two dimensions.

Subsequently, we evaluated the trained Res3D model using a separate test dataset consisting of samples from three participants, comprising a total of 120 samples. Figure 6e illustrates the classification test confusion matrix of the test dataset by comparing the predicted results with the true labels; each row and column represents an instance in a predicted class and a true class, respectively, and the diagonal values represent correct predicted results. The verification results indicate that although there is some confusion in the recognition of different categories, the overall classification accuracy can reach 89.3%. Additionally, Supplementary Fig. 21 displays the classification accuracy of dynamic touch gestures for each participant based on Res3D. On the other hand, to meet the practical requirements of robot interaction, it is desirable to achieve the best recognition accuracy with shorter perception time in the recognition model. To optimize the recognition accuracy of the model within a limited perception time, we segmented the time series into different lengths for both model training and testing. As depicted in Supplementary Fig. 22, we applied equal-length segmentation with different numbers of segments on the original 10-s time series. As the average number of splits in the time series increases, the accuracy of the model recognition significantly decreases. Among them, the model is considered to have good perceptual ability when the data time length is greater than 2 s under five equal parts. The results show that the FC-AIMS platform is capable of perceiving dynamic spatiotemporal stimuli and recognizing different dynamic touch gestures using deep learning and neural networks (Supplementary Movie 2). As shown in Fig. 6f, we further integrate backend interactive control based on dynamic tactile perception. In our application demonstration, we used touch gestures as input commands to control the motion of the robotic hand (Supplementary Fig. 23). Tactile-based interactive control aims to emphasize the importance of our ultrafast readout and crosstalk-suppressed tactile sensing system for real-time interaction with robots (Supplementary Movie 3).

Discussion

High-speed readout and crosstalk suppression of haptic feedback in a highly sensitive iontronic array on ultrawide pressure range is a great complement to high spatiotemporal dynamic touch perception, enabling robot skin to truly sense complex external stimuli. Here, we reported the FC-AIMS platform, which is an artificial ion mechanoreceptor skin array (8\(\times\)8 pixels in a 9 cm \(\times\) 9 cm area) equipping an ultrafast readout architecture based on frequency-coded multiplexing, demonstrating real-time perception of spatiotemporal tactile information (within 2 ms). We fabricated the artificial flexible AIM-skin array composed of biomimetic ion nanofiber membranes, which enables high sensitivity response of charged ions to stimuli (>1.2 \({{\rm{kPa}}}^{-1}\) in 0.12–1880 kPa). Meanwhile, we designed frequency-coding architecture with parallel zero-potential mechanism (PZPM) to suppress crosstalk (<0.4%) between sensing units and achieve high precision (relative error < 6%) in capacitance measurements. Furthermore, the FC-AIMS platform can be easily adapted for use with other types of iontronic sensor arrays, providing flexibility and versatility in various applications.

Compared to other state-of-the-art iontronic skins, our device offers several advantages, including ultrafast readout rate, ultrawide pressure range, high precision measurement, and a simple device structure3,16,28. Our work meets the requirements for ultrafast readout rate even with increasing array density in space, while also ensuring electrical crosstalk suppression. This opens up an advanced and feasible path for the development of large-area artificial skin with ultrafast response capabilities. We demonstrated the dynamic tactile recognition task of plush toy robot installed with our sensor, which can perceive different dynamic touch gestures with the assist of neural networks and deep learning. The approach could form the starting point of an information channel based on touch in many emerging applications including robotic tactile, touch emotional perception, sensory feedback from prosthetics and for advanced human-machine interfaces3,54,55,56,57.

Methods

Preparation of micro structured electrode array

Polydimethylsiloxane (PDMS, Sylgard 184, Dow Corning Co., Ltd) and curing agent were mixed at a mass ratio of 10:1 and defoamed in a vacuum drying oven. The above degassed PDMS was spin-coated onto commercially available sandpaper (P1200) stencils at 500 r/min. It was baked in an oven at 80 °C for 2 h. Finally, the cured PDMS finished peeling from the sandpaper surface. The Ag layer (200 nm) is deposited on this PDMS substrate by magnetron sputtering technique with a mask. The final flexible electrode array with eight strips of 8.5 cm in length and 5 mm in width was obtained.

Preparation of Ionic nanofiber membrane

The active layer in the iontronic array uses a core-shell structured ionic nanofiber membrane as reported in our previous work. In the first step, 18% wt thermoplastic polyurethane (TPU, 1065A, Great eastern resins industrial Co., LTD.) solution of the shell layer precursor and 10% wt PVDF-HFP (Sigma-Aldrich) solution of the core layer precursor were prepared, respectively. The solvents were both a mixture containing 60% N, N-dimethylformamide (DMF, Macklin) and 40% acetone (Siron Technology Co., Ltd.). In the second step, ionic liquid ([BMIM] [TFSI], Langang Greenchem) with a mass of 0.5 times the mass of PVDF-HFP ionic liquid was added to the core layer precursor solution with continued stirring. After the shell and core layer precursor solutions appear completely clarified and transparent uniform state using coaxial needles for electrostatic spinning. The voltage between the needle and the aluminum receiver drum during electrospinning was 18 kV and the distance was 10 cm. the advancement speed of the core and shell layer solutions was 0.15 ml/h and 0.3 ml/h, respectively. the ambient temperature of electrostatic spinning was controlled at 30 °C. Finally, the obtained ionic nanofiber film is cut into 9 cm \(\times \,\) 9cm.

Fabrication of iontronic arrays

The prepared ionic nanofiber membrane was used as the active layer of the iontronic array, sandwiched between the patterned upper and lower electrode arrays. The two electrode arrays were arranged vertically in the direction. Finally, the entire sensor array is encapsulated with polyimide tape. The array is connected to the readout architecture via leads.

Experiment measurement and characterization on sensor performance

A LCR meter (E4980AL, KEYSIGHT) was used to measure the capacitance (Capacitive measurements are all performed at 100 mV and 1 kHz unless otherwise specified in the paper). The pressure applied to the sensors was controlled and recorded by a tensiometry machine (LD23.501, LSD). A lock-in amplifier (MFLI, Zurich Instruments) was used for aliasing signal acquisition and spectrum visualization. A linear motor (LinMot.Linear Guides B, NTI AG) was used for multi-point dynamic touch experiment.

Preparation of the frequency-coding architecture

AD9833 was used to generate sinusoidal excitation voltage signal. The output frequency and phase are programmable by software, and the resolution of 0.1 Hz can be achieved at the clock frequency of 25 MHz. The microcontroller (STM32F103C8T6) uses SPI to control eight AD9833 chips to achieve stable output of sinusoidal signals with different frequencies as the excitation source of frequency coding (Supplementary Fig. 24). The amplitude of the output voltage is regulated by the reverse phase proportional amplifier at the rear end. The entire sinusoidal excitation system was integrated in a homemade printed circuit board.

OPA388 was used in the construction of the capacitance-to-voltage converter. As shown in Fig. 4b, the rows of the sensor array are connected to the input of the C/V converter. The change of capacitance of different nodes would cause the change of output voltage amplitude at different frequencies, and finally the pressure change of tactile nodes would be located (See the Supplementary Note 1 for the design of C/V converter).

As a 16-bit ADC chip for analog voltage signal acquisition, AD7606 can realize 8-channel voltage synchronous sampling with a sampling rate of up to 200 KSPS. As shown in Fig. 4a, the voltage output of each C/V converter is connected to a single voltage acquisition channel of the AD7606 to achieve synchronous acquisition of eight voltage signals. The entire synchronous acquisition system was integrated in a homemade printed circuit board.

FPGA (Xilinx-ZYNQ-7020) was used as the digital quadrature demodulators for synchronous parallel demodulation. Programmable Logic (PL) of Xilinx-ZYNQ-7020 was responsible for reading the voltage digital signal converted by AD7606 in parallel, demodulating and solving the amplitude of signals at different frequencies to obtain the capacitance values at different nodes (See the Supplementary Note 2 for the design theory of digital quadrature demodulator). Processing System (PS) of Xilinx-ZYNQ-7020 is responsible for sending the signal value after processing to the upper computer through the network port or serial port. See the Supplementary Fig. 25 for hardware system composition of signal acquisition.

Digital integrated circuit design

A parallel digital quadrature demodulator based on FPGA is designed to calculate the capacitance of array nodes under frequency coding. The digital circuit design structure can be seen in the Supplementary Fig. 20. The multi-channel parallel demodulation can meet the high-speed response of the sensor array.

Dynamic touch gesture recognition experiments

The sensor array was installed on the surface of the plush toy robot. After the AMI-skin array tactile unit is conditioned by the synchronous demodulation circuit above, the data is packaged and sent to the computer through the network port for processing. Each data frame contains 64 signal values of 8\(\times\)8 tactile sensor array. In addition, a customized interactive control software is compiled using Python (Python 3.9.0) on the computer to analyze, calculate and save the measurement data. Four NVIDIA T4 tensor core GPUs for model inference.

A total of 20 volunteers of different genders and ages (20–30 years old, 14 men and 6 women) participated in the dynamic touch recognition experiment. As shown in Fig. 6b, we have set ten types of gestures. Before the gesture, the screen in front of the participant would play the video and tag of the action. After hearing the Start prompt tone, participants made dynamic touch actions continuously within 10 s. The strength and position of the movements were in accordance with the subject’s personal habits. For each type of gesture, we repeated the test and collected four sets of data. Twenty participants completed ten types of gestures, and each gesture was repeated four times to obtain the total dataset (800 sets in total). The above experiments were conducted with the full informed consent of volunteers.

Spatiotemporal stimulus recognition model training and optimization

We implemented the Res3D network for the highly spatiotemporal responsive I-skin system to conduct dynamic touch gesture classification in the PyTorch framework. The Res3D model has 3D-Resnet18 architecture, as shown in Supplementary Fig. 20. We use an input of \(5000\times 8\times 8\) frames, because frames long enough can contain more dynamic gesture information. Two downsampling convolution layers with stride \(10\times 1\times 1\) change the input from \(5000\times 8\times 8\) to \(50\times 8\times 8\). See Supplementary Fig. 20 for Res3D architecture, downsampling strides are denoted as \(t\times s\) where \(t\) and \(s\) are temporal and spatial stride, respectively. Dotted lines are residual connections with downsampling. Furthermore, to prevent model overfitting, we apply regularization to the model. We set the “Weight_decay” parameter in the optimizer to 0.01, and we add a Dropout layer after the model’s fully connected layers with a parameter ‘p’ set to 0.1. We also utilize data augmentation with the following parameter settings: random horizontal flipping (probability of 50%), random rotation (−30° to +30°), and the addition of Gaussian noise (mean = 0, standard deviation = 0.1). To train the Res3D model, we use Adam on 4 GPUs with a mini-batch size of 60 examples. The initial learning rate is 0.001, divided by 10 every 20 iterations, and ends at 50 iterations.

2D convolutional network (C2D) and 3D convolutional network (C3D) can also be used for tactile feature learning58. C2D and C3D network architecture are shown in Supplementary Fig. 26. Applying 2D convolution on a touch data volume (multiple frames flattened as one plane) also results in an image. Similarly, applying 3D convolution on a touch data volume result in another volume, preserving spatiotemporal information of the input signal. Compared to the C2D network (accuracy = 74.5%) and C3D network (accuracy = 78.6%), Res3D can efficiently extract spatiotemporal features for touch gesture classification, achieving a higher recognition accuracy.