arising from L. Salmela et al. Nature Machine Intelligence https://www.nature.com/articles/s42256-021-00297-z (2021)

With their internal memory, recurrent neural networks can be used to learn and predict time-dependent behaviours. In their recent work, Salmela et al.1 present a recurrent neural network architecture to learn and predict complex nonlinear propagation in an optical fibre based on the input pulse intensity profile in the time domain. Here, we use their model by extending it to the case of spatiotemporal nonlinear propagation for an arbitrary number of modes in graded-index multimode fibres. In addition to the original work’s focus on predicting the temporal evolution of pulses, we show that the method is applicable for modelling and predicting spatial beam propagation incorporating nonlinear mode coupling.

The demonstrated method of Salmela et al.1 can be an alternative solution to time-consuming and computationally heavy nonlinear pulse propagation simulations. In essence, the method can accurately reproduce the complex nonlinear evolution governed by the nonlinear Schrödinger equation (NLSE) via employing long short-term memory (LSTM) nodes in an artificial neural network. Such a network architecture is capable of modelling sequential dependencies. Salmela et al.1 tested their model for pulse compression and ultra-broadband supercontinuum generation. They were able to accurately predict temporal and spectral evolutions of ultrashort pulses in a highly nonlinear fibre. Using the same neural network architecture, we trained the network to predict the spatiotemporal evolution of ultrashort pulses. In this study, we hypothesized that their recurrent neural network might be suitable to predict the spatiotemporal field evolution, given that it successfully predicted the temporal physical dynamics that is governed by the same NLSE equation that describes the spatial domain as well. Given that the NLSE is also applicable to other physical systems, it may be possible to use a generic, normalized form of the NLSE, for example, in Bose−Einstein condensation, hydrodynamics and plasma physics2.

Spatiotemporal nonlinearities and simulations

The study in ref. 1 focuses on a single-mode fibre with spectral or temporal nonlinear evolution of pulses in propagation axis by computing 2-dimensional (2D, with 1 spatial coordinate +1 time coordinate) simulations. Further, the authors apply their method to a step-index multimode fibre by computing the propagation of five modes of the investigated fibre by following a similar (1 + 1)D simulation and incorporating mode coupling by a matrix product, calculated as the overlap integrals of the modes of interest. In this study, we change the medium from step-index to graded-index multimode fibre and compute 4D (3 spatial coordinates +1 time coordinate)D simulations where the interaction of all the available modes of the fibre fuses naturally since all the contributing spatiotemporal degrees of freedom in the NLSE are included.

With relatively low modal dispersion and periodic self-imaging, graded-index multimode fibres are of important interest for nonlinear optics, imaging and telecommunications studies. In recent years, various interesting nonlinear dynamics such as spatiotemporal instability3,4, dispersive wave generation5, graded-index solitons6,7, self-beam cleaning8, nonlinear pulse compression9, and supercontinuum generation10,11 have been reported. In addition to the aforementioned single-pass dynamics, spatiotemporal mode-locked lasers12,13,14 have been realized, thanks to the low-modal-dispersion pulse propagation in graded-index multimode fibres. Using a spatial light modulator, learning and controlling nonlinear optical dynamics in graded-index multimode fibres was demonstrated by modifying the spatial properties of the intense pump pulse15,16. Recently, spatiotemporal nonlinear interactions in a graded-index multimode fibre were introduced as an optical computing engine, which performed well on a range of machine learning tasks, from classifying COVID-19 X-ray lung images and speech recognition to predicting age from images of faces17.

Numerical analysis is required to understand the underlying complex spatiotemporal dynamics of pulse propagation in a multimode fibre. The most important challenge in multimode fibre simulations is the addition of spatial degrees of freedom. In a single-mode fibre simulation, there is only the time domain grid to establish and then propagation can be implemented, for instance, by using split-step Fourier simulations, which has a low computational cost because 1D Fourier transforms are computed in every step. In multimode fibres, there exist multiple propagating modes having different spatial distributions. Hence, transverse dimensions must be included to describe a pulse that requires a sampling grid in two dimensions X and Y in addition to time. Hence, 3D Fourier transforms (two spatial dimensions in the transverse plane and one dimension in time) must be computed at every step taken along the propagation direction Z to provide a (3 + 1)D simulation. This is computationally costly and time-consuming. To overcome the computational load of (3 + 1)D beam propagation simulations, mode-resolved simulation methods based on pre-calculated nonlinear mode coupling have been proposed18,19. However, mode-resolved simulations are time-efficient for fewer than 10 modes and a low number of modes may not give an accurate picture of the spatiotemporal nonlinear propagation in a fibre of more than 200 modes. In this regard, the work by Salmela et al.1 enables a faster computation scheme when the neural network is trained20.

In our study, we first tested the neural network presented in ref. 1 by generating a dataset using a numerically computed fibre output using the (3 + 1)D split-step Fourier method that considers the interaction of all available fibre modes. We call this the time-dependent beam-propagation method (TD-BPM). We implemented a graphics processing unit (GPU) parallelized TD-BPM in Python to generate the dataset. To remain loyal to the original approach, we integrated the intensity of the TD-BPM outputs in the spatial domain to obtain only the time-domain evolution. The performance of the network in the time domain (but with spatial integrated modes) is illustrated in the ‘Temporal results’ section. The ‘Spatial results’ section shows how well the network predicts the intensity profile along the propagation from the time-integrated data. Owing to the network architecture, the data is fed after a dimension reduction by time-averaging or space-averaging. Nevertheless, the spatiotemporal effects are still inherited in the reduced data where each RNN model is able to capture it.

Results

Temporal results

The datasets generated by the aforementioned TD-BPM contain 1,000 examples of spatiotemporal nonlinear propagation of femtosecond pulses (see Supplementary Discussion 1 for details). Following the original work and using the sample code, with a small modification to increase the number of nodes in each layer from 250 to 500, we trained and tested spectral and temporal nonlinear propagation in a graded-index multimode fibre. Each dataset is split into 950 propagation samples for training and 50 propagation samples for testing. During the training, at each epoch, training data is split randomly with 9 to 1 ratio to generate the validation set, which is repeated for every training process in this study. The TD-BPM-generated data is first converted to logarithmic scale and normalized. The evolutions of the mean absolute error metric for training the networks are presented in Supplementary Discussion 4.

Similar to the work by Salmela et al.1, we tested the recurrent neural network for stepwise and complete propagation predictions in the frequency and time domain but in a multimode fibre. The best performance of the neural network is observed for stepwise predictions. The stepwise performances of the network for spectral and temporal data are presented in Supplementary Fig. 1 and Supplementary Fig. 2. For the complete propagation predictions, using only the injected pulse profile leads to accumulated errors; however, as shown in Figs. 1 and 2, the difference between the TD-BPM (ground truth) and the predictions are small and in an acceptable range.

Fig. 1: An example of the spectral intensity evolution of a high-power femtosecond pulse in a graded-index multimode fibre.
figure 1

a, Schematic of data generation and training pipeline. b, Time-dependent beam propagation (ground truth). c, Recurrent neural network (RNN)-predicted pulse propagation. d, Difference between the ground truth and the prediction. eg, Time-dependent beam propagation simulation results and the recurrent neural network predicted results at different propagation lengths. The recurrent neural network predictions use only the injected pulse intensity profile as input. The colour bars show intensity in decibels and z0 is the self-imaging period of the fibre.

Fig. 2: An example of temporal intensity evolution of a high-power femtosecond pulse in a graded-index multimode fibre.
figure 2

a, Time-dependent beam propagation (ground truth). b, Recurrent neural network predicted pulse propagation. c, Difference between the ground truth and the prediction. df, Time-dependent beam propagation simulation results and the recurrent neural network predicted results at different propagation lengths. The recurrent neural network predictions use only the injected pulse intensity profile as input. The colour bars show intensity in linear scale, which is normalized over the whole dataset. z0 is the self-imaging period of the fibre.

Spatial results

The dataset is generated by integrating the outputs of TD-BPM in the time domain to generate spatial-domain-only intensity distributions. A graded-index fibre with a 50-µm core diameter, supporting 240 modes at 1,030 nm wavelength, is digitally created. 1,000 different propagation cases are generated by having different spatial excitations at the fibre input. LP01, LP02, LP03, LP11, LP12, and LP21 modes are superposed with random coefficients while keeping the peak power fixed at 1 GW to encourage nonlinear inter-modal coupling within a short fibre length that is chosen to be ten times the self-imaging period of the graded-index multimode fibre (see Supplementary Discussion 1 for further details on data generation). We note that the field launched is limited to 6 modes. However, the modes can couple into the higher-order modes of the GRIN fibre (240 available modes) upon propagation due to mode coupling. The dataset is divided into training data (950 samples) and testing data (50 samples). The data is down-sampled to 32 by 32 pixels on the transverse x and y axes and 120 steps on the z axis. The 2D spatial information is converted to a 1D array of 1,024 elements to employ the original network architecture that accepts 1D intensity profiles. The LSTM and dense layer node number is set to 1,000 and window size, which is the number of previous steps introduced in LSTM, is set to 15 (instead of 10 used by Salmela et al.1). In Fig. 3, the prediction of the trained network on a test data is shown, with the X−Z propagation profile and X−Y transverse profiles of the first, middle and last steps, along with the corresponding TD-BPM results that serve as the ground truth.

Fig. 3: An example spatial intensity evolution of a 1-GW femtosecond pulse in a graded-index multimode fibre.
figure 3

a,Time-dependent beam propagation (ground truth) in X−Z mid-plane. b, Recurrent neural network prediction of pulse propagation in X−Z mid-plane. c, Difference between the ground truth and the prediction of the propagation profiles. df, The time-dependent beam propagation simulation result and the recurrent neural network prediction result after the first z-step in the transverse plane (X−Y) and the relative difference respectively. gi, Similar plots given in df at the half fibre length. jl, Similar plots given in df at the output plane (last propagation step). The recurrent neural network predictions use only the injected pulse intensity profile from the test data as input. The colour bars show intensity in linear scale, which is normalized over the whole dataset. All the fields are up-sampled for better visualization and z0 is the self-imaging period of the fibre.

Discussion

During our study, we compared the simulation runtimes between the TD-BPM and that of the recurrent neural network architecture for training and inference. The required training time for the recurrent neural network is comparable to the data-generation time of the TD-BPM, which is around 50 min for 1,000 samples (the technical details are provided in Supplementary Discussion 1). On the other hand, as anticipated, the inference time of the recurrent neural network is more than 40 times faster than TD-BPM for single-pass pulse propagation with graphics-card-based parallel processing on an Nvidia Tesla V100 GPU.

Temporal results show that the network successfully infers the time evolution of a pulse. Since a different simulation method (TD-BPM instead of mode-resolved, as used in the original work) and medium (graded-index multimode fibre instead of single mode and step-index multimode fibre, as used in the original work) are chosen in this study, we can state that the proposed architecture is capable of grasping the NLSE-governed dynamics without relying on a particular method to generate training data. Owing to the selected pulse parameters (duration and central wavelength), the dataset contains supercontinuum generation from self-phase modulation and spatiotemporal instability3,4. As shown in the ‘Results’ section, the neural network can predict the separate and combined spatiotemporal instability peaks, which occur at around 632 nm and 768 nm, respectively, remarkably well.

The proposed architecture is able to predict spatial propagation decently as demonstrated in Fig. 3. However, a important amount of error is also present, which is higher than the error obtained in temporal-only predictions. In Supplementary Discussion 3, we investigated a simpler spatial scenario where the input distribution is fixed as a doughnut shape and the pulse power is varied on the order of MW to have relatively mild nonlinear interactions. This scenario yielded less mean absolute error compared to the results provided in Fig. 3, where the spatial distribution of the input field is varied and pulse power is set to GW to enable more nonlinear interactions. This comparison hints that a degradation in the performance of prediction occurs as the variations within the dataset and the strength of nonlinear interaction increase. The main cause of this performance issue may arise from the intensity-only nature of the implemented neural network architecture. Physically, the field evolution is a product of the intensity and phase changes in time and space. However, in this architecture, the network is forced to learn the nonlinear propagation of a complex field without having the phase information. Even so, the network mimics the overall propagation trend, which is quite an achievement given the fact that half of the information required is not provided. This is because in the dataset the complex field evolution is generated by including the effects of all the dimensional degrees of freedom. The dimensional reduction by time-averaging and having the squared norm to convert the complex field into intensity does not completely erase the trace of the complex higher-dimensional field evolution because these traces manifest themselves in the intensity evolution. Verification of this point can be found in Supplementary Discussion 5, where the recurrent neural network is trained by phase-only varying input fields.

Another important factor is the resilience to under-sampling. Considering the low discretization in the inference, it is straightforward to say that the recurrent neural network is more flexible in terms of sampling constraints. However, this advantage is a result of the training phase where the data is generated with appropriately sampled simulation frames. The accurate data is then down-sampled and provided to the network. In the case of under-sampled training data that contains sudden pixel-to-pixel jumps, the trained network fails to model NLSE and yields unrelated predictions for the propagation.

Future directions

There are two main directions in which to expand the scope of the proposed recurrent neural network architecture: introducing spatiotemporal characteristics together in the network instead of decoupling the space and time information of the pulse as well as a network capable of handling complex fields. Neural network architectures that deal with complex fields have already been presented, such as a neural network that decomposes the output field into LP modes21,22. With a similar scheme, the network could accept transverse complex fields and the time domain information in a (2 + 1)D fashion to perform the nonlinear evolution step by step in the propagation direction. With such augmented dimensionality, 2D and 3D convolutional layers could replace the fully connected layers before and after the LSTM, because light propagation is governed by convolution with a diffraction kernel.