Introduction

Since the launch of Sputnik in 1957, approximately 13,000 satellites have been deployed in the Low Earth orbit1. With collisions, explosions and fragmentations, the number of space debris sized less than 1 cm is now estimated to be more than 1,000,0001. The US Space Surveillance Network has a catalogue of about 30,000 resident space objects only1. Since 97% of space debris is not tracked, they pose a greater threat to active satellites as well as future space missions.

With this large number of space debris, the risk of in-orbit collision has increased, and the number of collision events has seen a rise in the past decade. Collision in space can be destructive, for example, Iridium-Cosmos collision2 of 2009, or non-destructive, for example, Canadarm2 of the International Space Station hit by a small piece of space debris in 20213. Another incident which was catalogued as a non-destructive collision was in 2013 when Blits satellite was possibly hit by un-tracked space debris4. In non-destructive collision events, fragmentation does not happen. While there has been exhaustive research on modelling destructive collisions5,6, on risk assessment due to small space debris7, and reconnecting fragments in space to their parent satellite body8, the extraction of trajectory information of untracked space debris observing a non-destructive collision has not been studied extensively.

Recently, Harsha et al.9 formulated a space debris position, velocity and mass estimation problem considering the position and velocity deviation of an active satellite as observation with non-destructive and elastic collision assumption. The performance of the classical estimation methods as well as deep learning (DL) based approaches, were examined for this problem. It was observed that the position, velocity, and mass estimation performance of the Ensemble Neural Network-based technique are similar to those of classical methods. It should be noted that the elastic collision assumption in the above preliminary study was idealistic. The above estimation problem becomes complex when a more rigorous inelastic collision model is considered where the momentum is conserved, but the kinetic energy is not conserved. Hence, it is of interest to examine the trajectory estimation performance of both classical and ML-based methods under the inelastic collision assumption.

Deep Neural Network (DNN)-based techniques have already been proposed for asteroid exploration, spacecraft rendezvous and terrain navigation10. The convolutional neural network (CNN) has been proposed for pose estimation of uncooperative spacecraft11 and for object detection and tracking in space12. ESA’s Hera mission is planning to carry out image processing of double asteroid system Dimorphos-Didymos using CNN13 and demonstrate optical navigation. This mission will also investigate the impact of kinetic impactor released by NASA’s Double asteroid redirection test (DART) mission using deep learning techniques14.

Deep learning techniques have also been demonstrated to carry out initial screening of conjunction assessment among 170 million space-object pairs15, which otherwise is a computation intensive process in the domain of Space Situational Awareness. Automatic collision avoidance maneuver have been proposed using ML algorithms16 combining with Dempster-Shafer theory of evidence. Deep Neural network achieve satisfactory performance when trained with large amounts of data. However, these standard neural network-based methods inevitably face the challenge of drawing conclusions and making decisions under partial information, where the data acquisition is costly and scarce while dealing with complex and non-linear physical systems17. In addition, the solution of the traditional neural network does not guarantee adherence to the underlying physical properties in problems involving physical systems. To address these issues, the Physics informed Neural network (PINN) was proposed17. PINN18,19 uses the laws governing the system dynamics in the loss function as prior information. Note that the training of a neural network essentially implies finding the set of weights and biases for all the nodes in the network, which minimises the loss function. Hence, the inclusion of the system dynamics based on physical laws in the loss function act as a constraint for the output while training the neural network using the training data set. The PINNs can be used as surrogates in learning models used for autonomous navigation, uncertainty quantification and any real-time application that need inference20.

In this article, we have studied the problem of tracking unknown space debris, observing an inelastic and non-destructive collision event. Additionally, we assumed the space debris was in a stable orbit before the collision. Under the inelastic collision assumption, the coefficient of restitution21 is an unknown parameter in addition to the variables that need to be estimated under the elastic collision assumption. We have examined the performance of the classical estimation technique and various DNN and PINN-based methods for estimating the position, velocity, and mass of the space debris and the coefficient of restitution for the inelastic collision for the above-mentioned problem using 8235 collision simulations for training and 330 collision simulations for testing. The inelastic collisions were simulated using the model described by Schwager et al.22 The performance of the DNN and PINN-based methods are compared with the classical method for the inelastic and non-destructive collision. Towards tracking unknown space debris, we have made the following key contributions:

  1. 1.

    Formulation of space debris tracking problem observing a non-destructive and inelastic collision: We have posed the unknown space debris tracking problem after an inelastic event as a position, velocity, mass, and the coefficient of restitution estimation problem considering the active satellite position and velocity deviation as observations.

  2. 2.

    Formulation of an appropriate physics loss function and application of PINN: We have formulated a physics loss function for the above problem and trained a PINN using the developed loss function to solve the above estimation problem.

  3. 3.

    A velocity sampling method for simulating random in-orbit collision events: For the training of DNN and PINN-based models as well for testing the performance of the ML-based approaches and the classical approach, we simulated a total of 8565 inelastic collision events. Note that, for a collision event, the active satellite position and the space debris position will be approximately the same at the time of the collision, while the velocities will be different. However, any random velocity vector at the given position does not necessarily result in a stable elliptical orbit. We have proposed a velocity sampling method to generate the velocity vector of space debris for a given active satellite position at the time of the collision, which guarantees an elliptical orbit with a periapsis above a given minimum allowable altitude from the mean sea level.

The rest of the article is organised as follows: We have formally defined the problem in the next section, and then discussed the classical estimation method and various DNN-based approaches and provided the physics loss function for the PINN for solving the problem. Then we have discussed the collision data generation method, including the formulation of the velocity sampling algorithm and simulation of collision in space. Next, we have compared the performance of the classical estimation technique as well as the gradient boosting, DNN and PINN-based methods for solving the estimation problem. We conclude the article by consolidating our observations and delineating the future research direction.

Problem definition

Consider a satellite and space debris undergoing an inelastic collision in space as depicted in Fig. 1. Our objective is to find the trajectory of the space debris by observing the position and velocity deviation of the active satellite after the collision.

Figure 1
figure 1

Illustration of a collision in space.

Let \({\varvec{r}}_{sat}\) and \({\varvec{v}}_{sat}\) be the position and velocity of the satellite, and \({\varvec{r}}_{d}\) and \({\varvec{v}}_{d}\) be the position and velocity of the space debris. We assume satellite and space debris are spherical for ease of the analysis. Let \(t_c^-\) and \(t_c^+\) denote the time before collision and time after collision respectively. At the time of the collision, the position of the satellite and the position of debris can be written as

$$\begin{aligned}{} & {} {\varvec{r}}_{sat}(t_c^+) = {\varvec{r}}_{sat}(t_c^-) \end{aligned}$$
(1)
$$\begin{aligned}{} & {} {\varvec{r}}_{d}(t_c^+) = {\varvec{r}}_{d}(t_c^-) = {\varvec{r}}_{sat}(t_c^-) - (\rho _{sat} + \rho _{d}) \hat{{\varvec{v}}}_{rel} \end{aligned}$$
(2)

where \(\rho _{sat}\) and \(\rho _{d}\) are radii of the satellite and debris, respectively and \({\varvec{\hat{v}}}_{rel}\) is the unit vector along the relative velocity of debris with respect to the satellite and is given by

$$\begin{aligned} {\varvec{\hat{v}}}_{rel} = \frac{{\varvec{v}}_{sat}(t_c^-) - {\varvec{v}}_{d}(t_c^-) }{|| {\varvec{v}}_{sat}(t_c^-) - {\varvec{v}}_{d}(t_c^-) || } \end{aligned}$$
(3)

The velocity of the satellite and debris after an inelastic collision is21

$$\begin{aligned}{} & {} {\varvec{v}}_{sat}(t_c^+) = \frac{m_{sat}{\varvec{v}}_{sat}(t_c^-) + m_{d}{\varvec{v}}_{d}(t_c^-) + \epsilon m_{d} ({\varvec{v}}_{d}(t_c^-) - {\varvec{v}}_{sat}(t_c^-))}{m_{sat} + m_{d}} \end{aligned}$$
(4)
$$\begin{aligned}{} & {} {\varvec{v}}_{d}(t_c^+) = \frac{m_{d}{\varvec{v}}_{d}(t_c^-) + m_{sat}{\varvec{v}}_{sat}(t_c^-) + \epsilon m_{sat} ({\varvec{v}}_{sat}(t_c^-) - {\varvec{v}}_{d}(t_c^-))}{m_{sat} + m_{d}} \end{aligned}$$
(5)

where \({\varvec{v}}_{sat}(t_c^-)\) indicates velocity of satellite just before the collision and \({\varvec{v}}_{sat}(t_c^+)\) indicates velocity of satellite just after the collision and \(\epsilon\) is the coefficient of restitution22 of the inelastic collision. Then change in velocity of the active satellite can be written as

$$\begin{aligned} \Delta {\varvec{v}}_{sat}(t_c^+) = {\varvec{v}}_{sat}(t_c^+) - {\varvec{v}}_{sat}(t_c^-) \end{aligned}$$
(6)

Define the measurement \({\varvec{Z}}\) as

$$\begin{aligned} {\varvec{Z}} = \begin{bmatrix} {\varvec{r}}_{sat}(t_c^+) \\ \Delta {\varvec{v}}_{sat}(t_c^+) \end{bmatrix} + \omega (t_c^+) = \begin{bmatrix} {\varvec{r}}_{sat}(t_c^+) \\ {\varvec{v}}_{sat}(t_c^+) - {\varvec{v}}_{sat}(t_c^-) \end{bmatrix} + \omega (t_c^+) \end{aligned}$$
(7)

where \(\omega (t_c^+)\) is noise in measurement. As the active satellite is being tracked, \({\varvec{v}}_{sat}(t_c^+)\) and \({\varvec{v}}_{sat}(t_c^-)\) can be measured, and therefore measurement \({\varvec{Z}}\) is available. Using Eqs. (2), (4) and (5), \({\varvec{Z}}\) can be written as

$$\begin{aligned} {\varvec{Z}} = f({\varvec{r}}_{d}(t_c^-), {\varvec{v}}_{d}(t_c^-), m_{d}, \epsilon ) + \omega (t_c^+) \end{aligned}$$
(8)

Obtaining \({\varvec{r}}_{d}(t_c^-), {\varvec{v}}_{d}(t_c^-), m_{d}\) and \(\epsilon\) from \({\varvec{Z}}\) is essentially an estimation problem. Using the estimated parameters, \({\varvec{v}}_d(t_c^+)\) can be obtained and thus the trajectory of the space debris after the collision can be computed.

Classical approach

One can estimate \({\varvec{r}}_{d}(t_c^-), {\varvec{v}}_{d}(t_c^-), m_{d}\) and \(\epsilon\) from observation \({\varvec{Z}}\) using the Least Squared Estimation (LSE):

$$\begin{aligned} \widehat{{\varvec{r}}}_{d}(t_c^-), \widehat{{\varvec{v}}}_{d}(t_c^-), {\widehat{m}}_{d}, {\widehat{\epsilon }}= \underset{{{\varvec{r}}_{d}(t_c^-), {\varvec{v}}_{d}(t_c^-), m_{d}, \epsilon }}{\arg \min } ({\varvec{\Delta Z}}^T{\varvec{\Delta Z}}) \end{aligned}$$
(9)

where \(\widehat{{\varvec{r}}}_{d}(t_c^-), \widehat{{\varvec{v}}}_{d}(t_c^-), {\widehat{m}}_{d}, {\widehat{\epsilon }}\) are the estimate of the desired quantities and

$$\begin{aligned} {\varvec{\Delta Z}} = {\varvec{Z}} - f(\widehat{{\varvec{r}}}_{d}(t_c^-), \widehat{{\varvec{v}}}_{d}(t_c^-), {\widehat{m}}_{d}, {\widehat{\epsilon }}) \end{aligned}$$
(10)

Note that the above estimation problem formulation does not consider any mass constraints for the debris. However, we are particularly interested in the untracked debris, which is of small size and essentially has a very small mass. The inclusion of the mass constraint transforms the original problem into a constrained optimization problem, which can be solved using the Lagrange Multiplier approach9. The Lagrangian \(L\) for this constrained optimization problem can be defined as

$$\begin{aligned} {L} = {\varvec{\Delta Z}}^T {\varvec{\Delta Z}} + \lambda _1(1 - m_{d}) + \lambda _2(m_{d} - \delta m) \end{aligned}$$
(11)

which includes the mass constraint \(\delta m<m_{d}<1~kg\). Here, \(\lambda _1\) and \(\lambda _2\) are Lagrange multipliers. The position, velocity, mass and coefficient of restitution of the debris can be estimated by solving

$$\begin{aligned} \begin{bmatrix}\frac{\partial {L}}{\partial {\varvec{r}}_{d}}&\frac{\partial {L}}{\partial {\varvec{v}}_{d}}&\frac{\partial {L}}{\partial m_{d}}&\frac{\partial {L}}{\partial \epsilon }\end{bmatrix}&= {\varvec{0}} \end{aligned}$$
(12)
$$\begin{aligned} \begin{bmatrix}\frac{\partial {L}}{\partial \lambda _1}&\frac{\partial {L}}{\partial \lambda _2}\end{bmatrix}&= {\varvec{0}} \end{aligned}$$
(13)

There are various techniques available to solve this minimisation problem. We have used the gradient descent approach to solve the problem numerically.

Deep Learning-based approaches

An alternate approach for estimating the position, velocity, mass and coefficient of restitution of the debris after a collision can be based on a DNN model. Using (8) one can write

$$\begin{aligned} E[{\varvec{r}}_{d}(t_c^-), {\varvec{v}}_{d}(t_c^-), m_{d}, \epsilon ] = f^{-1}({\varvec{Z}}) \end{aligned}$$
(14)

considering \({\varvec{\omega }}(t_c^+)\) as a zero mean noise vector. Essentially the trained DNN model approximates the inverse function \(f^{-1}(\cdot )\). The DNN model can be trained using simulated collision data. We consider \({\varvec{Z}}\) as the input to the DNN and \({\varvec{\Delta r}}, {\varvec{\Delta v}}, m_{d}, \epsilon\) as the output of the DNN, where

$$\begin{aligned} {\varvec{\Delta r}} = \begin{bmatrix}\Delta x_d \\ \Delta y_d\\ \Delta z_d\end{bmatrix} = {\varvec{r}}_{d}(t_c^-) - {\varvec{r}}_{sat}(t_c^-) \end{aligned}$$
(15)

and

$$\begin{aligned} {\varvec{\Delta v}} = \begin{bmatrix}\Delta v_{xd} \\ \Delta v_{yd}\\ \Delta v_{zd}\end{bmatrix} = {\varvec{v}}_{d}(t_c^-) - {\varvec{v}}_{sat}(t_c^-) \end{aligned}$$
(16)

We consider two distinct DNN architectures for this problem. The first architecture is the usual DNN with an input layer with 6 inputs, multiple hidden layers and an output layer with 8 outputs as shown in Fig. 2. The second architecture comprises of 8 parallel DNNs with 6 inputs, multiple hidden layers and 1 output. Essentially, each of these parallel DNNs learns one of the 8 outputs. The outputs of these 8 parallel DNNs are connected to another DNN with 8 inputs and 8 outputs. The last DNN block ensures learning the dependency of each output variable with other output variables. The architecture is shown in Fig. 3.

Figure 2
figure 2

Deep Neural Network (DNN) architecture.

Figure 3
figure 3

Stacked Deep Neural Network (StackDNN) architecture.

We refer to this architecture as the stacked Deep Neural Network (StackDNN). For both architectures, Mean Squared Error (MSE) loss function is used for training the models. We define the MSE loss for this problem as

$$\begin{aligned}&{L}_{MSE}(\widehat{{\varvec{r}}}_{d}, \widehat{{\varvec{v}}}_{d}, \hat{m}_{d}, {\hat{\epsilon }})\nonumber \\&=\frac{1}{N}\sum _{i=1}^N\left[ ({\varvec{r}}_{d_i} - \widehat{{\varvec{r}}}_{d_i})^T({\varvec{r}}_{d_i} - \widehat{{\varvec{r}}}_{d_i}) + ({\varvec{v}}_{d_i} - \widehat{{\varvec{v}}}_{d_i})^T({\varvec{v}}_{d_i} - \widehat{{\varvec{v}}}_{d_i}) + (m_{d_i} - \hat{m}_{d_i})^2 + (\epsilon _i - \hat{\epsilon _i})^2\right] \end{aligned}$$
(17)

where \((\cdot )_{d_i}\) and \(\widehat{(\cdot )}_{d_i}\) denote the output data for the \(i\textrm{th}\) collision event and the corresponding predicted output data using the DNN. N is the number of collision events used to generate the training data.

Physics informed neural network

Having discussed the DNN-based method for estimating the position, velocity, mass and coefficient of restitution of debris after the collision, it is essential to note that the estimated position and velocity must be the solution to the differential equation that governs the motion of the debris. Based on Newton’s Law of Gravitation, the debris dynamics can be written as

$$\begin{aligned} \begin{bmatrix} \dot{{\varvec{r}}}_{d}\\ \dot{{\varvec{v}}}_{d}\end{bmatrix} = \begin{bmatrix}{\varvec{v}}_{d}\\ -\frac{\mu }{r_{d}^3}{\varvec{r}}_{d}\end{bmatrix} \end{aligned}$$
(18)

However, DNN-based solutions do not guarantee the above constraint. The Physics Informed Neural Network18 preserves the laws of physics in the solution by incorporating a physics loss in the loss function used to train the DNN. Using (18), we defined the physics loss for the PINN as

$$\begin{aligned}&{L}_{phy}(\widehat{{\varvec{r}}}_{d}, \widehat{{\varvec{v}}}_{d})\nonumber \\&= \frac{1}{N}\sum _{i=1}^N\left[ \left( \frac{\mu }{r_{d_i}^3}{\varvec{r}}_{d_i} - \frac{\mu }{\hat{r}_{d_i}^3}\widehat{{\varvec{r}}}_{d_i}\right) ^T\left( \frac{\mu }{r_{d_i}^3}{\varvec{r}}_{d_i} - \frac{\mu }{\hat{r}_{d_i}^3}\widehat{{\varvec{r}}}_{d_i}\right) + \left( {\varvec{v}}_{d_i} - \widehat{{\varvec{v}}}_{d_i}\right) ^T\left( {\varvec{v}}_{d_i} - \widehat{{\varvec{v}}}_{d_i}\right) \right] \end{aligned}$$
(19)

for the problem under consideration. The significance of this loss function is that the minimisation of \({\mathcal {L}}_{phy}(\widehat{{\varvec{r}}}_{d}, \widehat{{\varvec{v}}}_{d})\) minimises the difference between the acceleration computed using the output of the training data and the PINN output, and the difference between the velocity computed using the output of the training data and the PINN output. As a result, this loss function minimises the distance between the derivative of the vector \(\left[ {\varvec{r}}_{d}~~{\varvec{v}}_{d}\right] ^T\) computed using the training data, and the derivative of the same computed using the PINN predicted output. Since the velocity loss is already included in \({L}_{phy}(\widehat{{\varvec{r}}}_{d}, \widehat{{\varvec{v}}}_{d})\) we define the MSE loss for the PINN as

$$\begin{aligned} {L}_{MSEP}(\widehat{{\varvec{r}}}_{d}, \hat{m}_{d}, {\hat{\epsilon }}) =\frac{1}{N}\sum _{i=1}^N\left[ ({\varvec{r}}_{d_i} - \widehat{{\varvec{r}}}_{d_i})^T({\varvec{r}}_{d_i} - \widehat{{\varvec{r}}}_{d_i}) + (m_{d_i} - \hat{m}_{d_i})^2 + (\epsilon _i - \hat{\epsilon _i})^2\right] \end{aligned}$$
(20)

The complete loss function for the PINN is

$$\begin{aligned} {L}_{PINN} = {L}_{phy}(\widehat{{\varvec{r}}}_{d}, \widehat{{\varvec{v}}}_{d}) + {L}_{MSEP}(\widehat{{\varvec{r}}}_{d}, \hat{m}_{d}, {\hat{\epsilon }}) \end{aligned}$$
(21)

We use this loss function in both DNN and StackDNN for performance comparison.

Collision data generation

Having discussed the DNN, StackDNN, PINN, and StackPINN approaches for tracking the debris after an inelastic collision event, let us now turn to the preparation of the training data. Publicly available data on inelastic and non-destructive collision events are very few and not sufficient to train the neural networks. Hence simulated collision data are required for training the deep learning models.

The position and velocity information of active satellites can be obtained from publicly available catalogues. However, the position and velocity of the debris which collides with an active satellite at a given epoch must be simulated. The simulated debris must be in a complete orbit, i.e. the orbit eccentricity must be less than 1. Additionally, to form a stable orbit, the periapsis of the debris cannot be at a very low altitude.

The debris position is given by (2) at the time of the collision. From (2), we can approximate that the debris position as nearly equal to the active satellite position because \(r_{sat}\gg \rho _{sat}, \rho _{d}\). Corresponding to this position, infinitely many velocity vectors are possible for which the debris will be on a complete orbit. However, the magnitude and the direction of the velocity vector can not be any arbitrary magnitude and direction. In the next subsection, we describe a method of sampling the velocity of debris for a given position vector in space and a given minimum periapsis altitude.

Velocity sampling

Consider r is the norm of the active satellite as well as the debris position vector \({\varvec{r}}\) at the time of the collision, h is the norm of the angular momentum vector \({\varvec{h}}\) of the colliding debris, e is the eccentricity of the debris orbit, \(\phi\) is the true anomaly of the debris during the collision, \(R_e\) is the radius of the Earth and \(a_{min}\) minimum allowable periapsis altitude. If the semi-major axis for the debris orbit is a, then the periapsis \(r_p = a(1 - e)\). Also

$$\begin{aligned} \frac{h^2}{\mu } = r_p(1 + e) \end{aligned}$$
(22)

and

$$\begin{aligned} r = \frac{h^2}{\mu }\frac{1}{1 + e\cos {\phi }} \end{aligned}$$
(23)

Considering \(R_{min} = R_e + a_{min}\) as the minimum allowable distance from the earth centre

$$\begin{aligned} r_p = \frac{r(1 + e\cos {\phi })}{1 + e}>R_{min} \end{aligned}$$
(24)

then

$$\begin{aligned} 0<e< \frac{r - R_{min}}{R_{min} - r\cos {\phi }}<1 \end{aligned}$$
(25)

and

$$\begin{aligned} \cos ^{-1}\frac{2R_{min}-r}{r}<\phi < 2\pi - \cos ^{-1}\frac{2R_{min}-r}{r} \end{aligned}$$
(26)

Fig. 4 shows the variation of maximum possible eccentricity with the true anomaly.

Figure 4
figure 4

Maximum eccentricity vs. True anomaly.

Figure 5
figure 5

Cross-section of the intersection of the unit sphere and planes (34) and (35).

Now, the position of the debris at the time of collision is

$$\begin{aligned} {\varvec{r}}_{d} = {\varvec{r}}= r\begin{bmatrix}r_{ux}\\ r_{uy}\\ r_{uz}\end{bmatrix} \end{aligned}$$
(27)

and velocity of the debris is

$$\begin{aligned} {\varvec{v}}_{d} = v\begin{bmatrix} v_{ux}\\ v_{uy}\\ v_{uz} \end{bmatrix} \end{aligned}$$
(28)

where \(\begin{bmatrix}r_{ux}&r_{uy}&r_{uz}\end{bmatrix}^T\) and \(\begin{bmatrix}v_{ux}&v_{uy}&v_{uz}\end{bmatrix}^T\) are the position and velocity unit vectors respectively, and

$$\begin{aligned} v = \sqrt{\frac{\mu }{r}\left( 2 - \frac{1-e^2}{1+e\cos {\phi }}\right) } \end{aligned}$$
(29)

The angle between \({\varvec{r}}\) and \({\varvec{v}}\) is \(\theta\)23

$$\begin{aligned} \theta = \sin ^{-1}\frac{1 + e\cos {\phi }}{\sqrt{1+2e\cos {\phi }+ e^2}} \end{aligned}$$
(30)

The angular momentum vector of the debris is

$$\begin{aligned} {\varvec{h}}&= {\varvec{r}}\times {\varvec{v}}\nonumber \\&= rv\begin{bmatrix}r_{uy}v_{uz} - r_{uz}v_{uy}\\ r_{uz}v_{ux} - r_{ux}v_{uz}\\ r_{ux}v_{uy} - r_{uy}v_{ux}\end{bmatrix} \end{aligned}$$
(31)

and

$$\begin{aligned} h = rv\sin {\theta } \end{aligned}$$
(32)

then the cosine of the inclination i of the orbit

$$\begin{aligned} \cos {i} = \frac{rv(r_{ux}v_{uy} - r_{uy}v_{ux})}{rv\sin {\theta }} \end{aligned}$$
(33)

Consequently, the velocity unit vector needs to satisfy the following:

$$\begin{aligned} r_{ux}v_{ux} + r_{uy}v_{uy} +r_{uz}v_{uz} = \cos {\theta } \end{aligned}$$
(34)
$$\begin{aligned} - r_{uy}v_{ux} + r_{ux}v_{uy} + 0v_{uz} = \sin {\theta }\cos {i} \end{aligned}$$
(35)
$$\begin{aligned} \sqrt{v_{ux}^2 + v_{uy}^2 + v_{uz}^2} = 1 \end{aligned}$$
(36)

Geometrically, the condition for forming a complete orbit is - the line of intersection of planes (34) and (35) must lie within the unit sphere defined by (36). We can write (35) in the normal form:

$$\begin{aligned} - \frac{r_{uy}}{\sqrt{r_{ux}^2+r_{uy}^2}}v_{ux} + \frac{r_{ux}}{\sqrt{r_{ux}^2+r_{uy}^2}}v_{uy} + 0\cdot v_{uz} = \frac{\sin {\theta }\cos {i}}{\sqrt{r_{ux}^2+r_{uy}^2}} \end{aligned}$$
(37)

Since \(\sqrt{r_{ux}^2 + r_{uy}^2 + r_{uz}^2} = 1\), (34) is already in the normal form. Observing (34) and (37), one can deduce that the angle between the two planes is \(90^\circ\). Then from the cross-sectional geometry shown in Fig. 5, the criteria for the intersection of the two planes within the unit sphere is

$$\begin{aligned} \cos ^2{\theta } + \frac{\sin ^2{\theta }\cos ^2{i}}{r_{ux}^2 + r_{uy}^2}\le 1 \end{aligned}$$
(38)

Then

$$\begin{aligned} -\sqrt{\frac{r_{ux}^2 + r_{uy}^2}{\sin ^2{\theta }(1-\cos ^2{\theta })}}\le \cos {i}\le \sqrt{\frac{r_{ux}^2 + r_{uy}^2}{\sin ^2{\theta }(1-\cos ^2{\theta })}} \end{aligned}$$
(39)

Using the ranges of \(\phi\), e and \(\cos {i}\) given by (26), (25) and (39) one can independently sample these quantities and subsequently solve for \(v_{ux}\), \(v_{uy}\) and \(v_{uz}\) from (34), (35) and (36). Note that the solution will result in two unit vectors for velocity. Finally, the velocity magnitude can be calculated using (29). Algorithm 1 describes the velocity sampling steps.

Algorithm 1
figure a

Velocity sampling

Figure 6
figure 6

Gabbard chart for the generated velocity samples.

Figure 7
figure 7

Relative velocities for the sampled space debris velocity with respect to the active satellite.

Using this algorithm, we have sampled 2000 velocity vectors for an active satellite position at

$$\begin{aligned} {\varvec{r}} = \begin{bmatrix}3664.75536654&3893.60049258&5062.87536598\end{bmatrix}^T km \end{aligned}$$
(40)

and a minimum perigee altitude of 300 km. Figure 6 shows that the generated samples have perigees at altitudes higher than 300 km. Fig. 7 shows the relative velocities for the samples with respect to the active satellite. It can be deduced from Fig. 7 that collision cannot occur from any arbitrary direction for a given position vector in space.

Simulation of the collision

We have generated inelastic collision observations considering the collision model described in the “Problem Definition” section. We have used Python 3.7 with astropy24 and poliastro25 libraries for orbit simulations of the active satellites as well as the debris.

We have considered 5 collision epochs \(t_c\) each at 1200 UTC, starting from April 29, 2023, to May 3, 2023. We consider 1647 satellite orbits from the Starlink constellation and 66 satellite orbits from the LEMUR constellation for generating collision simulations. The Two Line Element (TLE) data for each constellation were obtained from www.spacetrack.org on April 27, 2023. Using TLE data, we have calculated the time difference \(\Delta t\) between the collision time \(t_c\) and the the TLE time, and propagated the active satellites. The orbits are propagated considering J2 and J3 zonal harmonics and the exponential atmospheric drag model.

For the debris simulations, we sampled the mass of each untracked debris from a normal distribution with 0.5 kg mean and varying standard deviations of (0 kg, 0.001 kg, 0.002 kg, 0.003 kg, 0.004 kg, 0.005 kg, 0.006 kg). For simplicity, satellites and debris are considered spherical objects of radius 0.5 m and 0.1 m, respectively, and the mass of the satellite is 100 times that of sampled mass of debris. The position of the debris \({{\varvec{r}}_{{\varvec{d}}}}\) is considered to be the same as \({\varvec{r}}_{{\varvec{sat}}}\), whereas the velocity of the debris \({\varvec{v}}_{{\varvec{d}}}\) is obtained using algorithm 1. The minimum perigee altitude for the debris corresponding to the Starlink satellites was set at 250 km, and for LEMUR, it was 200 km. The Gabbard plots for the debris generated for the Starlink and LEMUR constellations are shown in Fig. 8.

Figure 8
figure 8

Gabbard plot of space debris.

Figure 9
figure 9

Coefficient of restitution (\(\epsilon\)) distribution using Gaussian mixture.

Coefficient of restitution (\(\epsilon\)) values for each of the collisions have been sampled from a Gaussian mixture distribution, as shown in Fig. 9. It consists of 4 Gaussian distributions with 0.015 standard deviations sampled around uniformly distributed means ranging from 0.5 to 1 where the mean has been assumed based on the \(\epsilon\) values of the satellite materials (6061-T6 Aluminium, Stainless Steel 304, Nitronic 60A and Titanium)21.

Inelastic collision simulations are generated using the position, velocity, mass and coefficient of restitution values for the respective satellite and debris pairs in Eqs. (4) and (5). Post-collision at time \(t_c^+\), the observations are generated using Eq. (7). Here, the noise \(\omega (t_c^+)\) signifies the tracking uncertainty of the active satellite. We consider varying standard deviation (0m, 100 m, 200 m, 300 m, 400 m, 500 m) for position and (0 m/s, 50 m/s, 100 m/s, 150 m/s, 200 m/s, 250 m/s) for velocity in each axis.

A total of 8235 collision events were simulated for the orbits of the Starlink constellation using Param Pravega Super Computer. 66 collision events with 6 varying noise factors were simulated for the LEMUR constellation using a local workstation. For each collision, we have recorded \({\varvec{Z}}\) and \({\varvec{r}}_{d}\), \({\varvec{v}}_{d}\), \(m_{d}\) and \(\epsilon\). We consider the collision data corresponding to the orbits of the Starlink constellation as the training data and the same corresponding to the orbits of the LEMUR constellation as the test data.

Results and discussion

We used the training data described in the previous section for training the DNN, PINN, StackDNN, and StackPINN models. We trained four different models with 2, 4, 6, and 8 hidden layers for each of the ML models, with 5 nodes in each hidden layer. Rectified Linear Unit (ReLU) is selected as the activation function for each node. We have used adaptive moment (adam) optimizer with the learning rate equal to 1e−3 for all the models. All the models are trained for 50 epochs with a batch size equal to 32. We have also trained the Extreme Gradient Boosting (XGBoost)26 regressor model with the same training data.

The training is performed in a workstation with 8 Intel Xeon processor cores and 62GB of RAM. The training of all the ML models is done using 5-fold cross-validation for different sizes of training data. All the ML models are trained with 8235 samples and 4000 samples (All results analysed during the current study are available in the collisionAI repository, https://github.com/HarshaSSL/collisionAI.). Figure 10 shows the change in training loss with respect to training epochs for the DNN and the PINN models trained using 8235 samples. Similarly, Fig. 11 shows the change in training loss with respect to training epochs for the StackDNN and the StackPINN models trained using 8235 samples. It is observed in most of the folds, for models with an increased number of hidden layers, the training epochs decreases for the training loss to converge. In StackDNN and StackPINN architectures (Fig. 3), the input to stacked NN is obtained from already trained parallel models. This reduces the initial loss value to the order of \(10^1\) when compared to DNN and PINN models’ initial loss value which is of the order \(10^7\).The results for models trained using 4000 samples can be found in the supplementary information.

After analysing the performance of the trained models, we pick the best-performing model based on the lowest RMSE for the validation data during the 5-fold cross-validation. Inference is done using the selected model for the prediction of position, velocity, mass, and coefficient of restitution of inelastic collisions of the test dataset generated using LEMUR satellites.

Figure 10
figure 10

Training loss for DNN and PINN models in 5-Fold Cross-validation using 8235 samples and 50 epochs.

Figure 11
figure 11

Training loss for StackDNN and StackPINN models in 5-Fold Cross-validation using 8235 samples and 50 epochs.

Figure 12
figure 12

Error comparison of DNN and PINN models trained using 8235 samples and 50 epochs for multiple noise variations in LEMUR Satellites collision data.

Figure 13
figure 13

Error comparison of StackDNN and StackPINN models trained using 8235 samples and 50 epochs for multiple noise variations in LEMUR satellites inelastic collision data.

Figure 14
figure 14

Error comparison of Lagrange method and XGBoost models trained using 8235 samples for multiple noise variations in LEMUR Satellites inelastic collision data.

Using the observation \({\varvec{Z}}\) for all 66 test collision events with 6 different noise standard deviations as mentioned in section “Simulation of the collision” in the selected ML models, \({\varvec{r}}_{d}(t_c^-)\), \({\varvec{v}}_{d}(t_c^-)\), \(m_{d}\) and \(\epsilon\) were estimated. Additionally, the same variables were estimated using the Lagrange multiplier method for the same set of observations. We recorded the error in estimation for performance comparison.

Figure 12 shows the box plots of the position, velocity, mass and coefficient of restitution estimation errors recorded using selected DNN and PINN models. The position and velocity estimation errors for all the methods are represented by the Euclidian norm of the x, y, and z axes in the ECI frame. It can be observed that for position estimation error, the median and the interquartile range increase for the test cases with increasing noise standard deviation. A similar trend in the interquartile range can be found in the mass estimation error plots . In position, velocity and coefficient of restitution error estimation plots, it is observed that the performance of DNN and PINN models with 4, 6, and 8 hidden layers does not improve compared to 2 layers. However, in mass estimation error plots, DNN and PINN with 8 layers provide marginally better results as these models with more hidden layers are better function approximators27.

Figure 13 shows the box plots of the position, velocity, mass, and coefficient of restitution estimation errors recorded using selected StackDNN and StackPINN models. The results follow a similar trend as the DNN and PINN models. However, StackDNN and StackPINN provides marginally better mass estimation using 2 layers. Figure 14 shows the box plots of estimation errors recorded using Lagrange and XGBoost models. The results of the Lagrange method and XGBoost models follow a similar trend to that of DNN and PINN models for different noise standard deviations. However, the median and interquartile range for the estimation errors of XGBoost models is the least among all the models. The median and interquartile range of the estimation errors of the Lagrange method are the highest among all the methods.We have found through experiments that the output is sensitive to position noise standard deviation compared to the standard deviation for velocity and mass noises\(^{1}\).

The summary of 5-Fold cross-validation of the models with 2 hidden layers is shown in Table 1. The table shows the mean ± standard deviation of the Root Mean Squared Error (RMSE) for the validation data of the 5-fold cross-validation analysis. We have observed that the mean RMSE across all the training is higher when trained with smaller datasets. It is also observed that the StackPINN produces lesser variation in the RMSE when the smaller dataset is used.

Table 1 RMSE tabular summary for 5-fold cross validation of training the models with different number of samples.

From the above discussion, it can be discerned that, for 2 hidden layers, the PINN improves the space debris position, mass, and coefficient of restitution estimation performance significantly compared to the Lagrange multiplier-based and DNN-based methods. The performance of the DNN becomes similar to the PINN for higher numbers of hidden layers. However, the results show no significant advantage to increasing the number of layers for the problem under consideration. In addition, the StackDNN and the StackPINN performances are similar for the different numbers of hidden layers and comparable to the PINN model’s performance with 2 hidden layers. However, the stacked architecture is 9 different Neural Networks, increasing the complexity of the implementation.

The training time required for different folds during 5-fold cross-validation, inference time for the selected models, and stored memory for reusability for each selected model with 2 hidden layers is shown in Table 2. XGBoost model requires 4000 KB of memory, which is 100 times more than the memory required to store PINN models.

Table 2 Tabular results of models consisting of time taken for training during 5-Fold cross validation, inference time for predicting and evaluating the LEMUR collision data (Nominal Noise) and their respective stored memory for reusability.
Table 3 Median error for the test dataset.
Table 4 Interquartile range of the errors for the test dataset.

Tables 3 and 4 summarize the inference of the selected trained models along with the Lagrange method for the LEMUR collision dataset. From these tables, we can conclude that the PINN with 2 hidden layers provides better performance to resource requirement trade-off for tracking untracked space debris from a non-destructive and inelastic collision with an active satellite than the DNN, Stacked neural networks, and classical methods. We have found that the XGBoost method provides better results than the DNN and the PINN. However, this method requires approximately 100 times more memory than the DNN and PINN-based techniques. As a result, the PINN-based model would be more suitable for on-board deployment on the active spacecraft. Additionally, the XGBoost is a black box model, whereas the PINN is intrinsically explainable due to the physics-based loss function.

Conclusion

We have formulated a problem of space debris position, velocity, mass, and coefficient of restitution estimation from a non-destructive inelastic collision event under the assumption that the occurrence of the collision event is known. We have shown that this problem can be posed as a function approximation problem; hence, an ML model can approximate the function. We proposed a loss function based on Newton’s law of gravitation to design a PINN to solve this problem. We have also presented an algorithm for selecting the velocity of space debris for a collision simulation. Using simulated collision data, we have trained the DNN, PINN, StackDNN, and StackPINN with varied numbers of hidden layers and the XGBoost model. We have also compared the estimation performance of these ML-based methods with the Lagrange multiplier-based optimization technique.

The results demonstrate that the PINN with 2 hidden layers performs better estimation than the DNN with 2 hidden layers and the classical method. Additionally, the DNN performance becomes similar to the PINN when the number of hidden layers increases and when Stacked neural network architecture is used. However, this increase in the model complexity does not improve the estimation performance compared to the PINN with 2 hidden layers. It is also observed that the XGBoost models provide better estimation performance than the neural network-based counterparts. It is noteworthy to mention that the XGBoost model is a black-box model, and as a consequence, the output of the model is not explainable. On the other hand, the PINN, being trained using a loss function derived from the laws of physics, provides explainability of the outputs. In addition, the XGBoost model requires significantly higher memory than the proposed PINN. However, it is anticipated that model-agnostic explanation techniques, for example, SHapley Additive exPlanations (SHAP)28 can be utilised to explain the XGBoost model in this context and can be deployed on the ground or onboard spacecraft leveraging the state-of-the-art embedded systems. A physics-informed approach for the explainability of the gradient boosting technique29 can also be explored for tracking space debris by observing a collision.

The problem presented in the article can be further extended to the estimation of space debris position, velocity, mass, and coefficient of restitution using a sequence of tracking data of the active satellite. The Recurrent Neural Network (RNN) or transformer may be suitable for this case. However, it will require the generation of a large number of sequences of tracking data for each collision to train such models. RNNs, Transformers, and other such models tend to have significantly more inference times, model sizes, and complexity. Further research is necessary to study the performance of these models in the problem under consideration. It is anticipated that the application of the PINN can be further extended to trajectory extraction of the debris after a break-up event. In addition, the velocity sampling formulation can be used to derive a velocity probability density function for a given position in space, which can be used to design new filters for selecting candidate space debris for collision assessment as well as enhancing the collision probability computation.