ArtificialSeed
Research

Quantifying the uniqueness of professional tennis players

All researchArtificialSeed team

Goal

This project poses a simple question: how unique are the movements of top tennis players, and can we measure that from match footage turned into motion data? Our current approach focuses on four players only — Novak Djokovic, Rafael Nadal, Daniil Medvedev, and Roger Federer — and learns a compact “map” of their movement patterns. If the map places a player’s sequences in a tight, well-separated region, we treat that as a sign of uniqueness. The task for the model is straightforward classification: given a short motion sequence, predict which of the four players it came from. Later, we’ll show how well this works on held-out validation data, where it struggles, and what the final plot of the latent space’s projection will look like.

Data

As we are building our approach on the sequences of players’ motions, we need to transform raw videos from players’ matches into “digitized” motions. For this purpose, we are using the Skinned Multi-Person Linear Model (SMPL). What is SMPL? SMPL is a parametric 3D model of the human body. It represents shape with blend shapes and applies standard linear-blend skinning to pose the mesh, making it practical for animation and analysis. The body is driven by a skeleton with 24 joints, and joint rotations are typically represented in axis-angle form. Each frame of the motion sequence is a 144-dimensional vector built from SMPL outputs:

  • 72 joint locations — 24 joints × 3 coordinates (x, y, z).
  • 72 joint rotations — 24 joints × 3 values in angle–axis form.

A training example is a time series of these 144-D frames for a single player. This keeps the input grounded in body mechanics: where the joints are and how they rotate over time. Overall, our dataset consists of 1000+ such sequences across four renowned players — Novak Djokovic, Rafael Nadal, Roger Federer, and Daniil Medvedev — for whom we used publicly available tennis match records.

Model

Each motion clip is fed to a sequence encoder–decoder that compresses the clip into one latent vector and reconstructs it back to frames. A classifier head attached to the latent vector predicts the player. In parallel, we fit one Gaussian per player over the latent codes. The training objective is the sum of three parts: a reconstruction loss (keeps the decoded motion faithful to the input), a KL regularizer (keeps the latent distribution well-shaped around a unit normal), and a classification loss (pulls codes toward the correct player’s region).

What the training curves show:

  • Training losses drop quickly at the start and then level off; reconstruction stays low and steady, KL declines gradually, and classification falls the most early on.

Image

  • Validation losses follow a similar trend with a short bump mid-training before settling, and train/val curves stay close—no obvious overfitting.

Image

  • Validation classification metrics (accuracy, macro precision/recall/F1-score) climb together and flatten near the top, indicating balanced performance across the four classes once the model converges.

Image

Representation of the Latent Space

To show you the projection of a latent space, we will map each motion clip to a latent vector. Then, we fit one Gaussian per player on the training codes (mean and covariance). At inference, we compute the Mahalanobis distance from a new clip’s code to every Gaussian and pick the closest player. For display only, we project the codes to two PCA components. Note that the classifier itself works in the full latent space!

Image

What the figure shows:

  • Training distribution appears as faint points. Each class is summarized by its cluster mean (star) and confidence ellipses: solid is a 1 standard deviation, dashed is 2.
  • Inline labels next to the means show the player name and the size of the training set for each particular player.
  • Validation predictions are drawn over the top: open circles mark correct classifications; X marks show errors.

What stands out:

  • Four well-separated clusters: Djokovic bottom-left, Nadal bottom-right (clearly isolated), Federer near the center-top, and Medvedev at the top with a slightly elongated ellipse.
  • Most validation points land inside the 1σ contours of their class. The few mistakes sit near the edges or drift toward a neighboring cluster (e.g., an error near the Federer region and a couple near the Djokovic boundary).

Conclusion

This project demonstrates that the “uniqueness” of movement can be quantified as a distance and “separation” in a latent space of the model, which was trained to classify players and reconstruct their motions. Based on our four-player set — Djokovic, Nadal, Medvedev, and Federer — the model forms compact and well-separated clusters, as previously demonstrated using PCA projection on a two-dimensional plane.