The Science Behind Movement – Understanding The Technical Aspects of Motion Capture Systems

Image Credits – Roger Clark, Arthur Morgan in Red Dead

How do animators get from the image on the left to the one on the right?

On the left is Roger Clark, the voice actor who played the character on the right, Arthur Morgan, in Rockstar Studio’s 2018 video game Red Dead Redemption 2

The game was praised for its detail, with players immersing themselves in a world populated with realistic characters. Helping to build this sense of realism was Rockstar’s use of motion capture.

This transformative technology features heavily in the entertainment industry but has far-reaching applications from medicine to sports – it’s even being used in manufacturing to help with safety.

Motion capture has become far more accessible in recent years. Hobbyists can even capture data using nothing more than a smartphone and an app.

However, high-quality motion capture used by studios and industry professionals is a highly technical process. This post helps you understand the process.

Some Keywords

Before we dive into the technical aspects and processes of motion capture, these are the keywords necessary for understanding the technology:

Motion Capture (MoCap): A technique that records and digitises movement to animate digital character models in 3D.

Facial Motion Capture: Capturing facial movements to animate expressions on CGI models.

Optical MoCap: A type of motion capture that uses cameras to track the movement of markers or shapes on an actor’s body.

Non-Optical MoCap: Motion capture that doesn’t rely on cameras, instead using sensors like inertial or magnetic devices to track movement.

Marker-Based System: A motion capture system where actors wear suits with reflective or LED markers that cameras track to capture movement.

Markerless System: Motion capture systems that use software to track body movements without the need for physical markers on the actor.

Inertial Sensors: Devices used in non-optical MoCap systems that measure movement through acceleration and rotational data.

Calibration: Configuring the MoCap system to track movements in the designated space accurately. 

Noise Reduction: A step in processing mo-cap data to remove irrelevant or extraneous information that doesn’t represent actual movement.

Gap Filling: Estimating and inputting data for moments where markers were occluded, or sensor data was missing.

Data Smoothing: Refining motion capture data to remove jagged or abrupt movements for more fluid animation.

Rigging: Creating a digital skeleton for a CGI model, to which the motion capture data is applied.

Keyframe Animation: The process of manually setting specific frames in an animation sequence to control movement, often used with mo-cap data.

Real-Time Feedback: A feature of advanced mo-cap systems that allows the immediate viewing of captured movements as they are performed.

Synchronisation: The process of aligning MoCap data with CGI elements, ensuring consistency and timing within the animated environment.

Retargeting: Adapting motion capture data from one model (usually a human actor) to a different digital character model (like a fantasy monster).

But How Does Motion Capture Work?

The point of motion capture is to capture movement in a digital format. Specialised hardware (cameras, suits, markers and sensors) and software combine to 

Types of Motion Capture Systems

Motion capture systems are broadly classified into two categories: Optical and Non-optical.

Optical Motion Capture:

Marker-based Systems: The most recognisable type of motion capture. In these systems, actors wear suits dotted with reflective markers. Cameras positioned around the capture area record the movement of these markers, and software then translates this data into a digital skeleton. 

Markerless Systems: A more recent innovation, markerless systems use advanced algorithms to track body movements without requiring specialised suits or markers. These systems analyse the visual data from the actors and discern motion. This approach offers more freedom in movement but can be less precise than marker-based systems.

Non-optical Motion Capture:

Inertial Systems: These systems use sensors attached to the actor’s body to measure the inertial forces generated during movement. Unlike optical systems, inertial motion capture does not rely on cameras or external references. It’s highly portable and can be used in various settings, even outside a traditional studio. However, it may require more post-processing to achieve the same level of detail as optical systems.

The Hardware

Cameras and Sensors

Cameras and sensors capture the positional data of markers or the actor’s body, which is then processed to create a digital representation of the movement. 

Frame Rate: Motion capture cameras typically operate at higher frame rates than standard video cameras. 

Common frame rates for mo-cap cameras range from 60 frames per second (fps) to 120 fps, and some advanced systems can go up to 1000 fps or more.

These higher frame rates are preferred as they capture more data points per second, resulting in smoother and more detailed motion capture. This is especially important for fast or complex movements, such as capturing a fight scene in a video game.

Resolution: Typical resolutions for MoCap cameras range from 1 to 4 megapixels. However, in motion capture, the spatial resolution (how accurately the camera can detect the position of a marker in 3D space) is more critical.

Higher-resolution cameras can capture more subtle movements and better handle complex scenarios with many markers or actors.

It’s important to note that the choice of frame rate and resolution depends on the project’s specific needs. High-speed movements (like sports or action sequences) require higher frame rates for accurate data capture, while detailed facial expressions may need cameras with higher spatial resolution.

Optical Cameras: Used predominantly in optical MoCap systems, these cameras are designed to capture a high volume of data with great accuracy. In marker-based systems, they don’t capture images; they track the reflective or LED markers placed on the actor. 

The number of cameras can vary depending on the captured motion’s complexity and the capture space’s volume.

Infrared Cameras: Often used with reflective markers, infrared cameras can accurately capture motion in various lighting conditions.

Inertial Sensors: Central to inertial MoCap systems, these sensors measure the orientation and acceleration of the body part they are attached to. They are often used in environments where optical capture is challenging or requires a more natural actor’s environment.


In marker-based systems, markers are what the cameras track to capture movement. Therefore, markers are placed at key anatomical points on the actor’s body, such as joints or the tips of limbs. 

Placement is key – if the markers aren’t in the right place, it will affect how the performance looks in post-production.

Reflective Markers: Commonly used in optical systems, these spherical markers reflect light back to the cameras, making them easily trackable. They are lightweight and come in various sizes to suit different capture requirements.

LED Markers: In some optical systems, LED markers emit light to provide a consistent signal for the cameras to track. They’re useful in situations where lighting conditions may not be stable.

Suit Technology

MoCap suits are designed to be flexible, comfortable, and conducive to accurate data capture. They have specific points for marker attachment or are embedded with sensors (in the case of inertial systems). Suits typically have between 15-20 markers to ensure full coverage for capturing movements.

The material of the suits is typically lightweight and non-restrictive, allowing for a full range of motion.

Capturing Data

Capturing motion data is the initial and most crucial phase in MoCap. If the data isn’t captured correctly, it will require either costly reshoots or time-consuming manual keyframing in post-production.

To help prevent any issues, MoCap systems employ multiple cameras set up around the capture space. Even small-scale projects can use anywhere between 6-12 cameras – large productions can use upwards of 100.

The next step is calibration, which aligns the system’s spatial understanding with the real world. The process defines the capture space and ensures the cameras are synced and oriented correctly.

You also need to calibrate the actor with the system. The actor will complete a range of motions that ensure the cameras are capturing everything.

Once calibration is completed, the system connects all the markers on the actor. On-screen, this looks like a 3D skeleton. The skeleton is what will be used to animate the digital character.

Then, it’s time to capture the data.


Environmental Factors: Lighting conditions, background interference, and the physical setup can significantly impact data quality. Optical systems, in particular, require controlled lighting to ensure accurate marker tracking.

Marker Occlusion: In optical systems, markers can be obscured from the camera’s view, leading to gaps in the data. This issue requires careful planning of camera positions and sometimes post-capture data reconstruction.

Sensor Accuracy: In inertial systems, ensuring the accuracy and consistency of sensor data is crucial. Factors like sensor drift or external magnetic interference can affect data integrity.

Processing Raw Data

Before the data is used in animation, it’s important to filter out any issues like noise or errors. This refinement process includes:

Noise Reduction: Eliminating extraneous data that doesn’t represent actual movement.

Gap Filling: In cases of marker occlusion or sensor error, missing data points are estimated and filled in to maintain continuity in motion.

Data Smoothing: Smoothing out jerky or unnatural movements to ensure the motion looks fluid and lifelike.

Integration with Animation

Integrating mo-cap data with CGI is a meticulous and often time-consuming process that involves mapping the captured movements onto computer-generated characters. This begins with importing cleaned and refined MoCap data into animation software, which synchronises with the CGI elements.

One key aspect of integration is ensuring that the MoCap data aligns perfectly with the CGI environment. Animators need to adjust the digital character’s scale, orientation, and positioning within the virtual scene to ensure consistency and believability.

Rigging and Keyframe Animation

Rigging is the term for creating the digital skeleton that will be animated using the MoCap data. This skeleton acts as a framework to which the CGI model is attached. 

It’s crucial that the rig accurately represents the range of motion and physical constraints of the real-world subject to maintain realism. 

Once the skeleton is created, animators connect the two skeletons (the real and the digital). If the target (digital) skeleton has different proportions than the original, retargeting ensures the movements still match.

Animators then use keyframe animation to enhance or modify the data. This might involve exaggerating movements, altering the timing, or adding movements where no MoCap data is available. 

The blend of MoCap and keyframe animation allows for a balance between realistic movement and stylised, artistic expression.

The ultimate goal of integrating mo-cap with CGI is to achieve a level of realism convincing to the viewer. This involves accurate movements and the portrayal of subtle nuances like weight, momentum, and emotional expression through motion.

This presents several challenges for animators:

Maintaining Character Integrity: Ensuring the CGI character’s movements align with its physical attributes and the intended portrayal can be challenging. For instance, a character with a larger build will move differently than a slender one.

Facial Animation: Integrating facial expressions captured through MoCap with the body movements requires meticulous attention to detail to ensure synchronicity and emotional congruence.

Environmental Interaction: Simulating realistic interactions between the CGI character and its environment (like touching or lifting objects) requires carefully coordinating MoCap data and CGI physics.

The Finished Product

Motion capture is a crucial part of 3D animation. It helps to cut down on animation time (it’s faster to record the movements than manually keyframe them) and delivers more realistic movements.

As an artistic tool, it offers stunning games and films, like Red Dead Redemption 2. Elsewhere, motion capture is a fantastic tool for professionals in a range of industries, whether they’re coaching a sports team or a doctor developing treatments for patients. Motion Capture apps like ours are making this amazing technology accessible to animators on every budget.

Image of the perfromit live app mock up

Ready to bring your unique characters to life with the magic of motion capture?

Get the Performit Live app now and start creating unforgettable animations. Let Performit Live help you tell your story. 

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *