How to Build a Motion Recognition Engine

We all know it's difficult to build software that understands motion. However, there's been significant  advances in the field that's making it more comprehensible for many engineers. For those who are developing a motion-enabled product, here is one framework that we use at Kiwi to develop an effective motion recognition engine.


One thing we've learned is that different categories of motions should be approached differently. Specifically, there are two very distinct categories of motions: Activity and Movement.


First up is activity recognition, which measures different types of activity. Activity refers to different states of a user, for example: sitting, standing, walking, running. Here, we typically measure the number of times or duration an activity has occurred.



Movement recognition is a little more complex. These are the specific movements, not states, that you want to capture. The different movements that can be categorized are infinite - bat swing, wrench torquing, arm raise, etc. The questions asked for measuring movements are typically the following: when did the movement happen, how did the movement happen, and what information is useful from the movement.


The Kiwi Framework

We've broken down the Kiwi framework into 4 parts: Data collection, analysis type, personalization, and machine learning. Once you know whether you're doing activity recognition or movement recognition, you can approach these parts with more clarity.


1. Collect data

The cornerstone of any motion engine? A rich data set. Investing resources in collecting a plethora of relevant data will increase the effectiveness of your motion recognition engine. Data itself can be broken into three 'types' of data, all necessary for creating a motion engine:


Clean data - The basis for creating a model with different motions. This data is composed of moderated, consistent repetitions of a motion, with enough rest time between repetitions.


Natural data - Important for creating a model applicable in the real world. This data helps filter out false positives. In the real world, data is never clean, due to variation in users, inconsistency in doing a single motion, and other things between motions.


Garbage data - This data can be utilized for parameterizing the motion models. Think of this as movements the user may do, however should not be recognized as the correct motion. For example, if we want to track only the amount of jabs during a boxing session, we should collect data on other punch types (hooks, uppercut, etc.) to avoid misclassification.


Data collection types

A visual representation of these data types

2. Pattern-Based Analysis or Feature-Based Analysis?

There are two techniques you can use depending on which type of activities and motions are being tracked: pattern-based or feature-based. Pattern-based analysis is relatively simple to use and is perfect for large and distinctive motions (such as stick movements and ball pitching).


Feature-based analysis is a bit more sophisticated. You need to utilize various analysis techniques, such as digital signal analysis and statistical analysis. It becomes a powerful framework when recognizing small motions and not-so-distinct motions.


3. Personalization

Each individual has different footprints for the same motion. The recognition engine should account for all the variations.


But be warned, having an overly robust engine may cause inaccurate false positives. You need to fine-tune it and find what's best for your use-case. This is where natural and garbage data can be extremely handy to help develop a filter based on the collected data.

4. Machine Learning

Now that we know how to account for personalization, we need to understand how each person changes. Over time, each users' motions will change and develop, especially for use-cases like sports analytics. Make sure to have a layer in the engine where the models can be incrementally trained over time.

As the famous computer science idiom goes, “garbage in, garbage out.” It all starts from a rich data set - your model will reflect the quality of the data collected. Once the data is collected and organized, the rest will follow.