Bayesian Inference: An application to Kinect data

by Javier Rico Reche (jvirico@gmail.com)

Index

Introduction

During this work we perform the classification of sequences of body positions of skeletal body movements recorded from a kinect device.

First we train two Bayesian models, Naïve Bayes and Linear Gaussian Model, the later considering dependencies between the positions of different parts of the skeleton.

Second, we classify a set of new instances using both methods, and finally we evaluate the results giving an Accuracy estimate using Stratified K-Fold Cross-Validation.

Data

We use a fraction of the Kinect Gesture Dataset from Microsoft Research Cambridge, that consists of sequences of human movements, represented as body-part locations, and the associated gesture to be recognized by a system. Although the original dataset contains more than 6h of recordings of 30 people, our subset has 2045 instances of body positions for 4 classes, 'arms lifted', 'right arm extended to one side', 'crouched' and 'right arm extended to the front'.

The data is provided in a .mat Matlab file and loaded in python using scipy.io.loadmat function.

It is organized in three data structures:

data: a matrix (20x3x2045) with information of joints and their corresponding 3D coordinates of 2045 instances.
labels: a vector (2045x1) with the class label for each instance.
individuals: a vector (2045x1) with the identification of each individual.

Model Object

The parameters are stored in an object of class model as the one below.

class model:
    def __init__(self, G=None):
        self.connectivity = G
        self.class_priors = []
        self.jointparts   = []

# Auxiliary classes to store
# informatin of parameters depending
# on the type of model used.
class NBJoint:
    def __init__(self):
        self.means  = []
        self.sigmas = []
        
class LGMJoint:
    def __init__(slf):
        slf.HC_means  = []
        slf.HC_sigmas = []
        slf.betas     = []
        slf.sigmas    = []

Models

Naïve Bayes

We assume strong independence between the features of the system, and each of the 60 independent variables (20 joints x 3 coordinates), all continuous measurements, are modeled using Gaussian distributions.

def fit_gaussian(X):

    mean = np.mean(X)
    sigma = np.std(X)
    
    return (mean, sigma)

To calculate the parameters (mean and variance) of a set of observations we use the function fit_gaussian.