Motion capture technique

Published: November 9, 2015 Words: 963

1.Background of the research

Motion capture technique has become an important role in production environment for easy access to animation assets. Many applications depend on motion capture data for realistic human motions. More and more motion data become available and motion databases increase larger and larger. At the same time, limited storage for massive data turned into problem issue. Therefore, a lossy compression algorithm for large databases is proposed to deal with this problem. This method can compress 1080 MB data into 35.5 MB with small obvious degradation. Although the data has been compressed, compress motion is perceptually as close to the original one.

1.1. Compression of Motion Capture Databases

Compression of audio and video is an important problem and many good solutions have been proposed. Previous work on animation compression mainly focuses on compressing animated meshes. PCA can be used to compress shapes or animations. Clustered PCA is an effective way of compressing high dimensional data. Sattler et al. introduced a clustered PCA based approach for compressing mesh animations. Important properties of typical animal motion are degrees of freedom are correlated with each other and have temporal coherence and there will be many different copies of similar looking motions. For motion capture, these degrees of freedom are typically the character's global position/orientation and a set of joint angles.

In human and animal motion, there are many correlations between joint actions. These correlations are especially clear for a repetitive motion like walking. For example as the right foot steps forward, the left arm swings forward, or when the hip angle has a certain value, the knee angle is most likely to fall within a certain range .

2. The main idea of the proposed algorithm

This proposed algorithm assumes that the motion database is just a single motion and the motion is sampled at regular intervals and also the number of degree of freedom does not change.They will represent this motion as vector valued function M(t) .It contains the degree of freedom where m is the vector value function. For compression, they split the motion database into clips of k subsequent frames where k is a compression parameter.

In compression of motion representation, they convert the degrees of freedom into a positional representation firstand thenwork with joint positions rather than joint angles because joint positions behave more linearly. A halfway orientation between a and b is not necessarily (a+b)/2. They can compute the joint angles directly if they compute the global position of 3 different and known points in the local coordinate of each bone. This allows them to work with virtual marker position (3 for each bone).

In smooth representation, fit cubic Bezier curves to this positional representation using least squares.

For every virtual marker we store 4 * 3 = 12 number which mean x, y, z coordinates of 4 control pts of cubic Bezier curve.

In compression of clustering and projection,each clip is a point in a d = 12 * 3 * number of bones dimensional space. Clustering

Xi' = Pc(xi - mc) where Xi : vector of clip i, Mc : cluster mean , Pc : basis matrix

In big motion databases, some motions are very common. These common clips belong to the same cluster and they have similar projected coordinates. They take advantage of this observation by quantizing the projected coordinates. Because d is quite large, we need to reduce dimensionality using CPCA.

They quantize the projected coordinates and compress the integer coordinates using entropy encoding. The resulting integers have low entropy due to the dense clustering of coordinates.

In Environmental contacts, x,y and z coordinates are represented as a 1D signal and transformed into a frequency space using Discrete Cosine Transform(DCT). The resulting coefficients are then quantized into a finite number of bits.

There are three parameters that control the compression accuracy vs. compressed size.

K : The bigger this number is, the smoother the reconstructed signal will be. Optimal number are 16~32 frames.

T : The smaller this number is, the more coefficients will be needed for each clip. This value is related to the number of basis matrix Pc.

Number of clusters for CPA : a large number of clusters may create extra overhead.

In Decompression, firstly perform the entropy decoding and undo the quantization for the cluster coordinates (xi') then obtain the Bezier control points Xi = PcT xi' + mc.

These Bezier curves are resampled to get the 3D position. We decompress the virtual markers on the feet by performing inverse DCT. Because the loss in compression, feet may not reach the position that we decoded. Therefore we perform IK so that foot positions that we decoded are satisfied. Due to the lossy compression, there may be discontinuities between adjacent clips. So we found the continuous trajectory using C1 continuous Merge.

3. Result of the algorithm

For evaluating the result, this research used Root Mean Squared (RMS) to define closeness to be the distance between the skin vertices of the compressed motion versus the original.

Mc(t) : the compressed version of a motion

M(t) : the original version of a motion

n : the number of frame

As the compressed size goes down, the RMS error increases. When compressed size is small, this method performs better than other baseline methods.

This table shows a comparison of the compression methods for the same amount of visual quality. The Last column is lossless compression methods.

4. Strength and Limitation

Strength

- Compression and decompression is fast. It can decompress at about 7 times faster than the real time.

- It compress motion for the massive motion database without the big visual degradation

- It generates the “knee pop”.

Limitation

- The major limitation of this method is that they need to know the contacts.