motion.capture.free.fr website is dedicated to the Cheap Mocap Project. The goal of this project was to perform a fully functional motion capture, using 'cheap' webcams.
You can see a step-by-step explanation of the methods used, and see how the project got organized.
Begining of the project : May, 21 2007
At the beginning of the first week , we didn't know what libraries / tools to use and we had big gaps in our knowledge of 3D geometry and stereo reconstruction. We read the very well made course of Peter Sturm on Computer Vision, and we got acknowledged with Visual C++ and OpenCV. Starting from nothing, except examples from the internet, one of the hard part was to organize our software and his development. We began to create the software backbone : classes and methods.
We thought we could make our self calibration tool, using the OpenCV methods (there is a method that detects corners on a chessboard, which can be very useful for calibration). But this is a long process and we wanted to spend most of your time on the Motion Capture itself, so we decided to use MatLab to calibrate our cams. That's why we built a function able to import the result from a MatLab file into our software.
During several days we conducted experiments and tests on the way we could control the video stream of our webcams, from our laptops. We tried to code a simple soft which could open the configuration pannel of the webcams, and take shots. We used that to take shots of a calibration pattern which was a chessboard.
On the other side, we began to ponder about image processing algorithms which would be able to detect white spots on a low quality picture. Algorithms and methods are detailed on the image processing page.
Originally we didn't know if we would be able to build a real-time software, that's why we tried to assess the bandwidth of the USB ports, and the capacity of our laptops. The results were disapointing : to get a real-time stream we had to plug one cam per USB controller, that is to say two cams maximum per laptop. What's more, we had a poor framerate using the OpenCV capture methodes, and given that the image processing algorithms require a lot of time to compute a picture, we concluded that real-time capture was impossible on a single computer.
Luckily we found another way to capture the video stream : it's called CvCam and it's part of OpenCV. We also tried to spread the load over several computers.
Thus, we came to network. With a cam per client laptop -the eyes-, and a server -the brain-, we were able to get all the pictures we needed within 33 ms ( the max cam frequence was 30 FPS). The clients, after being connected to the server, wait for a capture request, then take a shot, find the white spots, and send the coordinates of these spots to the server. Network also allows the synchronisation of the captures. More details on the network page.
After a setup of the network, we conducted some tests on its speed and its bandwidth capacities. Given that only few data transit on it, we were finally ready for the real-time capture.
The first thing that we needed now was the possibilty to test what we had done. We added the OpenGL rendering thread, server side, to visualize the scene (only the cameras at the moment).
We tried differents methods to process the images, we took shots of black suited models, wearing white spots. And we tested our algorithms. Actually the best way to process the images was to set the parameters at the launching of the soft : we gave up the fully automatic configuration : the webcams are differents, the luminosity changes ...
Due to OpenCV -CvCam- bugs, we separated the soft into two parts : calibration part, and capture part. Why? because we need a first thread to run the network engine, and another to get the video stream, and yet the CvCam methods that we invoke to start the capture creates its own thread, and we cannot put it in another thread. The solution we found : we start the network in the main thread, then we start the capture, that creates its own thread but we cannot control it (stop it) anymore without exiting the program. Because we need a pause between calibration and capture, we decided to split it in two parts.
Another main issue was the initialisation of the scene. We need to track the white spots and we need to identify then, because the server receive only non identified 2D coordinates, and therefore needs a startpoint to track each spot. Our idea was to introduce an initialisation part, in which the model spreads his arms, and stay still. Then we click on the spots, in a predefined order, on each clients. We have set a little tracking system to keep the spots identified by the client. We the server send the capture start request, every client sends the ordered data, the initialisation sequence is over.
We began the 3D reconstruction by coding the methods able to reproject a 3D points on a 2D cam picture, and to reconstruct a 3D points from several (minimum 2) shots from differents cams. We implemented a set of mathematical tools to work on matrices and vectors.
A lot of tests and debugging are done to get a working program. We recalculate the calibration data : MatLab returns actually coordinates of the chessboard in the camera system; we thought that was the camera coordinates in the chessboard system. After a day of brain-breaking reflexions about coordinate system changes, we manage to have a very well position and orientation of the cameras.
We tried our system without tracking with only 1 ball and it seems to work fine, the moves of the model in black are well represented in the openGL scene. We had to deal with a difference between OpenGL and OpenCV : they take the pixel 0,0 in the top right corner for OpenCV so the Y-axis is descending. but in openGL it is ascending. Now everything is allright !
We work now on the 3D tracking, using epipolar lines, in order to use our system with several balls.
We have done several versions of the tracking system : the first using the epipolar constraint to evaluate wether 2 rays intersect, and the other use the actual distances between the rays.
We have conducted some experiments, using 2, 3 and 4 cams, with 1 to 5 balls. That worked fine with 2 cams, but this is not efficient if we have to deal with occlusions. With more than 2 cams, the number of "ghost" spots increase dramatically, because there are many rays which intersect even if there is no spot, so the motion is not smooth : the tracking system sees a lot of balls and has to choose some of them to match the spots in the former position. Thus this seems to be unstable, and we try now to find a way to improve that.
- What actually works :
We have a whole system who works fine with 2 cams, that is to say : the network, the calibration, the image processing, the 3D reconstruction, the tracking and the visualization.
- What problem we have encoutered :
Weird behavior of VC++ with the network in debug mode : the accept() method doesn't work in debug mode and we had some problems with that in release, depending on the nature (local or global) of some variables, for instance our Scene which sometimes spawned a crash of the network ...
The Keyboard management that we tried to include in the select() method, using the STDIN stream, actually doesn't work on windows.
The USB bandwidth is two weak to use several cams at the same time.
The HighGui capture methods provide a too low fps so we needed to use CvCam.
We still need to use MatLab for the calibration, we have chosen to focus on the motion capture itself, but there are some tools in openCV to do your own calibration in a further development.
We had also a problem with the matrices returned by MatLab because we misunderstood their meaning.
3D Reconstruction & Tracking :
We have some problems when we use 3 cams or more, due to the unstability of the geometry that we use, as explained previously.
Image Processing :
The equalization of the pictures histogram didn't work well enough so we switched to a manual configuration.
We hardly managed to control the CvCam capture thread .. The exit of the program is quite "dirty" due to that.
During the filming :
We had some light issues because of the "cheap" aspect of the project, we had to use the sunlight because the light system of the black room where we worked was coming from lthe ceiling and that spawned unconvenient shadows.
CONCLUSIONDuring this month we had to build a whole system from scratch, including all the aspects that we had to deal with, and that you can view details on this website. We also built our own algorithms for every phase of the process. We wanted to do a cheap mocap system, therefore unstable when using a lot of camera, due to the sum of the errors and imperfection of our cameras which are cheap webcams.
Nevertheless the system is quite sastisfying with a few cams, and considering that we had only 1 month to do all the work needed.
What equipment you need:
To perform a motion capture with our software you need :
- at least 2 webcams (Res. 640*480, 30fps)
- a computer (or more, prefer RJ45 and switch for several computers, we need a low ping for real time)
- a total black outfit (trousers, sweat, hood)
- white ping pong balls or something comparable.
- a black room or large pieces of dark tissue
- printing of a chessboard
Your cams must see only the dark part of the room. Try not to place them at the same height and orient them differently.
For the light, we used softened light (from the cloudy sky). Avoid direct light like spots or neons. Try to have homogeneous light.
First of all you need to register the proxytrans.ax file by typing in a console : regsvr32 proxytrans.ax in the directory of this file.
There are 2 phasis : Calibration and Motion Capture itself.
Launch CalibrServer on the server computer, and CalibrClient on the others. Then folow the instructions of calibration.
Once this step done, DON'T MOVE THE CAMS (or you'll have to re-calibrate).
Launch CheapMocapServer and a CheapMocapClient per cam (prefer copying the executable and calibration files in separate folder on one computer if it uses more than 1 cam) on concerned computers. Start to ask for the calibration settings with "C". You should see an OpenGL window appear. You already can move in the scene with arrow keys and the mouse.
On server, then launch camera initialisation with "I". On Clients, setup the cams and stop all automaticall stuff. Set fps to 30. Close the 2 settings windows.
Now, it's time for clients to set the threshold and errosion values. Use the cursors to do so. You need to see only the white spots corresponding to the balls (any other white pixel would disturb all the computing).
Final part for the client : spots initialisation. On the real image, click on each ball you see (in the same order if possible) and hit "Enter". When all clients are done, hit "S" on server to start capturing. And now you can see the balls moving in the openGL window.
Our system is not perfect and still can be improved :
We developped the network system in C in the main functions of our softwares, but that can be improved by considering the network engine as objects. We wanted to minimize the number of threads, but actually this can be more "clean" to put all this part in one thread apart, and using classes and methods to handle that.
As we said previously, the calibration is done with MatLab but openCV provides several methods to facilitate the calibration, such as cvFindChessboardCorners() and cvCalibrateCamera().
With a greater control of the openCV and CvCam threads, we coul have managed to build a single soft which deals with each phase of the motion capture, including the calibration process.
Image Processing :
One improvment on this section is to optimize the algorithms used by, for instance, processing only parts of the images, and not the whole image, to detect the spots. We can also work on the size of the spots given the distance that separate the cameras from the actor.
The greatest amount of improvment can be done in this part. We built our own tracking algorithms but there certainly is better algorithms to do that job, in a different way, for instance the Lucas Kanade method. We chose the 3D tracking, but the 2D tracking may be efficient and lighter.
We can also use the Wiimote to tighten the zone where we search the 3D spots.
One improvment to use our system with extern animation software is the exportation of the data computed, in a format that could be used by Maya for example.
Our software run in console mode and is not user friendly, so a graphical UI could make it more attractive and simpler, using QT for instance.
Here you can get our openSource Cheap Mocap program and its features here !!
Theorical materials on 3D-Reconstruction:
Because of socket handling, we propose only a Windows compatible release.
The project source code in a VC++ format: cheapMocap_Source.zip
Unfortunately Matlab is not a free software. You can get some informations here.
Configure your engine:
You have to modfy some of your environment variables :
Right-click on your" MyComputer" and choose properties. Then under advanced click on button :"Environment variables".
In the top frame, click on new. Create 3 variables in that order:
|QT||YourQtPath (for instance C:\Qt\4.2.3)|
|OPENCV||YourOpenCVpath (for instance C:\Program Files\OpenCV)|
Now let's take care of VC++.
In Tools>options, under project&solution>VC++ directories, we have to had some targets:
|Binaries||$(QT)\bin;$(OPENCV)\bin;C:\Program Files\Microsoft Platform SDK\Bin|
|Include||$(QT)\include;C:\Program Files\Microsoft Platform SDK\Include|
|Libraries||$(QT)\lib;$(OPENCV)\lib;C:\Program Files\Microsoft Platform SDK\lib|