how to get started in Linux with the Microsoft Kinect - Stony Brook ...
how to get started in Linux with the Microsoft Kinect - Stony Brook ...
how to get started in Linux with the Microsoft Kinect - Stony Brook ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
R and T: <strong>the</strong> translation and rotation of <strong>the</strong><br />
projected po<strong>in</strong>t from <strong>the</strong> world <strong>in</strong><strong>to</strong> a coord<strong>in</strong>ate<br />
frame <strong>in</strong> reference <strong>to</strong> <strong>the</strong> camera.<br />
Figure 6: Depth data as a flat array.<br />
Figure 4: (<strong>to</strong>p) The camera <strong>in</strong>tr<strong>in</strong>sic matrix and <strong>the</strong><br />
comb<strong>in</strong>ed R-T matrix, (bot<strong>to</strong>m) <strong>the</strong> p<strong>in</strong>hole model<br />
and stereo calibration [4].<br />
2D and 3D Data<br />
A first analysis of <strong>the</strong> depth data is treat<strong>in</strong>g it as if<br />
it were two-dimensional. For each pixel <strong>in</strong> <strong>the</strong> depth<br />
image, we th<strong>in</strong>k of its position <strong>with</strong><strong>in</strong> <strong>the</strong> image as<br />
its (x,y) coord<strong>in</strong>ates plus a grayscale value. This<br />
last value corresponds <strong>to</strong> <strong>the</strong> depth of <strong>the</strong> image <strong>in</strong><br />
front of it, e.g. this value will represent <strong>the</strong> pixel’s z-<br />
coord<strong>in</strong>ate. The depth map returns a flat array that<br />
conta<strong>in</strong>s 307,200 (or 640 times 480) <strong>in</strong>tegers arranged<br />
<strong>in</strong> a s<strong>in</strong>gle l<strong>in</strong>ear stack, Fig. 6.<br />
Figure 7: Depth data <strong>with</strong> po<strong>in</strong>t clouds.<br />
Outl<strong>in</strong>e and Next Steps<br />
This paper <strong>in</strong>tended <strong>to</strong> be an <strong>in</strong>troduction <strong>to</strong> <strong>how</strong><br />
<strong>to</strong> set and calibrate <strong>the</strong> <strong>Microsoft</strong> K<strong>in</strong>ect for L<strong>in</strong>ux,<br />
namely <strong>in</strong> <strong>the</strong> Fedora distribution. The described<br />
framework allows <strong>the</strong> development of many applications<br />
(and powerful!) on computer pho<strong>to</strong>graphy and<br />
computer vision. My project results, which is beyond<br />
<strong>the</strong> scope of this paper, explore some and can be acquired<br />
from <strong>the</strong> git reposi<strong>to</strong>ry [8].<br />
References<br />
[1] http://www.microsoft.com/en-us/k<strong>in</strong>ectforw<strong>in</strong>dows/<br />
[2] http://www.primesense.com/<br />
[3] http://www.asus.com/Multimedia/Xtion PRO/<br />
Figure 5: Transformations between 2D and 3D data.<br />
Once we convert all <strong>the</strong> two-dimensional grayscale<br />
pixels <strong>in</strong><strong>to</strong> three-dimensional po<strong>in</strong>ts <strong>in</strong> space, we have<br />
a po<strong>in</strong>t cloud, i.e. many disconnected po<strong>in</strong>ts float<strong>in</strong>g<br />
near each o<strong>the</strong>r <strong>in</strong> three-dimensional space <strong>in</strong> a way<br />
that corresponds <strong>to</strong> <strong>the</strong> arrangement of <strong>the</strong> objects<br />
and people <strong>in</strong> front of <strong>the</strong> K<strong>in</strong>ect, Fig. 7.<br />
[4] Hack<strong>in</strong>g <strong>the</strong> K<strong>in</strong>ect, Apress, 2012<br />
[5] Mak<strong>in</strong>g Th<strong>in</strong>gs See, 2012<br />
[6] http://www.openni.org/<br />
[7] http://openk<strong>in</strong>ect.org<br />
[8] https://bitbucket.org/ste<strong>in</strong>kich/k<strong>in</strong>ect-hacks-and-projects<br />
[9] http://opencv.org/<br />
[10] http://po<strong>in</strong>tclouds.org/<br />
4