The following are multiview stereo data sets captured in our lab: a set of images, camera parameters and extracted apparent contours of a single rigid object. Each data set consists of 24 images. Image resolutions range from 1400x1300 pixels^2 to 2000x1800 pixels^2 depending on the data set. For calibration, we have used "Camera Calibration Toolbox for Matlab" by Jean-Yves Bouguet to estimate both the intrinsic and the extrinsic camera parameters. All the images have been corrected to remove radial and tangential distortions. For contour extraction, first, Photoshop has been used to segment the foreground from each image(pixel-level). Second, segmentation results have been used to initialize the apparent contour(s) of an object. Last, a b-spline snake has been applied to extract apparent contours in a sub-pixel level.
Images are provided in the JPEG format. Camera parameters are provided in the same format as that of "Camera Calibration Toolbox for Matlab". Apparent contours are provided in our own format, but we hope are fairly easy to interpret. Please see the other sections at the bottom of this website for more details. Unfortunately, we do not have ground truth for all the data sets, but we believe that we can develop some photometric ways to evaluate the reconstruction results. This page is still under construction, and we keep on updating the contents as well as adding more data sets as we capture.
Note:
we also provide visual hull data sets.
Toy Dinosaur
|
![]() |
![]() |
Toy Mummy
|
![]() |
![]() |
Action Figure (Morpheus)
|
![]() |
![]() |
Action Figure (Predator)
|
![]() |
![]() |
Action Figure (Warrior)
|
![]() |
![]() |
Action Figure (Soldier)
|
![]() |
![]() |
Human Skull Cast
|
![]() |
![]() |
Human Skull Cast
|
![]() |
![]() |
Human Skull Cast - Pose 1
|
![]() |
![]() |
Human Skull Cast - Pose 2
|
![]() |
![]() |
We captured the above data sets in our
lab by using 3 fixed cameras (Canon EOS 1D
Mark II) and a motorized turn table (please
see a picture below). We have followed the
following 3 steps to acquire data.
1. Put 3 cameras in different heights (or
1 at the top and 2 almost at the same height).
2. Put an object on the motorized turn table,
and take pictures while rotating the table.
The table rotates approximately 45 degrees
every time. We rotate it 8 times and each
camera takes 8 images. Then at total, we
obtain 8 x 3 = 24 images.
3. Replace an object with a calibration board
(checker board pattern), and take pictures
of the board while rotating the table 8 times
in exactly the same way.
Note that although we have tried to
make the incoming light diffuse and uniform
as much as possible, lighting conditions
with respect to an object are different in
each image.

As is described in the last section, each
data set has been acquired by 3
cameras, and each camera has taken 8 pictures.
Image files from 00.jpg through 07.jpg have
been taken by the first camera, and
hence, these 8 pictures share the same intrinsic
camera parameters. Similarly, image files
from 08.jpg through 15.jpg have been taken
by the second camera, and Image files from
16.jpg through 23.jpg have been taken
by the third camera. There is a camera
parameter file for each camera: camera0.m
(resp. camera1.m and camera2.m) for the first
(resp. the second and the third) camera.
Each file stores the intrinsic camera parameters
of the corresponding camera and extrinsic
camera parameters of the associated 8 images.
For example, camera0.m contains intrinsic
camera parameters for the first camera and
extrinsic camera parameters of images 00.jpg
through 07.jpg. The format of the camera
parameter file is the same as that of "Camera Calibration Toolbox for Matlab". Please see their webpage for more detailed
information. For convienience, we have computed
the projection matrix from camera parameters
and added the matrix to the file. Also
note that marginal backgrounds have been
clipped away to reduce the sizes
of input images while keeping an object fully
visible in each image. The principal point
has been modified accordingly.
An apparent contour is represented as a chain
of points in our data sets (a piece-wise
linear structures) and provided in a simple
format. Each file starts with a single
line header and an integer representing
the number of apparent contours in the
corresponding image, which are followed by
the data sections of apparent contours. Note
that a single image can have multiple
apparent contours. A data section of
an apparent contour starts with an integer
representing the number of points in the
component, followed by their actually 2D
image coordinates. Points are listed in a
counter clock-wise order for apparent contours
containing foreground image region inside.
Similarly, points are listed in a clock-wise
order for apparent contours containing
background image region inside (holes). Since
all the objects in our data set are single
connected components, each image has a single
apparent contour of the first kind, which
is given as the first apparent contour in
all the files, but can have multiple holes.
In some images, especially those of very
complicated objects, there exists too many
holes in a single image, and we could
not extract all of them. However, we believe
that this is not a critical issue, because
a good visual hull can be still constructed,
and our reconstruction algorithm did not
have problems in using such visual hulls.
If you build a visual hull and are not satisfied
with the outcome, you can extract the silhouettes
by yourself. The other thing you may want
to try is to build a visual hull
by using the apparent contours we provided,
then project the visual hull back onto each
image, and extract boundaries of its projection.
Missing holes can be detected from the boundaries,
furthermore, you may be able to extract more
accurate silhouettes starting from the boundaries.
In those cases, we really appreciate it if
you can provide us more accurate data files.
We have performed the following 3 tests
to check the accuracy of the camera
parameters.
- Firstly, we need to make sure that the
behavior of the turn table is repetitive,
because pictures of an object and pictures
of the calibration grids have not been taken
at the same times. Note that we don't care
if the rotation angle is exactly 45 degrees
or not, but we want the rotation angle to
be the same every time. We confirmed
that the rotation is repetitive as follows.
- Put a paper with some textures on the turn
table, and take a picture from a fixed
camera.
- Rotate the table (approximately) -45 degrees.
- Rotate the table (approximately) 45 degrees.
- Take a picture.
- Rotate the table (approximately) -45 x
2 = 90 degrees.
- Rotate the table (approximately) 45 x 2
= 90 degrees.
- Take a picture.
- and etc.
Those images look identical.
- Secondly, we take pairs of (radially and
tangentially undistorted) images, and draw
a bunch of epipolar lines in them. We can
check an accuracy of camera parameters by
using frontier points, or salient image features
where the epipolar lines go through. We put
pairs of images with epipolar lines here for one of our objects. Please compare 2
images, which file names are ??_??_0.jpg
and ??_??__1.jpg. We believe that epipolar
lines go through the same image features
or are off by a pixel or two at most.
- Lastly, we can use the reconstruction of
our algorithm for the check, that is, look
at the alpha-blended surface textures backprojected
from different images. Backprojected textures
are consistent with each other only when
the camera parameters and the geometry of
an object are correct. We observed that backprojected
textures are consistent even for surface
structures that are a few pixels long. The
following 2 pairs of images are examples
of such alpha-blended textures. For
each pair, left picture shows the consistent
alpha-blended textures, while the right one
is inconsistent.
| Consistent alpha-blended textures |
Inconsistent alpha-blended textures |
Consistent alpha-blended textures |
Inconsistent alpha-blended textures |
![]() |
![]() |
![]() |
![]() |
We include below sample results of our 3D
photography algorithms both on our own datasets
and datasets from another sources. These
are described in articles that are (or will
shortly be) under submission. The confidentiality
policy of the corresponding conferences prevents
us from posting these articles until they
are accepted for publication. In the mean
time, please contact us directly for more
details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
We thank Jodi Blumenfeld and Steve Leigh, Department of Anthropology at University of Illinois at Urbana-Champaign for providing us the skulls. We also thank Steve Sullivan and ILM for providing a dataset for this project. This work was supported in part by the Beckman Institute and the National Science Foundation under ITR grant IIS-0312438.
| Send any comments to Y. Furukawa (yfurukaw -at- uiuc.edu) | Last updated 2 May 2006 -
|