A Search-Classify Approach for Cluttered Indoor Scene Understanding

ACM Transactions on Graphics 2012  (Proceedings of SIGGRAPH Asia 2012)

Liangliang Nan1       Ke Xie1       Andrei Sharf2

1Shenzhen VisuCA Key Lab/SIAT         2Ben Gurion University

Figure 1: A raw scan of a highly cluttered indoor scene is given (left). Applying our search-classify method, we segment the scene into meaningful objects (middle: chairs (blue) and tables (purple)), followed by a template deform-to-fit reconstruction (right).


We present an algorithm for recognition and reconstruction of scanned 3D indoor scenes. 3D indoor reconstruction is particularly challenging due to object interferences, occlusions and overlapping which yield incomplete yet very complex scene arrangements. Since it is hard to assemble scanned segments into complete models, traditional methods for object recognition and reconstruction would be inefficient. We present a search-classify approach which interleaves segmentation and classification in an iterative manner. Using a robust classifier we traverse the scene and gradually propagate classification information. We reinforce classification by a template fitting step which yields a scene reconstruction. We deform-to-fit templates to classified objects to resolve classification ambiguities. The resulting reconstruction is an approximation which captures the general scene arrangement. Our results demonstrate successful classification and reconstruction of cluttered indoor scenes, captured in just few minutes.


Figure 2: A zoom into the cluttered region of Figure 1 (left), reveal that accurate segmentation and classification are challenging, even for human perception. We initially over segment the scene (mid-left) and search-classify meaningful objects in the scene (mid-right), that are reconstructed by templates (right) overcoming the high clutter.

Figure 3: Search-classify overview. Left-to-right, starting from a raw scan, we randomly select a patch triplet with high classification likelihood value (green). In each iteration we grow by adding one neighbor patch (red). We do not add patches that decrease classification likelihood (chair bar yielding 0.88). Finally, we deform templates to fit points and reinforce classification.

Figure 7:Visualization of graph traversal and classification. Left-to-right, from a graph defined on initial patches, we select an initial object seed with above threshold classification confidence (mid-left). We traverse the graph in directions where classification confidence increases (number value, also blue color intensity). In rightmost figure, we show a neighboring patch (table-side) causing a steep decrease in classification confidence, hence we do not accumulate.

Figure 12: Two examples of final classification likelihoods and template fitting values. Mid-left figure shows classified scene with likelihood values per object. Mid-right figure shows three best fitting templates to each object with average matching distance.

Figure 14: Seven results of our search-classify method showing the raw input scan (left), initial oversegmentation into piecewise smooth patches (mid-left), classification result (mid-right) color hue representing the object class and color intensity representing final likelihood value, and template based reconstruction (right).


We thank the reviewers for their valuable comments. This work was supported in part by NSFC (61272327, 61003190, 61232011), National 863 Program (2012AA011802, 2011AA010503), Guangdong Science and Technology Program (2011B050200007), Shenzhen Science and Technology Foundation (JC201005270340A, CXB201104220029A, JC201005270329A), Israel Science Foundation (ISF) and European IRG FP7.

title = {A Search-Classify Approach for Cluttered Indoor Scene Understanding},
author = {Liangliang Nan and Ke Xie and Andrei Sharf},
journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2012)},
volume = {31},
month = {11},
year = {2012},

Downloads (faster for people in China)

Downloads (faster for people in other places)