Mixing Modalities of 3D Sketching and Speech for Interactive Model Retrieval in Virtual Reality


Sketch and speech are intuitive interaction methods that convey complementary information and have been independently used for 3D model retrieval in virtual environments. While sketch has been shown to be an effective retrieval method, not all collections are easily navigable using this modality alone. We design a new database of 3D chairs where each of the pieces (arms, legs, seat, back) are colored, challenging for sketch. To overcome this, we implement a multi-modal interface for querying 3D model databases within a virtual environment. We base the sketch on the state-of-the-art for 3D Sketch Retrieval and use a Wizard-of-Oz style experiment to process the voice input. We avoid the complexities of natural language processing that frequently require fine-tuning to be robust to accent. We conduct two user studies and show that hybrid search strategies emerge from the combination of interactions, fostering the advantages provided by both modalities.

 

 


Dataset



Our dataset contains 16200 chairs generated from 45 different shapes extracted from ShapeNet. Each shape is segmented in 4 parts (seat, back, legs and arms). We assign colors from a fixed set of 6 to each part following a permutation without repetitions in the same chair.

 

  1. chairs_database.zip (313 MB)
    features_dictionary.xlsx (0.3 MB)
  1. segmented_chairs.unitypackage (98 MB)

 


Paper

 


Bibtex