Multimodal analysis of user behavior and browsed content under different image search intents.

Soleymani M, Riegler M, Halvorsen P.

International Journal of Multimedia Information Retrieval Volume 7, Issue 1, pp 29–41

The motivation or intent of a search for content may vary between users and use-cases. Knowledge and understanding of these underlying objectives may therefore be important in order to return appropriate search results, and studies of user search intent are emerging in information retrieval to understand why a user is searching for a particular type of content. In the context of image search, our work targets automatic recognition of users’ intent in an early stage of a search session. We have designed seven different search scenarios under the intent conditions of finding items, re-finding items and entertainment. Moreover, we have collected facial expressions, physiological responses, eye gaze and implicit user interactions from 51 participants who performed seven different search tasks on a custom-built image retrieval platform, and we have analyzed the users’ spontaneous and explicit reactions under different intent conditions. Finally, we trained different machine learning models to predict users’ search intent from the visual content of the visited images, the user interactions and the spontaneous responses. Our experimental results show that after fusing the visual and user interaction features, our system achieved the F-1 score of 0.722 for classifying three classes in a user-independent cross-validation. Eye gaze and implicit user interactions, including mouse movements and keystrokes are the most informative features for intent recognition. In summary, the most promising results are obtained by modalities that can be captured unobtrusively and online, and the results therefore demonstrate the potential of including intent-based methods in multimedia retrieval platforms.