Semantic video search using natural language queries
Recent advances in computer vision and artificial intelligence algorithms have allowed automatic extraction of metadata from video. This metadata can be represented by using the RDF/OWL ontology which can encode scene objects and their relationships in an unambiguous and well-formed manner. The encoded data can be queried using SPARQL. However, SPARQL has a steep learning curve and cannot be directly utilized by a general user for video content search. In this paper, we propose a method to bridge this gap by automatically translating user provided natural language query into an ontology-based SPARQL query for semantic video search. The proposed method consists of three major steps. First, semantically labeled training corpus of natural language query sentences is used for learning the Semantic Stochastic Context Free Grammar (SSCFG). Second, given a user provided natural language query sentence, we use the Earley-Stolcke parsing algorithm to determine the maximum likelihood semantic parsing of the query sentence. This parsing infers the semantic meaning for each word in the query sentence from which the SPARQL query is constructed. Third, the SPARQL query is executed to retrieve relevant video segments from the RDF-OWL video content database. The method is evaluated by running natural language queries on surveillance videos from maritime and land-based domains, though the framework itself is general and extensible to search videos from other domains.