An effective and versatile keyword search engine on heterogenous data sources
We present EASE, an effective and versatile keyword search engine that enables users to easily access the heterogenous data composed of unstructured, semi-structured and structured data, without the need of learning XPath/XQuery or SQL languages. EASE addresses a challenge in keyword search that has been neglected in the literature: how to efficiently and adaptively process keyword queries on the heterogenous data. To provide such capability, EASE models unstructured, semi-structured and structured data as graphs, summarizes the graphs, and constructs graph indices instead of using traditional inverted indices for effective keyword search. EASE adopts an extended inverted index to facilitate keyword-based search, and employs a novel ranking mechanism for enhancing search effectiveness.