Abstract
This editorial summarizes the content of the special issue of the Semantic Web Journal on question answering for linked data.
The Data Web is now a reality for a large number of experts. With more than 10,000 datasets published according to the Linked Data principles and more than 150 billion facts1
The papers accepted in this special issue all present innovative approaches to dealing with question answering on Linked Data. [8] addresses the problem of building a knowledge base of rules for question answering system. To this end, the authors introduce an intermediate representation for questions that can be used across languages. The approach is applied to Vietnamese with high accuracies. This paper presents an alternative approach to a large number of state-of-the-art systems, which focus on English and use the large number of NLP tools available for this particular language to generate question parses and corresponding answers to these questions. Therewith, it can potentially lead the way towards novel approaches for question answering.
The authors of [4] address the problem of finding approximate answers for SPARQL 1.1 queries. The provision of solutions for this problem is of central importance when faced with zero-result queries or queries with unsatisfactory results. The authors present a framework that allows generating relaxations incrementally, making the idea of relaxation theoretically amenable to interactive applications.
GFMed [7] shows how designing specialized question answering systems can lead to high-performance question answering for the biomedical domain. The approach presented in this work relies on a controlled vocabulary, which allows generating SPARQL queries when coupled with a corresponding grammar. Once again, the idea of language-independence is tackled as the approach is evaluated on Romanian and English.
The authors of [5] address the same problem as the paper aforementioned but rely on a different approach. Here, the authors use natural-language processing techniques to generate abstractions of questions, which are converted into SPARQL query templates. The templates are then instantiated and executed. The approach is shown to perform well on benchmark data and suggests that the way of mapping languages is still a viable option for building question answering systems.
The paper [1] addresses the problem of information reconciliation for question answering. The distributed nature of the Linked Data Web is made use of to collect and integrate information necessary to answer questions. In particular, the authors use a framework based on argumentation theory for the reconciliation and are able to provide explanations for their results. The reconciliation approach is applied to DBpedia and used to create a dataset that subsumes all chapters and that can be used for better question answering. This paper displays how improve data quality can lead to better Semantic Web applications.
While the papers presented in this special issue present a significant advance over the state of the art, current surveys suggest [6] that there is still a long ahead before achieving highly accurate question answering on RDF data. Amongst the most important challenges lie the problem of multilinguality, which remains particularly hard to tackle for languages with only few linguistic resources. Achieving user-friendly runtimes on complex queries is also still ongoing and demands improved storage and indexing solutions for the Linked Data Web. Domain-specific questions (e.g., procedural, temporal, spatial and statistical questions) demand different types of processing as dedicated semantics are needed to replicate the model of the natural language used to formulate the query into a formal language such as SPARQL. More diverse and intelligent natural-language interfaces such as dialog and recommender systems for the Linked Data Web complete this non-exhaustive set of possible improvements for the future.
