Collective intelligence is most often understood as group intelligence which arises on the basis of intelligences of the group members. On the one hand, Newell [19] defined an intelligent collective as a social system, which is capable to act, even approximately, as a single rational agent. On the other hand, Lévy [11] understood collective intelligence as an intelligence that emerges from the collaboration and competition of many individuals; an intelligence that seemingly has a mind of its own. These definitions refer to cognitive systems. From the computational and artificial intelligence point of view, we can think about a collective as a set of autonomous units working on some common task, for example, a multi-agent system. We can say that the collective is intelligent if it can make a use of the intelligences of its members for solving some problem, for example, a decision making problem. Thus Computational Collective Intelligence (CCI) should provide computational methods which are based on the collective intelligence aspects and which use the computational techniques for solving these problems. Regarding knowledge engineering, computational collective intelligence provides methods and techniques for determining the knowledge of a collective as a whole on the basis of collective members’ knowledge. The need for processing collective knowledge is quickly increasing because of the very fast development of Internet, social networks and distributed databases. It is obvious that knowledge originating from autonomous sources for the same subject is very often inconsistent. Therefore, the aspects of inconsistency processing and integration computing are very important.
Knowledge engineering [9] plays a relevant role in CCI since it is necessary to use its techniques in order to represent individual information. In fact, there are studies advocating that scientific knowledge is essentially collective knowledge [23]. In this line, recent studies state that it is very important that different individuals provide orthogonal, highly unrelated, and possible contradictory knowledge to the collectivity. In other words, “the higher the inconsistency, the better the quality of collective knowledge” [20]. Another relevant interaction with knowledge engineering is ontology matching and integration [27], a fundamental activity in order to collect and process information in CCI. Since it is well known that this process is both time and resource-consuming, it is necessary to provide advanced architectures and platforms to reduce these costs [15] and good algorithms and frameworks to efficiently integrate ontologies [16, 21].
Although machine learning and data mining are two independent fields of work, there are frequent interactions between them [18]. Interestingly enough CCI uses machine learning and data mining in its solutions but there are also numerous applications of CCI to improve machine learning and data mining processes. A first issue concerns the representation of data in machine learning, given the fact that efficient algorithms strongly rely on a good structure of data. Actually, the research community organizes contests to compare existing approaches and to identify future challenges [7] and, as a result, new highly efficient methods, beating previous learning methods, are developed to improve representation learning [30]. Given the huge amount of data generated by current information systems, it is a must to use good machine learning algorithms to discover knowledge in these vast datasets. In order to reduce the amount of useful data, new techniques take advantage of the improvement in performance provided by parallel algorithms to process distributed data [31].
Learning from data streams is an area of increasing importance [14] and new approaches are needed to perform data stream mining and, again due to the huge amount of processed data, classify the received data, according to similarity, with good performance [17, 22]. Data stream mining uses prediction models to deal with historical data. However, these models lose accuracy if they do not frequently update old data with current data, so that it is necessary to deal with concept drift and adapt to it [5]. In particular, in order to cope with this problem when classifying streamed data, new methods must take into account temporal dependencies and splitting criteria based on misclassifications [24, 33].
Another area of interest, highly related to the issue of processing huge amounts of unstructured data, is text processing and information retrieval. This line of research is currently very active because of the ubiquity of social networks, an obvious area of interest for CCI. Information extraction in tweets is specially challenging because traditional natural language algorithms cannot be used to process them. In particular, named entities recognition (NER) is difficult to perform because, due to their short nature, single tweets do not provide enough information and novel clustering techniques must be used [13]. Interestingly enough, similar problems with NER appear in other disciplines, most notably in Medicine where active learning methods are successfully used [3]. In fact, clustering methods are widely used to detect communities in social networks, where members involved in similar social objects are grouped [32], although some researchers advocate that the existence of random factors needs to be taken into account so that statistical fuzzy approaches are more suitable [12].
Intelligent information systems are very useful in areas where vast amounts of heterogenous, and usually unstructured information, must be processed. These systems are gaining popularity in Medicine, in particular when used to evaluate the quality of health care systems [1] and monitor the progress of patients [29]. In the latter case, it is necessary to take into account the special nature of vital signs so that the best performance to predict patient conditions can be obtained when using a fuzzy model.
Intelligent database systems play a role in many different areas. In particular, they can be used in Computer Science to improve existing methodologies. This is the case of using databases to help in the replication of experiments in Software Engineering [2, 6]. This is an area where, again, statistic information is frequently used to construct robust methods [10]. This is specially relevant if datasets are non-normal.
The next kind of systems that we consider within the topics covered in this special issue are decision support systems. They are particular relevant in the context of CCI if we consider them in the scope of decision making and knowledge engineering, most notably, in the context of decisional DNA [25], a structure suited to obtain knowledge in decision making processes, and virtual engineering objects [26].
We finish this brief overview of the field with the application of computer vision techniques to video surveillance and object detection. This is a line of work where huge amounts of data must be adequately processes and analyzed. Since data are observed and/or collected from distributed locations and intelligent decisions must emerge, in particular when an imminent danger is recognized, this field is in the scope of CCI. The number of cameras installed in public areas is steadily increasing with the consequent increase of the amount of images to be processed. One of the current concerns is the detection of abandoned objects because of the threat that they can represent. Therefore, new robust and efficient algorithms are being developed, taking into account varying circumstances such as lighting changes [28]. Similarly, the detection of pedestrians, as well as the objects that they carry, is an active line of work [4, 8] where probabilistic approaches are widely used to decide whether the pedestrian is carrying potentially dangerous artifacts.
The aim of this special issue is to present to the research community a comprehensive collection of articles including the most relevant and recent achievements in the broad field of Collective Intelligent Information and Database Systems. We have been able to cover most methodological, theoretical and practical aspects of Collective Intelligence, and its relation to databases, understood as the form of intelligence that emerges from the collaboration and competition of many individuals (artificial and/or natural). This special issue includes, in particular, extended and revised versions of papers selected from the 2016 edition of the ACIIDS conference and the 2015 edition of the ICCCI conference. In addition, we called for high quality, up-to-date contributions in the broad field of Collective Intelligent Information and Database Systems.
The topics of interest for the special issue are those considered in the intersection between the ICCCI and ACIIDS Conference series. The papers in this special issue are distributed according to the following specific categories:
Knowledge engineering and semantic web.
Text processing and information retrieval.
Machine learning and data mining.
Social networks and recommender systems.
Agent and multi-agent systems.
Intelligent information systems.
Database systems and software engineering.
Decision support and control systems.
Computer vision techniques.
After a careful reviewing process, we have selected 40 papers to conform this special issue. All submitted papers, including the extended versions of conference papers, were peer-reviewed and selected on the basis of quality and relevance to the special issue.
We would like to thank the authors of the submitted papers for their interest in the special issue and the high quality of their contributions. They are the most important piece to conform a relevant and interesting scientific work. We would also like to thank the members of the Guest Editorial Board, and their subreviewers, because their careful work and dedication have been fundamental for the success of this special issue. The list of memberscan be found at https://sites.google.com/site/sejifs2016/guest-editorial-board. Finally, we would like to thank Van Du Nguyen (Wroclaw University of Technology) for his help with the web site.