Abstract
Keywords
1. Introduction
Current practice in service robot research should be placed in the context of particular research groups: the background and interests of the group and the group's members, the group's research network, the explicit agenda of short-, medium- and long-term goals, the preferred tools and methodologies, and the group's practice and experience. It is also necessary to consider whether the focus is academic research and advancing the state of the art, or whether it is to develop human resources and/or the development of commercial applications. In addition, it is also important to consider how the effort is directed and how productivity is assessed. One has to see, for instance, if the aim is producing robotics devices and algorithms or else fully operational service robots, publishing journal and conference papers, patents and utility designs or research corpora. Academic productivity can also be assessed in terms of the doctoral and masters dissertations produced within the context of the group, and also through demos, formal evaluations and competitions, such as RoboCup@Home 1 . These dimensions define a very large space in which research efforts can be placed, and although the stated purpose may be to develop ‘service robots’, different groups may be doing very different things.
In practice, most service robots research and development groups are focused on particular specialities, like navigation [1–5], manipulation [6–10], vision [11–15], robot planning and coordination [16–22], audio [23–25], control and signal processing [26, 27], operating systems [28, 29], speech and language processing [30, 31], machine learning [32–37], human-robot interaction [38–41] and artificial intelligence [42], among others, that constitute the service robot's supporting or enabling technologies. A detailed review and systematic comparison of the different approaches in the specialities mentioned - and possibly others - is a huge task to be undertaken that is beyond the scope of this paper. However, what we would like to highlight at this point is that groups are strong in those specific disciplines, but they simply integrate other technologies needed in the construction of the robot as a whole, and the development of the robot itself is mostly a question of “implementation”. From this perspective, the robot “emerges” as a side effect of the various functionalities but it is not the proper object of research, and somehow paradoxically it merely provides a context for making progress in the supporting technologies. The state of the field is hence not coherent, makes communication and interaction between groups difficult, fosters highly unbalanced development and, although there may be a large number of contributions to the supporting disciplines, current practice prevents the clear development of the service robots as an objective in itself.
The present state of the field is due, at least in part, to the lack of a clear and explicit concept of a service robot that is shared by the community and - consequently - of guidelines on how to articulate such a concept in particular research efforts. This paper is concerned with the analysis of such a concept and its impact on the development of service robots in general. In section 2, we place the problem in the context of system levels, and argue that there is a service robot level corresponding to the knowledge level in Newell's system levels [43] as well as to the computational theory in Marr's system levels hierarchy [44]. We argue that a higher-level specification of the function of a service robot provides context and coherence to the enabling technologies, producing a virtuous cycle that fosters progress in the field of service robots, and also promotes advances in enabling technologies directed specifically to service robots, and hence progress in the discipline as a whole. We also propose placing a ceiling on the tasks that can be performed by machines with current technology and adopt the practical task and domain-independent hypotheses for service robots (as presented in [45]), after the corresponding hypotheses for dialogue systems [46], and pose that the conceptual model for a service robot needs refer to such a notion and hypotheses.
In section 3, we discuss how the concept of a service robot can be articulated in practice through the explicit definition of a specific conceptual model for particular service robots. This consists of a highly abstract specification of what the robot does from the point of view of human users in terms of a set of behaviours at the level of the task and a behaviours composition mechanism. The conceptual model also permits the introduction of an explicit notion of a
The abstract specification with an instantiation of the conceptual model is illustrated in section 5. For this, we use the notion of a dialogue model for the specification of behaviours, the SitLog programming language for the specification and interpretation of the task structure [45] and the IOCA architecture [47]. The static and dynamic composition modes are exemplified with the
The main contribution of the present paper is in the articulation of a conceptual model for a service robot in which the specification of generic tasks is stated at a functional level that is oriented to the human user. This level consists of the specification of the robot's competence and is distinguished from the algorithmic and implementation levels that are commonly the focus of robotics research and which determine the robot's performance. The paper is concluded in section 7, with an overall reflection on the framework and methodology for the development of service robots in diverse application domains and the impact of the conceptual model in the field as a whole.
2. Concept of a service robot
Service robots are the product of implementation efforts and ‘emerge’ from the integration of diverse technologies, which are in turn supported by system software and utilities of different sorts. Questions about the design and implementation of perception and action algorithms can be stated explicitly in terms of specific functionalities and constraints, and the resulting devices can be assessed in relation to such specifications. However, the question of what is that we do when we design and build a service robot is somehow more difficult to answer. To grasp this point, we draw an analogy between service robots research and the design and construction of auto-mobiles of the standard sort. The car comes from the integration of a number of enabling structures and systems, mainly the body shell, the engine, the transmission system, the suspension system, the steering system, the breaks and the electrical equipment. Each of these technologies is the product of a research and development field with its own questions, practices and traditions; however, the car as a unit has a functional definition which consists of transporting people with some range of specific needs (e.g., sport cars, family cars, etc.), and this specification determines and regulates the specification for the particular structure and systems enabling such functionality. Hence, the evolution of auto-mobile technology can be seen as the product of a virtuous cycle between the function and the enabling technologies.
The relation between the design object and the enabling technologies can be thought of in terms of the notion of system levels. This notion is familiar to the philosophy of science, in which more specific sciences ‘reduce’ to more general ones, like chemistry which reduces to physics. Each discipline has its own focused phenomena, which are described by a set of general laws and a specialized vocabulary of theoretical terms, but at a particular level of abstraction that is relevant to the phenomena of interest. The reduction proper involves a mapping from the laws and theoretical terms between the corresponding theories. These notions have been applied to computing systems by Newell [43]. Newell's paper was motivated by the lack of a clear or common understanding of what knowledge was at the time, and yet there was a very significant effort devoted to the construction of knowledge-based systems, very much like the current situation in service robots research.
System levels in Newell's sense involve independent layers with a well-defined input, output and transfer function, which can be thought of as systems in themselves or else can be used in the construction of higher levels. Newell's levels for computational systems are the physical level, the device level, the electronic circuit level, the logic circuit level, the transfer-register level (i.e., computer architecture), the symbol level (i.e., programming languages) and, on top of this hierarchy, the knowledge level. An important distinction introduced by Newell was that all levels but the knowledge level reduce to the next down in the hierarchy, in the sense that a computer program written in a programming language can be mapped down into the computer architecture, or a logical circuit can be mapped directly into its implementation in an electronic circuit. The knowledge level, for its part, cannot be reduced. For this, the knowledge level is not only at the top of the systems level hierarchy but it also has a particular quality that makes it altogether different from all the other system levels. Function is stated at the knowledge level and it stands apart from all supporting or enabling technologies; for this reason, we think of the car as something that emerges from its constituent parts but which cannot be reduced to them. We can pose the same distinction for the service robot and think of a functional specification of what the robot does from the point of view of people, and this specification can be distinguished from the robot's mechanisms and systems.
An alternative to Newell's system levels, although somehow from a different perspective, was introduced by Marr [44], who distinguished between three different levels that he called the
Correspondence between Newell's, Marr's and Service Robot's System Levels
Newell's, Marr's and the present notion of system levels should be distinguished from actual computer architectures, like subsumption architectures [47], cognitive architectures [48] or layered architectures [18], as these are mostly orthogonal notions. We also need to consider the limitations of current technology in relation to open tasks that can be performed by people in natural environments. It is clear that human higher mental functions, like language, vision and memory, and also intentional motor behaviour, like walking or grasping objects, for instance, are much more complex than the functions that can be performed by the most sophisticated current machines, and that a full understanding of these functionalities is far removed from our current state of knowledge. Hence, it is necessary to place a reasonable limit on the things we can do with service robots. For this, we adopt the practical dialogues and domain-independent hypotheses suggested by Allen for dialogue systems [46], and pose the corresponding
3. Conceptual model and task structure
We proceed now to discuss how the concept of a service robot described above can be articulated in practice. For this, we abstract over hardware devices and their associated algorithms, and focus on the functionalities that these provide from the point of view of the human-user. We refer to each basic functionality in this set as a
There is a very large range of possibilities regarding the selection and specification of behaviours and composition mechanisms, with their corresponding properties, and we here pose that the conceptual model of a particular service robot consists of this particular choice. Hence, the field of service robots can be construed in terms of the study of behaviours and composition mechanisms with their theoretical properties and empirical validation. In a sense, all service robots have a conceptual model; however, the more explicit this is, the better the properties of the robot are understood and capitalized upon by its designers and users.
Furthermore, the catalogue of the robot's abilities and its composition mechanisms define the robot's competence and permit us to ask explicitly what it is that the robot can in principle do. However, abstract specifications can be implemented with different algorithms and computational devices, and physical robots with the same conceptual model may perform differently with different implementations. In this regard, Chomsky's distinction between linguistic competence and performance [49] can be applied to the field of service robots research: while “the task grammar” defined by a set of basic behaviours and composition mechanisms states the robot's competence, the actual robot's performance depends upon the particular choice of enabling technologies, system software and physical devices.
3.1 Task specification language
The explicit representation and interpretation of the task structure requires a specialized programming language so that final applications can be developed and tested with in reasonable time and with reasonable effort. Such a language should have enough expressive power to state a basic and composite task in a declarative way, and should also be rich enough to allow for the expression of content and control information; it should also have abstraction capabilities to express complex behaviours in a simple way. The design and implementation of such a specification and programming languages is another area of service robots research, and significant efforts in this regard are already apparent ([45, 50–55]).
3.2 Task structure and behaviours
Task structure and behaviours naturally define two layers of functionality, as illustrated in Figure 1. The upper layer stands for a composition of behaviours that constitute the task structure of a particular application, and the lower layer corresponds to the set of basic behaviours constituting the robot's native capabilities. For instance, the

Application and Behaviours Levels
Basic behaviours in turn are structured objects defined in terms of other behaviours, as illustrated by the hierarchies of
However, independently of its internal structure, a behaviour is also an atomic unit that can be a part of the task structure directly. This is illustrated by the directed dotted lines connecting situations of the task structure in the upper layer with behaviours. From the perspective of the task structure, the behaviours
Finally, the behaviours in the bottom layer of Figure 1 are grouped in four main areas of functionality, which are
3.3 Task management
The explicit representation of the task structure also allow the use of deliberative resources, knowledge-bases and task management strategies dynamically with the execution of a task. The task structure may also involve knowledge of constraints, like the time allowed to complete the task and the scores for achieving total and partial goals, as well as knowledge of the robot's own physical resources, like the number of hands and their state (i.e., holding or free) along the execution of the task, and the locations and distances between the places of the scenario, which may be collected dynamically. Additional
An explicit representation of the task structure also permits the identification of deliberative points where diagnosis, planning and decision-making may be particularly relevant. A particular deliberative situation occurs, for instance, once the robot has received an order in the
We also need to consider that the scenarios in which service robots are expected to perform are very noisy and that a large number of contingencies may arise along the way, so handling the time and other constraints is crucial in achieving the goals; hence, explicit task management is needed to supervise the process and make decisions along the way. The explicit representation of the task structure makes it possible to define such task management processes along with the deliberative inferences required to support it.
Task management is also relevant to handling faults due to either external or internal contingencies that may arise in the execution of behaviours. For instance, [56] present an analysis of fault diagnosis within a logical framework using naive physics and an ontology for hypothesis generation and fault prevention. A more general kind of fault occurs when the robot becomes out of context due to a mismatch between its expectations and the events in the world. In this latter situation, the robot may get back into context through an abductive inference in relation to a common sense theory about the states and actions that take place in the environment - like the home - including causal rules involving states and actions. Task management in this setting may be construed as a pipeline process involving fault detection, the formulation of a fault hypothesis through abduction in relation to the causal theory of home dynamics, and hypothesis ranking and the identification of possible courses of action through planning and decision-making. This pipeline is required, for instance, to handle dynamic scenarios where the robot cannot accomplish an explicit goal due to changes in the environment, as in the execution of commands Type 3 of the General Purpose Service Robot of the RoboCup@Home competition, as discussed below.
4. Dynamic task structure
There are scenarios in which the structure of the task is not available in advance and must be defined and executed dynamically. An instance of this situation is the
Situation: There is nobody in the living room, but there are people in the kitchen. The robot starts in the kitchen.
Command:
Commands of Type 1 are constituted by a sequence of basic commands, where each command expresses a basic intention or speech act, which in turn corresponds to a basic behaviour or sequence of behaviours that can be performed by the robot directly. The interpretation of this type of command requires no inference or problem solving, but the parsing inference involved in mapping the command to its corresponding speech act, and also in mapping the speech act to its corresponding behaviour or sequence of behaviours. For instance, the speech act move can be expressed by a number of different expressions like
Commands of Type 2 are underspecified orders that need to be determined either through linguistic interaction with the human user or through conceptual inference (e.g., querying a conceptual taxonomy in the robot's knowledge-base), or else by a combination of these two strategies. In the present example, the command states that a snack must be carried to a table, but it does not specify which snack or which table. In order to accomplish this order, the robot may ask the user to specify such information (i.e., solve the task through linguistic interaction); alternatively, the robot may find a particular object through vision, query whether it is a snack in its knowledge-base, and take it to any table whose location might also be stored in the knowledge-base (i.e., solve the task through conceptual inference); a third strategy might consist of finding the snack and asking the user for the table (i.e., combining interaction and inference). In any case, once the references are determined, the Type 2 command is reduced to a Type 1 command, which can be executed directly as before.
Commands of Type 3 also involve error detection and explicit task management. In the present example, the robot will go to the living room but will need to realize that nobody is there and execute a task management action, which might be to search for people in other locations. Error detection and task management are essential for robust behaviour, as robots need to be able to cope with a large number of contingencies that can appear during the execution of a task.
The higher the type of command, the higher the parsing effort; but assuming a robust and comprehensive parsing strategy, the increase in the difficulty of the three types of commands depends mostly on the need to engage in linguistic or visual interaction supported by inference, and also on the need to be aware of whether the actions performed are successful and to carry on with the appropriate actions. This may require explicit task management, involving diagnosis through abduction, planning and decision-making, considering any constraints and
Another consideration is that commands, speech acts and behaviours do not necessarily correspond univocally; although this may be the case for particular commands, this is not the case in general, and several commands may correspond to the same speech act, and a speech act may require the execution of several behaviours. In addition, the same command may correspond to different speech acts, and the context may be essential for resolving the ambiguity. A study of the sentence generator for the
The relation between speech acts and the corresponding behaviours is not one-to-one either. In this example, there are 11 speech acts but 25 behaviours, as shown in Figure 1; there are also speech acts that need to be assembled out of several behaviours, and there are also behaviours that do not correspond with a unique speech act. Indeed, the different kinds of relations between speech acts and behaviours mostly determine the type of command.
In addition to the behaviours, the
Commands of Type 1 state the basic case and relate a fully determined speech act to a particular behaviour, for instance:
In addition, a command may be interpreted in terms of behaviours and commands, as follows:
This shows that there is a feedback cycle between the linguistic (i.e., speech acts) and the behavioural component giving rise to complex behaviours, but with a simple and well-structured interpretation regime.
Commands of Type 2 involve the specification of the arguments of the speech act by means of conceptual inference, and possibly some linguistic or visual interaction strategy, for instance:
where
Commands of Type 3 add explicit task management to the behaviour so that appropriate actions can be taken in case performance errors occur, or in case the task cannot be executed in the actual scenario. To handle this type of command, a status argument must be included in the specification of all behaviours, for instance:
The error handling and task management processes check the status of each behaviour and decide whether to proceed with the task or take an appropriate task management action.
Commands of Type 1 and Type 2 should proceed straightforwardly according to the rules of the competition, as the status argument of the behaviours must always be ‘ok’; however, this argument in commands of Type 3 may have a different value, in which case the system must engage in a task management process involving diagnosis, planning and decision-making; this is, deliberative behaviour, in order to proceed with the task. The status arguments are defined for behaviours but not for speech acts, as these are the interpretation of the intentions expressed by the users, and hence the specification of the intended behaviour. In the present case, the robot ought to be able to make a diagnosis as to why there are no people in the room and come to the plausible conclusion that they have moved to another room, make a plan for visiting other rooms, and execute the
Task management involving deliberative behaviour is, of course, quite complex and the conceptual model of the robot should include the required supporting inferential and knowledge-base resources, but once again the particular algorithms and implementation strategies are not part of the functional specification. In particular, the mapping from natural language statements to the corresponding speech acts is not a part of the conceptual model. This facility should of course be available, but speech and language processing are enabling technologies - as any other - and there are several possible strategies and implementations; hence, no particular mechanism should be assumed in the conceptual model.
In summary, the specification of the

Interaction-oriented Cognitive Architecture (IOCA)
A final remark may be made that applications whose task structure can be defined in advance through analysis - like the standard tests of the RoboCup@Home competition - can also be thought of as a behaviour schema that could be specified and performed as a sequence of behaviours that can be interpreted and executed by dynamic composition mechanisms, as illustrated here for the
5. An instantiation of the conceptual model
In this section, we briefly describe a particular instantiation of the conceptual model. The central aspect is the definition of a machine for the declarative specification and interpretation of the task structure of final applications, and an interaction-oriented cognitive architecture
IOCA is a cognitive architecture with three layers directed to 1) reactive, 2) interpretation and action specification, and 3) representation and inference levels, from the bottom to the top respectively. In this respect it differs from subsumption architectures, like brooks [45], which reject representations, and also from multi-robot coordination architectures, which place a planning layer at the top as its main deliberative behaviour [18], whose discussion belongs to the algorithmic level of analysis advanced in the present paper.

Static Task Structure
SitLog's interpreter is written in Prolog, and SitLog's programs follow closely Prolog's notation. Each dialogue model consists of a set of situations or information states, and a dialogue model has a diagrammatic representation as a graph of situations. A situation in turn consists of a set of expectation and action pairs in addition to the situation that is reached when an expectation is met (i.e., when an expected state or event in the world is acknowledged through perceptual interpretation) and its associated action is performed, in addition to other content and control information. Situations are represented through a list of attribute-value pairs, as shown above.
The symbols at the left of ==> are the attribute names, and the symbols at the right stand for their corresponding values, which can be variables or expressions through which the expectations, actions, next situations and control information are expressed. Each situation has its ID, type and input and output arguments; the type indicates the kind of modality that is involved in the perceptual act through which expectations are acknowledged (e.g., vision, language, etc.). The attribute prog has as its value a local program which is executed unconditionally when the situation is reached. If the type of situation is
There are three main kinds of dialogue models standing for the static task structure: 1) the task structure of final applications, 2) the set of generic behaviours in the conceptual model, and 3) the recovery protocols that are evoked when the flow of interaction is interrupted due to a mismatch between the robot's expectations at the situation and the states and events in the world. DMs of Kind 1 are developed by final application programmers, who rely only on the set of behaviours defined in the conceptual model and the specification and programming facilities provided by SitLog. Dialogue Models of Kind 2, on the other hand, are a part of the specification of the conceptual model proper, and are developed by the robot's production team. Behaviours have an internal structure which is codified in SitLog and also an external aspect which depends upon the perception algorithms that support the interpretation of external information as well as upon the action algorithms that render the concrete actions performed by the robot through its actual physical devices. Lastly, the recovery protocols can be specific to final applications and hence developed by application programmers, or alternatively generic recovery protocols that enrich the conceptual model and are developed by the robot's production team.

SitLog's specification of a General Purpose Service Robot
5.1 Example of a static task structure: The restaurant test
An instance of an application with a static task structure is the
5.2 Example of a dynamic task structure: The EGPSR test
We illustrate now an instance of the specification and interpretation of a dynamic task structure. For this, we present the DM for the interpretation and execution of the three types of commands of the
The amount of task management knowledge has a very large impact on the overall strength of the service robot and is indispensable in the execution of commands of Type 3, as without this kind of knowledge the robot could not recover from unexpected obstacles or arbitrary changes in the scenario. The current strategies use heuristics and schematic behaviours to deal with expected test scenarios, like the strategies
The present case studies for both the static and dynamic tasks are illustrations of how the conceptual model is implemented; however, the same conceptual model has been used in the implementation of the rest of the RoboCup@Home tests, providing additional support for the generality of the present approach.
6. Description of the robot Golem-II+
Golem-II+ is our in-house service robot, presented in Figure 5. It is based on a PeopleBot model, with several major enhancements carried out in-house, both software- and hardware-wise. In Table 2, a brief summary is presented of its hardware and the software libraries used.

The Golem-II+ Service Robot
Software Libraries used by the IOCA Modules and the Hardware of Golem-II+
7. Conclusions
In this paper, we have introduced and discussed a concept of a service robot. A service robot is an entity that is able to perform a number of basic behaviours and compose them in the execution of complex tasks. This concept is articulated in practice through the definition of an explicit conceptual model for particular service robots. We have discussed how the conceptual model and framework can be applied in general, and we have also illustrated the model with a particular implementation in the robot Golem-II+ with the IOCA architecture and the SitLog programming language, which have been developed within the context of the Golem Project
2
. Video demonstrations of the examples discussed in this paper, as well as the actual SitLog code, can be accessed at
As an overall reflection, the present framework permits us to conceive of service robots research as a discipline of its own, consisting of the study and functional specification of useful behaviours from the point of view of human users and the ways in which these can be combined in the composition of complex tasks, including the design and implementation of specification languages for robot tasks. We propose that a clear demarcation between research into service robots and research into their enabling technologies and system software facilitates communication and collaboration, as individual researchers and groups will have a clearer idea of what the focus and system level are in relation to which their efforts are made. We also suggest that this demarcation of labour will foster a virtuous cycle between service robot research and enabling technologies, yielding progress in the field as a whole.
More generally, the definition of the conceptual model in terms of behaviours and composition mechanisms not only provides for a clear demarcation of application and robot development teams, but also determines the set of enabling technologies that are required to support a given set of behaviours. This latter specification is also functional, and the robot's design team must decide on particular algorithms and system design considerations. There may be a large range of technologies and implementations to choose from and a given selection will have an impact on the robot's performance but not on its competence, as this is determined by conceptual model. A final aspect of the design as a whole is the system software required to support the robot's architecture, which is the backbone of the robot. However, this is also an implementation decision that will have an impact on the robot's performance but not on its competence.
Finally, the conceptual model and a declarative language to state the task structure permit us to demarcate clearly the activities related to the development of the robot proper from the activities related to specifying the structure of a particular task or programming a particular application. In current practice, these two activities are heavily interwoven, as developers of enabling technologies and system programmers need to take into account specific aspects of particular applications. Moreover, application developers need to work with algorithms and system programming, making rather difficult the development and testing of final applications. Like cars, if a robot is offered to the general public, the focus of the robot-maker should be what the robot can do in principle and how well it performs in practical tasks (e.g., how reliable and efficient it is) so that final applications can be easily specified and developed, and robots can be used in practice by human users.
