[Documentation] [TitleIndex] [WordIndex

Only released in EOL distros:  

Package Summary

Spatial World Model for Object Tracking

About

The spatial_world_model contains libraries, database configuration scripts, and ROS wrapper nodes to communicate with the Spatial World Model. The Spatial World Model is a persistent, spatial representation of the world and a robot's working memory. This includes tracking and storing relationships between physical objects, maps, robots, and boundaries to name a few. A PostgreSQL database is used in the back-end to maintain this information.

robot.png

How is the Spatial World Model Different from Other World Model Approaches?

The Spatial World Model provides a general representation, persistent storage, and querying of entities (objects) and actions (affordances) organized in 3D semantic maps. The Spatial World Model is aimed to be used both by autonomous mechanisms (for object recognition, map building, affordance learning, etc.) and direct annotation by human users. The Spatial World Model differs from existing approaches by considering only the spatial and physical properties of objects, and foregoes broader conceptual and ontological knowledge. As such, the Spatial World Model aims to keep a modular separation between the spatial representation of the world and the specific inference mechanisms that could be used for AI and decision making. What the World Wide Web 1.0 and HTML did for 2D documents, the Spatial World Model aims to for 3D objects in the physical world.

Example -- Map Annotation

As a basic example for the types of end-user interfaces that can be created with the Spatial World Model, we look at the Map annotation interface. Information on this interface can be found on the worldtoolsjs GitHub page. Below is a video demonstrating its capabilities:

Installation

Installation of the Spatial World Model requires two steps:

  1. Setting up the PostgreSQL database (either locally or on a remote server)

  2. Setting up the ROS nodes to communicate with the Spatial World Model

Installing the Spatial World Model Database

The following steps are written for Ubuntu 12.04 but apply to most Linux systems.

To begin, we must install PostgreSQL and the Python libraries that will talk to it. To do so, run the following command:

Next, we will need to create the actual database. To do so, execute the following:

It is never a good idea to user the default (root) user as the main user for the database. Therefore, we will create a new user that will be used solely with the new world model database. To do so,

You will then be prompted for a password.

Next, we will have to grant this new user permission to our database:

Finally, we are able to install the database schema. This is provided in a script found in the worldlib package. This script can be used to both install a new database and to update an existing one.

Allowing Remote Connections (Optional)

In many cases, you will be installing the Spatial World Model database on a central server so that multiple clients (and robots) can talk to it. As of now, the ROS nodes communicate via SQL to the database; however, due to security risks this will eventually be changed. To allow remote connections, we must modify the configuration scripts on the system. Using your choice of editor, modify /etc/postgresql/9.1/main/pg_hba.conf with root privileges and add the following line:

Next, modify /etc/postgresql/9.1/main/postgresql.conf with root privileges and add the following line:

Finally, restart the server:

Installing the ROS Software

--- coming soon ---

Startup

--- coming soon ---

Implementation Goal and Design Decision

The Spatial World Model project is currently in its infancy and under active development. Parts of the API are considered to be highly unstable. The following sections describe both the long term implementation goals and design decisions associated with the project.

Current Implementation

High-Level Overview

At its core, the Spatial World Model is designed to be a persistent, multi-robot model to keep track of both the robot's working memory as well as keeping track of properties, affordances, and activities that can be associated with each object. To manage persistence, the world model is stored in a PostgreSQL database At a high level, currently the Spatial World Model allows for two sets of entities: a WorldObjectInstance and a WorldObjectDescription.

objects.png

WorldObjectInstance

A WorldObjectInstance, defined in the WorldObjectInstance message, can be thought of as a robot's working memory. At a basic level, such an entity contains a relative pose in the world with associated tags and timestamps. These entities describe a particular, specific instance of an object in the world (e.g., the cup sitting on the desk in the conference room). Each instance is linked to a single WorldObjectDescription which contains a set of spatial descriptors for the object (e.g., mesh, bounding box, point-cloud cluster, etc...). Below is a detailed explanation of the fields in the WorldObjectInstance. Note that some fields will be blank depending on the type of object or what you know about the world.

It should also be noted that things stored as object instances need not be physical objects in the traditional sense. It is appropriate and sometimes necessary to include things like maps, rooms, and robots in the world model. To learn about how some of these things are stored, refer to the following section describing listeners.

WorldObjectDescription

The second implemented entity is the WorldObjectDescription. This entity, defined in the WorldObjectDescription message, contains spatial descriptors of objects in the world. These are shared models that are common between all instances of such an object (e.g., a 3D mesh of the object itself). Each descriptions contains a set of tags and an array of actual Discriptors. The genericness of the descriptor model allows for models to come from a variety of sources (point cloud segmentation, 3D model warehouses and databases) with few-to-no restrictions. The main idea is to associate an appropriate type and ref field with each descriptor to determine how the data should be treated. In a sense, the type field can be thought up as a non-standard MIME type (PNG, Collada, but also nav_msgs/OccupancyGrid as a type). Future goals of the project set out to create a standard set of accepted type fields. Below is a detailed description of the fields associated with a WorldObjectDescription.

Design Decisions

The Database

The first design decision was to use a PostgreSQL database for storage. Given the highly relational components associated with the world model (e.g., WorldObjectInstance and WorldObjectDescription), it made sense to use such a database over other types of databases. For efficiency in searching and storage, the database schema itself is broken into finer grains than the APIs allow for. It is intended that developers make user of these higher-level APIs when dealing with the Spatial World Model as apposed to making raw SQL queries.

Higher Level APIs

APIs.png

The current implementation includes several layers of APIs. As mentioned previously, it is not intended for a developer to use the world model by directly making SQL queries. At the lowest level, the worldlib Python API should be used. This level of the API is responsible for talking SQL to the world model database and is able to make basic insertion and search queries while maintaining the correct structure. This level of the API allows for non-ROS processes to make use of the world model (another future goal of the project). By using an SQL connection between this library and the database, remote connections can be made and a central database can be used (such as one hosted in the cloud). This, of course, requires your server to allow remote SQL connections which is not ideal. Therefore, future plans hope to create a server-side API to allow for remote queries (think REST as an example but this would require polling). With such an API in place, the interface between the robot or client and the database could be made with this new service.

The second level of the API is the actual ROS node itself. The world_model node makes use of the Python API to communicate with the database. This node then offers a series of action servers to allow ROS nodes to search and add to the world model. Conversion between database entities and ROS messages is made here. It is intended that within ROS, a listener framework is used as described below.

Furthermore, a JavaScript library is provided in worldlibjs to allow remote web clients to interact with the world model. This API currently uses rosjs and rosbridge_server to communicate with the world mode; however, as discussed above, the eventual goal is to have a standard server-side API to communicate with directly instead of using ROS.

The Listener Idea

Within ROS, the intended use of the APIs into the world model was to create a series of what are being called listener nodes. Such nodes listen to a set of defined topics, make the appropriate inferences on the information, and update the world model accordingly. Below are three examples included in world_listeners.

The above are just examples of the types of listeners that can be created. An additional example could be a segmented object listener. Such a node could listen for any segmented objects found by the robot and update the world model accordingly.

Namespacing

To allow for multiple robots, a notion of namespacing must be kept. To support this feature early on, this information is currently held inside of the tags of the instance. It is up to the developer to maintain this namespace. For example, the above listeners take an optional argument to define the namespace. If no namespace is given, it will default to the hostname of the machine the node is running on. In most cases, this is good enough since the hostname of the robot is usually a good namepsace. Then, when searching for things like a particular robot, we can do a tag search for ["robot", "myRobotName"]. Future improvements should be made to make this clearer and enforce unique namespacing.

Future Implementation Goals

Object Instance Properties Database

One improvement to the current system would be to separate the properties array into its own separate database. The idea behind properties is to define relationships such as on or in between entities in the world model. Current thoughts are to point to entries within a graph database. By doing so, powerful search queries to can written such as "give me all the objects inside the bedroom?" or "is the book on my bookshelf?" in an efficient way.

Affordances and Activities

One large piece of the world model that is missing is the notice of affordances. The goal of the Spatial World Model is to not only keep track of particular instances of objects, but to also manage what types of actions can be taken on certain objects. For example, a door can be opened, a cup can be grasped, and a robot can grasp (assuming it has a gripper, of course). Furthermore, pre-conditions should also be stored here. For example, the cup must be on the table to be picked up (or any number of other conditions). This would rely on the implementation of the graph database described above. These types of attributes should be stored in a separate table in the database and linked to a particular WorldObjectDescription.

In addition to the affordances, a notion of activities, must be stored as well. Such a structure would be used to figure out how to perform such an action on such an object. For example, if you wanted to use a pickup action on a coffee cup, the associated activity would be some action call to a grasping pipeline. Each activity can be thought of as a node with some kind of transition model incorporated to provide feedback and belief states. An updated diagram of the Spatial World Model would be the following:

worldmodel.png

World Object Instance Improvements

In addition to abstracting out the properties as defined above, several improvements are needed with respect to the instances. For one, belief states should be associated with most attributes. While the current pose does allow for this, beliefs about things just as timestamps are just as important.

A second improvement needed is the enforcement of namespaces. Checks should be made to make sure things are linked to a proper namespace entity (e.g., a robot should be linked to a map within its own namespace).

Thirdly, efforts should be made to standardize the tag set. While the listeners can help enforce this, care should be taken to make sure duplicate names for the same tag do not appear. Standardization helps with this.

World Object Description Improvements

As with the instances, the descriptions also need several improvements. One important feature is to standardize things such as the types, source JSON strings, and tags. A list containing all officially recognized types should be made and kept up to date.

A second major component is a cleanser process for the database. Currently, descriptions can be linked to multiple instances. This is the main idea behind the descriptions itself. Additionally, these descriptions can potentially contain massive amounts of data (Collada models for example). If there are no longer any instances linked to a given description, it should be removed not only from the database itself, but from the disk as well (since the large data portions are kept in PostgreSQL Large Objects. Care should be taken to ensure thread safety in the removal.

Server API

Perhaps the largest piece needed in the project is a more robust, efficient, and flexible server-side API for the world model. Currently, the worldlib Python API is used by the main ROS node and speaks SQL to the database. For many reasons, security being one, this is not ideal. Efforts should be made to create a server-side API that allows for multiple remote connections to interact with the world model. Not only would this still allow the robots to communicate with the world model, but clients could now directly connect to the world model instead of using rosbridge_server as a "proxy". While at first glance it may seem appropriate, this API should not be response-based such as a REST API. A more robust socket-level connection should be made to allow for bi-directional communication. By standardizing a server-side interface, we can also create a more powerful query system. The protocol between clients and the server could include things like searching descriptions or descriptors without having to return the data associated with them. This allows clients to subscribe to changes in the world model without the need of polling. A diagram of the updated API levels is shown below.

NewAPIs.png

Discussions and Contributions

Discussions and contributions are welcome! To get involved, check out the GitHub Issue Tracker for current feature requests and discussions.

Support

Please send bug reports to the GitHub Issue Tracker. Feel free to contact me at any point with questions and comments.


wpi.pngack.png rail.png


2024-12-28 18:39