Skip to content

Object Generation, Inclusion, and Arrangement

Dr. Juan Rojas edited this page Jul 9, 2021 · 22 revisions

For this work: make sure you switch to the robosuite more_objects branch_

  • git fetch --all # fetches remote branches_
  • git branch # lists all branches_
  • git checkout more_objects # switches to that branch

01 Motivation

A key goal of this work is to learn dexterous picking for a large set of diverse objects. If you take a look at the QT-OPT work some of the core lessons is that if you want a robot that is going to be very good at picking anything, then, you need to work with a large number of objects, and additionally, you need those objects to be very diverse.

What is a large number? QT-OPT used around 1000 objects in their experiments. About half for training and a half for testing. Note: they did not test all 500 at the same time. They had bins with about 30 objects inside of them and they would shift them out across experiments.

To see if our algorithms work properly in DRL, it is important to verify them in simulation first. It is for this reason that we also need to simulate a diverse set of a large number of objects. What objects should we choose?

02 Classification

We have initially developed a classification system that involves 6 categories (weight, longitude, texture, geometry, density, and color) and different levels. If we do not count the colors, there are a total of 144 classes. By considering a total of 9 colors for each category, there a possible 1296 combinations. The following excel sheet defines an encoding for all possible combinations of these categories. It also includes 2 databases for real and synthetic objects. The databases

The objects_categories_db.xlsx excel file is found in the ./docs/simulation/objects folder in this repo. You can see a similar but simpler example coming from the YCB database.

  1. Study the categorization of objects
  2. We will use variable names oXXX for each object in the database. I.e. o0001 := 000000_can. This means that 0001 would correspond to a [light | short | smooth | hollow, rectangular | red cup].
  3. Note: any updates to our objects_categories_db.xlsx database should be pushed via git to this repo: docs/simulation/objects/object_categories_db.xlsx.

03 Generation

Objects can be created from scratch via a 3D modeling tool like SolidWorks or Google Sketchup, or get scanned from real objects (i.e. scan2mesh or similar). scan2mesh would be a nice option since it would allow the real object set and the synthetic object set to match. Though extracting the correct dynamic properties would be a different challenge. On the other hand, a program like Solidworks might make that data more readily accessible.

Below are a number of considerations to follow:

  1. Mujoco Objects Understand how Mujoco simulates objects. Study the documentation and prepare a detailed presentation - Resources: Mujoco Overview, Mujoco Modelling Mujoco Models.

    In our project, we will use Robosuite.ai. Robosuite is a 'toolsuite' that builds on top of Mujoco to facilitate DRL programming. Robosuite already counts with a number of objects already included in the toolsuite. We will use our own forked version to extend the existing object list. Below is a description of where you can find XML, .stl mesh files, and textures necessary to define objects in Mujoco.

  2. Existing Datasets There are at least 3 existing datasets that contain files that we can use to build Mujoco models:

  1. Automatic XML Generation: Investigate the possibility of coding a script that can generate the XML files on the fly given the existing mesh, texture file, and meta-data from solid-works. Or at least partially automate.

There is an example from the YCB git repo using Python3. See the create_ycb_sdf.py script. This is designed for Gazebo, but it gives an idea of how to do it.

The generation should not only create the XML file but also check to ensure that files are put in the correct directories. The automation can also do the 'inclusion' part which will be described next.

Roadmap:

  • Identify best objects for our categorization,
  • Build mujoco XML files from mesh, texture, and dynamic data (friction, damping, stiffness), and geometric data (distance from center to top and bottom and center-to-edge).
  • Given the large number of objects, it is best to work hard on an automated scripting tool that could then quickly generate a large number of XML files automatically.

04 Inclusion

Now at the XML files are available, how are they used in the code?

Please refer to the robosuite.ai documentation. The very first you will note is that the framework is divided into Modeling APIs and Simulation APIs. In this part of the project, you are all concerned with the Modelling APIs. Via robosuite, we will tell Mujoco every single detail of the simulation environment. What objects to load and where. So you must be very familiar with working with robosuite and its mujoco interface.

Short Version:

  • Robosuite uses classes to model robots, objects, environments, and manipulation tasks. Each of these has a parent folder. I.e. robosuite/robots; robosuite/models/objects; robosuite/environments/manipulation/tasks.

  • We are interested in modifying and expanding the bin_picking.py task (copy of pick_place.py) to be expanded in this project.

  • At the top we have import commands. There is a specific command in L09 where we import objects here:

from robosuite.models.objects import (
    MilkObject,
    BreadObject,
    CerealObject,
    CanObject,
    MasterChefCanObject,
)
from robosuite.models.objects import (
    MilkVisualObject,
    BreadVisualObject,
    CerealVisualObject,
    CanVisualObject,
    MasterChefCanVisualObject,
)

These classes can be imported in this way because in ./robosuite/models/objects/init.py there is another import command that makes the classes available from the submodule xml.objects.py:

from .xml_objects import (
    BottleObject,
    CanObject,
    MasterChefCanObject,
    LemonObject,
    MilkObject,
    BreadObject,
    CerealObject,
    SquareNutObject,
    RoundNutObject,
    MilkVisualObject,
    BreadVisualObject,
    CerealVisualObject,
    CanVisualObject,
    MasterChefCanVisualObject,
    PlateWithHoleObject,
    DoorObject,
)

Finally, it is inside xml_objects.py that TWO not ONE class is created for each object. Consider a "Can" object. The two classes for the can will be CanObject and CanVisualObject. The former is used to define a proper object, the latter is to define a visual fiducial that indicates position in Mujoco.

class CanObject(MujocoXMLObject):
    """
    Coke can object (used in PickPlace)
    """

    def __init__(self, name):
        super().__init__(xml_path_completion("objects/can.xml"),
                         name=name, joints=[dict(type="free", damping="0.0005")],
                         obj_type="all", duplicate_collision_geoms=True)
class CanVisualObject(MujocoXMLObject):
    """
    Visual fiducial of coke can (used in PickPlace)
    Fiducial objects are not involved in collision physics.
    They provide a point of reference to indicate a position.
    """

    def __init__(self, name):
        super().__init__(xml_path_completion("objects/can-visual.xml"),
                         name=name, joints=None, obj_type="visual", duplicate_collision_geoms=True)

Instead of listing which objects should be imported by hand, we can automate this as follows:

from robosuite.models.objects import * 
mods = dir()                                                              # lists current modules
visualObjs  = [ item for item in mods if 'VisualObject' in item ]         # keep XXXVisualObject
objs        = [ item.replace('Visual','') for item in visualObjs ]        # remove Visual
Challenges:
  • Entering classes manually for 1000x objects in xml_objects.py is not practical.
  • We will automate a script that builds the python code and appends appropriate code with each new object
  • Use the ID under classification (i.e. oXXXX) to facilitate working with all objects.
  • Note: for visual objects we are currently using oXXXXv, but this breaks compatibility with robosuite's nameVisualObject string. Consider keeping the same format (look at pre-amble of picking.py and _reset_internals() ).

Become familiar with the object class declaration method:

A good way to become familiar with how this happens is to debug.

  • Make sure you have setup robsuite and mujoco properly.
  • Use your favorite IDE (i.e. Code)
  • Go to ./robosuite/demos/demo_control.py
  • Place a breakpoint in L09 of bin_picking.py
  • Go back to demo_control.py and start debug session with F5. Step/step-in as necessary.
  • Notice that in L47 import robosuite as suite this import command will automatically call all the init.py scripts within those folders, making any imported Classes available to the parent script.

05 Arrangement

Now that we can access the objects in code, we must always adapt the modeling environment to place them in the world according to some strategy regardless if we import 1, 10, 30, objects. The code needs to compute all spatial coordinates of these objects in the simulation environment to know where to place them. I.e. if you have a bin in front of the robot, given a strategy, determine where all the objects need to be placed. Different strategies will choose positions differently. For example, (i) Dump: dump them all towards the center and let them fall on top of each other, (ii) Stack: neatly and densely stack objects in an organized way, (iii) Line around the wall of the bin.

Warning: It's necessary to consider the object's radius/width such that we don't place two objects in such a way that they penetrate each other at the time of initialization. If penetration happens, the simulator may lead to aggressive movement of objects.

Robosuite uses the task classes to establish all specific aspects of the environment. Head over to bin_picking.py.

Read through the docstring. Notice we have two bins (can be modified as well). Then in the init() method we set the center of mass of those bins at some XYZ location.

    def __init__(
        ...
        bin1_pos=(0.1, -0.25, 0.8),
        bin2_pos=(0.1, 0.28, 0.8),
        ...

Task settings are set. Objects are given IDs. Here we will need to update the code to use our classification IDs as the dictionary keys.

If we were to use 30 objects, we would have to generate a function that loads a list of our training objects, randomly select 30 of them and returns their names to fill in the keys of this dictionary.

A couple of lines later, objects are assigned a name. We would use the same IDs as names here.

self.object_to_id = {"milk": 0, "bread": 1, "cereal": 2, "can": 3} 
...
self.obj_names = ["Milk", "Bread", "Cereal", "Can"]

Similarly, we need to look throughout the entire class where fixed names like these are used and replace and fill them with our updated returned names. Another place where fixed names are located is the _load_model(self) method of the class:

       for vis_obj_cls, obj_name in zip(
                (MilkVisualObject, BreadVisualObject, CerealVisualObject, CanVisualObject),
                self.obj_names,
        ):

...
        for obj_cls, obj_name in zip(
                (MilkObject, BreadObject, CerealObject, CanObject),
                self.obj_names,

Now, we need to tell the code to adjust positions appropriately according to the (A) Strategy and (B) the number of objects. Method to update are:

  • _reset_internal(self)
  • _get_placement_initializer(self).
  • _setup_references(self)

We will assume three main strategies: organized (neat stacks of objects - typically boxes), jumbled (placed randomly anywhere in the bin), and wall (lined with the wall). We can select a strategy in a fixed manner or randomly. We will use the following global list in picking.py to define these: object_reset_strategy_cases = ['organized', 'jumbled', 'wall', 'random']

Num objects to load and which objects to load:

  • num_obj_in_db = total number of objects offline
  • num_obj_to_load = how many objs to put in the bin at one time
  • num_objs = this will be a number we use for the Graph Neural Network. Out of the num_objs we load, should we model all of them? You may not necessarily do this if the computation requirements are too high. You may choose to model fewer objects than those available.

Then, whenever we reset the simulation, should we keep the same objects, or new ones? This will be controlled by self.object_randomization = object_randomization

A final aspect is to load a new set of training/testing N objects every time we finish an experiment (episode) run. Will need to double-check this email and learn from how the QT-OPT paper did it.

Some things that we need to determine here are:

I have developed the 'jumbled' strategy. You will have to help me extend these with organized and random.