Skip to content

Domain Randomization

domain_randomization config controls randomization of cameras, environment, robot base, table, wall, HDR, etc.

In domain_randomization you specify camera settings by config_path and type (currently only fixed supported):

cameras:
config_path: configs/cameras/fixed_camera.yml
type: fixed

There are two common camera config styles:

Example GenManip-style camera config:

camera_name:
clipping_range_max: 10000.0
clipping_range_min: 0.001
exists: false
focal_length: 5.0
frequency: 60
horizontal_aperture: 10.0
name: camera_name
orientation:
- -0.0071865058262247095
- -0.5792785229595211
- -0.005107428795172445
- 0.815081996576386
position:
- 0.28070902824401855
- -0.02326910011470318
- 1.6857764720916748
prim_path: /camera_name
resolution:
- 1280
- 720
vertical_aperture: 5.625
with_distance: true
with_semantic: true
with_bbox2d: true
with_bbox3d: true
with_motion_vector: true

where:

ParameterDescription
clipping_range_maxFar clipping plane of the camera. Keep default.
clipping_range_minNear clipping plane of the camera. Keep default.
existsWhether the camera already exists in the scene. If false, it will be created under /Cameras<prim_path>; otherwise under /World/<scene_uuid><prim_path>.
focal_lengthFocal length of the camera.
frequencyCapture frequency of the camera. Keep default.
horizontal_apertureHorizontal aperture size, together with focal length determines the field of view (FOV).
nameName of the camera.
orientationCamera orientation relative to its parent node (quaternion [x, y, z, w]).
positionCamera position relative to its parent node [x, y, z].
prim_pathThe prim path of the camera.
resolutionOutput image resolution [width, height].
vertical_apertureVertical aperture size, together with focal length determines the field of view (FOV).
with_distanceWhether to render depth information.
with_semanticWhether to render semantic segmentation masks.
with_bbox2dWhether to render 2D bounding boxes.
with_bbox3dWhether to render 3D bounding boxes.
with_motion_vectorWhether to render motion vector information.

You can also provide camera_params (fx, fy, cx, cy, width, height) to override focal/ aperture settings:

camera_params:
- 387.2585754394531 # fx
- 386.81646728515625 # fy
- 324.53442382812 # cx
- 244.35198974609375 # cy
- 640 # width
- 480 # height

Example Simbox-style camera config:

obs_camera:
exists: false
frequency: 60
name: obs_camera
camera_params:
- 605.451
- 605.137
- 320.778
- 255.816
orientation: [0.83980864, 0.4807688, -0.10643996, -0.2285899]
position:
- -0.54772
- -0.79264
- 1.64032
prim_path: /obs_camera
resolution:
- 640
- 480
pixel_size: 3.0
f_number: 2.0
focus_distance: 0.5
with_distance: false
with_semantic: false
with_bbox2d: false
with_bbox3d: false
with_motion_vector: false
camera_axes: usd

where:

ParameterDescription
existsWhether the camera already exists in the scene. If false, it will be created under /Cameras<prim_path>; otherwise under /World/<scene_uuid><prim_path>.
frequencyCapture frequency of the camera. Keep default.
nameName of the camera.
camera_paramsIntrinsic parameters of the camera [fx, fy, cx, cy].
orientationCamera orientation relative to its parent node (quaternion, scalar-first [w, x, y, z]).
positionCamera position relative to its parent node [x, y, z].
prim_pathPrim path of the camera in the USD scene.
resolutionOutput image resolution [width, height].
pixel_sizePixel size. Keep default.
f_numberAperture number (f-stop). Keep default.
focus_distanceFocus distance. Keep default.
with_distanceWhether to render depth map.
with_semanticWhether to render semantic masks.
with_bbox2dWhether to render 2D bounding boxes.
with_bbox3dWhether to render 3D bounding boxes.
with_motion_vectorWhether to render motion vectors.
camera_axesCamera coordinate convention (usually usd). Keep default.

Most of the time you can reuse provided cameras (e.g., fixed_camera_s2r_3L-align_twoObs.yml for Franka + Panda).

Random environment options example:

random_environment:
has_wall: false
hdr: false
robot_base_position: false
robot_eepose: false
table_texture: false
table_type: false
wall_texture: false
camera_randomization:
realsense:
max_translation_noise: 0.02
max_orientation_noise: 2.5
obs_camera:
max_translation_noise: 0.05
max_orientation_noise: 10.0
ParameterDescription
has_wallWhether to include surrounding walls.
hdrWhether to randomize the dome light using HDR files (located in saved/assets/miscs/hdrs/*.exr).
robot_base_positionWhether to randomize the robot arm’s base position.
robot_eeposeWhether to randomize the robot arm’s end-effector pose.
table_textureWhether to randomize the table texture (textures located in saved/assets/textures/*.jpg).
table_typeWhether to randomize the table model (InternUtopia tables located in saved/assets/object_usds/grutopia_usd/Table/tabl/*.usd). When table_type is enabled, table_texture will use materials from saved/assets/object_usds/grutopia_usd/Table/Materials/*.mdl.
wall_textureWhether to randomize wall textures.
camera_randomizationPer-camera randomization parameters (max_translation_noise in meters, max_orientation_noise in degrees).

object_data_path appears in scaling tasks (e.g., object_data_path: objaverse_annotation_refined_container_selection) and points to a pickle describing object annotations (e.g., assets/objects/objaverse_annotation_refined_container_selection.pkl). The annotation must include some fields. Example script to build a minimal pkl:

new_data = {}
for key in data.keys():
new_data[key] = {}
new_data[key]["caption"] = data[key]["caption"] # required
new_data[key]["scale"] = data[key]["scale"] # [min, max] in meters, required
# optional fields used by filter rules
new_data[key]["color"] = []
new_data[key]["materials"] = []
new_data[key]["shape"] = []
new_data[key]["category_list"] = [] # e.g., ['food', 'fruit', 'banana']
new_data[key]["can_grasp"] = True
new_data[key]["is_container"] = True
pickle.dump(new_data, open("assets/objects/objaverse_annotation_refined_pnp.pickle", "wb"))

rewrite_instruction toggles automatic rewriting of the instruction string. If true, instruction can be auto-generated like:

put {obj10_name} to the {position0} of {obj20_name}, put {obj11_name} to the {position1} of {obj21_name} and put {obj12_name} to the {position2} of {obj22_name}