Generate Your Own Data / Benchmark
In this chapter, we will walk through how to use GenManip to generate your own datasets or custom benchmarks.
You can directly edit USD scenes and write the corresponding Config file, then run GenManip to generate data. The video below demonstrates the full process — we recommend watching it on YouTube in its original quality:
Data Generation vs. Benchmark Generation
Section titled “Data Generation vs. Benchmark Generation”Benchmark generation follows the same process as data generation. Typically, we first perform a closed-loop validation based on the current layout to ensure that our Oracle solver is capable of completing the task.
- If the validation succeeds, it means the task is solvable.
- If you are already confident that the task is solvable, you can set
modein the Config file toBenchmarkto skip closed-loop validation and directly save the layout.
Generating Data
Section titled “Generating Data”Once you have written the Config file, use the following commands to generate data:
# Iterate over each dictionary in demonstration_configs and generate num_episode data samplespython demogen.py -cfg configs/tasks/xxx.yml
# Iterate over each dictionary in demonstration_configs and render num_episode data samplespython render.py -cfg configs/tasks/xxx.ymlThe generated data will be saved in:
saved/demonstrations/<task_name>/Generating Test Cases
Section titled “Generating Test Cases”To generate evaluation test cases, run:
# Iterate over each dictionary in evaluation_configs and generate num_test test casespython demogen.py -cfg configs/tasks/xxx.yml --eval
# Collect benchmark assets and package them into a GenManip Packagepython standalone_tools/collect_benchmark_assets.py --asset_path saved/tasks/<task_name> --dataset_id <task_package> --upload_to_huggingfaceThe generated test cases will be saved in:
saved/tasks/GenManip-Package-{<task_name>}/Parallel Execution Support
Section titled “Parallel Execution Support”One important feature of GenManip is its ability to run at scale in parallel:
- Whether you are running
demogen.py,render.py, oreval.py, you can launch multiple instances across different servers simultaneously. - These programs use filesystem-based file locks and
listdirsynchronization to avoid conflicts and maintain consistent progress.
You can launch any number of processes across different servers — just make sure they share the same saved directory.