maniskill_dataset_converted_externally_to_rlds

Sự miêu tả :

Franka mô phỏng thực hiện nhiều nhiệm vụ thao tác khác nhau

Trang chủ : https://github.com/haosulab/ManiSkill2
Mã nguồn : tfds.robotics.rtx.ManiskillDatasetConvertedExternallyToRlds
Phiên bản :
- 0.1.0 (mặc định): Bản phát hành đầu tiên.
Kích thước tải xuống : Unknown size
Kích thước tập dữ liệu : 151.05 GiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Không
Chia tách :

Tách ra	Ví dụ
`'train'`	30,213

Cấu trúc tính năng :

FeaturesDict({
    'episode_metadata': FeaturesDict({
        'episode_id': Text(shape=(), dtype=string),
        'file_path': Text(shape=(), dtype=string),
    }),
    'steps': Dataset({
        'action': Tensor(shape=(7,), dtype=float32, description=Robot action, consists of [3x end effector delta target position, 3x end effector delta target orientation in axis-angle format, 1x gripper target position (mimic for two fingers)]. For delta target position, an action of -1 maps to a robot movement of -0.1m, and action of 1 maps to a movement of 0.1m. For delta target orientation, its encoded angle is mapped to a range of [-0.1rad, 0.1rad] for robot execution. For example, an action of [1, 0, 0] means rotating along the x-axis by 0.1 rad. For gripper target position, an action of -1 means close, and an action of 1 means open.),
        'discount': Scalar(shape=(), dtype=float32, description=Discount if provided, default to 1.),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'language_embedding': Tensor(shape=(512,), dtype=float32, description=Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5),
        'language_instruction': Text(shape=(), dtype=string),
        'observation': FeaturesDict({
            'base_pose': Tensor(shape=(7,), dtype=float32, description=Robot base pose in the world frame, consists of [x, y, z, qw, qx, qy, qz]. The first three dimensions represent xyz positions in meters. The last four dimensions are the quaternion representation of rotation.),
            'depth': Image(shape=(256, 256, 1), dtype=uint16, description=Main camera Depth observation. Divide the depth value by 2**10 to get the depth in meters.),
            'image': Image(shape=(256, 256, 3), dtype=uint8, description=Main camera RGB observation.),
            'main_camera_cam2world_gl': Tensor(shape=(4, 4), dtype=float32, description=Transformation from the main camera frame to the world frame in OpenGL/Blender convention.),
            'main_camera_extrinsic_cv': Tensor(shape=(4, 4), dtype=float32, description=Main camera extrinsic matrix in OpenCV convention.),
            'main_camera_intrinsic_cv': Tensor(shape=(3, 3), dtype=float32, description=Main camera intrinsic matrix in OpenCV convention.),
            'state': Tensor(shape=(18,), dtype=float32, description=Robot state, consists of [7x robot joint angles, 2x gripper position, 7x robot joint angle velocity, 2x gripper velocity]. Angle in radians, position in meters.),
            'target_object_or_part_final_pose': Tensor(shape=(7,), dtype=float32, description=The final pose towards which the target object or object part needs be manipulated, consists of [x, y, z, qw, qx, qy, qz]. The pose is represented in the world frame. An episode is considered successful if the target object or object part is manipulated to this pose.),
            'target_object_or_part_final_pose_valid': Tensor(shape=(7,), dtype=uint8, description=Whether each dimension of target_object_or_part_final_pose is valid in an environment. 1 = valid; 0 = invalid (in which case one should ignore the corresponding dimensions in target_object_or_part_final_pose). "Invalid" means that there is no success check on the final pose of target object or object part in the corresponding dimensions.),
            'target_object_or_part_initial_pose': Tensor(shape=(7,), dtype=float32, description=The initial pose of the target object or object part to be manipulated, consists of [x, y, z, qw, qx, qy, qz]. The pose is represented in the world frame. This variable is used to specify the target object or object part when multiple objects or object parts are present in an environment),
            'target_object_or_part_initial_pose_valid': Tensor(shape=(7,), dtype=uint8, description=Whether each dimension of target_object_or_part_initial_pose is valid in an environment. 1 = valid; 0 = invalid (in which case one should ignore the corresponding dimensions in target_object_or_part_initial_pose).),
            'tcp_pose': Tensor(shape=(7,), dtype=float32, description=Robot tool-center-point pose in the world frame, consists of [x, y, z, qw, qx, qy, qz]. Tool-center-point is the center between the two gripper fingers.),
            'wrist_camera_cam2world_gl': Tensor(shape=(4, 4), dtype=float32, description=Transformation from the wrist camera frame to the world frame in OpenGL/Blender convention.),
            'wrist_camera_extrinsic_cv': Tensor(shape=(4, 4), dtype=float32, description=Wrist camera extrinsic matrix in OpenCV convention.),
            'wrist_camera_intrinsic_cv': Tensor(shape=(3, 3), dtype=float32, description=Wrist camera intrinsic matrix in OpenCV convention.),
            'wrist_depth': Image(shape=(256, 256, 1), dtype=uint16, description=Wrist camera Depth observation. Divide the depth value by 2**10 to get the depth in meters.),
            'wrist_image': Image(shape=(256, 256, 3), dtype=uint8, description=Wrist camera RGB observation.),
        }),
        'reward': Scalar(shape=(), dtype=float32, description=Reward if provided, 1 on final step for demos.),
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D	Sự miêu tả
	Tính năngDict
tập_siêu dữ liệu	Tính năngDict
tập_siêu dữ liệu/tập_id	Chữ		sợi dây	ID tập.
tập_siêu dữ liệu/file_path	Chữ		sợi dây	Đường dẫn tới file dữ liệu gốc.
bước	Tập dữ liệu
bước/hành động	Tenxơ	(7,)	phao32	Hoạt động của robot, bao gồm [3x vị trí mục tiêu tam giác của bộ tác động cuối, 3x hướng mục tiêu tam giác của bộ tác động cuối ở định dạng góc trục, 1x vị trí mục tiêu của bộ kẹp (bắt chước cho hai ngón tay)]. Đối với vị trí mục tiêu đồng bằng, hành động -1 sẽ ánh xạ tới chuyển động của robot là -0,1m và hành động của 1 sẽ ánh xạ tới chuyển động 0,1m. Đối với hướng mục tiêu delta, góc mã hóa của nó được ánh xạ tới phạm vi [-0,1rad, 0,1rad] để thực thi robot. Ví dụ: hành động [1, 0, 0] có nghĩa là quay dọc theo trục x 0,1 rad. Đối với vị trí mục tiêu của bộ kẹp, hành động -1 nghĩa là đóng và hành động 1 nghĩa là mở.
bước/giảm giá	vô hướng		phao32	Giảm giá nếu được cung cấp, mặc định là 1.
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
các bước/ngôn ngữ_embedding	Tenxơ	(512,)	phao32	Nhúng ngôn ngữ Kona. Xem https://tfhub.dev/google/universal-sentence-encoding-large/5
các bước/ngôn ngữ_instruction	Chữ		sợi dây	Giảng dạy ngôn ngữ.
bước/quan sát	Tính năngDict
bước/quan sát/base_pose	Tenxơ	(7,)	phao32	Tư thế cơ sở của robot trong khung thế giới, bao gồm [x, y, z, qw, qx, qy, qz]. Ba chiều đầu tiên biểu thị vị trí xyz tính bằng mét. Bốn chiều cuối cùng là biểu diễn quaternion của phép quay.
bước/quan sát/độ sâu	Hình ảnh	(256, 256, 1)	uint16	Camera chính Quan sát độ sâu. Chia giá trị độ sâu cho 2**10 để có độ sâu tính bằng mét.
bước/quan sát/hình ảnh	Hình ảnh	(256, 256, 3)	uint8	Quan sát RGB của camera chính.
bước/quan sát/main_Camera_cam2world_gl	Tenxơ	(4, 4)	phao32	Chuyển đổi từ khung camera chính sang khung thế giới trong quy ước OpenGL/Blender.
bước/quan sát/main_máy ảnh_extrinsic_cv	Tenxơ	(4, 4)	phao32	Ma trận bên ngoài camera chính trong quy ước OpenCV.
bước/quan sát/main_máy ảnh_intrinsic_cv	Tenxơ	(3, 3)	phao32	Ma trận nội tại của camera chính trong quy ước OpenCV.
bước/quan sát/trạng thái	Tenxơ	(18,)	phao32	Trạng thái robot, bao gồm [góc khớp robot 7x, vị trí kẹp 2x, vận tốc góc khớp robot 7x, vận tốc kẹp 2x]. Góc tính bằng radian, vị trí tính bằng mét.
bước/quan sát/target_object_or_part_final_pose	Tenxơ	(7,)	phao32	Tư thế cuối cùng mà đối tượng mục tiêu hoặc phần đối tượng cần được thao tác, bao gồm [x, y, z, qw, qx, qy, qz]. Tư thế được thể hiện trong khung thế giới. Một tập được coi là thành công nếu đối tượng mục tiêu hoặc bộ phận đối tượng được điều khiển theo tư thế này.
bước/quan sát/target_object_or_part_final_pose_valid	Tenxơ	(7,)	uint8	Liệu mỗi thứ nguyên của target_object_or_part_final_pose có hợp lệ trong một môi trường hay không. 1 = hợp lệ; 0 = không hợp lệ (trong trường hợp đó người ta nên bỏ qua các kích thước tương ứng trong target_object_or_part_final_pose). "Không hợp lệ" có nghĩa là không có kiểm tra thành công về tư thế cuối cùng của đối tượng mục tiêu hoặc bộ phận đối tượng trong các kích thước tương ứng.
bước/quan sát/target_object_or_part_initial_pose	Tenxơ	(7,)	phao32	Tư thế ban đầu của đối tượng mục tiêu hoặc phần đối tượng cần thao tác, bao gồm [x, y, z, qw, qx, qy, qz]. Tư thế được thể hiện trong khung thế giới. Biến này được sử dụng để chỉ định đối tượng đích hoặc phần đối tượng khi có nhiều đối tượng hoặc phần đối tượng trong môi trường
bước/quan sát/target_object_or_part_initial_pose_valid	Tenxơ	(7,)	uint8	Liệu mỗi thứ nguyên của target_object_or_part_initial_pose có hợp lệ trong một môi trường hay không. 1 = hợp lệ; 0 = không hợp lệ (trong trường hợp đó người ta nên bỏ qua các kích thước tương ứng trong target_object_or_part_initial_pose).
bước/quan sát/tcp_pose	Tenxơ	(7,)	phao32	Robot công cụ-điểm trung tâm đặt trong khung thế giới, bao gồm [x, y, z, qw, qx, qy, qz]. Điểm trung tâm công cụ là tâm giữa hai ngón tay kẹp.
bước/quan sát/cổ tay_máy ảnh_cam2world_gl	Tenxơ	(4, 4)	phao32	Chuyển đổi từ khung máy ảnh đeo tay sang khung thế giới trong quy ước OpenGL/Blender.
bước/quan sát/cổ tay_máy ảnh_extrinsic_cv	Tenxơ	(4, 4)	phao32	Ma trận bên ngoài của máy ảnh đeo tay trong quy ước OpenCV.
bước/quan sát/cổ tay_máy ảnh_intrinsic_cv	Tenxơ	(3, 3)	phao32	Ma trận nội tại của máy ảnh đeo tay trong quy ước OpenCV.
bước/quan sát/cổ tay_độ sâu	Hình ảnh	(256, 256, 1)	uint16	Camera đeo tay Quan sát độ sâu. Chia giá trị độ sâu cho 2**10 để có độ sâu tính bằng mét.
bước/quan sát/hình ảnh cổ tay	Hình ảnh	(256, 256, 3)	uint8	Camera đeo tay quan sát RGB.
bước/phần thưởng	vô hướng		phao32	Phần thưởng nếu được cung cấp, 1 ở bước cuối cùng cho bản demo.

Khóa được giám sát (Xem as_supervised doc ): None
Hình ( tfds.show_examples ): Không được hỗ trợ.
Ví dụ ( tfds.as_dataframe ):

Trích dẫn :

@inproceedings{gu2023maniskill2,
  title={ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills},
  author={Gu, Jiayuan and Xiang, Fanbo and Li, Xuanlin and Ling, Zhan and Liu, Xiqiang and Mu, Tongzhou and Tang, Yihe and Tao, Stone and Wei, Xinyue and Yao, Yunchao and Yuan, Xiaodi and Xie, Pengwei and Huang, Zhiao and Chen, Rui and Su, Hao},
  booktitle={International Conference on Learning Representations},
  year={2023}
}