maniskill_dataset_converted_externally_to_rlds

وصف :

محاكاة فرانكا وهي تؤدي مهام تلاعب مختلفة

الصفحة الرئيسية : https://github.com/haosulab/ManiSkill2
كود المصدر : tfds.robotics.rtx.ManiskillDatasetConvertedExternallyToRlds
الإصدارات :
- 0.1.0 (افتراضي): الإصدار الأولي.
حجم التحميل : Unknown size
حجم مجموعة البيانات : 151.05 GiB
التخزين المؤقت التلقائي ( الوثائق ): لا
الإنشقاقات :

ينقسم	أمثلة
`'train'`	30,213

هيكل الميزة :

FeaturesDict({
    'episode_metadata': FeaturesDict({
        'episode_id': Text(shape=(), dtype=string),
        'file_path': Text(shape=(), dtype=string),
    }),
    'steps': Dataset({
        'action': Tensor(shape=(7,), dtype=float32, description=Robot action, consists of [3x end effector delta target position, 3x end effector delta target orientation in axis-angle format, 1x gripper target position (mimic for two fingers)]. For delta target position, an action of -1 maps to a robot movement of -0.1m, and action of 1 maps to a movement of 0.1m. For delta target orientation, its encoded angle is mapped to a range of [-0.1rad, 0.1rad] for robot execution. For example, an action of [1, 0, 0] means rotating along the x-axis by 0.1 rad. For gripper target position, an action of -1 means close, and an action of 1 means open.),
        'discount': Scalar(shape=(), dtype=float32, description=Discount if provided, default to 1.),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'language_embedding': Tensor(shape=(512,), dtype=float32, description=Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5),
        'language_instruction': Text(shape=(), dtype=string),
        'observation': FeaturesDict({
            'base_pose': Tensor(shape=(7,), dtype=float32, description=Robot base pose in the world frame, consists of [x, y, z, qw, qx, qy, qz]. The first three dimensions represent xyz positions in meters. The last four dimensions are the quaternion representation of rotation.),
            'depth': Image(shape=(256, 256, 1), dtype=uint16, description=Main camera Depth observation. Divide the depth value by 2**10 to get the depth in meters.),
            'image': Image(shape=(256, 256, 3), dtype=uint8, description=Main camera RGB observation.),
            'main_camera_cam2world_gl': Tensor(shape=(4, 4), dtype=float32, description=Transformation from the main camera frame to the world frame in OpenGL/Blender convention.),
            'main_camera_extrinsic_cv': Tensor(shape=(4, 4), dtype=float32, description=Main camera extrinsic matrix in OpenCV convention.),
            'main_camera_intrinsic_cv': Tensor(shape=(3, 3), dtype=float32, description=Main camera intrinsic matrix in OpenCV convention.),
            'state': Tensor(shape=(18,), dtype=float32, description=Robot state, consists of [7x robot joint angles, 2x gripper position, 7x robot joint angle velocity, 2x gripper velocity]. Angle in radians, position in meters.),
            'target_object_or_part_final_pose': Tensor(shape=(7,), dtype=float32, description=The final pose towards which the target object or object part needs be manipulated, consists of [x, y, z, qw, qx, qy, qz]. The pose is represented in the world frame. An episode is considered successful if the target object or object part is manipulated to this pose.),
            'target_object_or_part_final_pose_valid': Tensor(shape=(7,), dtype=uint8, description=Whether each dimension of target_object_or_part_final_pose is valid in an environment. 1 = valid; 0 = invalid (in which case one should ignore the corresponding dimensions in target_object_or_part_final_pose). "Invalid" means that there is no success check on the final pose of target object or object part in the corresponding dimensions.),
            'target_object_or_part_initial_pose': Tensor(shape=(7,), dtype=float32, description=The initial pose of the target object or object part to be manipulated, consists of [x, y, z, qw, qx, qy, qz]. The pose is represented in the world frame. This variable is used to specify the target object or object part when multiple objects or object parts are present in an environment),
            'target_object_or_part_initial_pose_valid': Tensor(shape=(7,), dtype=uint8, description=Whether each dimension of target_object_or_part_initial_pose is valid in an environment. 1 = valid; 0 = invalid (in which case one should ignore the corresponding dimensions in target_object_or_part_initial_pose).),
            'tcp_pose': Tensor(shape=(7,), dtype=float32, description=Robot tool-center-point pose in the world frame, consists of [x, y, z, qw, qx, qy, qz]. Tool-center-point is the center between the two gripper fingers.),
            'wrist_camera_cam2world_gl': Tensor(shape=(4, 4), dtype=float32, description=Transformation from the wrist camera frame to the world frame in OpenGL/Blender convention.),
            'wrist_camera_extrinsic_cv': Tensor(shape=(4, 4), dtype=float32, description=Wrist camera extrinsic matrix in OpenCV convention.),
            'wrist_camera_intrinsic_cv': Tensor(shape=(3, 3), dtype=float32, description=Wrist camera intrinsic matrix in OpenCV convention.),
            'wrist_depth': Image(shape=(256, 256, 1), dtype=uint16, description=Wrist camera Depth observation. Divide the depth value by 2**10 to get the depth in meters.),
            'wrist_image': Image(shape=(256, 256, 3), dtype=uint8, description=Wrist camera RGB observation.),
        }),
        'reward': Scalar(shape=(), dtype=float32, description=Reward if provided, 1 on final step for demos.),
    }),
})

وثائق الميزة :

ميزة	فصل	شكل	نوع D	وصف
	المميزاتDict
الحلقة_البيانات الوصفية	المميزاتDict
Episode_metadata/episode_id	نص		خيط	معرف الحلقة
Episode_metadata/file_path	نص		خيط	المسار إلى ملف البيانات الأصلي.
خطوات	مجموعة البيانات
الخطوات/الإجراء	الموتر	(7،)	float32	يتكون عمل الروبوت من [موضع هدف دلتا المستجيب النهائي 3x، اتجاه هدف دلتا المستجيب النهائي 3x بتنسيق زاوية المحور، موضع هدف المقبض 1x (تقليد لإصبعين)]. بالنسبة لموضع هدف دلتا، يتم تعيين إجراء -1 لحركة الروبوت بمقدار -0.1 م، وإجراء 1 خريطة لحركة 0.1 م. بالنسبة لتوجيه هدف دلتا، يتم تعيين زاويته المشفرة إلى نطاق [-0.1rad، 0.1rad] لتنفيذ الروبوت. على سبيل المثال، الإجراء [1، 0، 0] يعني الدوران على طول المحور السيني بمقدار 0.1 راد. بالنسبة لموضع هدف المقبض، الإجراء -1 يعني الإغلاق، والإجراء 1 يعني الفتح.
الخطوات/الخصم	العددية		float32	الخصم إذا تم توفيره، الافتراضي هو 1.
الخطوات/is_first	الموتر		منطقي
الخطوات/is_last	الموتر		منطقي
الخطوات/is_terminal	الموتر		منطقي
الخطوات/language_embedding	الموتر	(512،)	float32	تضمين لغة كونا. راجع https://tfhub.dev/google/universal-sentence-encoder-large/5
الخطوات/language_instruction	نص		خيط	تعليم اللغة.
الخطوات/الملاحظة	المميزاتDict
الخطوات/الملاحظة/base_pose	الموتر	(7،)	float32	تشكل قاعدة الروبوت في الإطار العالمي، وتتكون من [x، y، z، qw، qx، qy، qz]. تمثل الأبعاد الثلاثة الأولى مواضع xyz بالأمتار. الأبعاد الأربعة الأخيرة هي تمثيل الكواترنيون للدوران.
الخطوات / الملاحظة / العمق	صورة	(256، 256، 1)	uint16	كاميرا رئيسية لمراقبة العمق. اقسم قيمة العمق على 2**10 لتحصل على العمق بالأمتار.
الخطوات/الملاحظة/الصورة	صورة	(256، 256، 3)	uint8	مراقبة الكاميرا الرئيسية RGB.
الخطوات/الملاحظة/main_camera_cam2world_gl	الموتر	(4، 4)	float32	التحول من إطار الكاميرا الرئيسي إلى الإطار العالمي في اتفاقية OpenGL/Blender.
الخطوات/الملاحظة/main_camera_extrinsic_cv	الموتر	(4، 4)	float32	المصفوفة الخارجية للكاميرا الرئيسية في اتفاقية OpenCV.
الخطوات/الملاحظة/main_camera_intrinsic_cv	الموتر	(3، 3)	float32	مصفوفة الكاميرا الرئيسية الجوهرية في اتفاقية OpenCV.
الخطوات/الملاحظة/الحالة	الموتر	(18،)	float32	حالة الروبوت، تتكون من [7x زوايا مفصل الروبوت، 2x موضع القابض، 7x سرعة زاوية مفصل الروبوت، 2x سرعة القابض]. الزاوية بالراديان، والموضع بالمتر.
الخطوات/الملاحظة/target_object_or_part_final_pose	الموتر	(7،)	float32	يتكون الوضع النهائي الذي يجب التلاعب بالكائن المستهدف أو جزء الكائن من [x، y، z، qw، qx، qy، qz]. يتم تمثيل الوضع في الإطار العالمي. تعتبر الحلقة ناجحة إذا تم التلاعب بالكائن المستهدف أو جزء الكائن في هذه الوضعية.
الخطوات/الملاحظة/target_object_or_part_final_pose_valid	الموتر	(7،)	uint8	ما إذا كان كل بُعد من target_object_or_part_final_pose صالحًا في البيئة. 1 = صالح؛ 0 = غير صالح (وفي هذه الحالة يجب تجاهل الأبعاد المقابلة في target_object_or_part_final_pose). "غير صالح" يعني أنه لا يوجد اختبار نجاح في الوضع النهائي للكائن المستهدف أو جزء الكائن في الأبعاد المقابلة.
الخطوات/الملاحظة/target_object_or_part_initial_pose	الموتر	(7،)	float32	يتكون الوضع الأولي للكائن المستهدف أو جزء الكائن المراد معالجته من [x، y، z، qw، qx، qy، qz]. يتم تمثيل الوضع في الإطار العالمي. يتم استخدام هذا المتغير لتحديد الكائن الهدف أو جزء الكائن عند وجود كائنات أو أجزاء كائن متعددة في البيئة
الخطوات/الملاحظة/target_object_or_part_initial_pose_valid	الموتر	(7،)	uint8	ما إذا كان كل بُعد من target_object_or_part_initial_pose صالحًا في البيئة. 1 = صالح؛ 0 = غير صالح (وفي هذه الحالة يجب تجاهل الأبعاد المقابلة في target_object_or_part_initial_pose).
الخطوات/الملاحظة/tcp_pose	الموتر	(7،)	float32	وضعية النقطة المركزية لأداة الروبوت في الإطار العالمي، تتكون من [x، y، z، qw، qx، qy، qz]. نقطة مركز الأداة هي المركز بين إصبعي القابض.
الخطوات/الملاحظة/wrist_camera_cam2world_gl	الموتر	(4، 4)	float32	التحول من إطار كاميرا المعصم إلى الإطار العالمي في اتفاقية OpenGL/Blender.
الخطوات/الملاحظة/wrist_camera_extrinsic_cv	الموتر	(4، 4)	float32	مصفوفة خارجية لكاميرا المعصم في اتفاقية OpenCV.
الخطوات/الملاحظة/wrist_camera_intrinsic_cv	الموتر	(3، 3)	float32	مصفوفة كاميرا المعصم الجوهرية في اتفاقية OpenCV.
الخطوات/الملاحظة/عمق_الرسغ	صورة	(256، 256، 1)	uint16	كاميرا المعصم مراقبة العمق. اقسم قيمة العمق على 2**10 لتحصل على العمق بالأمتار.
الخطوات/الملاحظة/wrist_image	صورة	(256، 256، 3)	uint8	مراقبة كاميرا المعصم RGB.
خطوات/مكافأة	العددية		float32	مكافأة إذا تم توفيرها، 1 في الخطوة النهائية للعروض التوضيحية.

المفاتيح الخاضعة للإشراف (راجع as_supervised doc ): None
الشكل ( tfds.show_examples ): غير مدعوم.
أمثلة ( tfds.as_dataframe ):

الاقتباس :

@inproceedings{gu2023maniskill2,
  title={ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills},
  author={Gu, Jiayuan and Xiang, Fanbo and Li, Xuanlin and Ling, Zhan and Liu, Xiqiang and Mu, Tongzhou and Tang, Yihe and Tao, Stone and Wei, Xinyue and Yao, Yunchao and Yuan, Xiaodi and Xie, Pengwei and Huang, Zhiao and Chen, Rui and Su, Hao},
  booktitle={International Conference on Learning Representations},
  year={2023}
}