TFDS hiện hỗ trợ định dạng Croissant 🥐 ! Đọc tài liệu để biết thêm.

Trang này được dịch bởi Cloud Translation API.

d4rl_mujoco_hopper

Sự miêu tả :

D4RL là một chuẩn mực nguồn mở dành cho việc học tăng cường ngoại tuyến. Nó cung cấp các môi trường và bộ dữ liệu được tiêu chuẩn hóa cho các thuật toán đào tạo và đo điểm chuẩn.

Các bộ dữ liệu tuân theo định dạng RLDS để thể hiện các bước và các tập.

Mô tả cấu hình : Xem thêm chi tiết về nhiệm vụ và các phiên bản của nó trong https://github.com/rail-berkeley/d4rl/wiki/Tasks#gym
Trang chủ : https://sites.google.com/view/d4rl-anonymous
Mã nguồn : tfds.d4rl.d4rl_mujoco_hopper.D4rlMujocoHopper
Phiên bản :
- 1.0.0 : Bản phát hành đầu tiên.
- 1.1.0 : Đã thêm is_last.
- 1.2.0 (mặc định): Đã cập nhật để tính đến lần quan sát tiếp theo.
Khóa được giám sát (Xem as_supervised doc ): None
Hình ( tfds.show_examples ): Không được hỗ trợ.
Trích dẫn :

@misc{fu2020d4rl,
    title={D4RL: Datasets for Deep Data-Driven Reinforcement Learning},
    author={Justin Fu and Aviral Kumar and Ofir Nachum and George Tucker and Sergey Levine},
    year={2020},
    eprint={2004.07219},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

d4rl_mujoco_hopper/v0-expert (cấu hình mặc định)

Kích thước tải xuống : 51.56 MiB
Kích thước tập dữ liệu : 64.10 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	1.029

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v0-medium

Kích thước tải xuống : 51.74 MiB
Kích thước tập dữ liệu : 64.68 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	3.064

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v0-medium-expert

Kích thước tải xuống : 62.01 MiB
Kích thước tập dữ liệu : 77.25 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	2.277

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v0-hỗn hợp

Kích thước tải xuống : 10.48 MiB
Kích thước tập dữ liệu : 13.15 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	1.250

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v0-ngẫu nhiên

Kích thước tải xuống : 51.83 MiB
Kích thước tập dữ liệu : 66.06 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	8,793

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v1-chuyên gia

Kích thước tải xuống : 93.19 MiB
Kích thước tập dữ liệu : 608.03 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Không
Chia tách :

Tách ra	Ví dụ
`'train'`	1.836

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 11), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(6,), dtype=float32),
            'qvel': Tensor(shape=(6,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
chính sách	Tính năngDict
chính sách/fc0	Tính năngDict
chính sách/fc0/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc0/trọng lượng	Tenxơ	(256, 11)	phao32
chính sách/fc1	Tính năngDict
chính sách/fc1/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc1/trọng lượng	Tenxơ	(256, 256)	phao32
chính sách/last_fc	Tính năngDict
chính sách/last_fc/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/last_fc_log_std	Tính năngDict
chính sách/last_fc_log_std/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc_log_std/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/phi tuyến tính	Tenxơ		sợi dây
chính sách/output_distribution	Tenxơ		sợi dây
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao32
bước/thông tin/qpos	Tenxơ	(6,)	phao32
bước/thông tin/qvel	Tenxơ	(6,)	phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v1-medium

Kích thước tải xuống : 92.03 MiB
Kích thước tập dữ liệu : 1.78 GiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Không
Chia tách :

Tách ra	Ví dụ
`'train'`	6.328

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 11), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(6,), dtype=float32),
            'qvel': Tensor(shape=(6,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
chính sách	Tính năngDict
chính sách/fc0	Tính năngDict
chính sách/fc0/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc0/trọng lượng	Tenxơ	(256, 11)	phao32
chính sách/fc1	Tính năngDict
chính sách/fc1/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc1/trọng lượng	Tenxơ	(256, 256)	phao32
chính sách/last_fc	Tính năngDict
chính sách/last_fc/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/last_fc_log_std	Tính năngDict
chính sách/last_fc_log_std/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc_log_std/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/phi tuyến tính	Tenxơ		sợi dây
chính sách/output_distribution	Tenxơ		sợi dây
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao32
bước/thông tin/qpos	Tenxơ	(6,)	phao32
bước/thông tin/qvel	Tenxơ	(6,)	phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v1-medium-expert

Kích thước tải xuống : 184.59 MiB
Kích thước tập dữ liệu : 230.24 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Chỉ khi shuffle_files=False (train)
Chia tách :

Tách ra	Ví dụ
`'train'`	8.163

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(6,), dtype=float32),
            'qvel': Tensor(shape=(6,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao32
bước/thông tin/qpos	Tenxơ	(6,)	phao32
bước/thông tin/qvel	Tenxơ	(6,)	phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v1-medium-replay

Kích thước tải xuống : 55.65 MiB
Kích thước tập dữ liệu : 34.78 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	1.151

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float64),
        'discount': float64,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float64),
        'reward': float64,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao64
bước/giảm giá	Tenxơ		phao64
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao64
bước/phần thưởng	Tenxơ		phao64

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v1-full-replay

Kích thước tải xuống : 183.32 MiB
Kích thước tập dữ liệu : 114.78 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	2,907

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float64),
        'discount': float64,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float64),
        'reward': float64,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao64
bước/giảm giá	Tenxơ		phao64
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao64
bước/phần thưởng	Tenxơ		phao64

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v1-ngẫu nhiên

Kích thước tải xuống : 91.11 MiB
Kích thước tập dữ liệu : 130.73 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Chỉ khi shuffle_files=False (train)
Chia tách :

Tách ra	Ví dụ
`'train'`	45.265

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(6,), dtype=float32),
            'qvel': Tensor(shape=(6,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao32
bước/thông tin/qpos	Tenxơ	(6,)	phao32
bước/thông tin/qvel	Tenxơ	(6,)	phao32
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v2-expert

Kích thước tải xuống : 145.37 MiB
Kích thước tập dữ liệu : 390.40 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Không
Chia tách :

Tách ra	Ví dụ
`'train'`	1.028

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 11), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
chính sách	Tính năngDict
chính sách/fc0	Tính năngDict
chính sách/fc0/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc0/trọng lượng	Tenxơ	(256, 11)	phao32
chính sách/fc1	Tính năngDict
chính sách/fc1/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc1/trọng lượng	Tenxơ	(256, 256)	phao32
chính sách/last_fc	Tính năngDict
chính sách/last_fc/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/last_fc_log_std	Tính năngDict
chính sách/last_fc_log_std/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc_log_std/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/phi tuyến tính	Tenxơ		sợi dây
chính sách/output_distribution	Tenxơ		sợi dây
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v2-full-replay

Kích thước tải xuống : 179.29 MiB
Kích thước tập dữ liệu : 115.04 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	3,515

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v2-medium

Kích thước tải xuống : 145.68 MiB
Kích thước tập dữ liệu : 702.57 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Không
Chia tách :

Tách ra	Ví dụ
`'train'`	2.187

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 11), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(3,), dtype=float32),
            'weight': Tensor(shape=(3, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
chính sách	Tính năngDict
chính sách/fc0	Tính năngDict
chính sách/fc0/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc0/trọng lượng	Tenxơ	(256, 11)	phao32
chính sách/fc1	Tính năngDict
chính sách/fc1/thiên vị	Tenxơ	(256,)	phao32
chính sách/fc1/trọng lượng	Tenxơ	(256, 256)	phao32
chính sách/last_fc	Tính năngDict
chính sách/last_fc/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/last_fc_log_std	Tính năngDict
chính sách/last_fc_log_std/thiên vị	Tenxơ	(3,)	phao32
chính sách/last_fc_log_std/trọng lượng	Tenxơ	(3, 256)	phao32
chính sách/phi tuyến tính	Tenxơ		sợi dây
chính sách/output_distribution	Tenxơ		sợi dây
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v2-medium-expert

Kích thước tải xuống : 290.43 MiB
Kích thước tập dữ liệu : 228.28 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Chỉ khi shuffle_files=False (train)
Chia tách :

Tách ra	Ví dụ
`'train'`	3.214

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v2-medium-replay

Kích thước tải xuống : 72.34 MiB
Kích thước tập dữ liệu : 46.51 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Có
Chia tách :

Tách ra	Ví dụ
`'train'`	2.041

Cấu trúc tính năng :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
thuật toán	Tenxơ		sợi dây
sự lặp lại	Tenxơ		int32
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper/v2-ngẫu nhiên

Kích thước tải xuống : 145.46 MiB
Kích thước tập dữ liệu : 130.72 MiB
Tự động lưu vào bộ nhớ đệm ( tài liệu ): Chỉ khi shuffle_files=False (train)
Chia tách :

Tách ra	Ví dụ
`'train'`	45.240

Cấu trúc tính năng :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(6,), dtype=float64),
            'qvel': Tensor(shape=(6,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(11,), dtype=float32),
        'reward': float32,
    }),
})

Tài liệu tính năng :

Tính năng	Lớp học	Hình dạng	loại D
	Tính năngDict
bước	Tập dữ liệu
bước/hành động	Tenxơ	(3,)	phao32
bước/giảm giá	Tenxơ		phao32
các bước/thông tin	Tính năngDict
bước/thông tin/action_log_probs	Tenxơ		phao64
bước/thông tin/qpos	Tenxơ	(6,)	phao64
bước/thông tin/qvel	Tenxơ	(6,)	phao64
bước/is_first	Tenxơ		bool
bước/is_last	Tenxơ		bool
bước/is_terminal	Tenxơ		bool
bước/quan sát	Tenxơ	(11,)	phao32
bước/phần thưởng	Tenxơ		phao32

Ví dụ ( tfds.as_dataframe ):

d4rl_mujoco_hopper Sử dụng bộ sưu tập để sắp xếp ngăn nắp các trang Lưu và phân loại nội dung dựa trên lựa chọn ưu tiên của bạn.