tf_agents.trajectories.to_transition
Stay organized with collections
Save and categorize content based on your preferences.
Create a transition from a trajectory or two adjacent trajectories.
tf_agents . trajectories . to_transition (
trajectory : tf_agents . trajectories . Trajectory
,
next_trajectory : Optional [ tf_agents . trajectories . Trajectory
] = None
) -> tf_agents . trajectories . Transition
Note: If next_trajectory
is not provided, tensors of trajectory
are
sliced along their second (time
) dimension; for example:
time_steps . step_type = trajectory . step_type [:,: - 1 ]
time_steps . observation = trajectory . observation [:,: - 1 ]
next_time_steps . observation = trajectory . observation [:, 1 :]
next_time_steps . step_type = trajectory . next_step_type [:,: - 1 ]
next_time_steps . reward = trajectory . reward [:,: - 1 ]
next_time_steps . discount = trajectory . discount [:,: - 1 ]
Notice that reward and discount for time_steps are undefined, therefore filled
with zero.
Args
trajectory
An instance of Trajectory
. The tensors in Trajectory must have
shape [B, T, ...]
when next_trajectory is None
. discount
is assumed
to be a scalar float; hence the shape of trajectory.discount
must be
[B, T]
.
next_trajectory
(optional) An instance of Trajectory
.
Returns
A tuple (time_steps, policy_steps, next_time_steps)
. The reward
and
discount
fields of time_steps
are filled with zeros because these
cannot be deduced (please do not use them).
Raises
ValueError
if discount
rank is not within the range [1, 2].
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[{
"type": "thumb-down",
"id": "missingTheInformationINeed",
"label":"Missing the information I need"
},{
"type": "thumb-down",
"id": "tooComplicatedTooManySteps",
"label":"Too complicated / too many steps"
},{
"type": "thumb-down",
"id": "outOfDate",
"label":"Out of date"
},{
"type": "thumb-down",
"id": "samplesCodeIssue",
"label":"Samples / code issue"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Other"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Easy to understand"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Solved my problem"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Other"
}]
{"lastModified": "Last updated 2024-04-26 UTC."}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[]]