continuous_control
문서의 이전 판입니다!
Continuous Control
A2C
def cont_logprob(mu, var, actions): import torch import math p1 = - ((mu - actions) ** 2) / (2 * var.clamp(min=1e-3)) p2 = - torch.log(torch.sqrt(2 * math.pi * var)) return (p1 + p2).sum(-1, keepdims=True) def cont_entropy(var): import torch import math entropy = (torch.log(2 * math.pi * var) + 1) / 2 return entropy.sum(-1)
...
continuous_control.1597946816.txt.gz · 마지막으로 수정됨: 2024/03/23 02:37 (바깥 편집)