I want my model to output a single value, how can I constrain the value to (a, b)? for example, my code is:
class ActorCritic(nn.Module):
def __init__(self, num_state_features):
super(ActorCritic, self).__init__()
# value
self.critic_net = nn.Sequential(
nn.Linear(num_state_features, 64),
nn.ReLU(),
nn.Linear(64, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 1)
)
# policy
self.actor_net = nn.Sequential(
nn.Linear(num_state_features, 64),
nn.ReLU(),
nn.Linear(64, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 1),
)
def forward(self, state):
value = self.critic_net(state)
policy_mean = self.actor_net(state)
return value, policy_mean
and I want the policy output to be in the range (500, 3000), how can I do this?
(I have tried torch.clamp(), this does not work well since the policy would stay always the same if it is near the limit, for example the output goes to -1000000 and it will then stay 500 forever, or takes really long time to change. The same is true for function like nn.Sigmoid())