How does critic influence actor in "Encoder-Core-Decoder" (in shared and separate network)?

Question

I'm learning RL and understand the basic actor-critic concept, but I'm confused about the technical details of how the critic actually influences the actor during training. Here's my current understanding, there are shared weight and separate weight actor-critic network:

For shared weight, the actor and critic share Encoder + Core (RNN). In backpropagation, critic updates the weights on the Encoder and RNN, and actor also updates the weights on the Encoder (feature extractor) and the RNN, so actor "learns" from the weights updated by critic indirectly and from the gradients combining both updated losses.

For separate weight, both actor and critic have their own Encoder, RNN, so weights are updated separately by their own loss. Thus, they are not affecting each other through weights. Instead, the critic is used to calculate the advantage, and the advantage is used by the actor.

Is my understanding correct? If not, could you explain the flow, point out any crucial details I'm missing, or refer me to where I can gain a better understanding of this?

And in MARL settings, when should I use separate vs. shared weights? What are the key trade-offs?

Any pointers to papers or code examples would be super helpful!

Reyomi · Accepted Answer · 2025-11-19 09:38:30Z

I managed to figure this out through a bunch of reading.

For shared weight, critic influences actor through:

Critic's value V(s) -> Advantage A -> Actor
Shared weight in the Encoder/RNN network as their gradients both shape the weights

For separate weight, critic influences actor only through:

The advantage (Actor tries to maximing Advantage from critic's value, but doesn't influence critic feature learning).

And in MARL, to preserve MAPOMDP, separate weight must be used so actor doesn't get influenced by critic through shared weight.

Please feel free to correct me if I'm misunderstanding somewhere.

Stack Exchange Network

How does critic influence actor in "Encoder-Core-Decoder" (in shared and separate network)?

1 Answer 1

You must log in to answer this question.

Hot Network Questions

How does critic influence actor in "Encoder-Core-Decoder" (in shared and separate network)?

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions