One can hardly give a single, unambiguous answer to such a question, since test-driven development (TDD) in itself serves more than one purpose:
- As a process, through the act of writing a test before the implementation, one receives valuable feedback about the API and behaviour of the System Under Test (SUT). It frequently happens that the process unearths problems with one's original conceptual design. When that happens, an important step in TDD is to reset and try something else (Git has turned out to be quite helpful in this regard).
- Once the tests and the design congeal and agree, the process leaves behind a set of automated tests that protect against regressions.
While there's a certain exploratory quality to TDD, it doesn't follow that planning ahead is forbidden. One should, however, be ready to discard any plans if it turns out that the reality (as indicated by the (test) code) doesn't fit the plans. Thus, while an hour of planning can save days of coding, too much planning often turns out to be wasted.
From the OP, it's hard to tell what purpose TDD is intended to serve. Is it to gather feedback on API design ideas? Or is it to create an automated test suite? If the latter, what kind of errors should the tests prevent?
The OP makes it sound as though the code to be driven by tests is mostly mapping code. This kind of code can be tedious to write, but how error-prone is it? How costly is a defect? Do you even need to test it?
Let us assume, however, for the sake of argument, that you need to test-drive this code. What are your options?
Dynamic mocks do, indeed, come with problems as outlined in the OP. What else can one do?
A typical object-oriented design may follow a template like the following pseudocode:
GetFooFromAPI(bar)
apiBar = translate bar to API represenation
apiFoo = invoke API with apiBar
foo = translate API foo to external representation
return foo
When designed this way, the API invocation looks like an integral part of the process. Many developers can't think of any other way of making such a design testable than replacing the API with some kind of Test Double.
And you can certainly do that, but if so, consider using state-based testing instead of interaction-based testing.
As is often the case, being aware of more than one programming paradigm can be helpful. In functional programming (FP), one would often model this kind of problem in a different way. Translation or mapping code can often be regarded as a set parser/writer pairs, or serializer/deserializer pairs.
In FP, one may view the overall action of translating one way, invoking the API, and translating back as function composition:
GetFooFromAPI(bar) = apiFooToFoo(invokeAPI(barToAPIBar(bar))
which one would write mathematically as
GetFooFromAPI = apiFooToFoo ∘ invokeAPI ∘ barToAPIBar
Westeners often find that right-to-left composition order counter-intuitive, so some languages offer other alternatives. In F#, for example, one might express the same idea as:
GetFooFromAPI = barToAPIBar >> invokeAPI >> apiFooToFoo
The point of this digression being that this isn't just an abstract mathematical notion, but a real, practical technique supported by actual, useful programming languages.
In FP, one would often consider function composition as a 'given' - as something not warranting testing, since it's typically built into the language, or at least a general-purpose library function. Thus, what remains to be tested are the translations.
And since such mappings typically are pure functions, they are intrinsically testable.
One could test each individual mapping, or instead test each parser/writer pair as there-and-back-again properties.
In conclusion, there are many options. Choose the one that best solves the problem at hand. This requires first figuring out what that problem is. The 'real' problem rarely is 'how do I TDD this?', but rather a more high-level problem to which TDD may or may not be the answer.