6

Question

Is it possible to replicate Marshmallow's dump_only feature using pydantic for FastAPI, so that certain fields are "read-only", without defining separate schemas for serialization and deserialization?

Context

At times, a subset of the attributes (e.g. id and created_date) for a given API resource are meant to be read-only and should be ignored from the request payload during deserialization (e.g. when POSTing to a collection or PUTting to an existing resource) but need to be returned with that schema in the response body for those same requests.

Marshmallow provides a convenient dump_only parameter that requires only one schema to be defined for both serialization and deserialization, with the option to exclude certain fields from either operation.

Existing Solution

Most attempts I've seen to replicate this functionality within FastAPI (i.e. FastAPI docs, GitHub Issue, Related SO Question) tend to define separate schemas for input (deserialization) and output (serialization) and define a common base schema for the shared fields between the two.

Based on my current understanding of this approach, it seems a tad inconvenient for a few reasons:

  1. It requires the API developer to reserve separate namespaces for each schema, a problem that is exacerbated by following the practice of abstracting the common fields to a third "base" schema class.
  2. It results in the proliferation of schema classes in APIs that have nested resources, since each level of nesting requires a separate input and output schema.
  3. The the OAS-compliant documentation displays the input/output schemas as separate definitions, when the consumer of that API only ever needs to be aware of a single schema since the (de)serialization of those read-only fields should be handled properly by the API.

Example

Say we're developing a simple API for a survey with the following models:

from sqlalchemy.orm import declarative_base, relationship
from sqlalchemy import (
    func,
    Column,
    Integer,
    String,
    DateTime,
    ForeignKey,
)

Base = declarative_base()


class SurveyModel(db.Base):
    """Table that represents a collection of questions"""

    __tablename__ = "survey"

    # columns
    id = Column(Integer, primary_key=True, index=True)
    name = Column(String, nullable=False)
    created_date = Column(DateTime, default=func.now())

    # relationships
    questions = relationship("Question", backref="survey")


class QuestionModel(Base):
    """Table that contains the questions that comprise a given survey"""

    __tablename__ = "question"

    # columns
    id = Column(Integer, primary_key=True, index=True)
    survey_id = Column(Integer, ForeignKey("survey.id"))
    text = Column(String)
    created_date = Column(DateTime, default=func.now())

And we wanted a POST /surveys endpoint to accept the following payload in the request body:

{
    "name": "First Survey",
    "questions": [
        {"text": "Question 1"},
        {"text": "Question 2"}
    ]
}

And return the following in the response body:

{
    "id": 1,
    "name": "First Survey",
    "created_date": "2021-12-12T00:00:30",
    "questions": [
        {
             "id": 1,
             "text": "Question 1",
             "created_date": "2021-12-12T00:00:30"
        },
        {
             "id": 2,
             "text": "Question 2",
             "created_date": "2021-12-12T00:00:30"
        },
    ]
}

Is there an alternative way to make id and created_date read-only for both QuestionModel and SurveyModel other than defining the schemas like this?

from datetime import datetime
from typing import List

from pydantic import BaseModel


class QuestionIn(BaseModel):
    text: str

    class Config:
        extra = "ignore"  # ignores extra fields passed to schema


class QuestionOut(QuestionIn):
    id: int
    created_date: datetime


class SurveyBase(BaseModel):
    name: str

    class Config:
        extra = "ignore"  # ignores extra fields passed to schema

        
class SurveyOut(SurveyBase):
    id: int
    created_date: datetime


class SurveyQuestionsIn(SurveyBase):
    questions: List[QuestionIn]


class SurveyQuestionsOut(SurveyOut):
    questions: List[QuestionOut]

Just for comparison, here would be the equivalent schema using marshmallow:

from marshmallow import Schema, fields


class Question(Schema):
    id = fields.Integer(dump_only=True)
    created_date = fields.DateTime(dump_only=True)
    text = fields.String(required=True)

class Survey(Schema):
    id = fields.Integer(dump_only=True)
    created_date = fields.DateTime(dump_only=True)
    name = fields.String(required=True)
    questions = fields.List(fields.Nested(Question))

References

  1. Marshmallow read-only/load-only fields
  2. Existing Stack Exchange Question
  3. Read-only fields issue on FastAPI repo
  4. FastAPI documentation on Schemas
5
  • Do you want a simpler alternative? The one you already have does not seem too complex. I think you can remove the Config, though. ignore is the default value for extra. Commented Dec 14, 2021 at 1:47
  • 3
    @HernánAlarcón Thanks for that note on the Config. I'm looking for a solution that doesn't require defining separate schema classes for inputs and outputs for the reasons listed under "Existing Solution". This solution isn't too complex for the toy example above but it requires 2-3x the number of classes as Marshmallow's approach with dump_only which becomes less manageable as the number of models increases. Commented Dec 15, 2021 at 1:12
  • 1
    I'm trying out FastAPI after many years of using marshmallow and I had the exact same "complaint" (great writeup btw!). I'm trying to just go with it and adjust my thinking on how this kind of thing is structured. It's a different separation of concerns, but I'm not yet convinced that it's "better". Commented Jun 28, 2022 at 23:04
  • What's wrong with using a custom validator to generate your id's as suggested in the #3 link? Also, your use of the term read-only is confusing as it makes me think you are trying to prevent modification of the field. Commented Dec 29, 2022 at 17:24
  • @GabrielG. Having a custom validator which generates the id is one option for setting the id value, but it still requires a separate pydantic class to return the id in the response body once it's been set. And that is actually what I mean by read-only. It is a common pattern within Marshmallow that they call dump_only which allows a value to be serialized but not deserialized: marshmallow.readthedocs.io/en/stable/… Commented Jan 9, 2023 at 4:12

1 Answer 1

-1

According to pydantic.Field documentation:

  • init_var ­— Whether the field should be included in the constructor of the dataclass.
  • exclude — Whether to exclude the field from the model serialization.

So, to exclude a field from "deserialization" that is being handled by the BaseModel.__init__ method in Pydantic it's enough to set init_var to False.

Rewriting your example in Pydantic would result in something like this:

from datetime import datetime

from Pydantic import BaseModel, Field


class Question(BaseModel):
    id: int = Field(init_var=False)
    created_date: datetime = Field(init_var=False)
    text: str

class Survey(BaseModel):
    id: int = Field(init_var=False)
    created_date: datetime = Field(init_var=False)
    name: str
    questions: list[Question]
Sign up to request clarification or add additional context in comments.

1 Comment

This would be an elegant solution, but it doesn't seem to work as expected. In this example the id field is still deserialized if I include it in the dictionary passed to the Question schema -- Question(**{"id": 1, "text": "hello"})

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.