Skip to main content

Annotate JSON schema properties in Python with msgspec

· 4 min read
Serhii Hrekov
software engineer, creator, artist, programmer, projects founder

To annotate JSON schema properties in Python using msgspec, you use msgspec.field to provide metadata and constraints for a struct field. This allows you to define a more detailed schema beyond just the Python type hints, including documentation, default values, and validation rules.

Basic Annotations

The simplest use of msgspec.field is to provide a default value and a docstring that will be included in the generated schema.

import msgspec

class User(msgspec.Struct):
name: str
age: int = msgspec.field(default=0, help="The user's age. Must be a non-negative integer.")

# The generated schema will include this help text
schema = msgspec.json.schema(User)
print(schema["properties"]["age"])
# Output: {'type': 'integer', 'default': 0, 'description': "The user's age. Must be a non-negative integer."}

Here, help is used to create a description field in the JSON schema, providing documentation for API users.


Validation and Constraints

You can define constraints like minimum/maximum values, string length, and patterns directly in msgspec.field. These constraints are not just for documentation; they are enforced during validation and are included in the generated schema.

import msgspec

class Product(msgspec.Struct):
product_id: str = msgspec.field(
name="productId", # Customize the JSON field name
pattern=r"^[A-Z]{3}-\d{5}$",
help="A unique identifier for the product, e.g., 'ABC-12345'."
)
price: float = msgspec.field(
ge=0, # Greater than or equal to 0
help="The product's price. Must be a non-negative value."
)
tags: list[str] = msgspec.field(
max_length=5, # Restrict the number of items in the list
help="A list of keywords associated with the product."
)

# The generated schema includes all constraints
schema = msgspec.json.schema(Product)
print(schema["properties"]["product_id"])
# Output: {'type': 'string', 'pattern': '^[A-Z]{3}-\\d{5}$', 'description': "A unique identifier for the product, e.g., 'ABC-12345'."}

In this example, ge and max_length provide validation rules that a value must satisfy. If a value fails these checks during decoding, a ValidationError is raised.


Advanced Use Cases

Renaming Fields

The name argument in msgspec.field allows you to define a different name for the property in the JSON payload than the name of the attribute in the Python class. This is useful for adhering to different naming conventions, such as camelCase in JSON and snake_case in Python.

Aliases and Optional Fields

For optional fields, you can use default=None. You can also use aliases to accept multiple names for the same field in the JSON payload (1).

import msgspec

class Order(msgspec.Struct):
user_id: str = msgspec.field(name="userId")
# This field can be null and is not required in the payload
discount_code: str | None = msgspec.field(default=None, aliases=["promo_code"])

# A payload using the alias
payload = b'{"userId": "123", "promo_code": "DISCOUNT10"}'
order = msgspec.json.decode(payload, type=Order)
print(order)
# Output: Order(user_id='123', discount_code='DISCOUNT10')

Sources

  1. Msgspec Documentation: msgspec.field