Recently at Qargo, we implemented functionality to convert Pydantic models to XML. This led us to xsdata and its extension xsdata-pydantic. While exploring its source code, I discovered an elegant pattern for storing additional metadata about XML structure (like whether a field is an element or attribute) directly on Pydantic fields.
This pattern has applications beyond XML serialization—you might use it for database column metadata, API documentation, custom validation rules, or any scenario where you need to attach extra information to your model fields. In this post, I’ll share two approaches for implementing this pattern: extending FieldInfo
and using Python’s type annotations. The code examples assume Pydantic version 2.10.
Approach 1: Extending FieldInfo
The first approach, used by xsdata-pydantic
, involves creating a custom FieldInfo
class. This follows traditional object-oriented design by extending Pydantic’s FieldInfo
with your custom attributes. Here’s how it works:
from typing import Any
from pydantic import fields
class XMLFieldInfo(fields.FieldInfo):
xml_metadata: dict[str, Any] | None
__slots__ = ("xml_metadata",)
def __init__(self, metadata: dict[str, Any] | None = None, **kwargs: Any):
super().__init__(**kwargs)
self.xml_metadata = metadata
self.json_schema_extra = {
"xml_metadata": metadata,
}
def XMLField(
metadata: dict[str, Any] | None = None,
**kwargs: Any,
) -> Any:
return XMLFieldInfo(
metadata=metadata,
**kwargs,
)
Notice the use of __slots__
to maintain memory efficiency, preventing the creation of a __dict__
for each field instance. The separate XMLField
function serves as a factory, following Pydantic’s pattern and keeping the typing system happy.
You can then use XMLField
just like Pydantic’s built-in Field
:
from pydantic import BaseModel
class Person(BaseModel):
first_name: str = XMLField(
metadata={
"type": "Element",
"required": True,
"name": "first_name",
},
alias="first_name",
)
Accessing the metadata is straightforward through the model’s field information (by making use of the built-in model_fields
):
first_name_field = Person.model_fields["first_name"]
print(first_name_field.xml_metadata)
# Output: {'type': 'Element', 'required': True, 'name': 'first_name'}
The Trade-off
While this approach is clean and follows familiar OO patterns, it comes with a significant drawback: you lose all typing and auto-completion support. Your IDE won’t know about the custom parameters you’ve added, making it easier to introduce bugs. This is especially problematic when upgrading Pydantic versions where the internal API might change.
Approach 2: Using Type Annotations
The second approach leverages Python’s Annotated
type, introduced in Python 3.9 (available via typing_extensions
for earlier versions). This system allows you to attach metadata directly to type hints:
from dataclasses import dataclass
from typing import Annotated
from pydantic import BaseModel
@dataclass
class XMLMetadata:
type: str
required: bool
name: str
class Person(BaseModel):
first_name: Annotated[
str,
XMLMetadata(
type="Element",
required=True,
name="first_name",
),
]
Using a dataclass for metadata provides structure and type safety. Pydantic automatically stores any annotations it doesn’t recognize as metadata on the field.
Retrieving the metadata requires iterating through the field’s metadata items:
first_name_field = Person.model_fields["first_name"]
xml_metadata = next(
metadata_item
for metadata_item in first_name_field.metadata
if isinstance(metadata_item, XMLMetadata)
)
print(xml_metadata)
# Output: XMLMetadata(type='Element', required=True, name='first_name')
Composing with Other Annotations
One powerful aspect of this approach is how well it composes with Pydantic’s existing annotation system:
from pydantic import AfterValidator, Field
def validate_xml_name(value: str) -> str:
if not value.isidentifier():
raise ValueError("Invalid XML element name")
return value
class Document(BaseModel):
root_element: Annotated[
str,
Field(min_length=1),
AfterValidator(validate_xml_name),
XMLMetadata(
type="Element",
required=True,
name="root",
),
]
This co-location of all field-related information makes the schema immediately understandable.
Conclusion
Custom metadata in Pydantic fields opens up powerful patterns for building domain-specific frameworks. While the FieldInfo
extension approach offers a familiar object-oriented pattern, the Annotated
approach provides better developer experience with full typing support and cleaner composition.
The key insight from exploring xsdata-pydantic
is that Pydantic’s flexibility allows us to extend models with domain-specific information without compromising the core validation functionality. Whether you’re building XML serializers, ORM mappings, or API documentation generators, these patterns provide a solid foundation for metadata-driven development.