Getting Started
Mashumaro is a fast and well-tested serialization library built on top of Python dataclasses. It lets you convert dataclass instances to and from JSON, YAML, TOML, MessagePack, and plain dictionaries with minimal effort.
Installation
Install with pip:
pip install mashumaroMashumaro supports Python 3.9 through 3.14.
To install with optional format support, use extras:
pip install mashumaro[orjson] # Fast JSON via orjson
pip install mashumaro[yaml] # YAML support
pip install mashumaro[toml] # TOML support
pip install mashumaro[msgpack] # MessagePack support
# Or install everything at once:
pip install mashumaro[orjson,yaml,toml,msgpack]Quick Example
Add serialization to any dataclass by inheriting a mixin:
from dataclasses import dataclass
from mashumaro.mixins.json import DataClassJSONMixin
@dataclass
class User(DataClassJSONMixin):
name: str
email: str
age: int
# Serialize to JSON
user = User(name="Alice", email="alice@example.com", age=30)
json_str = user.to_json()
# Deserialize from JSON
restored = User.from_json(json_str)
assert restored == userThat's it. No schema definitions, no field mappings, no boilerplate.
Mixins vs Codecs
Mashumaro offers two fundamentally different approaches. See Supported Formats for all available formats.
Mixins
Mixins are ideal when you have a root dataclass model. Inherit a mixin and get to_ and from_ methods directly on your class:
from dataclasses import dataclass
from mashumaro.mixins.json import DataClassJSONMixin
@dataclass
class Config(DataClassJSONMixin):
host: str
port: int
debug: bool = False
config = Config(host="localhost", port=8080)
config.to_json() # serialize
Config.from_json(data) # deserializeCodecs
Codecs are more flexible. They can encode and decode any type, not just dataclasses:
from mashumaro.codecs.json import JSONDecoder, JSONEncoder
encoder = JSONEncoder(list[Config])
decoder = JSONDecoder(list[Config])
configs = [Config("host1", 80), Config("host2", 443)]
json_str = encoder.encode(configs)
restored = decoder.decode(json_str)Codecs are especially useful when you need to serialize collections or non-dataclass types.
How It Works
Mashumaro analyzes your dataclass schema and generates optimized Python code for serialization and deserialization. This code is compiled once:
- Mixins: compiled at import time (or lazily on first call, see lazy_compilation)
- Codecs: compiled when the encoder/decoder is created
Every subsequent call runs this pre-compiled code directly, with no runtime type introspection.
Supported Formats
Mashumaro has built-in support for multiple serialization formats. Each format comes with both a mixin class and a codec pair. For the full list of supported Python types, see Supported Types.
Basic Form (Dict)
The basic form converts dataclasses to and from plain Python dictionaries containing only primitive types (str, int, float, bool, list, dict).
Mixin:
from mashumaro import DataClassDictMixin
@dataclass
class MyModel(DataClassDictMixin):
name: str
value: int
obj = MyModel(name="test", value=42)
obj.to_dict() # {'name': 'test', 'value': 42}
MyModel.from_dict({...}) # deserializeCodec:
from mashumaro.codecs import BasicDecoder, BasicEncoder
encoder = BasicEncoder(list[MyModel])
decoder = BasicDecoder(list[MyModel])DataClassDictMixin is the base class for all other format mixins. You don't need to inherit it separately if you're already using a format-specific mixin.
JSON
Standard Library
from mashumaro.mixins.json import DataClassJSONMixin
@dataclass
class MyModel(DataClassJSONMixin):
name: str
value: int
obj = MyModel(name="test", value=42)
obj.to_json() # '{"name":"test","value":42}'
MyModel.from_json(json_str) # deserializeCodec:
from mashumaro.codecs.json import JSONDecoder, JSONEncoder
encoder = JSONEncoder(MyModel)
decoder = JSONDecoder(MyModel)orjson
For better JSON performance, install via pip install mashumaro[orjson]:
from mashumaro.mixins.orjson import DataClassORJSONMixin
@dataclass
class MyModel(DataClassORJSONMixin):
name: str
value: int
obj = MyModel(name="test", value=42)
obj.to_json() # returns str
obj.to_jsonb() # returns bytes (faster)Codec:
from mashumaro.codecs.orjson import ORJSONDecoder, ORJSONEncoder
encoder = ORJSONEncoder(MyModel)
decoder = ORJSONDecoder(MyModel)The types datetime, date, time, and uuid.UUID are handled natively by orjson. You can configure orjson behavior with the orjson_options config option.
YAML
Requires pyyaml. Install via pip install mashumaro[yaml].
from mashumaro.mixins.yaml import DataClassYAMLMixin
@dataclass
class MyModel(DataClassYAMLMixin):
name: str
items: list[str]
obj.to_yaml()
MyModel.from_yaml(yaml_str)Codec:
from mashumaro.codecs.yaml import YAMLDecoder, YAMLEncoder
encoder = YAMLEncoder(MyModel)
decoder = YAMLDecoder(MyModel)TOML
Requires tomli and tomli-w. Install via pip install mashumaro[toml].
from mashumaro.mixins.toml import DataClassTOMLMixin
@dataclass
class AppConfig(DataClassTOMLMixin):
title: str
debug: boolCodec:
from mashumaro.codecs.toml import TOMLDecoder, TOMLEncoder
encoder = TOMLEncoder(MyModel)
decoder = TOMLDecoder(MyModel)Fields with None values are omitted during TOML serialization because TOML does not support null values.
MessagePack
Requires msgpack. Install via pip install mashumaro[msgpack].
from mashumaro.mixins.msgpack import DataClassMessagePackMixin
@dataclass
class Event(DataClassMessagePackMixin):
type: str
payload: bytes
obj.to_msgpack()
Event.from_msgpack(packed_bytes)Codec:
from mashumaro.codecs.msgpack import MessagePackDecoder, MessagePackEncoder
encoder = MessagePackEncoder(MyModel)
decoder = MessagePackDecoder(MyModel)Supported Types
Mashumaro supports virtually every built-in Python type and many from typing-extensions. For types not listed here, you can add support via SerializableType or SerializationStrategy.
Generic Types from typing
List, Tuple, NamedTuple, Set, FrozenSet, Deque, Dict, OrderedDict, DefaultDict, TypedDict, Mapping, MutableMapping, Counter, ChainMap, Sequence
PEP 585 Generic Types (Python 3.9+)
list, tuple, set, frozenset, collections.deque, dict, collections.OrderedDict, collections.defaultdict, collections.abc.Mapping, collections.Counter, collections.ChainMap, collections.abc.Sequence, collections.abc.MutableSequence
Special Typing Primitives
Any, Optional, Union, TypeVar, TypeVarTuple, NewType, Annotated, Literal, LiteralString, Final, Self, Unpack, ReadOnly
Enumerations
Enum, IntEnum, StrEnum, Flag, IntFlag
Common Built-in Types
int, float, bool, str, bytes, bytearray
Datetime Types
datetime, date, time, timedelta, timezone, ZoneInfo
Path Types
PurePath, Path, PurePosixPath, PosixPath, PureWindowsPath, WindowsPath, os.PathLike
Other Built-in Types
uuid.UUID, decimal.Decimal, fractions.Fraction, ipaddress.IPv4Address, ipaddress.IPv6Address, ipaddress.IPv4Network, ipaddress.IPv6Network, re.Pattern
SerializableType Interface
If you have a custom class whose instances you want to serialize, implement the SerializableType interface for full control over conversion. For third-party types you can't modify, see SerializationStrategy instead.
Basic Usage
from mashumaro import DataClassDictMixin
from mashumaro.types import SerializableType
class Airport(SerializableType):
def __init__(self, code, city):
self.code = code
self.city = city
def _serialize(self):
return [self.code, self.city]
@classmethod
def _deserialize(cls, value):
return cls(*value)
@dataclass
class Flight(DataClassDictMixin):
origin: Airport
destination: Airport
data = {"origin": ["JFK", "New York"], "destination": ["LAX", "Los Angeles"]}
flight = Flight.from_dict(data)
assert flight.to_dict() == dataUsing Annotations
By default, _deserialize receives raw input data. Enable use_annotations=True for automatic type conversion:
class Itinerary(SerializableType, use_annotations=True):
def __init__(self, flights):
self.flights = flights
def _serialize(self) -> list[Flight]:
return self.flights
@classmethod
def _deserialize(cls, flights: list[Flight]):
return cls(flights)use_annotations=True must be explicitly passed. It will become the default in a future major release.
SerializationStrategy
When you need to customize serialization for a type you don't control, use SerializationStrategy. Unlike SerializableType, it doesn't require modifying the target class.
Basic Example
from dataclasses import dataclass, field
from datetime import datetime
from mashumaro import DataClassDictMixin, field_options
from mashumaro.types import SerializationStrategy
class FormattedDateTime(SerializationStrategy):
def __init__(self, fmt):
self.fmt = fmt
def serialize(self, value):
return value.strftime(self.fmt)
def deserialize(self, value):
return datetime.strptime(value, self.fmt)
@dataclass
class DateTimeFormats(DataClassDictMixin):
short: datetime = field(
metadata=field_options(
serialization_strategy=FormattedDateTime("%d%m%Y%H%M%S")
)
)
verbose: datetime = field(
metadata=field_options(
serialization_strategy=FormattedDateTime("%A %B %d, %Y, %H:%M:%S")
)
)Registering in Config
Instead of attaching strategies to individual fields, register them globally via Config:
@dataclass
class MyModel(DataClassDictMixin):
created_at: datetime
updated_at: datetime
class Config:
serialization_strategy = {
datetime: FormattedDateTime("%Y-%m-%d"),
}Strategies can also be applied via Dialects for context-dependent behavior.
Field Options
For per-field customization use the metadata parameter of dataclasses.field(). For global customization, see Config Options.
from dataclasses import dataclass, field
from mashumaro import field_options
@dataclass
class MyModel:
x: int = field(metadata=field_options(...))serialize
Change how a field is serialized. Accepts a callable or a string engine name:
@dataclass
class Event(DataClassDictMixin):
created_at: datetime = field(
metadata={
"serialize": lambda v: v.strftime("%Y-%m-%d %H:%M:%S")
}
)| Applicable Types | Engine | Description |
|---|---|---|
NamedTuple | as_list, as_dict | Pack as lists (default) or dicts |
| Any | omit | Skip the field during serialization |
deserialize
Change how a field is deserialized:
| Applicable Types | Engine | Description |
|---|---|---|
datetime, date, time | ciso8601, pendulum | Alternative datetime parsers |
NamedTuple | as_list, as_dict | Unpack from lists or dicts |
alias
Assign a different name for deserialization. See also aliases in Config:
@dataclass
class Response(DataClassDictMixin):
status_code: int = field(metadata=field_options(alias="statusCode"))
error_message: str = field(metadata=field_options(alias="errorMessage"))
obj = Response.from_dict({"statusCode": 200, "errorMessage": ""})pass_through
Pass a field value as-is without any transformation:
from mashumaro import pass_through
@dataclass
class Raw(DataClassDictMixin):
data: MyCustomClass = field(
metadata={
"serialize": pass_through,
"deserialize": pass_through,
}
)Config Options
The inner Config class provides centralized control over serialization. Config options are inherited by subclasses. For per-field control, see Field Options.
from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig
class BaseModel(DataClassDictMixin):
class Config(BaseConfig):
debug = True
class ModelA(BaseModel):
a: intdebug
Print the generated code for inspection.
serialization_strategy
Register strategies for specific types across all fields:
class Config(BaseConfig):
serialization_strategy = {
datetime: FormattedDateTime("%Y"),
}aliases
Define field aliases in one place. See also alias field option:
class Config(BaseConfig):
aliases = {"field_a": "FieldA", "field_b": "FieldB"}serialize_by_alias
Serialize all aliased fields using their alias names.
omit_none
Skip fields with None values during serialization:
@dataclass
class Sparse(DataClassDictMixin):
a: int | None = None
b: str | None = None
class Config(BaseConfig):
omit_none = True
Sparse().to_dict() # {}
Sparse(a=1).to_dict() # {"a": 1}You can also control this at call time via code_generation_options.
omit_default
Skip fields whose values equal the default.
namedtuple_as_dict
Serialize NamedTuple fields as dicts instead of lists.
dialect
Set a default dialect for this dataclass.
orjson_options
Set default options for orjson.dumps when using DataClassORJSONMixin. See Supported Formats for more on orjson.
discriminator
Configure discriminated unions. See the Discriminator section.
lazy_compilation
Defer code generation from import time to first use. Reduces import time for apps with many dataclasses.
sort_keys
Sort dictionary keys in serialization output.
forbid_extra_keys
Raise an error during deserialization if input contains unexpected keys.
Dialects
Dialects separate serialization schemes from data models. Unlike Config options which are fixed, dialects can be switched at call time.
Defining a Dialect
from datetime import date, datetime
from mashumaro.dialect import Dialect
from mashumaro.types import SerializationStrategy
class DateTimeFmt(SerializationStrategy):
def __init__(self, fmt):
self.fmt = fmt
def serialize(self, value):
return value.strftime(self.fmt)
def deserialize(self, value):
return datetime.strptime(value, self.fmt).date()
class EthiopianDialect(Dialect):
serialization_strategy = {
date: DateTimeFmt("%d/%m/%Y"),
}
class JapaneseDialect(Dialect):
serialization_strategy = {
date: DateTimeFmt("%Y年%m月%d日"),
}Using Dialects at Call Time
Enable dialect support via code generation options, then pass a dialect:
from mashumaro.config import BaseConfig, ADD_DIALECT_SUPPORT
@dataclass
class Entity(DataClassDictMixin):
dt: date
class Config(BaseConfig):
code_generation_options = [ADD_DIALECT_SUPPORT]
entity = Entity(date(2021, 12, 31))
entity.to_dict(dialect=EthiopianDialect)
entity.to_dict(dialect=JapaneseDialect)Dialect Options
| Option | Description |
|---|---|
serialization_strategy | Type-to-strategy mapping |
serialize_by_alias | Serialize fields by their aliases |
omit_none | Skip None values |
omit_default | Skip default values |
namedtuple_as_dict | Serialize named tuples as dicts |
no_copy_collections | Skip copying collections for performance |
Discriminator
Discriminated unions allow automatic subclass selection during deserialization based on a field value.
By a Field Value
from mashumaro import DataClassDictMixin
from mashumaro.types import Discriminator
@dataclass
class Event(DataClassDictMixin):
type: str
class Config:
discriminator = Discriminator(field="type", include_subtypes=True)
@dataclass
class ClickEvent(Event):
type = "click"
x: int
y: int
@dataclass
class KeyEvent(Event):
type = "keypress"
key: str
event = Event.from_dict({"type": "click", "x": 10, "y": 20})
assert isinstance(event, ClickEvent)Without a Common Field
When subclasses don't share a discriminator field, Mashumaro tries each subclass and picks the one whose fields match:
@dataclass
class Event(DataClassDictMixin):
class Config:
discriminator = Discriminator(include_subtypes=True)Custom Variant Tagger
For maximum flexibility, provide a custom dispatch function:
def event_tagger(data):
mapping = {"click": ClickEvent, "keypress": KeyEvent}
return mapping[data["type"]]
class Config:
discriminator = Discriminator(variant_tagger_fn=event_tagger)With Union Types
Discriminators also work in Union type annotations using Annotated:
from typing import Annotated
@dataclass
class Container(DataClassDictMixin):
event: Annotated[
ClickEvent | KeyEvent,
Discriminator(field="type"),
]Code Generation Options
Some features require additional generated code. These are opt-in via code_generation_options in Config.
Available Options
| Constant | Description |
|---|---|
TO_DICT_ADD_OMIT_NONE_FLAG | Adds omit_none kwarg to to_* methods |
TO_DICT_ADD_BY_ALIAS_FLAG | Adds by_alias kwarg to to_* methods |
ADD_DIALECT_SUPPORT | Adds dialect kwarg to from_ and to_ methods |
ADD_SERIALIZATION_CONTEXT | Adds context kwarg to to_* methods |
Example: omit_none
from mashumaro.config import BaseConfig
from mashumaro.config import TO_DICT_ADD_OMIT_NONE_FLAG
@dataclass
class Model(DataClassDictMixin):
a: int
b: str | None = None
class Config(BaseConfig):
code_generation_options = [TO_DICT_ADD_OMIT_NONE_FLAG]
Model(a=1).to_dict(omit_none=True) # {"a": 1}
Model(a=1).to_dict(omit_none=False) # {"a": 1, "b": None}See also Dialects for the ADD_DIALECT_SUPPORT option.
Generic Dataclasses
Mashumaro fully supports generic dataclasses and variadic generic types.
Generic Inheritance
from typing import Generic, Mapping, TypeVar
from datetime import date
KT = TypeVar("KT")
VT = TypeVar("VT")
@dataclass
class GenericModel(Generic[KT, VT]):
data: Mapping[KT, VT]
@dataclass
class StringDateModel(GenericModel[str, date], DataClassDictMixin):
pass
StringDateModel.from_dict({"data": {"key": "2021-01-01"}})Generic Dataclass in a Field Type
T = TypeVar("T")
@dataclass
class Wrapper(Generic[T], DataClassDictMixin):
value: T
@dataclass
class Container(DataClassDictMixin):
date_wrapper: Wrapper[date]
str_wrapper: Wrapper[str]GenericSerializableType
For advanced control over generic serialization, implement GenericSerializableType:
from mashumaro.types import GenericSerializableType
class DictWrapper(dict[KT, VT], GenericSerializableType):
__packers__ = {date: lambda x: x.isoformat(), str: str}
__unpackers__ = {date: date.fromisoformat, str: str}
def _serialize(self, types):
k_type, v_type = types
k_conv = self.__packers__[k_type]
v_conv = self.__packers__[v_type]
return {k_conv(k): v_conv(v) for k, v in self.items()}
@classmethod
def _deserialize(cls, value, types):
k_type, v_type = types
k_conv = cls.__unpackers__[k_type]
v_conv = cls.__unpackers__[v_type]
return cls({k_conv(k): v_conv(v) for k, v in value.items()})Serialization Hooks
Hooks let you intercept data at four stages of the lifecycle. They work with every serialization format.
Before Deserialization
@dataclass
class CaseInsensitiveModel(DataClassJSONMixin):
name: str
age: int
@classmethod
def __pre_deserialize__(cls, d):
return {k.lower(): v for k, v in d.items()}
CaseInsensitiveModel.from_dict({"NAME": "Alice", "AGE": 30})After Deserialization
@classmethod
def __post_deserialize__(cls, obj):
if obj.score < 0:
obj.score = 0
return objBefore Serialization
def __pre_serialize__(self):
self._serialize_count += 1
return selfAfter Serialization
def __post_serialize__(self, d):
d.pop("password")
return dHook Summary
| Hook | Type | Input | Output |
|---|---|---|---|
__pre_deserialize__ | classmethod | dict | dict |
__post_deserialize__ | classmethod | instance | instance |
__pre_serialize__ | instance method | self | self |
__post_serialize__ | instance method | dict | dict |
JSON Schema
Mashumaro can generate JSON Schema from your dataclasses for validation, documentation, and interoperability.
Building a Schema
from dataclasses import dataclass
from mashumaro.jsonschema import build_json_schema
@dataclass
class User:
name: str
age: int
email: str | None = None
schema = build_json_schema(User)
print(schema.to_json())Works with non-dataclass types too:
schema = build_json_schema(list[User])Schema Constraints
Use Annotated with constraint annotations:
from typing import Annotated
from mashumaro.jsonschema.annotations import (
Maximum, Minimum, MaxLength, MinLength, Pattern,
)
@dataclass
class Product:
name: Annotated[str, MinLength(1), MaxLength(100)]
price: Annotated[float, Minimum(0)]
sku: Annotated[str, Pattern(r"^[A-Z]{2}\\d{6}$")]
quantity: Annotated[int, Minimum(0), Maximum(10000)]Numeric: Minimum, Maximum, ExclusiveMinimum, ExclusiveMaximum, MultipleOf
String: MinLength, MaxLength, Pattern
Array: MinItems, MaxItems, UniqueItems, Containts, MinContains, MaxContains
Object MinProperties, MaxProperties, DependentRequired
Plugins
from mashumaro.jsonschema.plugins import DocstringDescriptionPlugin
schema = build_json_schema(User, plugins=[DocstringDescriptionPlugin()])