dumps() function¶
- rapidjson.dumps(obj, *, skipkeys=False, ensure_ascii=True, write_mode=WM_COMPACT, indent=4, default=None, sort_keys=False, number_mode=None, datetime_mode=None, uuid_mode=None, bytes_mode=BM_UTF8, iterable_mode=IM_ANY_ITERABLE, mapping_mode=MM_ANY_MAPPING, allow_nan=True)¶
Encode given Python obj instance into a
JSONstring.- Parameters:
obj – the value to be serialized
skipkeys (bool) – whether invalid
dictkeys will be skippedensure_ascii (bool) – whether the output should contain only ASCII characters
write_mode (int) – enable particular pretty print behaviors
indent – indentation width or string to produce pretty printed JSON
default (callable) – a function that gets called for objects that can’t otherwise be serialized
sort_keys (bool) – whether dictionary keys should be sorted alphabetically
number_mode (int) – enable particular behaviors in handling numbers
datetime_mode (int) – how should
datetime,timeanddateinstances be handleduuid_mode (int) – how should
UUIDinstances be handledbytes_mode (int) – how should
bytesinstances be handlediterable_mode (int) – how should iterable values be handled
mapping_mode (int) – how should mapping values be handled
allow_nan (bool) – compatibility flag equivalent to
number_mode=NM_NAN
- Returns:
A Python
strinstance.
skipkeys
If skipkeys is true (default:
False), then dict keys that are not of a basic type (str,int,float,bool,None) will be skipped instead of raising aTypeError:>>> dumps({(0,): 'empty tuple', True: 'a true value'}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: keys must be strings >>> dumps({(0,): 'empty tuple', True: 'a true value'}, skipkeys=True) '{}'
Note
skipkeys is a backward compatible alias of new
MM_SKIP_NON_STRING_KEYSmapping mode.ensure_ascii
If ensure_ascii is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is false, these characters will be output as-is:
>>> dumps('The symbol for the Euro currency is €') '"The symbol for the Euro currency is \\u20AC"' >>> dumps('The symbol for the Euro currency is €', ... ensure_ascii=False) '"The symbol for the Euro currency is €"'
write_mode
The write_mode controls how
python-rapidjsonemits JSON: by default it isWM_COMPACT, that produces the most compact JSON representation:>>> dumps([1, 2, {'three': 3, 'four': 4}]) '[1,2,{"four":4,"three":3}]'
With
WM_PRETTYit will useRapidJSON‘sPrettyWriter, with a default indent (see below) of four spaces:>>> print(dumps([1, 2, {'three': 3, 'four': 4}], write_mode=WM_PRETTY)) [ 1, 2, { "four": 4, "three": 3 } ]
With
WM_SINGLE_LINE_ARRAYarrays will be kept on a single line:>>> print(dumps([1, 2, 'three', [4, 5]], write_mode=WM_SINGLE_LINE_ARRAY)) [1, 2, 'three', [4, 5]] >>> print(dumps([1, 2, {'three': 3, 'four': 4}], write_mode=WM_SINGLE_LINE_ARRAY)) [1, 2, { "three": 3, "four": 4 }]
indent
The indent parameter may be either a positive integer number or a string: in the former case it specifies a number of spaces, while in the latter the string may contain zero or more ASCII whitespace characters (space, tab
\t, newline\nand carriage-return\r), all equals (that is,"\n\t"is not accepted).The integer number or the length of the string determine how many spaces (or the characters composing the string) will be used to indent nested structures, when the write_mode above is not
WM_COMPACT, and it defaults to 4. Specifying a value different fromNoneautomatically sets write_mode toWM_PRETTY, if not explicited.By setting indent to 0 each array item (when write_mode is not
WM_SINGLE_LINE_MODE) and each dictionary value will be followed by a newline. A positive integer means that each level will be indented by that many spaces:>>> print(dumps([1, 2, {'three': 3, 'four': 4}], indent=0)) [ 1, 2, { "four": 4, "three": 3 } ] >>> print(dumps([1, 2, {'three': 3, 'four': 4}], indent=2)) [ 1, 2, { "four": 4, "three": 3 } ] >>> print(dumps([1, 2, {'three': 3, 'four': 4}], indent="")) [ 1, 2, { "four": 4, "three": 3 } ] >>> print(dumps([1, 2, {'three': 3, 'four': 4}], indent=" ")) [ 1, 2, { "four": 4, "three": 3 } ] >>> print(dumps([1, 2, {'three': 3, 'four': 4}], indent="\t")) [ 1, 2, { "three": 3, "four": 4 } ]
default
The default argument may be used to specify a custom serializer for otherwise not handled objects. If specified, it should be a function that gets called for such objects and returns a JSON encodable version of the object itself or raise a
TypeError:>>> class Point(object): ... def __init__(self, x, y): ... self.x = x ... self.y = y ... >>> point = Point(1,2) >>> dumps(point) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: <__main__.Point object at …> is not JSON serializable >>> def point_jsonifier(obj): ... if isinstance(obj, Point): ... return {'x': obj.x, 'y': obj.y} ... else: ... raise ValueError('%r is not JSON serializable' % obj) ... >>> dumps(point, default=point_jsonifier) '{"y":2,"x":1}'
sort_keys
When sort_keys is true (default:
False), the JSON representation of Python dictionaries is sorted by key:>>> dumps(point, default=point_jsonifier, sort_keys=True) '{"x":1,"y":2}'
Note
sort_keys is a backward compatible alias of new
MM_SORT_KEYSmapping mode.number_mode
The number_mode argument selects different behaviors in handling numeric values.
By default non-numbers (
nan,inf,-inf) will be serialized as their JavaScript equivalents (NaN,Infinity,-Infinity), becauseNM_NANis on by default (NB: this is not compliant with theJSONstandard):>>> nan = float('nan') >>> inf = float('inf') >>> dumps([nan, inf]) '[NaN,Infinity]' >>> dumps([nan, inf], number_mode=NM_NAN) '[NaN,Infinity]'
Explicitly setting number_mode or using the compatibility option allow_nan you can avoid that and obtain a
ValueErrorexception instead:>>> dumps([nan, inf], number_mode=NM_NATIVE) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Out of range float values are not JSON compliant >>> dumps([nan, inf], allow_nan=False) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Out of range float values are not JSON compliant
Likewise
Decimalinstances cause aTypeErrorexception:>>> from decimal import Decimal >>> pi = Decimal('3.1415926535897932384626433832795028841971') >>> dumps(pi) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Decimal(…) is not JSON serializable
while using
NM_DECIMALthey will be serialized as their textual representation like any other float value:>>> dumps(pi, number_mode=NM_DECIMAL) '3.1415926535897932384626433832795028841971'
Yet another possible flag affects how numeric values are passed to the underlying RapidJSON library: by default they are serialized to their string representation by the module itself, so they are virtually of unlimited precision:
>>> dumps(123456789012345678901234567890) '123456789012345678901234567890'
With
NM_NATIVEtheir binary values will be passed directly instead: this is somewhat faster, it is subject to the underlying C librarylong longanddoublelimits:>>> dumps(123456789012345678901234567890, number_mode=NM_NATIVE) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: int too big to convert
These flags can be combined together:
>>> fast_and_precise = NM_NATIVE | NM_DECIMAL | NM_NAN >>> dumps([-1, nan, pi], number_mode=fast_and_precise) '[-1,NaN,3.1415926535897932384626433832795028841971]'
datetime_mode
By default
date,datetimeandtimeinstances are not serializable:>>> from datetime import datetime >>> right_now = datetime(2016, 8, 28, 13, 14, 52, 277256) >>> date = right_now.date() >>> time = right_now.time() >>> dumps({'date': date, 'time': time, 'timestamp': right_now}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: datetime(…) is not JSON serializable
When datetime_mode is set to
DM_ISO8601those values are serialized using the common ISO 8601 format:>>> dumps(['date', date, 'time', time, 'timestamp', right_now], ... datetime_mode=DM_ISO8601) '["date","2016-08-28","time","13:14:52.277256","timestamp","2016-08-28T13:14:52.277256"]'
The right_now value is a naïve datetime (because it does not carry the timezone information) and is normally assumed to be in the local timezone, whatever your system thinks it is. When you instead know that your value, even being naïve are actually in the UTC timezone, you can use the
DM_NAIVE_IS_UTCflag to inform RapidJSON about that:>>> mode = DM_ISO8601 | DM_NAIVE_IS_UTC >>> dumps(['time', time, 'timestamp', right_now], datetime_mode=mode) '["time","13:14:52.277256+00:00","timestamp","2016-08-28T13:14:52.277256+00:00"]'
A variant is
DM_SHIFT_TO_UTC, that shifts all datetime values to the UTC timezone before serializing them:>>> from datetime import timedelta, timezone >>> here = timezone(timedelta(hours=2)) >>> now = datetime(2016, 8, 28, 20, 31, 11, 84418, here) >>> dumps(now, datetime_mode=DM_ISO8601) '"2016-08-28T20:31:11.084418+02:00"' >>> mode = DM_ISO8601 | DM_SHIFT_TO_UTC >>> dumps(now, datetime_mode=mode) '"2016-08-28T18:31:11.084418+00:00"'
With
DM_IGNORE_TZthe timezone, if present, is simply omitted:>>> mode = DM_ISO8601 | DM_IGNORE_TZ >>> dumps(now, datetime_mode=mode) '"2016-08-28T20:31:11.084418"'
Another one-way only alternative format is Unix time: with
DM_UNIX_TIMEdate,datetimeandtimeinstances are serialized as a number of seconds, respectively since theEPOCHfor the first two kinds and since midnight for the latter:>>> mode = DM_UNIX_TIME | DM_NAIVE_IS_UTC >>> dumps([now, now.date(), now.time()], datetime_mode=mode) '[1472409071.084418,1472342400.0,73871.084418]' >>> unixtime = float(dumps(now, datetime_mode=mode)) >>> datetime.fromtimestamp(unixtime, here) == now True
Combining it with the
DM_ONLY_SECONDSwill produce integer values instead, dropping microseconds:>>> mode = DM_UNIX_TIME | DM_NAIVE_IS_UTC | DM_ONLY_SECONDS >>> dumps([now, now.date(), now.time()], datetime_mode=mode) '[1472409071,1472342400,73871]'
It can be used combined with
DM_SHIFT_TO_UTCto obtain the timestamp of the corresponding UTC time:>>> mode = DM_UNIX_TIME | DM_SHIFT_TO_UTC >>> dumps(now, datetime_mode=mode) '1472409071.084418'
As above, when you know that your values are in the UTC timezone, you can use the
DM_NAIVE_IS_UTCflag to get the right result:>>> a_long_time_ago = datetime(1968, 3, 18, 9, 10, 0, 0) >>> mode = DM_UNIX_TIME | DM_NAIVE_IS_UTC >>> dumps([a_long_time_ago, a_long_time_ago.date(), a_long_time_ago.time()], ... datetime_mode=mode) '[-56472600.0,-56505600.0,33000.0]'
uuid_mode
Likewise, to handle
UUIDinstances there are two modes that can be specified with the uuid_mode argument, that will use the string representation of their values:>>> from uuid import uuid4 >>> random_uuid = uuid4() >>> dumps(random_uuid) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: UUID(…) is not JSON serializable >>> dumps(random_uuid, uuid_mode=UM_CANONICAL) '"be576345-65b5-4fc2-92c5-94e2f82e38fd"' >>> dumps(random_uuid, uuid_mode=UM_HEX) '"be57634565b54fc292c594e2f82e38fd"'
bytes_mode
By default all
bytesinstances are assumed to beUTF-8encoded strings, and acted on accordingly:>>> ascii_string = 'ciao' >>> bytes_string = b'cio\xc3\xa8' >>> unicode_string = 'cioè' >>> dumps([ascii_string, bytes_string, unicode_string]) '["ciao","cio\\u00E8","cio\\u00E8"]'
Sometime you may prefer a different approach, explicitly disabling that behavior using the
BM_NONEmode:>>> dumps([ascii_string, bytes_string, unicode_string], ... bytes_mode=BM_NONE) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: b'cio\xc3\xa8' is not JSON serializable >>> my_bytes_handler = lambda b: b.decode('UTF-8').upper() >>> dumps([ascii_string, bytes_string, unicode_string], ... bytes_mode=BM_NONE, default=my_bytes_handler) '["ciao","CIO\\u00C8","cio\\u00E8"]'
iterable_mode
By default a value that implements the iterable protocol gets encoded as a
JSONarray:>>> from time import localtime, struct_time >>> lt = localtime() >>> dumps(lt) '[2020,11,28,19,55,40,5,333,0]' >>> class MyList(list): ... pass >>> ml = MyList((1,2,3)) >>> dumps(ml) '[1,2,3]'
When that’s not appropriate, for example because you want to use a different way to encode them, you may specify iterable_mode to
IM_ONLY_LISTS:>>> dumps(lt, iterable_mode=IM_ONLY_LISTS) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: <time.struct_time …> is not JSON serializable >>> dumps(ml, iterable_mode=IM_ONLY_LISTS) Traceback (most recent call last): ... TypeError: [1, 2, 3] is not JSON serializable
and thus you can use the default argument:
>>> def ts_or_ml(obj): ... if isinstance(obj, struct_time): ... return {'__class__': 'time.struct_time', '__init__': list(obj)} ... elif isinstance(obj, MyList): ... return [i*2 for i in obj] ... else: ... raise ValueError('%r is not JSON serializable' % obj) >>> dumps(lt, iterable_mode=IM_ONLY_LISTS, default=ts_or_ml) '{"__class__":"time.struct_time","__init__":[2020,11,28,19,55,40,5,333,0]}' >>> dumps(ml, iterable_mode=IM_ONLY_LISTS, default=ts_or_ml) '[2,4,6]'
Obviously, in such case the value returned by the default callable must not be or contain a
tuple:>>> def bad_timestruct(obj): ... if isinstance(obj, struct_time): ... return {'__class__': 'time.struct_time', '__init__': tuple(obj)} ... else: ... raise ValueError('%r is not JSON serializable' % (obj,)) >>> dumps(lt, iterable_mode=IM_ONLY_LISTS, default=bad_timestruct) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: (…) is not JSON serializable
mapping_mode
By default a value that implements the mapping protocol gets encoded as a
JSONobject:>>> from collections import Counter >>> d = {"a":1,"b":2,"c":3} >>> c = Counter(d) >>> dumps([c, d]) '[{"a":1,"b":2,"c":3},{"a":1,"b":2,"c":3}]'
When that’s not appropriate, for example because you want to use a different way to encode them, you may specify mapping_mode to
MM_ONLY_DICTS:>>> dumps(d, mapping_mode=MM_ONLY_DICTS) '{"a":1,"b":2,"c":3}' >>> dumps(c, mapping_mode=MM_ONLY_DICTS) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Counter(…) is not JSON serializable
and thus you can use the default argument:
>>> def counter(obj): ... if isinstance(obj, Counter): ... return {'__class__': 'collections.Counter', '__init__': dict(obj)} ... else: ... raise ValueError('%r is not JSON serializable' % obj) >>> dumps(c, mapping_mode=MM_ONLY_DICTS, default=counter) '{"__class__":"collections.Counter","__init__":{"a":1,"b":2,"c":3}}'
Obviously, in such case the value returned by the default callable must not be or contain mappings other than plain
dicts:>>> from collections import OrderedDict >>> def bad_counter(obj): ... if isinstance(obj, Counter): ... return {'__class__': 'time.struct_time', '__init__': OrderedDict(obj)} ... else: ... raise ValueError('%r is not JSON serializable' % (obj,)) >>> dumps(c, mapping_mode=MM_ONLY_DICTS, default=bad_counter) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: OrderedDict([('a', 1), ('b', 2), ('c', 3)]) is not JSON serializable
Normally, dumping a dictionary containing non-string keys raises a
TypeErrorexception:>>> dumps({-1: 'minus-one'}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: keys must be strings
Setting mapping_mode to
MM_COERCE_KEYS_TO_STRINGSsuch keys will be converted to their string representation:>>> dumps({-1: 'minus-one', True: "good", False: "bad", None: "ugly"}, ... mapping_mode=MM_COERCE_KEYS_TO_STRINGS) '{"-1":"minus-one","True":"good","False":"bad","None":"ugly"}'
Alternatively, by providing a default function you can have finer control on how they should be encoded. For example the following mimics the default behaviour of the standard library
jsonmodule:>>> def mimic_stdlib_json(obj): ... if isinstance(obj, dict): ... result = {} ... for key in obj: ... if key is True: ... result['true'] = obj[key] ... elif key is False: ... result['false'] = obj[key] ... elif key is None: ... result['null'] = obj[key] ... elif isinstance(key, (int, float)): ... result[str(key)] = obj[key] ... else: ... raise TypeError('keys must be str, int, float, bool or None') ... return result ... else: ... raise ValueError('%r is not JSON serializable' % (obj,)) >>> dumps({True: 'good', False: 'bad', None: 'ugly'}, ... default=mimic_stdlib_json) '{"true":"good","false":"bad","null":"ugly"}'
Warning
This can lead to an infinite recursion error, if the default function returns a dictionary that still contains non-string keys:
>>> dumps({True: 'vero', False: 'falso'}, ... default=lambda map: map) Traceback (most recent call last): File "<stdin>", line 1, in <module> RecursionError: maximum recursion depth exceeded