Package org.python.modules
Class _codecs
java.lang.Object
org.python.modules._codecs
This class corresponds to the Python _codecs module, which in turn lends its functions to the
codecs module (in Lib/codecs.py). It exposes the implementing functions of several codec families
called out in the Python codecs library Lib/encodings/*.py, where it is usually claimed that they
are bound "as C functions". Obviously, C stands for "compiled" in this context, rather than
dependence on a particular implementation language. Actual transcoding methods often come from
the related
codecs class.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classOptimized charmap encoder mapping. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic PyTupleascii_decode(String str) static PyTupleascii_decode(String str, String errors) static PyTupleascii_encode(String str) static PyTupleascii_encode(String str, String errors) static PyObjectcharmap_build(PyUnicode map) static PyTuplecharmap_decode(String bytes) Equivalent tocharmap_decode(bytes, errors, null).static PyTuplecharmap_decode(String bytes, String errors) Equivalent tocharmap_decode(bytes, errors, null).static PyTuplecharmap_decode(String bytes, String errors, PyObject mapping) Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers).static PyTuplecharmap_decode(String bytes, String errors, PyObject mapping, boolean ignoreUnmapped) Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers).static PyTuplecharmap_encode(String str) Equivalent tocharmap_encode(str, null, null).static PyTuplecharmap_encode(String str, String errors) Equivalent tocharmap_encode(str, errors, null).static PyTuplecharmap_encode(String str, String errors, PyObject mapping) Encoder based on an optional character mapping.static PyObjectDecodebytesusing the system default encoding (seecodecs.getDefaultEncoding()).static PyObjectDecodebytesusing the codec registered for theencoding.static PyObjectDecodebytesusing the codec registered for theencoding.static PyStringEncodeunicodeusing the system default encoding (seecodecs.getDefaultEncoding()).static PyStringEncodeunicodeusing the codec registered for theencoding.static PyStringEncodeunicodeusing the codec registered for theencoding.static Stringencode_UTF16(String str, String errors, int byteorder) static PyTupleescape_decode(String str) static PyTupleescape_decode(String str, String errors) static PyTupleescape_encode(String str) static PyTupleescape_encode(String str, String errors) static PyTuplelatin_1_decode(String str) static PyTuplelatin_1_decode(String str, String errors) static PyTuplelatin_1_encode(String str) static PyTuplelatin_1_encode(String str, String errors) static PyTuplestatic PyObjectlookup_error(PyString handlerName) static PyTuplestatic PyTupleraw_unicode_escape_decode(String str, String errors) static PyTuplestatic PyTupleraw_unicode_escape_encode(String str, String errors) static voidstatic voidregister_error(String name, PyObject errorHandler) static PyObjecttranslateCharmap(PyUnicode str, String errors, PyObject mapping) static PyTuplestatic PyTupleunicode_escape_decode(String str, String errors) static PyTuplestatic PyTupleunicode_escape_encode(String str, String errors) static PyTupleunicode_internal_decode(String bytes) Deprecated.static PyTupleunicode_internal_decode(String bytes, String errors) Deprecated.static PyTupleunicode_internal_encode(String unicode) Deprecated.static PyTupleunicode_internal_encode(String unicode, String errors) Deprecated.static PyTupleutf_16_be_decode(String str) static PyTupleutf_16_be_decode(String str, String errors) static PyTupleutf_16_be_decode(String str, String errors, boolean final_) static PyTupleutf_16_be_encode(String str) static PyTupleutf_16_be_encode(String str, String errors) static PyTupleutf_16_decode(String str) static PyTupleutf_16_decode(String str, String errors) static PyTupleutf_16_decode(String str, String errors, boolean final_) static PyTupleutf_16_encode(String str) static PyTupleutf_16_encode(String str, String errors) static PyTupleutf_16_encode(String str, String errors, int byteorder) static PyTupleutf_16_ex_decode(String str) static PyTupleutf_16_ex_decode(String str, String errors) static PyTupleutf_16_ex_decode(String str, String errors, int byteorder) static PyTupleutf_16_ex_decode(String str, String errors, int byteorder, boolean final_) static PyTupleutf_16_le_decode(String str) static PyTupleutf_16_le_decode(String str, String errors) static PyTupleutf_16_le_decode(String str, String errors, boolean final_) static PyTupleutf_16_le_encode(String str) static PyTupleutf_16_le_encode(String str, String errors) static PyTupleutf_32_be_decode(String bytes) Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_be_decode(String bytes, String errors) Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_be_decode(String bytes, String errors, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_be_encode(String unicode) Encode a Unicode Java String as UTF-32 with big-endian byte order.static PyTupleutf_32_be_encode(String unicode, String errors) Encode a Unicode Java String as UTF-32 with big-endian byte order.static PyTupleutf_32_decode(String bytes) Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_decode(String bytes, String errors) Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_decode(String bytes, String errors, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_encode(String unicode) Encode a Unicode Java String as UTF-32 with byte order mark.static PyTupleutf_32_encode(String unicode, String errors) Encode a Unicode Java String as UTF-32 with byte order mark.static PyTupleutf_32_encode(String unicode, String errors, int byteorder) Encode a Unicode Java String as UTF-32 in specified byte order with byte order mark.static PyTupleutf_32_ex_decode(String bytes, String errors, int byteorder) Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention).static PyTupleutf_32_ex_decode(String bytes, String errors, int byteorder, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention).static PyTupleutf_32_le_decode(String bytes) Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_le_decode(String bytes, String errors) Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_le_decode(String bytes, String errors, boolean isFinal) Decode (perhaps partially) a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed.static PyTupleutf_32_le_encode(String unicode) Encode a Unicode Java String as UTF-32 with little-endian byte order.static PyTupleutf_32_le_encode(String unicode, String errors) Encode a Unicode Java String as UTF-32 with little-endian byte order.static PyTupleutf_7_decode(String bytes) static PyTupleutf_7_decode(String bytes, String errors) static PyTupleutf_7_decode(String bytes, String errors, boolean finalFlag) static PyTupleutf_7_encode(String str) static PyTupleutf_7_encode(String str, String errors) static PyTupleutf_8_decode(String str) static PyTupleutf_8_decode(String str, String errors) static PyTupleutf_8_decode(String str, String errors, boolean final_) static PyTupleutf_8_decode(String str, String errors, PyObject final_) static PyTupleutf_8_encode(String str) static PyTupleutf_8_encode(String str, String errors)
-
Constructor Details
-
_codecs
public _codecs()
-
-
Method Details
-
register
-
lookup
-
lookup_error
-
register_error
-
decode
Decodebytesusing the system default encoding (seecodecs.getDefaultEncoding()). Decoding errors raise aValueError.- Parameters:
bytes- to be decoded- Returns:
- Unicode string decoded from
bytes
-
decode
Decodebytesusing the codec registered for theencoding. Theencodingdefaults to the system default encoding (seecodecs.getDefaultEncoding()). Decoding errors raise aValueError.- Parameters:
bytes- to be decodedencoding- name of encoding (to look up in codec registry)- Returns:
- Unicode string decoded from
bytes
-
decode
Decodebytesusing the codec registered for theencoding. Theencodingdefaults to the system default encoding (seecodecs.getDefaultEncoding()). The stringerrorsmay name a different error handling policy (built-in or registered withregister_error(String, PyObject)). The default error policy is 'strict' meaning that decoding errors raise aValueError.- Parameters:
bytes- to be decodedencoding- name of encoding (to look up in codec registry)errors- error policy name (e.g. "ignore")- Returns:
- Unicode string decoded from
bytes
-
encode
Encodeunicodeusing the system default encoding (seecodecs.getDefaultEncoding()). Encoding errors raise aValueError.- Parameters:
unicode- string to be encoded- Returns:
- bytes object encoding
unicode
-
encode
Encodeunicodeusing the codec registered for theencoding. Theencodingdefaults to the system default encoding (seecodecs.getDefaultEncoding()). Encoding errors raise aValueError.- Parameters:
unicode- string to be encodedencoding- name of encoding (to look up in codec registry)- Returns:
- bytes object encoding
unicode
-
encode
Encodeunicodeusing the codec registered for theencoding. Theencodingdefaults to the system default encoding (seecodecs.getDefaultEncoding()). The stringerrorsmay name a different error handling policy (built-in or registered withregister_error(String, PyObject)). The default error policy is 'strict' meaning that encoding errors raise aValueError.- Parameters:
unicode- string to be encodedencoding- name of encoding (to look up in codec registry)errors- error policy name (e.g. "ignore")- Returns:
- bytes object encoding
unicode
-
charmap_build
-
utf_8_decode
-
utf_8_decode
-
utf_8_decode
-
utf_8_decode
-
utf_8_encode
-
utf_8_encode
-
utf_7_decode
-
utf_7_decode
-
utf_7_decode
-
utf_7_encode
-
utf_7_encode
-
escape_decode
-
escape_decode
-
escape_encode
-
escape_encode
-
charmap_decode
Equivalent tocharmap_decode(bytes, errors, null). This method is here so the error and mapping arguments can be optional at the Python level.- Parameters:
bytes- sequence of bytes to decode- Returns:
- decoded string and number of bytes consumed
-
charmap_decode
Equivalent tocharmap_decode(bytes, errors, null). This method is here so the error argument can be optional at the Python level.- Parameters:
bytes- sequence of bytes to decodeerrors- error policy- Returns:
- decoded string and number of bytes consumed
-
charmap_decode
Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers). If the mapping is null or None, decode with latin-1 (essentially treating bytes as character codes directly).- Parameters:
bytes- sequence of bytes to decodeerrors- error policymapping- to convert bytes to characters- Returns:
- decoded string and number of bytes consumed
-
charmap_decode
public static PyTuple charmap_decode(String bytes, String errors, PyObject mapping, boolean ignoreUnmapped) Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers).- Parameters:
bytes- sequence of bytes to decodeerrors- error policymapping- to convert bytes to charactersignoreUnmapped- if true, pass unmapped byte values as character codes [0..256)- Returns:
- decoded string and number of bytes consumed
-
translateCharmap
-
charmap_encode
Equivalent tocharmap_encode(str, null, null). This method is here so the error and mapping arguments can be optional at the Python level.- Parameters:
str- to be encoded- Returns:
- (encoded data, size(str)) as a pair
-
charmap_encode
Equivalent tocharmap_encode(str, errors, null). This method is here so the mapping can be optional at the Python level.- Parameters:
str- to be encodederrors- error policy name (e.g. "ignore")- Returns:
- (encoded data, size(str)) as a pair
-
charmap_encode
Encoder based on an optional character mapping. This mapping is either anEncodingMapof 256 entries, or an arbitrary container indexable with integers using__finditem__and yielding byte strings. If the mapping is null, latin-1 (effectively a mapping of character code to the numerically-equal byte) is used- Parameters:
str- to be encodederrors- error policy name (e.g. "ignore")mapping- from character code to output byte (or string)- Returns:
- (encoded data, size(str)) as a pair
-
ascii_decode
-
ascii_decode
-
ascii_encode
-
ascii_encode
-
latin_1_decode
-
latin_1_decode
-
latin_1_encode
-
latin_1_encode
-
utf_16_encode
-
utf_16_encode
-
utf_16_encode
-
utf_16_le_encode
-
utf_16_le_encode
-
utf_16_be_encode
-
utf_16_be_encode
-
encode_UTF16
-
utf_16_decode
-
utf_16_decode
-
utf_16_decode
-
utf_16_le_decode
-
utf_16_le_decode
-
utf_16_le_decode
-
utf_16_be_decode
-
utf_16_be_decode
-
utf_16_be_decode
-
utf_16_ex_decode
-
utf_16_ex_decode
-
utf_16_ex_decode
-
utf_16_ex_decode
-
utf_32_encode
Encode a Unicode Java String as UTF-32 with byte order mark. (Encoding is in platform byte order, which is big-endian for Java.)- Parameters:
unicode- to be encoded- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_encode
Encode a Unicode Java String as UTF-32 with byte order mark. (Encoding is in platform byte order, which is big-endian for Java.)- Parameters:
unicode- to be encodederrors- error policy name or null meaning "strict"- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_encode
Encode a Unicode Java String as UTF-32 in specified byte order with byte order mark.- Parameters:
unicode- to be encodederrors- error policy name or null meaning "strict"byteorder- decoding "endianness" specified (in the Python -1, 0, +1 convention)- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_le_encode
Encode a Unicode Java String as UTF-32 with little-endian byte order. No byte-order mark is generated.- Parameters:
unicode- to be encoded- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_le_encode
Encode a Unicode Java String as UTF-32 with little-endian byte order. No byte-order mark is generated.- Parameters:
unicode- to be encodederrors- error policy name or null meaning "strict"- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_be_encode
Encode a Unicode Java String as UTF-32 with big-endian byte order. No byte-order mark is generated.- Parameters:
unicode- to be encoded- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_be_encode
Encode a Unicode Java String as UTF-32 with big-endian byte order. No byte-order mark is generated.- Parameters:
unicode- to be encodederrors- error policy name or null meaning "strict"- Returns:
- tuple (encoded_bytes, unicode_consumed)
-
utf_32_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes- to be decoded (JythonPyStringconvention)- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_decode
Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode).- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")isFinal- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_le_decode
Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes- to be decoded (JythonPyStringconvention)- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_le_decode
Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_le_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode).- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")isFinal- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_be_decode
Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes- to be decoded (JythonPyStringconvention)- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_be_decode
Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_be_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. Unicode string and return as a tuple the unicode text, the amount of input consumed. The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode).- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")isFinal- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed)
-
utf_32_ex_decode
Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention). The endianness, if not unspecified (=0), will be deduced from a byte-order mark and returned. (This codec entrypoint is used in that way in theutf_32.pycodec, but only until the byte order is known.) When not defined by a BOM, processing assumes big-endian coding (Java platform default), but returns "unspecified". (Theutf_32.pycodec treats this as an error, once more than 4 bytes have been processed.) (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode).- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")byteorder- decoding "endianness" specified (in the Python -1, 0, +1 convention)- Returns:
- tuple (unicode_result, bytes_consumed, endianness)
-
utf_32_ex_decode
Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention). The endianness will be that specified, will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). Or it may still be undefined if fewer than 4 bytes are presented. (This codec entrypoint is used in the utf-32 codec only untile the byte order is known.) The unicode text is presented as a Java String (the UTF-16 representation used byPyUnicode).- Parameters:
bytes- to be decoded (JythonPyStringconvention)errors- error policy name (e.g. "ignore", "replace")byteorder- decoding "endianness" specified (in the Python -1, 0, +1 convention)isFinal- if a "final" call, meaning the input must all be consumed- Returns:
- tuple (unicode_result, bytes_consumed, endianness)
-
raw_unicode_escape_encode
-
raw_unicode_escape_encode
-
raw_unicode_escape_decode
-
raw_unicode_escape_decode
-
unicode_escape_encode
-
unicode_escape_encode
-
unicode_escape_decode
-
unicode_escape_decode
-
unicode_internal_encode
Deprecated.Legacy method to encode given unicode in CPython wide-build internal format (equivalent UTF-32BE). -
unicode_internal_encode
Deprecated.Legacy method to encode given unicode in CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes. -
unicode_internal_decode
Deprecated.Legacy method to decode given bytes as if CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes. -
unicode_internal_decode
Deprecated.Legacy method to decode given bytes as if CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes.
-