Skip to content

Commit 9b122ad

Browse files
committed
Mention Punycode normalization, re-do fields as a table
1 parent 0954661 commit 9b122ad

File tree

1 file changed

+19
-54
lines changed

1 file changed

+19
-54
lines changed

README.md

Lines changed: 19 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -230,13 +230,12 @@ address (domain names are case-insensitive), [Unicode "NFC"
230230
normalization](https://en.wikipedia.org/wiki/Unicode_equivalence) of the
231231
whole address (which turns characters plus [combining
232232
characters](https://en.wikipedia.org/wiki/Combining_character) into
233-
precomposed characters where possible and replaces certain Unicode
234-
characters (such as angstrom and ohm) with other equivalent code points
235-
(a-with-ring and omega, respectively)), replacement of [fullwidth and
233+
precomposed characters where possible, replacement of [fullwidth and
236234
halfwidth
237235
characters](https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms)
238-
in the domain part, and possibly other
239-
[UTS46](http://unicode.org/reports/tr46) mappings on the domain part.
236+
in the domain part, possibly other
237+
[UTS46](http://unicode.org/reports/tr46) mappings on the domain part,
238+
and conversion from Punycode to Unicode characters.
240239

241240
(See [RFC 6532 (internationalized email) section
242241
3.1](https://tools.ietf.org/html/rfc6532#section-3.1) and [RFC 5895
@@ -283,6 +282,10 @@ converted to IDNA ASCII Punycode). Also note that the `email` and `domain`
283282
fields provide a normalized form of the email address and domain name
284283
(casefolding and Unicode normalization as required by IDNA 2008).
285284

285+
Calling `validate_email` with the ASCII form of the above email address,
286+
`example@xn--bdk.life`, returns the exact same information (i.e., the
287+
`email` field always will contain Unicode characters, not Punycode).
288+
286289
For the fictitious address `ツ-test@joshdata.me`, which has an
287290
internationalized local part, the returned object is:
288291

@@ -309,55 +312,17 @@ Return value
309312
When an email address passes validation, the fields in the returned object
310313
are:
311314

312-
`email`: The canonical form of the email address, mostly useful for
313-
display purposes. This merely combines the `local_part` and `domain`
314-
fields (see below).
315-
316-
`ascii_email`: If set, an ASCII-only form of the email address by replacing the
317-
domain part with [IDNA](https://tools.ietf.org/html/rfc5891)
318-
[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt).
319-
This field will be present when an ASCII-only form of the email
320-
address exists (including if the email address is already ASCII). If
321-
the local part of the email address contains internationalized
322-
characters, `ascii_email` will be `None`. If set, it merely combines
323-
`ascii_local_part` and `ascii_domain`.
324-
325-
`local_part`: The local part of the given email address (before the @-sign) with
326-
Unicode NFC normalization applied.
327-
328-
`ascii_local_part`: If set, the local part, which is composed of ASCII characters only.
329-
330-
`domain`: The canonical internationalized Unicode form of the domain part of the
331-
email address. If the returned string contains non-ASCII characters, either the
332-
[SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your
333-
mail relay will be required to transmit the message or else the
334-
email address's domain part must be converted to IDNA ASCII first: Use
335-
`ascii_domain` field instead.
336-
337-
`ascii_domain`: The [IDNA](https://tools.ietf.org/html/rfc5891)
338-
[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)-encoded
339-
form of the domain part of the given email address, as
340-
it would be transmitted on the wire.
341-
342-
`smtputf8`: A boolean indicating that the
343-
[SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your
344-
mail relay will be required to transmit messages to this address
345-
because the local part of the address has non-ASCII characters (the
346-
local part cannot be IDNA-encoded). If `allow_smtputf8=False` is
347-
passed as an argument, this flag will always be false because an
348-
exception is raised if it would have been true.
349-
350-
`mx`: A list of (priority, domain) tuples of MX records specified in the
351-
DNS for the domain (see [RFC 5321 section
352-
5](https://tools.ietf.org/html/rfc5321#section-5)). May be `None` if
353-
the deliverability check could not be completed because of a temporary
354-
issue like a timeout.
355-
356-
`mx_fallback_type`: `None` if an `MX` record is found. If no MX records are actually
357-
specified in DNS and instead are inferred, through an obsolete
358-
mechanism, from A or AAAA records, the value is the type of DNS
359-
record used instead (`A` or `AAAA`). May be `None` if the deliverability check
360-
could not be completed because of a temporary issue like a timeout.
315+
| Field | Value |
316+
| -----:|-------|
317+
| `email` | The normalized form of the email address that you should put in your database. This merely combines the `local_part` and `domain` fields (see below). |
318+
| `ascii_email` | If set, an ASCII-only form of the email address by replacing the domain part with [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt). This field will be present when an ASCII-only form of the email address exists (including if the email address is already ASCII). If the local part of the email address contains internationalized characters, `ascii_email` will be `None`. If set, it merely combines `ascii_local_part` and `ascii_domain`. |
319+
| `local_part` | The local part of the given email address (before the @-sign) with Unicode NFC normalization applied. |
320+
| `ascii_local_part` | If set, the local part, which is composed of ASCII characters only. |
321+
| `domain` | The canonical internationalized Unicode form of the domain part of the email address. If the returned string contains non-ASCII characters, either the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit the message or else the email address's domain part must be converted to IDNA ASCII first: Use `ascii_domain` field instead. |
322+
| `ascii_domain` | The [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)-encoded form of the domain part of the given email address, as it would be transmitted on the wire. |
323+
| `smtputf8` | A boolean indicating that the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit messages to this address because the local part of the address has non-ASCII characters (the local part cannot be IDNA-encoded). If `allow_smtputf8=False` is passed as an argument, this flag will always be false because an exception is raised if it would have been true. |
324+
| `mx` | A list of (priority, domain) tuples of MX records specified in the DNS for the domain (see [RFC 5321 section 5](https://tools.ietf.org/html/rfc5321#section-5)). May be `None` if the deliverability check could not be completed because of a temporary issue like a timeout. |
325+
| `mx_fallback_type` | `None` if an `MX` record is found. If no MX records are actually specified in DNS and instead are inferred, through an obsolete mechanism, from A or AAAA records, the value is the type of DNS record used instead (`A` or `AAAA`). May be `None` if the deliverability check could not be completed because of a temporary issue like a timeout. |
361326

362327
Assumptions
363328
-----------

0 commit comments

Comments
 (0)