Skip to content

Commit 5cf49cf

Browse files
committed
Move README section on unsafe Unicode to a later section since it applies to both the local part and the domain part
1 parent a9a8a62 commit 5cf49cf

File tree

1 file changed

+19
-36
lines changed

1 file changed

+19
-36
lines changed

README.md

Lines changed: 19 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -184,8 +184,12 @@ Internationalized email addresses
184184
The email protocol SMTP and the domain name system DNS have historically
185185
only allowed English (ASCII) characters in email addresses and domain names,
186186
respectively. Each has adapted to internationalization in a separate
187-
way, creating two separate aspects to email address
188-
internationalization.
187+
way, creating two separate aspects to email address internationalization.
188+
189+
(If your mail submission library doesn't support Unicode at all, then
190+
immediately prior to mail submission you must replace the email address with
191+
its ASCII-ized form. This library gives you back the ASCII-ized form in the
192+
`ascii_email` field in the returned object.)
189193

190194
### Internationalized domain names (IDN)
191195

@@ -208,6 +212,19 @@ email addresses, only English letters, numbers, and some punctuation
208212
(`._!#$%&'^``*+-=~/?{|}`) are allowed. In internationalized email address
209213
local parts, a wider range of Unicode characters are allowed.
210214

215+
Email addresses with these non-ASCII characters require that your mail
216+
submission library and all the mail servers along the route to the destination,
217+
including your own outbound mail server, all support the
218+
[SMTPUTF8 (RFC 6531)](https://tools.ietf.org/html/rfc6531) extension.
219+
Support for SMTPUTF8 varies. If you know ahead of time that SMTPUTF8 is not
220+
supported by your mail submission stack, then you must filter out addresses that
221+
require SMTPUTF8 using the `allow_smtputf8=False` keyword argument (see above).
222+
This will cause the validation function to raise a `EmailSyntaxError` if
223+
delivery would require SMTPUTF8. If you do not set `allow_smtputf8=False`,
224+
you can also check the value of the `smtputf8` field in the returned object.
225+
226+
### Unsafe Unicode characters are rejected
227+
211228
A surprisingly large number of Unicode characters are not safe to display,
212229
especially when the email address is concatenated with other text, so this
213230
library tries to protect you by not permitting reserved, non-, private use,
@@ -226,40 +243,6 @@ with the normalized email address string returned by this library. This does not
226243
guard against the well known problem that many Unicode characters look alike
227244
(or are identical), which can be used to fool humans reading displayed text.
228245

229-
Email addresses with these non-ASCII characters require that your mail
230-
submission library and the mail servers along the route to the destination,
231-
including your own outbound mail server, all support the
232-
[SMTPUTF8 (RFC 6531)](https://tools.ietf.org/html/rfc6531) extension.
233-
Support for SMTPUTF8 varies. See the `allow_smtputf8` parameter.
234-
235-
### If you know ahead of time that SMTPUTF8 is not supported by your mail submission stack
236-
237-
By default all internationalized forms are accepted by the validator.
238-
But if you know ahead of time that SMTPUTF8 is not supported by your
239-
mail submission stack, then you must filter out addresses that require
240-
SMTPUTF8 using the `allow_smtputf8=False` keyword argument (see above).
241-
This will cause the validation function to raise a `EmailSyntaxError` if
242-
delivery would require SMTPUTF8. That's just in those cases where
243-
non-ASCII characters appear before the @-sign. If you do not set
244-
`allow_smtputf8=False`, you can also check the value of the `smtputf8`
245-
field in the returned object.
246-
247-
If your mail submission library doesn't support Unicode at all --- even
248-
in the domain part of the address --- then immediately prior to mail
249-
submission you must replace the email address with its ASCII-ized form.
250-
This library gives you back the ASCII-ized form in the `ascii_email`
251-
field in the returned object, which you can get like this:
252-
253-
```python
254-
emailinfo = validate_email(email, allow_smtputf8=False)
255-
email = emailinfo.ascii_email
256-
```
257-
258-
The local part is left alone (if it has internationalized characters
259-
`allow_smtputf8=False` will force validation to fail) and the domain
260-
part is converted to [IDNA ASCII](https://tools.ietf.org/html/rfc5891).
261-
(You probably should not do this at account creation time so you don't
262-
change the user's login information without telling them.)
263246

264247
Normalization
265248
-------------

0 commit comments

Comments
 (0)