You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parse quoted-string local parts but by default keep them disallowed with better exception messages
People have opened issues several times about quoted local parts being incorrectly rejected. We can give a better error when it happens to head-off questions about it by parsing them so that we know when they occur.
* Detect when a quoted-string local part might be present when splitting the address into a local part and domain part when the address has quoted @-signs in the local part rather than giving an error message about multiple @-signs.
* Remove the surrounding quotes and un-escape the string before checking the syntax of the local part. Return the un-quoted and un-escaped string as the normalized local_part in the returned ValidatedEmail object if it's valid as an unquoted local part.
* Check for invalid characters in the quoted-string (per the spec and our additional Unicode character checks) and raise exceptions.
* Add a new option to accept quoted-string local parts which is off by default. When accepting them, apply Unicode normalization as per dot-atom internationalized addresses and apply minimal backslash escaping.
* Update tests.
See #54, #92.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,7 @@ There are no significant changes to which email addresses are considered valid/i
9
9
* The dnspython package is no longer required if DNS checks are not used, although it will install automatically.
10
10
* NoNameservers and NXDOMAIN DNS errors are now handled differently: NoNameservers no longer fails validation, and NXDOMAIN now skips checking for an A/AAAA fallback and goes straight to failing validation.
11
11
* Some syntax error messages have changed because they are now checked explicitly rather than as a part of other checks.
12
+
* The quoted-string local part syntax (e.g. multiple @-signs, spaces, etc. if surrounded by quotes) is now parsed but not considered valid by default. Better error messages are now given for quoted-string syntax since it can be confusing for a technically valid address to be rejected, and a new allow_quoted_local option is added to allow these addresses if you really need them.
12
13
* Some other error messages have changed to not repeat the email address in the error message.
13
14
* The library has been reorganized internally into smaller modules.
14
15
* The tests have been reorganized and expanded. Deliverability tests now mostly use captured DNS responses so they can be run off-line.
@@ -103,8 +104,8 @@ But when an email address is valid, an object is returned containing
103
104
a normalized form of the email address (which you should use!) and
104
105
other information.
105
106
106
-
The validator doesn'tpermit obsoleted forms of email addresses that no
107
-
one uses anymore even though they are still valid and deliverable, since
107
+
The validator doesn't, by default, permit obsoleted forms of email addresses
108
+
that no one uses anymore even though they are still valid and deliverable, since
108
109
they will probably give you grief if you're using email for login. (See
109
110
later in the document about that.)
110
111
@@ -134,6 +135,8 @@ The `validate_email` function also accepts the following keyword arguments
134
135
require the
135
136
[SMTPUTF8](https://tools.ietf.org/html/rfc6531) extension. You can also set `email_validator.ALLOW_SMTPUTF8` to `False` to turn it off for all calls by default.
136
137
138
+
`allow_quoted_local=False`: Set to `True` to allow obscure and potentially problematic email addresses in which the part of the address before the @-sign contains spaces, @-signs, or other surprising characters when the local part is surrounded in quotes (so-called quoted-string local parts). In the object returned by `validate_email`, the normalized local part removes any unnecessary backslash-escaping and even removes the surrounding quotes if the address would be valid without them. You can also set `email_validator.ALLOW_QUOTED_LOCAL` to `True` to turn this on for all calls by default.
139
+
137
140
`allow_empty_local=False`: Set to `True` to allow an empty local part (i.e.
138
141
`@example.com`), e.g. for validating Postfix aliases.
139
142
@@ -288,6 +291,11 @@ and conversion from Punycode to Unicode characters.
288
291
3.1](https://tools.ietf.org/html/rfc6532#section-3.1) and [RFC 5895
Normalization is also applied to quoted-string local parts if you have
295
+
allowed them by the `allow_quoted_local` option. Unnecessary backslash
296
+
escaping is removed and even the surrounding quotes are removed if they
297
+
are unnecessary.
298
+
291
299
Examples
292
300
--------
293
301
@@ -355,9 +363,9 @@ are:
355
363
356
364
| Field | Value |
357
365
| -----:|-------|
358
-
|`email`| The normalized form of the email address that you should put in your database. This merely combines the `local_part` and `domain` fields (see below). |
366
+
|`email`| The normalized form of the email address that you should put in your database. This combines the `local_part` and `domain` fields (see below). |
359
367
|`ascii_email`| If set, an ASCII-only form of the email address by replacing the domain part with [IDNA](https://tools.ietf.org/html/rfc5891)[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt). This field will be present when an ASCII-only form of the email address exists (including if the email address is already ASCII). If the local part of the email address contains internationalized characters, `ascii_email` will be `None`. If set, it merely combines `ascii_local_part` and `ascii_domain`. |
360
-
|`local_part`| The local part of the given email address (before the @-sign) with Unicode NFC normalization applied. |
368
+
|`local_part`| The normalized local part of the given email address (before the @-sign). Normalization includes Unicode NFC normalization and removing unnecessary quoted-string quotes and backslashes. If `allow_quoted_local` is True and the surrounding quotes are necessary, the quotes _will_ be present in this field. |
361
369
|`ascii_local_part`| If set, the local part, which is composed of ASCII characters only. |
362
370
|`domain`| The canonical internationalized Unicode form of the domain part of the email address. If the returned string contains non-ASCII characters, either the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit the message or else the email address's domain part must be converted to IDNA ASCII first: Use `ascii_domain` field instead. |
363
371
|`ascii_domain`| The [IDNA](https://tools.ietf.org/html/rfc5891)[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)-encoded form of the domain part of the given email address, as it would be transmitted on the wire. |
@@ -383,9 +391,9 @@ or likely to cause trouble:
383
391
(except see the `test_environment` parameter above).
384
392
* Obsolete email syntaxes are rejected:
385
393
The "quoted string" form of the local part of the email address (RFC
386
-
5321 4.1.2) is not permitted.
387
-
Quoted forms allow multiple @-signs, space characters, and other
388
-
troublesome conditions. The unusual [(comment) syntax](https://github.com/JoshData/python-email-validator/issues/77)
394
+
5321 4.1.2) is not permitted unless `allow_quoted_local=True` is given
395
+
(see above).
396
+
The unusual ["(comment)" syntax](https://github.com/JoshData/python-email-validator/issues/77)
389
397
is also rejected. The "literal" form for the domain part of an email address (an
390
398
IP address in brackets) is rejected. Other obsolete and deprecated syntaxes are
0 commit comments