Skip to content

Commit 99e5122

Browse files
committed
Rename the email field of ValidatedEmail to normalized to be clearer about its importance
1 parent bfa538f commit 99e5122

File tree

6 files changed

+66
-55
lines changed

6 files changed

+66
-55
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ There are no significant changes to which email addresses are considered valid/i
1111
* Some syntax error messages have changed because they are now checked explicitly rather than as a part of other checks.
1212
* The quoted-string local part syntax (e.g. multiple @-signs, spaces, etc. if surrounded by quotes) and domain-literal addresses (e.g. @[192.XXX...] or @[IPv6:...]) are now parsed but not considered valid by default. Better error messages are now given for these addresses since it can be confusing for a technically valid address to be rejected, and new allow_quoted_local and allow_domain_literal options are added to allow these addresses if you really need them.
1313
* Some other error messages have changed to not repeat the email address in the error message.
14+
* The `email` field on the returned `ValidatedEmail` object has been renamed to `normalized` to be clearer about its importance, but access via `.email` is also still supported.
1415
* The library has been reorganized internally into smaller modules.
1516
* The tests have been reorganized and expanded. Deliverability tests now mostly use captured DNS responses so they can be run off-line.
1617
* The __main__ tool now reads options to validate_email from environment variables.

README.md

Lines changed: 18 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -65,11 +65,11 @@ try:
6565
# Check that the email address is valid. Turn on check_deliverability
6666
# for first-time validations like on account creation pages (but not
6767
# login pages).
68-
validation = validate_email(email, check_deliverability=False)
68+
emailinfo = validate_email(email, check_deliverability=False)
6969

7070
# After this point, use only the normalized form of the email address,
7171
# especially before going to a database query.
72-
email = validation.email
72+
email = emailinfo.normalized
7373

7474
except EmailNotValidError as e:
7575

@@ -158,7 +158,7 @@ from email_validator import validate_email, caching_resolver
158158
resolver = caching_resolver(timeout=10)
159159

160160
while True:
161-
email = validate_email(email, dns_resolver=resolver).email
161+
validate_email(email, dns_resolver=resolver)
162162
```
163163

164164
### Test addresses
@@ -249,8 +249,8 @@ This library gives you back the ASCII-ized form in the `ascii_email`
249249
field in the returned object, which you can get like this:
250250

251251
```python
252-
valid = validate_email(email, allow_smtputf8=False)
253-
email = valid.ascii_email
252+
emailinfo = validate_email(email, allow_smtputf8=False)
253+
email = emailinfo.ascii_email
254254
```
255255

256256
The local part is left alone (if it has internationalized characters
@@ -266,7 +266,7 @@ Normalization
266266

267267
The use of Unicode in email addresses introduced a normalization
268268
problem. Different Unicode strings can look identical and have the same
269-
semantic meaning to the user. The `email` field returned on successful
269+
semantic meaning to the user. The `normalized` field returned on successful
270270
validation provides the correctly normalized form of the given email
271271
address.
272272

@@ -275,9 +275,9 @@ equivalent in domain names to their ASCII counterparts. This library
275275
normalizes them to their ASCII counterparts:
276276

277277
```python
278-
valid = validate_email("me@Domain.com")
279-
print(valid.email)
280-
print(valid.ascii_email)
278+
emailinfo = validate_email("me@Domain.com")
279+
print(emailinfo.normalized)
280+
print(emailinfo.ascii_email)
281281
# prints "me@domain.com" twice
282282
```
283283

@@ -321,7 +321,7 @@ For the email address `test@joshdata.me`, the returned object is:
321321

322322
```python
323323
ValidatedEmail(
324-
email='test@joshdata.me',
324+
normalized='test@joshdata.me',
325325
local_part='test',
326326
domain='joshdata.me',
327327
ascii_email='test@joshdata.me',
@@ -335,7 +335,7 @@ internationalized domain but ASCII local part, the returned object is:
335335

336336
```python
337337
ValidatedEmail(
338-
email='example@ツ.life',
338+
normalized='example@ツ.life',
339339
local_part='example',
340340
domain='ツ.life',
341341
ascii_email='example@xn--bdk.life',
@@ -345,20 +345,20 @@ ValidatedEmail(
345345

346346
```
347347

348-
Note that the `email` and `domain` fields provide a normalized form of the
348+
Note that `normalized` and other fields provide a normalized form of the
349349
email address, domain name, and (in other cases) local part (see earlier
350350
discussion of normalization), which you should use in your database.
351351

352352
Calling `validate_email` with the ASCII form of the above email address,
353353
`example@xn--bdk.life`, returns the exact same information (i.e., the
354-
`email` field always will contain Unicode characters, not Punycode).
354+
`normalized` field always will contain Unicode characters, not Punycode).
355355

356356
For the fictitious address `ツ-test@joshdata.me`, which has an
357357
internationalized local part, the returned object is:
358358

359359
```python
360360
ValidatedEmail(
361-
email='ツ-test@joshdata.me',
361+
normalized='ツ-test@joshdata.me',
362362
local_part='ツ-test',
363363
domain='joshdata.me',
364364
ascii_email=None,
@@ -368,10 +368,8 @@ ValidatedEmail(
368368
```
369369

370370
Now `smtputf8` is `True` and `ascii_email` is `None` because the local
371-
part of the address is internationalized. The `local_part` and `email` fields
372-
return the normalized form of the address: certain Unicode characters
373-
(such as angstrom and ohm) may be replaced by other equivalent code
374-
points (a-with-ring and omega).
371+
part of the address is internationalized. The `local_part` and `normalized` fields
372+
return the normalized form of the address.
375373

376374
Return value
377375
------------
@@ -381,8 +379,8 @@ are:
381379

382380
| Field | Value |
383381
| -----:|-------|
384-
| `email` | The normalized form of the email address that you should put in your database. This combines the `local_part` and `domain` fields (see below). |
385-
| `ascii_email` | If set, an ASCII-only form of the email address by replacing the domain part with [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt). This field will be present when an ASCII-only form of the email address exists (including if the email address is already ASCII). If the local part of the email address contains internationalized characters, `ascii_email` will be `None`. If set, it merely combines `ascii_local_part` and `ascii_domain`. |
382+
| `normalized` | The normalized form of the email address that you should put in your database. This combines the `local_part` and `domain` fields (see below). |
383+
| `ascii_email` | If set, an ASCII-only form of the normalized email address by replacing the domain part with [IDNA](https://tools.ietf.org/html/rfc5891) [Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt). This field will be present when an ASCII-only form of the email address exists (including if the email address is already ASCII). If the local part of the email address contains internationalized characters, `ascii_email` will be `None`. If set, it merely combines `ascii_local_part` and `ascii_domain`. |
386384
| `local_part` | The normalized local part of the given email address (before the @-sign). Normalization includes Unicode NFC normalization and removing unnecessary quoted-string quotes and backslashes. If `allow_quoted_local` is True and the surrounding quotes are necessary, the quotes _will_ be present in this field. |
387385
| `ascii_local_part` | If set, the local part, which is composed of ASCII characters only. |
388386
| `domain` | The canonical internationalized Unicode form of the domain part of the email address. If the returned string contains non-ASCII characters, either the [SMTPUTF8](https://tools.ietf.org/html/rfc6531) feature of your mail relay will be required to transmit the message or else the email address's domain part must be converted to IDNA ASCII first: Use `ascii_domain` field instead. |

email_validator/exceptions_types.py

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,13 +22,13 @@ class ValidatedEmail(object):
2222
and other information."""
2323

2424
"""The email address that was passed to validate_email. (If passed as bytes, this will be a string.)"""
25-
original_email: str
25+
original: str
2626

2727
"""The normalized email address, which should always be used in preferance to the original address.
2828
The normalized address converts an IDNA ASCII domain name to Unicode, if possible, and performs
2929
Unicode normalization on the local part and on the domain (if originally Unicode). It is the
3030
concatenation of the local_part and domain attributes, separated by an @-sign."""
31-
email: str
31+
normalized: str
3232

3333
"""The local part of the email address after Unicode normalization."""
3434
local_part: str
@@ -68,14 +68,22 @@ def __init__(self, **kwargs):
6868
setattr(self, k, v)
6969

7070
def __repr__(self):
71-
return f"<ValidatedEmail {self.email}>"
71+
return f"<ValidatedEmail {self.normalized}>"
72+
73+
"""For backwards compatibility, support old field names."""
74+
def __getattr__(self, key):
75+
if key == "original_email":
76+
return self.original
77+
if key == "email":
78+
return self.normalized
79+
raise AttributeError()
7280

7381
"""For backwards compatibility, some fields are also exposed through a dict-like interface. Note
7482
that some of the names changed when they became attributes."""
7583
def __getitem__(self, key):
7684
warnings.warn("dict-like access to the return value of validate_email is deprecated and may not be supported in the future.", DeprecationWarning, stacklevel=2)
7785
if key == "email":
78-
return self.email
86+
return self.normalized
7987
if key == "email_ascii":
8088
return self.ascii_email
8189
if key == "local":
@@ -97,7 +105,7 @@ def __eq__(self, other):
97105
if not isinstance(other, ValidatedEmail):
98106
return False
99107
return (
100-
self.email == other.email
108+
self.normalized == other.normalized
101109
and self.local_part == other.local_part
102110
and self.domain == other.domain
103111
and getattr(self, 'ascii_email', None) == getattr(other, 'ascii_email', None)

email_validator/validate_email.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ def validate_email(
7676

7777
# Collect return values in this instance.
7878
ret = ValidatedEmail()
79-
ret.original_email = email
79+
ret.original = email
8080

8181
# Validate the email address's local part syntax and get a normalized form.
8282
# If the original address was quoted and the decoded local part is a valid
@@ -113,7 +113,7 @@ def validate_email(
113113
ret.ascii_domain = domain_part_info["ascii_domain"]
114114

115115
# Construct the complete normalized form.
116-
ret.email = ret.local_part + "@" + ret.domain
116+
ret.normalized = ret.local_part + "@" + ret.domain
117117

118118
# If the email address has an ASCII form, add it.
119119
if not ret.smtputf8:
@@ -144,20 +144,20 @@ def validate_email(
144144
#
145145
# See the length checks on the local part and the domain.
146146
if ret.ascii_email and len(ret.ascii_email) > EMAIL_MAX_LENGTH:
147-
if ret.ascii_email == ret.email:
147+
if ret.ascii_email == ret.normalized:
148148
reason = get_length_reason(ret.ascii_email)
149-
elif len(ret.email) > EMAIL_MAX_LENGTH:
149+
elif len(ret.normalized) > EMAIL_MAX_LENGTH:
150150
# If there are more than 254 characters, then the ASCII
151151
# form is definitely going to be too long.
152-
reason = get_length_reason(ret.email, utf8=True)
152+
reason = get_length_reason(ret.normalized, utf8=True)
153153
else:
154154
reason = "(when converted to IDNA ASCII)"
155155
raise EmailSyntaxError(f"The email address is too long {reason}.")
156-
if len(ret.email.encode("utf8")) > EMAIL_MAX_LENGTH:
157-
if len(ret.email) > EMAIL_MAX_LENGTH:
156+
if len(ret.normalized.encode("utf8")) > EMAIL_MAX_LENGTH:
157+
if len(ret.normalized) > EMAIL_MAX_LENGTH:
158158
# If there are more than 254 characters, then the UTF-8
159159
# encoding is definitely going to be too long.
160-
reason = get_length_reason(ret.email, utf8=True)
160+
reason = get_length_reason(ret.normalized, utf8=True)
161161
else:
162162
reason = "(when encoded in bytes)"
163163
raise EmailSyntaxError(f"The email address is too long {reason}.")

tests/test_main.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ def test_dict_accessor():
1313
input_email = "testaddr@example.tld"
1414
valid_email = validate_email(input_email, check_deliverability=False)
1515
assert isinstance(valid_email.as_dict(), dict)
16-
assert valid_email.as_dict()["original_email"] == input_email
16+
assert valid_email.as_dict()["original"] == input_email
1717

1818

1919
def test_main_single_good_input(monkeypatch, capsys):
@@ -24,7 +24,7 @@ def test_main_single_good_input(monkeypatch, capsys):
2424
stdout, _ = capsys.readouterr()
2525
output = json.loads(str(stdout))
2626
assert isinstance(output, dict)
27-
assert validate_email(test_email, dns_resolver=RESOLVER).original_email == output["original_email"]
27+
assert validate_email(test_email, dns_resolver=RESOLVER).original == output["original"]
2828

2929

3030
def test_main_single_bad_input(monkeypatch, capsys):
@@ -53,7 +53,7 @@ def test_bytes_input():
5353
input_email = b"testaddr@example.tld"
5454
valid_email = validate_email(input_email, check_deliverability=False)
5555
assert isinstance(valid_email.as_dict(), dict)
56-
assert valid_email.as_dict()["email"] == input_email.decode("utf8")
56+
assert valid_email.as_dict()["normalized"] == input_email.decode("utf8")
5757

5858
input_email = "testaddr中example.tld".encode("utf32")
5959
with pytest.raises(EmailSyntaxError):

0 commit comments

Comments
 (0)