Skip to content

Commit 022e8cd

Browse files
committed
Tests: Add test checking value of after_sentence_split feature
1 parent 31cef7d commit 022e8cd

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

tests/preprocess/test_sentence_structure_features.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,3 +102,16 @@ def test_detect_compound_sentence_multiple_splits(self):
102102
"2 medium-ripe tomatoes or 4 plum tomatoes or 8 to 10 cherry tomatoes"
103103
)
104104
assert p.sentence_structure.sentence_splits == [3, 7]
105+
106+
def test_after_sentence_split_feature(self):
107+
"""
108+
Test that the or-number-size sequence is identified as split point.
109+
"""
110+
p = PreProcessor("2 small carrots or 1 large carrot")
111+
112+
# Assert that only the tokens after 3 have after_sentence_split feature.
113+
for i, token_features in enumerate(p.sentence_features()):
114+
if i >= 3:
115+
assert token_features.get("after_sentence_split", False)
116+
else:
117+
assert not token_features.get("after_sentence_split", False)

0 commit comments

Comments
 (0)