Skip to content

Commit 7a38e20

Browse files
Update phishing_email_detection_gpt2.py
Try to fix issue with batch_size and dtype with string tokenization...
1 parent fdc4812 commit 7a38e20

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

phishing_email_detection_gpt2.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,8 @@ def call(self, inputs):
217217
# if isinstance(inputs, tf.Tensor):
218218
# # Convert tensor to a list of strings
219219
# inputs = inputs.numpy().astype("U").tolist()
220-
# Tokenize each input string separately
220+
221+
inputs = [x.decode('utf-8') for x in inputs.numpy()]
221222
tokenized = self.tokenizer(inputs,
222223
max_length=self.max_seq_length,
223224
padding='max_length',

0 commit comments

Comments
 (0)