Description
When using model.evaluate(), the metric values displayed in the progress bar differ from the values returned by the method. There appears to be double averaging happening - the batch values are already averaged, but the progress bar shows an additional average of these averages.
Code to reproduce:
Code to reproduce:
import tensorflow as tf
import numpy as np
# Modify the model to output the same as the input
model = tf.keras.Sequential([
tf.keras.layers.Lambda(lambda x: x) # Lambda layer to pass input directly to output
])
# Compile the model with MAE as the metric
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
# Dummy data for evaluation
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=float)
y = np.zeros_like(x) # Dummy target values
results = model.evaluate(x, y, verbose=1, batch_size=1)
print("Evaluation results:", results)
Output:
10/10 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - loss: 17.8182 - mae: 3.4545
Evaluation results: [38.5, 5.5]
Expected behavior:
The metric values shown in the progress bar should match the final returned results (or at least be clearly documented if this difference is intentional).
Issue:
Progress bar shows loss: 17.8182, MAE: 3.4545
Returned values show loss: 38.5, MAE: 5.5
The correct values should be the returned ones (38.5 and 5.5 respectively), as these match manual calculations
The progress bar seems to be averaging already-averaged batch values
Environment:
TensorFlow 2.19
Additional notes:
This discrepancy can be confusing for users who rely on the progress bar metrics during evaluation.