evaluate() shows incorrect metric values in progress bar vs returned results


When using model.evaluate(), the metric values displayed in the progress bar differ from the values returned by the method. There appears to be double averaging happening - the batch values are already averaged, but the progress bar shows an additional average of these averages.
Code to reproduce:

**Code to reproduce:**
```python
import tensorflow as tf
import numpy as np

# Modify the model to output the same as the input
model = tf.keras.Sequential([
    tf.keras.layers.Lambda(lambda x: x)  # Lambda layer to pass input directly to output
])

# Compile the model with MAE as the metric
model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# Dummy data for evaluation
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=float)
y = np.zeros_like(x)  # Dummy target values

results = model.evaluate(x, y, verbose=1, batch_size=1)
print("Evaluation results:", results)
```

**Output:**
```
10/10 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - loss: 17.8182 - mae: 3.4545
Evaluation results: [38.5, 5.5]
```

**Expected behavior:**
The metric values shown in the progress bar should match the final returned results (or at least be clearly documented if this difference is intentional).

**Issue:**

Progress bar shows loss: 17.8182, MAE: 3.4545

Returned values show loss: 38.5, MAE: 5.5

The correct values should be the returned ones (38.5 and 5.5 respectively), as these match manual calculations

The progress bar seems to be averaging already-averaged batch values

**Environment:**
TensorFlow 2.19 

**Additional notes:**
This discrepancy can be confusing for users who rely on the progress bar metrics during evaluation. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

evaluate() shows incorrect metric values in progress bar vs returned results #21301

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

evaluate() shows incorrect metric values in progress bar vs returned results #21301

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions