-
Hi, I try to reproduce the section 'Format Data Part 2: Windowing dataset' here with the tensorflow dataset specific function : tf.keras.preprocessing.timeseries_dataset_from_array() In the notbook, this is the method used :
What I tried using the timeseries_dataset_from_array function:
I find suspicious the shape of my Datasets: I was hoping ((None, 7), (None,))
The problem is when I train my model, I find a very different MAE from the notebook :
890 is very different from the 568 of the Notebook when evaluating Model 1 Does someone see what I'm missing ? Thank you for your help |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
Edit: @remy-r has discovered the difference is because NumPy arrays are shuffled ( However when using This explains the different outcomes of results. When See the comment below for more: #199 (reply in thread) Hey there, Massive effort giving this a go. I'd say the difference could be coming from when you create your train/test sets. You may have to make the split before you window. I'd also check to see where the starting indexes are using the
Example 2 in the docs is the most similar to what you'd want to set up: input_data = data[:-10] # <- inputs indexed
targets = data[10:] # <- also indexed
dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
input_data, targets, sequence_length=10)
for batch in dataset:
inputs, targets = batch
assert np.array_equal(inputs[0], data[:10]) # First sequence: steps [0-9]
assert np.array_equal(targets[0], data[10]) # Corresponding target: step 10
break |
Beta Was this translation helpful? Give feedback.
Edit: @remy-r has discovered the difference is because NumPy arrays are shuffled (
shuffled=True
) by default when using thefit()
function - https://www.tensorflow.org/api_docs/python/tf/keras/Model#fitHowever when using
tf.data.Dataset
objects such as withtf.keras.utils.timeseries_dataset_from_array()
, theshuffle
parameter infit()
gets ignored.This explains the different outcomes of results.
When
tf.data.Dataset
objects are shuffled manually with.shuffle()
, the results start to line up.See the comment below for more: #199 (reply in thread)
Hey there,
Massive effort giving this a go.
I'd say the difference could be coming from when you create your train/test sets.
You may have to m…