Answer the question
In order to leave comments, you need to log in
[[+content_image]]
Why does tf.nn.sparse_softmax_cross_entropy_with_logits() return nan?
inp, tar = sess.run(el)
print(tar[:1])
[[912 0 53 145 0 155 45 50 15 48 924 225 912 0 235 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]]
dec_outputs = decoder(y, context_vector, hidden, batch_sz)[:1]
[]
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels= y, logits= dec_outputs))
loss_ = sess.run(loss, feed_dict= {x: inp, y: tar})
Answer the question
In order to leave comments, you need to log in
Most likely, the problem is not in the network architecture, but in the data. For example, such an error can occur if one of the samples turned out to be of zero length. At the same time, the gradients will be NaN there, and the weights through which they pass also turn into NaN.
The zeros in the first sample output look suspicious. Usually 0 is just a blank to finish the length of the sequence to the desired size, and OOV-words and the end of the phrase are marked with separate codes / characters.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question