Answer the question
In order to leave comments, you need to log in
How to solve batch_size problem?
def construct_autoregressive_mask(target):
"""
Args: Original target of word ids, shape [batch, seq_len]
Returns: a mask of shape [batch, seq_len, seq_len].
"""
batch_size, seq_len = target.shape.as_list()
print(batch_size)
tri_matrix = np.zeros((seq_len, seq_len))
tri_matrix[np.tril_indices(seq_len)] = 1
mask = tf.convert_to_tensor(tri_matrix, dtype=tf.float32)
masks = tf.tile(tf.expand_dims(mask, 0), (batch_size, 1, 1)) # copies
return masks
Answer the question
In order to leave comments, you need to log in
Use tf.shape(target) to get a tensor with the shape of another tensor whose size can change from step to step.
Tensorflow, by the way, can automatically bring forms. If we add tensor A with shape (K, L, M) and tensor B with shape (1, L, M), then TF will automatically copy B along the first axis K times, and when adding tensors with shapes (K, 1) and ( 1, L) will also copy both and the result will be of the form (K, L): https://www.tensorflow.org/xla/broadcasting
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question