- Plese see this for a detailed example of how
tf.nn.conv2d_backprop_input
andtf.nn.conv2d_backprop_filter
in an example.
In tf.nn
, there are 4 closely related 2d conv functions:
tf.nn.conv2d
tf.nn.conv2d_backprop_filter
tf.nn.conv2d_backprop_input
tf.nn.conv2d_transpose
def conv2d(input, filter, strides, padding, use_cudnn_on_gpu=True, data_format="NHWC", name=None): r"""Computes a 2-D convolution given 4-D `input` and `filter` tensors. Given an input tensor of shape `[batch, in_height, in_width, in_channels]` and a filter / kernel tensor of shape `[filter_height, filter_width, in_channels, out_channels]`, this op performs the following: 1. Flattens the filter to a 2-D matrix with shape `[filter_height * filter_width * in_channels, output_channels]`. 2. Extracts image patches from the input tensor to form a *virtual* tensor of shape `[batch, out_height, out_width, filter_height * filter_width * in_channels]`. 3. For each patch, right-multiplies the filter matrix and the image patch vector. In detail, with the default NHWC format, output[b, i, j, k] = sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] * filter[di, dj, q, k] Must have `strides[0] = strides[3] = 1`. For the most common case of the same horizontal and vertices strides, `strides = [1, stride, stride, 1]`.
Given out = conv2d(x, w)
and the output gradient d_out
:
- Use
tf.nn.conv2d_backprop_filter
to compute the filter gradientd_w
- Use
tf.nn.conv2d_backprop_input
to compute the filter gradientd_x
tf.nn.conv2d_backprop_input
can be implemented bytf.nn.conv2d_transpose
- All 4 functions above can be implemented by
tf.nn.conv2d
- Actually, use TF's autodiff is the fastest way to compute gradients
Long Answer
Now, let's give an actual working code example of how to use the 4 functions above to compute d_x
and d_w
given d_out
. This shows how conv2d
, conv2d_backprop_filter
, conv2d_backprop_input
, and conv2d_transpose
are related to each other. .
Computing d_x
in 4 different ways:
# Method 1: TF's autodiffd_x = tf.gradients(f, x)[0] # Method 2: manually using conv2d d_x_manual = tf.nn.conv2d(input=tf_pad_to_full_conv2d(d_out, w_size), filter=tf_rot180(w), strides=strides, padding='VALID') # Method 3: conv2d_backprop_input d_x_backprop_input = tf.nn.conv2d_backprop_input(input_sizes=x_shape, filter=w, out_backprop=d_out, strides=strides, padding='VALID') # Method 4: conv2d_transpose d_x_transpose = tf.nn.conv2d_transpose(value=d_out, filter=w, output_shape=x_shape, strides=strides, padding='VALID')
Computing d_w
in 3 different ways:
# Method 1: TF's autodiffd_w = tf.gradients(f, w)[0] # Method 2: manually using conv2d d_w_manual = tf_NHWC_to_HWIO(tf.nn.conv2d(input=x, filter=tf_NHWC_to_HWIO(d_out), strides=strides, padding='VALID')) # Method 3: conv2d_backprop_filter d_w_backprop_filter = tf.nn.conv2d_backprop_filter(input=x, filter_sizes=w_shape, out_backprop=d_out, strides=strides, padding='VALID')
Please see the for the implementation of tf_rot180
, tf_pad_to_full_conv2d
, tf_NHWC_to_HWIO
. In the scripts, we check that the final output values of different methods are the same; a numpy implementation is also available.