---
title: "Understanding layer size in Convolutional Neural Networks"
description: "Filter size, padding, and stride explained"
author: "Bartosz Mikulski"
author_bio: "Principal AI Engineer & MLOps Architect. I bridge the gap between \"it works in a notebook\" and \"it works for 200 million users.\""
author_url: https://mikulskibartosz.name
author_linkedin: https://www.linkedin.com/in/mikulskibartosz/
author_github: https://github.com/mikulskibartosz
canonical_url: https://mikulskibartosz.name/understanding-layer-size-in-convolutional-neural-networks
---

What size is going the output have after applying a convolutional pooling layer? I used to have no idea. I sort of could imagine what happens when a filter is applied, but when we added padding and increase the stride, my imagination got lost.

If you have a similar problem, this article is for you. I am going to explain how the filter size influences the size of the next layer, how to use padding, and what happens when you use stride.

## Input

For the sake of an example, let’s use the following data as the input for the pooling layer. Also, I am going to use the max pooling, just because it is simple and makes a good example.

![Input](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/input.png)

## Filter

I decided to use a 3x3 filter. It means that the output shrinks by two columns and rows. Why? Let’s look at the result of applying the max pooling to the first set of cells.

![Cells selected by the 3x3 max pooling filter](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/pooling.png)

The maximal value of the selected cells is 7, so the output looks like this:

![Result (after one step) of the 3x3 filter applied to the example input](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/pooling_result.png)

Now, I have to move my filter. To make things easy, in the first example, the stride has to be 1, so let’s move the filter by one column.

![Cells selected by the 3x3 max pooling filter in the second step](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/pooling_2.png)

That gives me another value for the output:

![Result (after two steps)](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/pooling_result_2.png)

If I continue applying the max pooling filter, I am going to end up with this result:

![Result (after all steps)](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/pooling_result_all.png)

What happened? From the pooling filter, I get only one value, so when the stride is 1, and there is no padding, the output is going to shrink by:

```
number_of_lost_columns_or_rows = the_size_of_the_filter - 1
```

(in this example, 2 columns and 2 rows)

## Padding

What if I want to have the output in the same size as the input without changing the filter? I must add two columns and two rows to the input. If I use zero-padding with size 2, it will mean that I add two rows and two columns which contain only zeros as the border of the input:

![Input with zero-padding (2x2)](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/padding_input.png)

Now, when I apply the filter, it is going to select the following cells:

![Cells selected by the 3x3 max pooling filter](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/padding_pooling.png)

so the max pooling returns this:

![Result (after one step)](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/padding_pooling_result.png)

In the next step, the filter selects those cells:

![Cells selected by the 3x3 max pooling filter in the second step](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/padding_step_2.png)

and the result looks like this:

![Result (after two steps)](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/padding_step2_output.png)

When the filter gets applied to all cells, this is going to be the final result:

![Result (after all steps)](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/padding_output_all.png)

## Stride

Let’s use the input without padding again, but this time with stride = 2.

In the first step, the filter selects these cells:

![Input](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/stride_input.png)

Then, it moves two columns to the right, so the second step selects the following cells:

![Cells selected by the 3x3 max pooling filter in the second step](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/stride_step2.png)

After those steps, the output contains:

![Result (after two steps)](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/stride_result.png)

We also see that, when the input is a 7x7 matrix, the filter has size 3x3 with stride 2 and without padding, the output is going to be a 3x3 matrix.

## Formula

Things get complicated. Fortunately, there is a formula that lets us calculate the size of the output.

![Formula](/images/2019-05-22-understanding-layer-size-in-convolutional-neural-networks/formula.png)

W — the width of the input<br/>
F_w — the width of the filter<br/>
P — padding<br/>
S_w — the horizontal stride<br/>
<br/>
H — the height of the input<br/>
F_h — the height of the filter<br/>
P — padding<br/>
S_h — the vertical stride<br/>