Home page

Deconstructing Deep Learning + δeviations

Drop me an email | RSS feed link : Click
Format : Date | Title
  TL; DR

Total posts : 78

View My GitHub Profile

Index page


To implement a faster conv we need padding, so here we will try to explore what that means and try to implement it.

The objective is to get the kernel to be the same size as the image and fill it with some value so as to be able to apply FFT to it.

We first import the packages we need, I am just using the image packages for visualization and then take a small image and a kernel (just for testing, we can scale it up later). We are also making the kernel a solid block of white as it is just easier to see since the img is random numbers.

using Images,ImageView, Plots,LinearAlgebra,Statistics
img = rand(Float32,50,50)
kernel = ones(Float32,15,15);

Constant padding

Okay now for constant padding. This means that we choose a value and then apply it to the figure.

Steps followed : 1. To save memory, let us first allocate an image of constants with the size of the image. We do this by making an array of ones and then element wise multiplying it by constant. (Note that the number should be in the range of 0-1 for a gray scale image) 2. Then we identify the center of the image 3. We then find out the space required by the kernel to fit in this array 4. Just set this space in the padded version = the kernel 5. Convert it to grayscale so we can plot it and see if it worked. 6. Note that we scale the constant between 0 and 1 using the sigmoid function

function pad_constant(img,kernel,constant)
    kernel_h, kernel_w = size(kernel)
    img_h, img_w = size(img)
    padded_kernel= ones(img_h,img_w).*(1/(1+exp(-constant)));
    pad_h, pad_w = size(padded_kernel)
    center_x,center_y = pad_w ÷2, pad_h ÷2
    tmp_x = center_x-(kernel_w÷2)
    tmp_y = center_y-(kernel_h÷2)
    padded_kernel[collect(tmp_x:tmp_x+kernel_w-1),collect(tmp_y:tmp_y+kernel_h-1)] = kernel;
    return Gray.(padded_kernel)

Since our kernel was white and we supplied a constant of .3(grayish), we get this ->


It works!!

Max padding



Min padding



Mean padding



Median padding