Deconstructing Deep Learning + δeviations

Drop me an email
| RSS feed link : Click | Source? Refer to the repository

Format :
Date | Title

TL; DR

✎: Post, ⚛ : Deviation, 📖: Book notes, 🚀: Spaceship

Search for something in the blog

21 July 2020

by Subhaditya Mukherjee

Convert a video to low poly :) (See images below if you dont know what that is)

Just as a citation. Most of the code is from here. I adapted it to videos and decided to explain the code because it was too cool to not.

Before we go on. This is a low poly image.

Looks really cool doesn’t it? Okay so how do we go about making this. And then extend it to work with videos? Let us begin!! (Note that this will be in Python)

First let us get the libraries we need. cv2 and PIL are for loading and performing operations on images. tqdm is a fancy progress bar. colortypes is for getting the color and ctypes gives us access to the types from C. sys gives us access to the system As for the scipy spatial, that will be explained in due course.

```
import colorsys
import ctypes
from itertools import product
from multiprocessing import Process, Array
from PIL import Image, ImageDraw, ImageFilter
import random
from scipy.spatial import Delaunay
import sys
```

After that, let us set some default parameters. The most important ones here are the point count and the edge ratio which determine how many and how polygonized the output image will be.

```
POINT_COUNT = 2000
EDGE_THRESHOLD = 172
EDGE_RATIO = .8
DARKENING_FACTOR = 35
SPEEDUP_FACTOR_X = 1
SPEEDUP_FACTOR_Y = 1
CONCURRENCY_FACTOR = 8
```

Now we need to generate random points. To do so, we first define two helper functions. One which returns a fourth of the height/width. And another which identifies the least metric and returns 1/16th of it. This is done to make the generation a bit more random but not distorted.

```
def get_point_propagation(width, height):
return (width / 4, height / 4)
def get_point_distance(width, height):
return min(width, height) / 16
```

Now for the actual point generation. We first use the previously defined function to get a fourth of the size. The we take 1/16th of the least value. Finally, we iterate over for every point that we need and we choose a random x and y coordinate using the points we have obtained by trying to find a polygon using these random points. We then append this to the point list and store it.

```
def generate_random(im, points):
prop_x, prop_y = get_point_propagation(*im.size)
point_distance = get_point_distance(*im.size)
for _ in range(POINT_COUNT):
x = random.randrange(round((im.size[0] + prop_x) / point_distance)) * \
point_distance - (prop_x / 2)
y = random.randrange(round((im.size[1] + prop_y) / point_distance)) * \
point_distance - (prop_y / 2)
points.append([x, y])
```

Now that we have points, we need to generate random edges. Before we do this, we first write a function to identify the grayscale version of a pixel. We need this to calculate threshold and identify edges because in a color version of the image, the intensity values might not match our requirements. The formula to return the grayscale value of a pixel is

\[0.2126 \cdot r + 0.7152 \cdot g + 0.0722 \cdot b\]where r,g,b are the values of red, green and blue intensities respectively. After that, we apply a sharpen and a find edges filter which will increase our chances of finding edges and then actually find the required edges. Now that we have the edges in the image itself, we iterate over the points and check if our pixel matches our threshold. If it does, them we store away its coordinates. Note that the product function takes the product over an iterator such as a range function. (This is used to check for the threshold.)

```
def get_grayscale(r, g, b):
return 0.2126*r + 0.7152*g + 0.0722*b
def generate_edges(im, points):
im_edges = im.filter(ImageFilter.SHARPEN).filter(ImageFilter.FIND_EDGES)
for x, y in product(range(im.size[0] - 1), range(im.size[1] - 1)):
if get_grayscale(*im_edges.getpixel((x, y))) > EDGE_THRESHOLD and \
random.random() > EDGE_RATIO:
points.append([x, y])
```

Now for the most important part, identifying the polygons. This is called triangulation. We first define a function to do it. Then we define another to perform the process in parallel to save time. We first iterate over the rows of the image and parallelize the code to work for the rows instead of working for the whole image at once. So for every row, using the previously defined speedup factor to determine the number of values considered at a time, we can reduce compute time. Then we use an algorithm from scipy to find the simplex points containing x and y. From graph theory, we have “Simplex graph -A graph in which no line starts and ends at the same point, and in which no two lines have the same pair of end points”. We can now identify the pixels by their index and convert them to black if they are white. If they points cannot be made into simplex points, then we ignore them. If they are, then we identify the r,g,b values of that point and then encode and store their color into a fixed size array. (from the code)

```
def triangulate_worker(im, triangles, colors, worker_index, worker_count):
for x, y in product(range(SPEEDUP_FACTOR_X,
im.size[0] - 1,
SPEEDUP_FACTOR_X),
# Workers treat different rows
range(worker_index * SPEEDUP_FACTOR_Y,
im.size[1] - 1,
worker_count * SPEEDUP_FACTOR_Y)
):
t = triangles.find_simplex((x, y)).flat[0]
pixel_index = y*im.size[0] + x
colors[pixel_index] = 0xFFFF << 32
if not ~t:
continue
# Colors and triangle ID are encoded to integers for fixed-size
# storage in the Array object through a 64-bit integer
(r, g, b) = im.getpixel((x, y))
colors[pixel_index] = ((t << 32) + (b << 16) + (g << 8) + r)
return
```

We now use the Delaunay triangulation. This states that no point in P is inside the circumcircle of any triangle in DT(P) in the created triangulation. After we have that, we run our previous function in parallel and decode the colors using the encoded ones from the previous function. Once we do that, we can return the colors as well as the decoded triangles.

```
def triangulate(im, points):
triangles = Delaunay(points)
colors = Array(ctypes.c_uint64, im.size[0] * im.size[1], lock=True)
jobs = []
for i in range(CONCURRENCY_FACTOR):
p = Process(target=triangulate_worker, args=(im, triangles, colors, i,
CONCURRENCY_FACTOR))
jobs.append(p)
p.start()
for i in jobs:
i.join()
decoded_colors = [None] * len(triangles.simplices)
# Color decoding
for i, c in enumerate(colors):
t = (c & 0xFFFF << 32) >> 32
if t == 0xFFFF:
continue
if not decoded_colors[t]:
decoded_colors[t] = []
decoded_colors[t].append((c & 0xFF,
(c & 0xFF00) >> 8,
(c & 0xFF0000) >> 16))
return (triangles, decoded_colors)
```

Now all that is left before the video is to draw these triangles. To do that we iterate over the colors and darken them a bit, to make it look better. We also convert the rgb colors to hsv space so we can darken them. (Hue saturation value). Once we do that we can draw the polygons using everything we have so far. We also need to fill

```
def draw(im, points, triangles, colors):
d = ImageDraw.Draw(im)
for t, t_colors in enumerate(colors):
end = (0, 0, 0)
if t_colors:
avg = [round(sum(y) / len(y)) for y in zip(*t_colors)]
# Random darkening of the triangles
(h, s, v) = colorsys.rgb_to_hsv(avg[0], avg[1], avg[2])
end = colorsys.hsv_to_rgb(h, s,
v - random.random() * DARKENING_FACTOR)
d.polygon([tuple(points[y]) for y in triangles.simplices[t]],
fill=tuple(map(lambda x: round(x), end)))
return im
def main(im):
points = []
generate_random(im, points)
generate_edges(im, points)
triangles, colors = triangulate(im, points)
draw(im, points, triangles, colors)
return im
```

Now for the video. Any video is just pictures so we convert a video into pictures. Since I took the video from a movie, I first needed to cut out a part of the movie. The easiest way to do this is using ffmpeg. (This is a bash command)

```
ffmpeg -i Avengers.Infinity.War.2018.720p.WEBRip.x264-\[YTS.AM\].mp4 -ss 01:45:10 -to 01:48:00 -c copy cropped.mp4
```

All this does is crop the movie between two times and save it as a new file. After that we use opencv to read the video frame by frame and perform above functions on them. We will also save it to images which we will again combine using ffmpeg.

```
import cv2
from tqdm import tqdm
from PIL import Image
import PIL
cap = cv2.VideoCapture("/media/subhaditya/DATA/ENTERTAIN/Movies/avengers infinity war/cropped.mp4")
cap.get(cv2.CAP_PROP_FRAME_COUNT)
arra_images = []
for i in tqdm(range(1430)):
ret, frame = cap.read()
# main4(frame)
if ret==True:
arra_images.append(frame)
else:
break
# Release everything if job is finished
cap.release()
count = 0
def save_incr(x):
# global count
out = main(Image.fromarray(x))
# count+=1
return out
for a in tqdm(range(len(arra_images))):
save_incr(arra_images[a]).save(f"data/{a}.jpg")
main(Image.fromarray(arra_images[1]))
```

Done. Now for the last function to convert it to a video we again use ffmpeg.

```
!cat data/*.png | ffmpeg -f image2pipe -i - output.mp4
```