Hoani.net

Go Visitor Pattern

2024-05-03T00:00:00+00:00

The visitor pattern allows us to extend the behaviour of various objects through a common interface.

It is particularly powerful when working with the composite pattern or any kind of graph execution where we want to keep our graph traversal logic seperate from our handling logic.

The Pattern

To show how the visitor pattern works, we will extend arbitrary shapes using visitors.

For example, we may have a Shape class which:

accepts a Visitor
is implemented by Circle and Rectangle

---
title: Class diagram
---
classDiagram
  Shape <|-- Circle
  Shape <|-- Rectangle
  class Shape{
    <>
    +Accept(Visitor)
  }
  class Circle{
    +float64 Radius
  }
  class Rectangle{
    +float64 Length
    +float64 Width
  }
  class Visitor{
    <>
    +DoCircle(Circle)
    +DoRectangle(Rectangle)
  }

Say we define an Area visitor:

---
title: Area Visitor
---
classDiagram
  Visitor <|-- Area
  class Visitor{
    <>
    +DoCircle(Circle)
    +DoRectangle(Rectangle)
  }
  class Area{
    <>
    +DoCircle(Circle)
    +DoRectangle(Rectangle)
    +Calculate(Shape) float64
  }

If we call Area::Calculate we get:

sequenceDiagram
    participant Caller
    participant Area
    participant Circle
    Caller->>Area: Calculate(Circle)
    Area->>Circle: Accept(self)
    Circle->>Area: DoCircle(self)
    Area-->>Caller: result

The call to Accept and DoCircle is known as double dispatch and allows the Caller to not know that the Shape it has sent Area is infact a Circle.

Go Code

In go this looks like:

type Visitor interface {
  DoCircle(Circle)
  DoRectangle(Rectangle)
}

type Shape interface {
  Accept(Visitor)
}

type Circle struct {
  float64 Radius
}

func (c *Circle) Accept(v Visitor) {
  v.DoCircle(c)
}

type Rectangle struct {
  float64 Length
  float64 Width
}

func (r *Rectangle) Accept(v Visitor) {
  v.DoRectangle(c)
}

We are now setup to extend Circle and Rectangle shapes as much as we like.

An Area visitor will look like:

type Area struct {
  float64 result
}

func (v *Area) Calculate(s Shape) float64 {
  s.Visit(v)
  return v.result
}

func (av *Area) DoCircle(c *Circle) {
  v.result = math.Pi * math.Pow(c.Radius,2)
}

func (v *Area) DoRectangle(r *Rectangle) {
  v.result = r.Length * r.Height
}

Another example is a Circumference visitor:

type Circumference struct {
  float64 result
}

func (v *Circumference) Calculate(s Shape) float64 {
  s.Visit(v)
  return v.result
}

func (a *Circumference) DoCircle(c *Circle) {
  v.result = 2.0 * math.Pi * c.Radius
}

func (v *Circumference) DoRectangle(r *Rectangle) {
  v.result = 2.0 * (r.Length + r.Height)
}

Clover - Arrow Trajectory Algorithm

2024-04-03T00:00:00+00:00

I came up with the design of Clover after watching a bunch of game design videos by Masahiro Sakurai on youtube.

He mentions game essence which has to do with the push and pull of a game - whether a game rewards you for taking more risk.

When designing Clover, I decided to give the player a huge bonus for making combo kills. This would require that the player allows more enemies to get closer to them; increasing risk to the player.

After doing a bunch of hand-drawn sketches about possible layouts, I came to the conclusion that having a hero being chased by bandits on a cart would provide a good balance of realism weighed up against the gameplay I wanted.

Shooting a bow

Early versions of the game had the hero aiming directly at the mouse cursor. The more the bow was pulled, then the straighter the shot, but the arrow would always shoot under the mouse.

In a way this made shooting the bow feel pretty natural; you would learn to aim a little higher. The gameplay looked a little like this:

Showing the trail

Because the game was a fast-paced balance of risk and reward, I wanted to help the player see where they were aiming. So I added some circles to show the trajectory.

Calculating the trajectory is simple, an arrow moves with an initial speed and direction. There is no drag, and the arrow is pulled by gravity:

\[\begin{align} x(t) &= x(t_0) + \dot{x}(t_0)t \\ y(t) &= y(t_0) + \dot{y}(t_0)t + \frac{1}{2}gt^2 \\ \end{align}\]

This looks like this:

Unfortunately, it made the experience of shooting the bow. The trajectory was always under the mouse - it felt strange that your eyes were being drawn away from where you were pointing.

Solving for trajectory

I decided that I would prefer that the trajectory always tries to intercept the mouse.

This resulted in a new problem, how do I calculate the bow angle to make the arrow path intercept the mouse location?

Our equations of motion are:

\[\begin{align} x(t) &= x(t_0) + \dot{x}(t_0)t \\ y(t) &= y(t_0) + \dot{y}(t_0)t + \frac{1}{2}gt^2 \\ \end{align}\]

We also know our initial velocity based on how far back the hero has pulled the bow:

\[\begin{align} v(t_0) &= \sqrt{\dot{x}(t_0) + \dot{y}(t_0)} \\ \end{align}\]

Defining the mouse intercept time as \(t_m\) and defining the distances to cover as \(\Delta{x}\) and \(\Delta{y}\):

\[\begin{align} \Delta{x} &= x(t_m) - x(t_0) \\ \Delta{y} &= y(t_m) - y(t_0) \\ \end{align}\]

We rearrange our equations to a quadratic which solves for \(t_m^2\):

\[\begin{align} 0 &= {\Delta{x}}^2 + {\Delta{x}}^2 - \left( {v}^2(t_0) + \frac{1}{2}g\Delta{y} \right)t_m^2 + \frac{1}{4}g^2t_m^4 \\ \end{align}\]

This can be solved using the quadratic formula:

\[\begin{align} z &= \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} \\ \end{align}\]

where:

\[\begin{align} z &= t_m^2 \\ a &= \frac{1}{4}g^2 \\ b &= - \left( {v}^2(t_0) + \frac{1}{2}g\Delta{y} \right) \\ c &= {\Delta{x}}^2 + {\Delta{x}}^2\\ \end{align}\]

We can now solve for the angle:

\[\begin{align} \tan(\theta) &= \frac{\dot{y}(t_0)}{\dot{x}(t_0)}\\ \dot{x}(t_0) &= \frac{\Delta{x}}{t_m} \\ \dot{y}(t_0) &= \frac{\Delta{y} - \frac{1}{2}g{t_m}^2}{t_m} \\ \end{align}\]

Giving:

\[\begin{align} \theta &= \tan{\frac{\Delta{y} - \frac{1}{2}g{t_m}^2}{\Delta{x}}} \end{align}\]

The resulting algorithm has the trajectory go through the mouse position:

There may be a brief period where the arrow cannot intercept the mouse position; when this happens, we just point the bow towards the mouse.

If you want to play clover, you can check it out here:

Hexagonal Game of Life

2023-11-02T00:00:00+00:00

Last year, I took Mark Rober’s 30-day creative engineering course which emphasises the creative process from idea to build completion.

One of the builds was themed around art.

I had earlier been obsessed with some hexagonal shelves that I had found at the local hardware store, but I had no reason to buy them.

So to justify my impulsive consumerism I built a novel game-of-life on a hexagonal grid.

After completing the the chatbox and lazerpaw projects; I revisited this project to tidy it up and add an end-of-life detection algorithm.

Before	After

Rules of the Game

Hexagonal game of life is similar to a square grid game of life.

The major difference is that instead of a cell having 9 neighbours, it will typically have 6.

The exceptions are the corner cells with 3 neighbours and the edge cells with four.

We always start the game with each cell assigned either alive or dead randomly.

Then for each step of the game we apply the following rules, where n is the number of living neighbours a cell has:

For dead cells:
- If n == 2 or n == 3, life begins in your cell
For living cells
- If n < 2, life ends due to loneliness
- If n > 3, life ends due to overpopulation

This image illustrates these rules, where we have two living cells about to die due to lonliness or overpopulation:

The game of life can get stuck in a looping pattern.

Because each step is completely deterministic, the game will never exit the loop.

I added a final rule to keep the build always doing interesting things:

Once the game gets stuck in a loop, it ends and restarts.

Detecting End of Life

To enforce the last rule, we need an algorithm to detect it is stuck in a loop. I called this the End of Life (EoL) algorithm.

Because this is running on embedded hardware, I wanted the detection to be lightweight and not require too much processing power.

The End of Life algorithm is:

On each step, calculate a uint64_t entry value which represents the current board.
- There are 61 cells, so the entry value uses 61 of it’s bits to represent the state of the board.
We store 16 entry values in an eolEntries array:
- Value 0 is stored on each game step
- Value 1 is stored on every 2nd game step
- Value 2 is stored on every 4th game step
- Values 3 to 16 is stored on every 2^nth game step
On each step, we check if the current entry matches any of the previous values in eolEntries
If it does, the End of Life sequence begins so that the game will restart.

The benefits of this algorithm is:

can detect loops up to 65536 steps long
uses only 128 bytes of RAM
computationaly deterministic and light
- only requires around 500 operations per step
at most a loop can only repeat twice before it is detected

I wanted the End of Life sequence to be a special part of the game. So once End of Life is detected, I trigged a speaker to play Final Voyage from Outer Wilds. As this happens, the game continues with the cells slowly turning white. At the end of the song, cells stop dying until the entire board fills up, and then all die at once.

Build Details

Hardware

The hardware in this project includes:

WS2812 LED strips
Teensy 4.0
PJRC’s Audio Adaptor shield
5V (3A) power supply
A 5V Speaker with Audio jack input
Enclosure
- Hexagonal shelf
- 3D printed parts
- White plastic for diffusing the LEDs

Software

The software was written in C++ using arduino.

Check it out at github.com/hoani/hex-game-of-life

Debugging

To help debug the project, I added a serial_view which allows us to see the game of life on a serial terminal window.

LazerPaw - Cat lazer turret.

2023-08-19T00:00:00+00:00

In January, we adopted three kittens who had been abandoned at a raceway.

We love them to bits but sometimes it’s hard to keep up with thier kitten energy.

I had just finished learning a bit about OpenCV (see my guides here) and I thought maybe I could build a lazer turret to play with my cats.

Lazer Turret

This is the finished product.

It is made up of:

Red Lazer (<1mW)
Raspberry Pi Zero W2
Rasberry Pi IR Camera
Servo pan-tilt module
Raspberry Pi Pico
Neo Pixel Strip
A pushbutton to start/stop the chase game
A bunch of 3D printed parts
Two smartphone holders to mount to the ceiling beam

Design Decisions

Use a camera without the IR filter

As you’ll see later, I use a red backdrop. This makes the cats stand out against the background.

Align the camera and lazer

Both camera and lazer are on the pan-tilt system. This simplifies the control algorithm because the center of the image is always where the lazer is pointing.

Have a simple hardware UI

There is a single button to start the chase game. Holding it will shutdown the Raspberry Pi.

There is a single strip of neopixel LEDs which show status and mode.

Safety

The bulk of the requirements around this build was with safety in mind.

Firstly, it’s important to point out that a <1mW lazer is very unlikely to cause damage to a cats eyes.

Having said that, I wanted to make sure that our cats were as safe as possible. To achieve this, I built in two requirements:

The system must turn the lazer off as soon as it thinks it is pointed at a cat
The system should be mounted high

Mounting the turret high means that when chasing the lazer, the cat is always looking away from the light source.

Algorithm

Given the hardware constraint of the Raspberry Pi Zero W2, cat detection is achieved with a thresholding algorithm.

Thresholding converts an image into black and white based on whether the brightness of each pixel is above or below the threshold.

The following image shows the threshold result next to the original image:

To simplify knowing where the lazer is pointed, the lazer and camera are moved together. Therefore, we assume that the center of the image is also where the lazer is pointing.

To make the image move away from the black pixels (which I assume are my cats), each black pixel repels the center of the image.

Repulsion is distance based, the closer that a black pixel is to the center, the more it will repel as shown by the arrows below:

The repulsion force of an individual black pixel is:

\[\begin{align} F_x = \frac{xK}{d^3} \quad F_y = \frac{yK}{d^3} \end{align}\]

Where:

\(K\) is a repulsion constant
\(x\) and \(y\) are pixel distances from the center of the image
\(d\) is the pixel distance given by \(\sqrt{x^2 + y^2}\)

Three additional rules are included:

Center pixels are excluded from the repulsion algorithm
The lazer is only on when center pixels are white
The image is downsampled before thresholding for performance

The result of downsampled thresholding can be seen here, where the pixellated black pixels show what areas the controller will avoid.

Software

The lazerpaw repo is publicly available on github.com/hoani/lazerpaw.

Simulation

I wrote a simulator to test the controller and give me something to use when building the client UI.

The simulator draws a top down “truth” view of a floor. This is the view on the right.

It then takes a projection on the drawn floor and uses it as input to the camera; which is being streamed to the client UI on the left.

The camera projection isn’t perfect; but its good enough.

User interface

I wanted to be able to use the lazer turret in two ways:

In the living room, by pushing a button to start
In my office to play with the cats from across the house

I prefer the second option the most, because I can see what the turret is sensing and predict where it will go next.

The Raspberry Pi is running a flask server, which serves up a 30fps camera feed and a 30fps pixel grid which indicates which pixels the control algorithm is trying to avoid.

The pixel grid is overlaid on top of the live feed, and then some controls are placed around it.

Elements like “test lazer”, the “manual” switch, and “Set Threshold” are used to calibrate the turret.

Usually, only the “start” and “shutdown” buttons will be used since the lazer game will stop on its own after two minutes.

LED Driving

The LEDs are a strip of WS2812, which are commonly branded by Adafruit as NeoPixels.

I reused the github.com/hoani/serial-pixel library that I built during my chatbox project.

Demonstration

This is the lazer turrent playing with two of our cats on the first night it got installed.

Two things I learned when trying this out with my cats were:

the thresholding algorthm is surprisingly good at detecting which areas have a cat in it
my cats don’t really care if the algorithm is perfect or a bit dumb, they just like chasing the lazer

If you got this far, thanks for reading and I hope this project gave you some cool ideas for your next build.

Configuring Cameras with v4l2-ctl

2023-08-15T00:00:00+00:00

You can get an overview of cameras available using:

v4l2-ctl --all

If you want to know which /dev/video maps to your device:

v4l2-ctl --list-devices

From here on, we assume the camera in question is on /dev/video0.

To learn what formats, resolution and frame-rate options your camera provides:

v4l2-ctl --device=/dev/video0 --list-formats-ext

From here, the camera’s settings can be updated, for example, the following sets resolution:

 v4l2-ctl --device=/dev/video0 --set-fmt-video=width=1280,height=960 --verbose

Or you can change the encoding to one of the numeric indexes shown in --list-formats-ext:

v4l2-ctl --device=/dev/video0 --set-fmt-video=pixelformat=2 --verbose

Additional available controls are shown with:

v4l2-ctl --device=/dev/video0 --list-ctrls

OpenCV Face Detection

2023-06-05T00:00:00+00:00

These are my notes based on Jason Dsouza’s excellent OpenCV video course.

Installation

python3 -m pip install opencv-contrib-python

Finding a Classifier

OpenCV’s face detection loads a cascade classifier which is then used to detect faces.

Go to github.com/opencv/opencv/data/haarcascades. There are a number of different classifiers, download haarcascade_frontalface_default.xml

Running the classifier

Using an image with a face (or many faces); we can detect them with:

import cv2 as cv

img = cv.imread('Photos/people.jpg')

cascade_file = 'haarcascade_frontalface_default.xml'
haar_cascade = cv.CascadeClassifier(cascade_file)
faces_rect = haar_cascade.detectMultiScale(
  cv.cvtColor(img, cv.COLOR_BGR2GRAY), 
  scaleFactor=1.1, minNeighbors=5)

for (x,y,w,h) in faces_rect:
    cv.rectangle(img, (x,y), (x+w, y+h), (0,255,255), 2)
cv.imshow('Detected faces', img)

cv.waitKey(0)

For the Super Mario Movie cast, I got these results.

There were a few false positives and negatives.

This could be further tuned by adjusting the scaleFactor and minNeighbours parameters; however, this isn’t the best idea since we would only be tuning the face detection to work with a specific type of image.

OpenCV Training

2023-06-05T00:00:00+00:00

These are my notes based on Jason Dsouza’s excellent OpenCV video course.

Installation

python3 -m pip install opencv-contrib-python

Setup

We assume you have a the following folders:

train
|- Name A
|- Name B
|- ...etc
validate
|- Name A
|- Name B
|- ...etc

Each sub-folder should have multiple images in them for training and verifying.

Make sure you have downloaded the face detection cascade data from opencv detection.

Training

import cv2 as cv
import os
import numpy as np

DIR = "train"
people = []
for i in os.listdir(DIR):
    people.append(i)

cascade_file = 'haarcascade_frontalface_default.xml'
haar_cascade = cv.CascadeClassifier(cascade_file)
def detect_face(img):
  return haar_cascade.detectMultiScale(img, 
    scaleFactor=1.1, minNeighbors=4)

def create_train():
  features = []
  labels = []
  for person in people:
    path = os.path.join(DIR, person)
    label = people.index(person)

    for img in os.listdir(path):
      img_path = os.path.join(path, img)
      img_array = cv.imread(img_path)
      gray = cv.cvtColor(img_array, cv.COLOR_BGR2GRAY)

      for (x,y,w,h) in faces_rect:
        faces_roi = gray[y:y+h, x:x+w]
        features.append(faces_roi)
        labels.append(label)

  return np.array(features, dtype='object'), np.array(labels)

features, labels = create_train()

# Train the recognizer on the features list and labels list
face_recogniser = cv.face.LBPHFaceRecognizer_create()
face_recogniser.train(features, labels)
face_recogniser.save('face-trained.yml')

Recognition

In this code, we use our trained face recognizer to determine if a detected face is who we expect it to be.

import numpy as np
import cv2 as cv
import os

DIR = "train"
people = []
for i in os.listdir(DIR):
    people.append(i)

# Load model
face_recognizer = cv.face.LBPHFaceRecognizer_create()
face_recognizer.read('3-faces/face-trained.yml')

# Load a face
img = cv.imread('validate/elton_john/1.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

# Detect face
cascade_file = 'haarcascade_frontalface_default.xml'
haar_cascade = cv.CascadeClassifier(cascade_file)
faces_rect = haar_cascade.detectMultiScale(gray, 1.1, 4)
for (x,y,w,h) in faces_rect:
  faces_roi = gray[y:y+h, x:x+w]

  label, confidence = face_recognizer.predict(faces_roi)

  cv.putText(img, str(people[label]), (20,40), 
    cv.FONT_HERSHEY_COMPLEX, 1.0, (0, 255, 0), 2)
  cv.rectangle(img, (x,y), (x+w, y+h), (0,255,0), thickness=2)

cv.imshow('Detected face', img)

cv.waitKey(0)

This works sometimes from my poorly trained model.

Funnily enough, it thought Jerry Seinfeld is Elton John when wearing glasses.

OpenCV Processing

2023-06-04T00:00:00+00:00

These are my notes based on Jason Dsouza’s excellent OpenCV video course.

Installation

python3 -m pip install opencv-contrib-python

Colorspaces

OpenCV represents pixels in a BGR (blue, green, red) colorspace, use cv.cvtColor when you need to represent pixels in other colorspaces.

import cv2 as cv
import matplotlib.pyplot as plt

img = cv.imread('Photos/profile.jpg')
cv.imshow('Profile', img)

# Matplotlib uses RGB
plt.imshow(cv.cvtColor(img, cv.COLOR_BGR2RGB))
plt.show()

Split/Merge

Colors can be split into each individual channel.

import cv2 as cv
import numpy as np

img = cv.imread('Photos/profile.jpg')
cv.imshow('Park', img)

b, g, r = cv.split(img)

# Show colors for split images
blank = np.zeros(img.shape[:2], dtype='uint8')
cv.imshow('Blue', cv.merge([b, blank, blank]))
cv.imshow('Green', cv.merge([blank, g, blank]))
cv.imshow('Red', cv.merge([blank, blank, r]))

merged = cv.merge([b, g, r])
cv.imshow('Merged', merged)

cv.waitKey(0)

Smoothing

Typically, we use smoothing to remove noise.

Smoothing functions use a kernal size which defines the local pixels which are used for the smoothing operation.

For each example, assume we start with:

import cv2 as cv

img = cv.imread('Photos/profile.jpg')

Averaging

Each pixel is the average of the neighbouring pixels.

average = cv.blur(img, [11,11])
cv.imshow('AverageBlur', average)

Gaussian

Each pixel is the gaussian weighted average of the neighbouring pixels.

gaussian = cv.GaussianBlur(img, [11,11], sigmaX=0)
cv.imshow('GaussianBlur', gaussian)

Median Blur

Each pixel is the median of the neighbouring pixels.

median = cv.medianBlur(img, 11)
cv.imshow('MedianBlur', median)

Bilateral Blur

Applies a bilateral blur which provides smooth transitions.

bilateral = cv.bilateralFilter(img, 11, 35, 25)
cv.imshow('Bilateral', bilateral)

Bitwise Operations

Can apply AND, OR, NOT and XOR.

import cv2 as cv
import numpy as np

blank = np.zeros((400, 400), dtype='uint8')
rectangle = cv.rectangle(blank.copy(), (30, 30), (370, 370), 255, -1)
circle = cv.circle(blank.copy(), (200, 200), 200, 255, -1)

bitwiseAnd = cv.bitwise_and(rectangle, circle)
cv.imshow('Bitwise AND', bitwiseAnd)

bitwiseOr = cv.bitwise_or(rectangle, circle)
cv.imshow('Bitwise OR', bitwiseOr)

bitwiseXor = cv.bitwise_xor(rectangle, circle)
cv.imshow('Bitwise XOR', bitwiseXor)

bitwiseNot = cv.bitwise_not(rectangle)
cv.imshow('Bitwise NOT', bitwiseNot)

cv.waitKey(0)

Masking

Can use bitwise functions to mask an image.

import cv2 as cv
import numpy as np

img = cv.imread('Photos/profile.jpg')

mask = cv.circle(np.zeros(img.shape[:2], dtype='uint8'), 
  (img.shape[1]//2, img.shape[0]//2), img.shape[0]//3, 
  255, -1)

masked = cv.bitwise_or(img, img, mask=mask)
cv.imshow('Masked image', masked)

cv.waitKey(0)

Histogram

import cv2 as cv
import matplotlib.pyplot as plt
import numpy as np

img = cv.imread('Photos/profile.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

gray_hist = cv.calcHist([gray], [0], None, [256], [0,256])
fig = plt.figure()
plt.title('Grayscale histogram')
plt.xlabel('Bins')
plt.ylabel('Num pixels')
plt.plot(gray_hist)
plt.xlim([0,256])
plt.show()

Thresholding

Thresholding allows us to process an image based on pixel intensity.

For each example, assume we start with:

import cv2 as cv

img = cv.imread('Photos/profile.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

Simple

th, thresh = cv.threshold(gray, 150, 255, cv.THRESH_BINARY)
cv.imshow('Simple Threshold', thresh)

Inverted

th, thresh = cv.threshold(gray, 150, 255, cv.THRESH_BINARY_INV)
cv.imshow('Inverted Threshold', thresh)

Adaptive

thresh = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C, cv.THRESH_BINARY, 11, 3)
cv.imshow('Adaptive threshold', thresh)

Gradient Detection

Gradient detection is similar to edge detection; and is useful for simple object detection.

For each example, assume we start with:

import cv2 as cv
import numpy as np

img = cv.imread('Photos/park.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

Lapacian

lap = cv.Laplacian(gray, cv.CV_64F)
lap = np.uint8(np.absolute(lap))
cv.imshow('Laplacian', lap)

Sobel

sobelx = cv.Sobel(gray, cv.CV_64F, 1, 0)
sobely = cv.Sobel(gray, cv.CV_64F, 0, 1)
combined_sobel = cv.bitwise_or(sobelx, sobely)
cv.imshow('Sobel Combined', combined_sobel)

Canny

canny = cv.Canny(gray, 150, 175)
cv.imshow('Canny', canny)

OpenCV Basics

2023-06-03T00:00:00+00:00

These are my notes based on Jason Dsouza’s excellent OpenCV video course.

Installation

python3 -m pip install opencv-contrib-python

Showing an Image

import cv2 as cv

img = cv.imread('Photos/profile.jpg')
cv.imshow('Profile', img)
cv.waitKey(0) # Waits forever until keypress.

Showing a Video

Showing a video is similar to an image, but we display each frame individually.

import cv2 as cv

# capture = cv.VideoCapture(0) for webcam
capture = cv.VideoCapture('Videos/drone.mp4')

while True:
  hasFrame, frame = capture.read()
  if not hasFrame:
    break
  cv.imshow('Video', frame)
  if cv.waitKey(20) & 0xFF==ord('d'):
    break

capture.release()
cv.destroyAllWindows()

Common Operations

For each example, assume we start with:

import cv2 as cv

img = cv.imread('Photos/profile.jpg')

Convert to Grayscale

gray = cv.cvtColor(img, code=cv.COLOR_BGR2GRAY)
cv.imshow('Gray', gray)

Blurring

blur = cv.GaussianBlur(img, (7,7), cv.BORDER_DEFAULT)
cv.imshow('Gray', gray)

Edge Cascade

canny = cv.Canny(blur, 125, 175)
cv.imshow('Canny Edges', canny)

Note: Blurring prior to edge detection can provide better results.

Dilating/Eroding

dilated = cv.dilate(canny, (7,7), iterations=3)
eroded = cv.erode(dilated, (7,7), iterations=3)
cv.imshow('Dilated', dilated)
cv.imshow('Eroded', eroded)

Drawing on an image

For each example, assume we start with:

import cv2 as cv

img = np.zeros((500, 500, 3), dtype='uint8')
green = 0,255,0
red = 0,0,255 # Note BGR.

Painting pixels

blank[:] = green
blank[200:300, 300:400] = red
cv.imshow('Green with red square', img)

Rectangle

cv.rectangle(img, (0, 0), (250, 400), green, thickness=cv.FILLED)
cv.imshow('Rectangle', img)

Set thickness to an integer for outlines.

Circle

cv.circle(img, (250, 250), 100, red, thickness=cv.FILLED)
cv.imshow('Circle', img)

Line

cv.line(img, (50, 200), (450, 300), green, thickness=4)
cv.imshow('Line', img)

Text

cv.putText(img, "Hello", (225, 225), cv.FONT_HERSHEY_TRIPLEX, 1.0, green, thickness=2)
cv.putText(img, "Bye bye", (225, 280), cv.FONT_HERSHEY_DUPLEX, 1.0, green, thickness=2)
cv.imshow('Text', img)

cv.waitKey(0)

Transformations

For each example, assume we start with:

import cv2 as cv
import numpy as np

img = cv.imread('Photos/profile.jpg')

Translation

def translate(img, x, y):
    transMat = np.float32([[1,0,x], [0,1,y]])
    dimensions = (img.shape[1], img.shape[0])
    return cv.warpAffine(img, transMat, dimensions)

translated = translate(img, -100, 100)
cv.imshow('Translated', translated)

Rotation

def rotate(img, angle, rotPoint=None):
    (h, w) = img.shape[:2]
    if rotPoint is None:
        rotPoint = (w//2, h//2)
    
    rotMat = cv.getRotationMatrix2D(rotPoint, angle, 1.0)
    dimensions = (w, h)

    return cv.warpAffine(img, rotMat, dimensions)

rotated = rotate(img, -45)
cv.imshow('Rotated', rotated)

Resizing

resized = cv.resize(img, (500,500), interpolation=cv.INTER_NEAREST)
cv.imshow('Resized', resized)

Note: Pay attention to interpolation; linear or cubic looks the best when increasing image sizes.

Cropping

cropped = img[50:200, 200:300]
cv.imshow('Cropped', cropped)

Flipping

cv.imshow('Flip Vertical', cv.flip(img, 0))
cv.imshow('Flip Horizontal', cv.flip(img, 1))
cv.imshow('Flip Both', cv.flip(img, -1))

Contours

import cv2 as cv
import numpy as np

img = cv.imread('Photos/profile.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

ret, thres = cv.threshold(gray, 125, 255, cv.THRESH_BINARY)
cv.imshow('Threshold', thres)

# RETR_EXTERNAL provides only external contours
# CHAIN_APPROX_SIMPLE allows us to combine points when it 
#   makes sense
#   i.e. (0,0), (1,1), (2,2) => (0,0), (2,2)
contours, heirachies = cv.findContours(thres, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)
print(f'{len(contours)} contour(s) found')

blank = np.zeros(img.shape, dtype='uint8')
cv.drawContours(blank, contours, -1, (0,0,255), 1)
cv.imshow('Contours drawn', blank)

cv.waitKey(0)

The heirachies allow us to detect contours within contours, which may be useful for detecting features within objects.

Go Functional Options Pattern

2023-04-23T00:00:00+00:00

The functional options pattern is a popular way to instantiate structs in Go.

It has two parts:

New(options ...func(*MyStruct)) *MyStruct
Options WithSomething(s Something) func(*MyStruct)

Option functions return a function which configures your structure, they typically look like:

func WithSomething(s Something) func(*MyStruct) {
  return func(m *MyStruct) {
    m.something = s
  }
}

Then the inside of your New function looks like:

func New(options ...func(*MyStruct)) *MyStruct {
  // Default config here.
  m := &MyStruct{}
  // ...

  // Apply options.
  for _, option := range options {
    option(m)
  }

  return m
}

Example: Http Client

package main

import (
  "fmt"
  "net/http"
  "time"
)

type Option func(*http.Client)

func New(opts ...Option) *http.Client {
  c := &http.Client{
    Timeout: 5 * time.Second,
  }
  for _, opt := range opts {
    opt(c)
  }
  return c
}

func WithTimeout(t time.Duration) func(*http.Client) {
  return func(c *http.Client) {
    c.Timeout = t
  }
}

func main() {
  c := New(WithTimeout(time.Second))
  res, err := c.Head("http://hoani.net")
  if err != nil {
    panic(err)
  }
  fmt.Printf("status %v\n", res.Status)
}