Jekyll2023-12-20T07:31:43+00:00https://hoani.net/feed.xmlHoani.netThings I've gone and made.HoaniHexagonal Game of Life2023-11-02T00:00:00+00:002023-11-02T00:00:00+00:00https://hoani.net/posts/blog/hex-game-of-life<p>Last year, I took Mark Rober’s 30-day <a href="https://studio.com/mark-rober-engineering">creative engineering</a> course which emphasises the creative process from idea to build completion.</p>
<p>One of the builds was themed around art.</p>
<p>I had earlier been obsessed with some hexagonal shelves that I had found at the local hardware store, but I had no reason to buy them.</p>
<p>So to justify my impulsive consumerism I built a novel game-of-life on a hexagonal grid.</p>
<figure>
<a href="https://www.youtube.com/watch?v=jcLBQ5ObCy0"><img src="/assets/images/posts/blog/hex-gol/first-build.png" /></a>
</figure>
<p>After completing the the <a href="https://hoani.net/posts/blog/2023-04-16-chatbox/">chatbox</a> and <a href="https://hoani.net/posts/blog/2023-08-19-lazerpaw/">lazerpaw</a> projects; I revisited this project to tidy it up and add an end-of-life detection algorithm.</p>
<table>
<thead>
<tr>
<th>Before</th>
<th>After</th>
</tr>
</thead>
<tbody>
<tr>
<td><img src="/assets/images/posts/blog/hex-gol/first-build.gif" /></td>
<td><img src="/assets/images/posts/blog/hex-gol/final-build.gif" /></td>
</tr>
</tbody>
</table>
<h1 id="rules-of-the-game">Rules of the Game</h1>
<p>Hexagonal game of life is similar to a square grid game of life.</p>
<p>The major difference is that instead of a cell having 9 neighbours, it will typically have 6.</p>
<p>The exceptions are the corner cells with 3 neighbours and the edge cells with four.</p>
<p>We always start the game with each cell assigned either alive or dead randomly.</p>
<p>Then for each step of the game we apply the following rules, where <code class="language-plaintext highlighter-rouge">n</code> is the number of living neighbours a cell has:</p>
<ul>
<li>For dead cells:
<ul>
<li>If <code class="language-plaintext highlighter-rouge">n == 2</code> or <code class="language-plaintext highlighter-rouge">n == 3</code>, life begins in your cell</li>
</ul>
</li>
<li>For living cells
<ul>
<li>If <code class="language-plaintext highlighter-rouge">n < 2</code>, life ends due to loneliness</li>
<li>If <code class="language-plaintext highlighter-rouge">n > 3</code>, life ends due to overpopulation</li>
</ul>
</li>
</ul>
<p>This image illustrates these rules, where we have two living cells about to die due to lonliness or overpopulation:</p>
<figure class="two-thirds">
<img src="/assets/images/posts/blog/hex-gol/rules-example.png" />
</figure>
<p>The game of life can get stuck in a looping pattern.</p>
<p>Because each step is completely deterministic, the game will never exit the loop.</p>
<p>I added a final rule to keep the build always doing interesting things:</p>
<ul>
<li>Once the game gets stuck in a loop, it ends and restarts.</li>
</ul>
<h3 id="detecting-end-of-life">Detecting End of Life</h3>
<p>To enforce the last rule, we need an algorithm to detect it is stuck in a loop. I called this the End of Life (EoL) algorithm.</p>
<p>Because this is running on embedded hardware, I wanted the detection to be lightweight and not require too much processing power.</p>
<p>The End of Life algorithm is:</p>
<ul>
<li>On each step, calculate a <code class="language-plaintext highlighter-rouge">uint64_t</code> <code class="language-plaintext highlighter-rouge">entry</code> value which represents the current board.
<ul>
<li>There are 61 cells, so the <code class="language-plaintext highlighter-rouge">entry</code> value uses 61 of it’s bits to represent the state of the board.</li>
</ul>
</li>
<li>We store 16 <code class="language-plaintext highlighter-rouge">entry</code> values in an <code class="language-plaintext highlighter-rouge">eolEntries</code> array:
<ul>
<li>Value <code class="language-plaintext highlighter-rouge">0</code> is stored on each game step</li>
<li>Value <code class="language-plaintext highlighter-rouge">1</code> is stored on every 2nd game step</li>
<li>Value <code class="language-plaintext highlighter-rouge">2</code> is stored on every 4th game step</li>
<li>Values <code class="language-plaintext highlighter-rouge">3</code> to 16 is stored on every <code class="language-plaintext highlighter-rouge">2^n</code>th game step</li>
</ul>
</li>
<li>On each step, we check if the current <code class="language-plaintext highlighter-rouge">entry</code> matches any of the previous values in <code class="language-plaintext highlighter-rouge">eolEntries</code></li>
<li>If it does, the End of Life sequence begins so that the game will restart.</li>
</ul>
<p>The benefits of this algorithm is:</p>
<ul>
<li>can detect loops up to 65536 steps long</li>
<li>uses only 128 bytes of RAM</li>
<li>computationaly deterministic and light
<ul>
<li>only requires around 500 operations per step</li>
</ul>
</li>
<li>at most a loop can only repeat twice before it is detected</li>
</ul>
<p>I wanted the End of Life sequence to be a special part of the game. So once End of Life is detected, I trigged a speaker to play <a href="https://www.youtube.com/watch?v=6zlSUvWU6z8">Final Voyage</a> from Outer Wilds. As this happens, the game continues with the cells slowly turning white. At the end of the song, cells stop dying until the entire board fills up, and then all die at once.</p>
<figure class="two-thirds"><img src="/assets/images/posts/blog/hex-gol/end-of-life.gif" /></figure>
<h1 id="build-details">Build Details</h1>
<h3 id="hardware">Hardware</h3>
<table>
<tbody>
<tr>
<td><img src="/assets/images/posts/blog/hex-gol/hardware.gif" /></td>
<td><img src="/assets/images/posts/blog/hex-gol/hardware-grid.gif" /></td>
</tr>
</tbody>
</table>
<p>The hardware in this project includes:</p>
<ul>
<li>WS2812 LED strips</li>
<li>Teensy 4.0</li>
<li>PJRC’s Audio Adaptor shield</li>
<li>5V (3A) power supply</li>
<li>A 5V Speaker with Audio jack input</li>
<li>Enclosure
<ul>
<li>Hexagonal shelf</li>
<li>3D printed parts</li>
<li>White plastic for diffusing the LEDs</li>
</ul>
</li>
</ul>
<h3 id="software">Software</h3>
<p>The software was written in C++ using arduino.</p>
<p>Check it out at <a href="https://github.com/hoani/hex-game-of-life">github.com/hoani/hex-game-of-life</a></p>
<h3 id="debugging">Debugging</h3>
<p>To help debug the project, I added a <code class="language-plaintext highlighter-rouge">serial_view</code> which allows us to see the game of life on a serial terminal window.</p>
<figure class="two-thirds"><img src="/assets/images/posts/blog/hex-gol/serial.gif" /></figure>HoaniA digital art project on life (I guess)LazerPaw - Cat lazer turret.2023-08-19T00:00:00+00:002023-08-19T00:00:00+00:00https://hoani.net/posts/blog/lazerpaw<p>In January, we adopted three kittens who had been abandoned at a raceway.</p>
<figure>
<img src="/assets/images/posts/blog/lazerpaw/cats.jpg" />
</figure>
<p>We love them to bits but sometimes it’s hard to keep up with thier kitten energy.</p>
<p>I had just finished learning a bit about OpenCV (see my <a href="/guides/software/python/opencv">guides here</a>) and I thought maybe I could build a lazer turret to play with my cats.</p>
<h2 id="lazer-turret">Lazer Turret</h2>
<p>This is the finished product.</p>
<figure>
<img src="/assets/images/posts/blog/lazerpaw/lazerpaw.jpg" />
<img />
</figure>
<p>It is made up of:</p>
<ul>
<li>Red Lazer (<1mW)</li>
<li>Raspberry Pi Zero W2</li>
<li>Rasberry Pi IR Camera</li>
<li>Servo pan-tilt module</li>
<li>Raspberry Pi Pico</li>
<li>Neo Pixel Strip</li>
<li>A pushbutton to start/stop the chase game</li>
<li>A bunch of 3D printed parts</li>
<li>Two smartphone holders to mount to the ceiling beam</li>
</ul>
<h3 id="design-decisions">Design Decisions</h3>
<p><strong>Use a camera without the IR filter</strong></p>
<p>As you’ll see later, I use a red backdrop. This makes the cats stand out against the background.</p>
<p><strong>Align the camera and lazer</strong></p>
<p>Both camera and lazer are on the pan-tilt system. This simplifies the control algorithm because the center of the image is always where the lazer is pointing.</p>
<p><strong>Have a simple hardware UI</strong></p>
<p>There is a single button to start the chase game. Holding it will shutdown the Raspberry Pi.</p>
<p>There is a single strip of neopixel LEDs which show status and mode.</p>
<h2 id="safety">Safety</h2>
<p>The bulk of the requirements around this build was with safety in mind.</p>
<p>Firstly, it’s important to point out that a <1mW lazer is very unlikely to cause damage to a cats eyes.</p>
<p>Having said that, I wanted to make sure that our cats were as safe as possible. To achieve this, I built in two requirements:</p>
<ul>
<li>The system must turn the lazer off as soon as it thinks it is pointed at a cat</li>
<li>The system should be mounted high</li>
</ul>
<p>Mounting the turret high means that when chasing the lazer, the cat is always looking away from the light source.</p>
<figure class="two-thirds">
<img src="/assets/images/posts/blog/lazerpaw/turret-position.png" />
<img />
</figure>
<h2 id="algorithm">Algorithm</h2>
<p>Given the hardware constraint of the Raspberry Pi Zero W2, cat detection is achieved with a thresholding algorithm.</p>
<p>Thresholding converts an image into black and white based on whether the brightness of each pixel is above or below the threshold.</p>
<p>The following image shows the threshold result next to the original image:</p>
<figure>
<img src="/assets/images/posts/blog/lazerpaw/threshold.png" />
<img />
</figure>
<p>To simplify knowing where the lazer is pointed, the lazer and camera are moved together. Therefore, we assume that the center of the image is also where the lazer is pointing.</p>
<p>To make the image move away from the black pixels (which I assume are my cats), each black pixel repels the center of the image.</p>
<p>Repulsion is distance based, the closer that a black pixel is to the center, the more it will repel as shown by the arrows below:</p>
<figure class="two-thirds">
<img src="/assets/images/posts/blog/lazerpaw/algorithm.png" />
<img />
</figure>
<p>The repulsion force of an individual black pixel is:</p>
\[\begin{align}
F_x = \frac{xK}{d^3} \quad
F_y = \frac{yK}{d^3}
\end{align}\]
<p>Where:</p>
<ul>
<li>\(K\) is a repulsion constant</li>
<li>\(x\) and \(y\) are pixel distances from the center of the image</li>
<li>\(d\) is the pixel distance given by \(\sqrt{x^2 + y^2}\)</li>
</ul>
<p>Three additional rules are included:</p>
<ul>
<li>Center pixels are excluded from the repulsion algorithm</li>
<li>The lazer is only on when center pixels are white</li>
<li>The image is downsampled before thresholding for performance</li>
</ul>
<p>The result of downsampled thresholding can be seen here, where the pixellated black pixels show what areas the controller will avoid.</p>
<figure class="two-thirds">
<img src="/assets/images/posts/blog/lazerpaw/downsampled-threshold.gif" />
<img />
</figure>
<h2 id="software">Software</h2>
<p>The lazerpaw repo is publicly available on <a href="https://github.com/hoani/lazerpaw">github.com/hoani/lazerpaw</a>.</p>
<h3 id="simulation">Simulation</h3>
<p>I wrote a simulator to test the controller and give me something to use when building the client UI.</p>
<figure>
<img src="/assets/images/posts/blog/lazerpaw/simulator.gif" /><img />
</figure>
<p>The simulator draws a top down “truth” view of a floor. This is the view on the right.</p>
<p>It then takes a projection on the drawn floor and uses it as input to the camera; which is being streamed to the client UI on the left.</p>
<p>The camera projection isn’t perfect; but its good enough.</p>
<h3 id="user-interface">User interface</h3>
<p>I wanted to be able to use the lazer turret in two ways:</p>
<ul>
<li>In the living room, by pushing a button to start</li>
<li>In my office to play with the cats from across the house</li>
</ul>
<p>I prefer the second option the most, because I can see what the turret is sensing and predict where it will go next.</p>
<figure>
<img src="/assets/images/posts/blog/lazerpaw/user-interface.png" /><img />
</figure>
<p>The Raspberry Pi is running a flask server, which serves up a 30fps camera feed and a 30fps pixel grid which indicates which pixels the control algorithm is trying to avoid.</p>
<p>The pixel grid is overlaid on top of the live feed, and then some controls are placed around it.</p>
<p>Elements like “test lazer”, the “manual” switch, and “Set Threshold” are used to calibrate the turret.</p>
<p>Usually, only the “start” and “shutdown” buttons will be used since the lazer game will stop on its own after two minutes.</p>
<h3 id="led-driving">LED Driving</h3>
<p>The LEDs are a strip of WS2812, which are commonly branded by Adafruit as NeoPixels.</p>
<p><img src="/assets/images/posts/blog/lazerpaw/leds.gif" alt="serial-pixel LEDs" /></p>
<p>I reused the <a href="https://github.com/hoani/serial-pixel">github.com/hoani/serial-pixel</a> library that I built during my <a href="/posts/blog/2023-04-16-chatbox/">chatbox project</a>.</p>
<h2 id="demonstration">Demonstration</h2>
<p>This is the lazer turrent playing with two of our cats on the first night it got installed.</p>
<figure>
<img src="/assets/images/posts/blog/lazerpaw/demonstration.gif" /><img />
</figure>
<p>Two things I learned when trying this out with my cats were:</p>
<ul>
<li>the thresholding algorthm is surprisingly good at detecting which areas have a cat in it</li>
<li>my cats don’t really care if the algorithm is perfect or a bit dumb, they just like chasing the lazer</li>
</ul>
<p>If you got this far, thanks for reading and I hope this project gave you some cool ideas for your next build.</p>HoaniCats, Computer vision, and lazersConfiguring Cameras with v4l2-ctl2023-08-15T00:00:00+00:002023-08-15T00:00:00+00:00https://hoani.net/guides/software/raspberrypi/webcamConfig<p>You can get an overview of cameras available using:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>v4l2-ctl --all
</code></pre></div></div>
<p>If you want to know which <code class="language-plaintext highlighter-rouge">/dev/video<n></code> maps to your device:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>v4l2-ctl --list-devices
</code></pre></div></div>
<p>From here on, we assume the camera in question is on <code class="language-plaintext highlighter-rouge">/dev/video0</code>.</p>
<p>To learn what formats, resolution and frame-rate options your camera provides:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>v4l2-ctl --device=/dev/video0 --list-formats-ext
</code></pre></div></div>
<p>From here, the camera’s settings can be updated, for example, the following sets resolution:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> v4l2-ctl --device=/dev/video0 --set-fmt-video=width=1280,height=960 --verbose
</code></pre></div></div>
<p>Or you can change the encoding to one of the numeric indexes shown in <code class="language-plaintext highlighter-rouge">--list-formats-ext</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>v4l2-ctl --device=/dev/video0 --set-fmt-video=pixelformat=2 --verbose
</code></pre></div></div>
<p>Additional available controls are shown with:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>v4l2-ctl --device=/dev/video0 --list-ctrls
</code></pre></div></div>HoaniCamera config for Raspberry Pi/Ubuntu distrosOpenCV Face Detection2023-06-05T00:00:00+00:002023-06-05T00:00:00+00:00https://hoani.net/guides/software/python/opencv-detection<p>These are my notes based on Jason Dsouza’s excellent <a href="https://youtu.be/oXlwWbU8l2o">OpenCV video course</a>.</p>
<h2 id="installation">Installation</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python3 -m pip install opencv-contrib-python
</code></pre></div></div>
<h2 id="finding-a-classifier">Finding a Classifier</h2>
<p>OpenCV’s face detection loads a <a href="https://docs.opencv.org/3.4/db/d28/tutorial_cascade_classifier.html">cascade classifier</a> which is then used to detect faces.</p>
<p>Go to <a href="https://github.com/opencv/opencv/tree/master/data/haarcascades">github.com/opencv/opencv/data/haarcascades</a>. There are a number of different classifiers, download <a href="https://github.com/opencv/opencv/blob/master/data/haarcascades/haarcascade_frontalface_default.xml">haarcascade_frontalface_default.xml</a></p>
<h2 id="running-the-classifier">Running the classifier</h2>
<p>Using an image with a face (or many faces); we can detect them with:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/people.jpg'</span><span class="p">)</span>
<span class="n">cascade_file</span> <span class="o">=</span> <span class="s">'haarcascade_frontalface_default.xml'</span>
<span class="n">haar_cascade</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">CascadeClassifier</span><span class="p">(</span><span class="n">cascade_file</span><span class="p">)</span>
<span class="n">faces_rect</span> <span class="o">=</span> <span class="n">haar_cascade</span><span class="p">.</span><span class="n">detectMultiScale</span><span class="p">(</span>
<span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">),</span>
<span class="n">scaleFactor</span><span class="o">=</span><span class="mf">1.1</span><span class="p">,</span> <span class="n">minNeighbors</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
<span class="k">for</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">w</span><span class="p">,</span><span class="n">h</span><span class="p">)</span> <span class="ow">in</span> <span class="n">faces_rect</span><span class="p">:</span>
<span class="n">cv</span><span class="p">.</span><span class="n">rectangle</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">),</span> <span class="p">(</span><span class="n">x</span><span class="o">+</span><span class="n">w</span><span class="p">,</span> <span class="n">y</span><span class="o">+</span><span class="n">h</span><span class="p">),</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">255</span><span class="p">,</span><span class="mi">255</span><span class="p">),</span> <span class="mi">2</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Detected faces'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<p>For the Super Mario Movie cast, I got these results.</p>
<figure>
<img src="/assets/images/posts/guides/opencv/00_detection.png" /><img />
</figure>
<p>There were a few false positives and negatives.</p>
<p>This could be further tuned by adjusting the <code class="language-plaintext highlighter-rouge">scaleFactor</code> and <code class="language-plaintext highlighter-rouge">minNeighbours</code> parameters; however, this isn’t the best idea since we would only be tuning the face detection to work with a specific type of image.</p>HoaniFace detection with OpenCVOpenCV Training2023-06-05T00:00:00+00:002023-06-05T00:00:00+00:00https://hoani.net/guides/software/python/opencv-training<p>These are my notes based on Jason Dsouza’s excellent <a href="https://youtu.be/oXlwWbU8l2o">OpenCV video course</a>.</p>
<h2 id="installation">Installation</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python3 -m pip install opencv-contrib-python
</code></pre></div></div>
<h2 id="setup">Setup</h2>
<p>We assume you have a the following folders:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>train
|- Name A
|- Name B
|- ...etc
validate
|- Name A
|- Name B
|- ...etc
</code></pre></div></div>
<p>Each sub-folder should have multiple images in them for training and verifying.</p>
<p>Make sure you have downloaded the face detection cascade data from <a href="/guides/software/python/opencv-detection">opencv detection</a>.</p>
<h2 id="training">Training</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">DIR</span> <span class="o">=</span> <span class="s">"train"</span>
<span class="n">people</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">os</span><span class="p">.</span><span class="n">listdir</span><span class="p">(</span><span class="n">DIR</span><span class="p">):</span>
<span class="n">people</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="n">cascade_file</span> <span class="o">=</span> <span class="s">'haarcascade_frontalface_default.xml'</span>
<span class="n">haar_cascade</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">CascadeClassifier</span><span class="p">(</span><span class="n">cascade_file</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">detect_face</span><span class="p">(</span><span class="n">img</span><span class="p">):</span>
<span class="k">return</span> <span class="n">haar_cascade</span><span class="p">.</span><span class="n">detectMultiScale</span><span class="p">(</span><span class="n">img</span><span class="p">,</span>
<span class="n">scaleFactor</span><span class="o">=</span><span class="mf">1.1</span><span class="p">,</span> <span class="n">minNeighbors</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">create_train</span><span class="p">():</span>
<span class="n">features</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">labels</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">person</span> <span class="ow">in</span> <span class="n">people</span><span class="p">:</span>
<span class="n">path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">DIR</span><span class="p">,</span> <span class="n">person</span><span class="p">)</span>
<span class="n">label</span> <span class="o">=</span> <span class="n">people</span><span class="p">.</span><span class="n">index</span><span class="p">(</span><span class="n">person</span><span class="p">)</span>
<span class="k">for</span> <span class="n">img</span> <span class="ow">in</span> <span class="n">os</span><span class="p">.</span><span class="n">listdir</span><span class="p">(</span><span class="n">path</span><span class="p">):</span>
<span class="n">img_path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
<span class="n">img_array</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="n">img_path</span><span class="p">)</span>
<span class="n">gray</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img_array</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">)</span>
<span class="k">for</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">w</span><span class="p">,</span><span class="n">h</span><span class="p">)</span> <span class="ow">in</span> <span class="n">faces_rect</span><span class="p">:</span>
<span class="n">faces_roi</span> <span class="o">=</span> <span class="n">gray</span><span class="p">[</span><span class="n">y</span><span class="p">:</span><span class="n">y</span><span class="o">+</span><span class="n">h</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span><span class="n">x</span><span class="o">+</span><span class="n">w</span><span class="p">]</span>
<span class="n">features</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">faces_roi</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">label</span><span class="p">)</span>
<span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">features</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'object'</span><span class="p">),</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">labels</span><span class="p">)</span>
<span class="n">features</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">create_train</span><span class="p">()</span>
<span class="c1"># Train the recognizer on the features list and labels list
</span><span class="n">face_recogniser</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">face</span><span class="p">.</span><span class="n">LBPHFaceRecognizer_create</span><span class="p">()</span>
<span class="n">face_recogniser</span><span class="p">.</span><span class="n">train</span><span class="p">(</span><span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">)</span>
<span class="n">face_recogniser</span><span class="p">.</span><span class="n">save</span><span class="p">(</span><span class="s">'face-trained.yml'</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="recognition">Recognition</h2>
<p>In this code, we use our trained face recognizer to determine if a detected face is who we expect it to be.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="n">DIR</span> <span class="o">=</span> <span class="s">"train"</span>
<span class="n">people</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">os</span><span class="p">.</span><span class="n">listdir</span><span class="p">(</span><span class="n">DIR</span><span class="p">):</span>
<span class="n">people</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="c1"># Load model
</span><span class="n">face_recognizer</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">face</span><span class="p">.</span><span class="n">LBPHFaceRecognizer_create</span><span class="p">()</span>
<span class="n">face_recognizer</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="s">'3-faces/face-trained.yml'</span><span class="p">)</span>
<span class="c1"># Load a face
</span><span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'validate/elton_john/1.jpg'</span><span class="p">)</span>
<span class="n">gray</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">)</span>
<span class="c1"># Detect face
</span><span class="n">cascade_file</span> <span class="o">=</span> <span class="s">'haarcascade_frontalface_default.xml'</span>
<span class="n">haar_cascade</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">CascadeClassifier</span><span class="p">(</span><span class="n">cascade_file</span><span class="p">)</span>
<span class="n">faces_rect</span> <span class="o">=</span> <span class="n">haar_cascade</span><span class="p">.</span><span class="n">detectMultiScale</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="mf">1.1</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
<span class="k">for</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">w</span><span class="p">,</span><span class="n">h</span><span class="p">)</span> <span class="ow">in</span> <span class="n">faces_rect</span><span class="p">:</span>
<span class="n">faces_roi</span> <span class="o">=</span> <span class="n">gray</span><span class="p">[</span><span class="n">y</span><span class="p">:</span><span class="n">y</span><span class="o">+</span><span class="n">h</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span><span class="n">x</span><span class="o">+</span><span class="n">w</span><span class="p">]</span>
<span class="n">label</span><span class="p">,</span> <span class="n">confidence</span> <span class="o">=</span> <span class="n">face_recognizer</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">faces_roi</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">putText</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">people</span><span class="p">[</span><span class="n">label</span><span class="p">]),</span> <span class="p">(</span><span class="mi">20</span><span class="p">,</span><span class="mi">40</span><span class="p">),</span>
<span class="n">cv</span><span class="p">.</span><span class="n">FONT_HERSHEY_COMPLEX</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">255</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span> <span class="mi">2</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">rectangle</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">),</span> <span class="p">(</span><span class="n">x</span><span class="o">+</span><span class="n">w</span><span class="p">,</span> <span class="n">y</span><span class="o">+</span><span class="n">h</span><span class="p">),</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">255</span><span class="p">,</span><span class="mi">0</span><span class="p">),</span> <span class="n">thickness</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Detected face'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<p>This works sometimes from my poorly trained model.</p>
<figure>
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_0.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_1.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_2.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_3.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_4.png" alt="" />
</figure>
<p>Funnily enough, it thought Jerry Seinfeld is Elton John when wearing glasses.</p>
<figure>
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_5.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_6.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_7.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_8.png" alt="" />
<img style="width:20%" src="/assets/images/posts/guides/opencv/01_recognize_roi_9.png" alt="" />
</figure>HoaniTraining an OpenCV Face Recognition ModelOpenCV Processing2023-06-04T00:00:00+00:002023-06-04T00:00:00+00:00https://hoani.net/guides/software/python/opencv-processing<p>These are my notes based on Jason Dsouza’s excellent <a href="https://youtu.be/oXlwWbU8l2o">OpenCV video course</a>.</p>
<h2 id="installation">Installation</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python3 -m pip install opencv-contrib-python
</code></pre></div></div>
<h2 id="colorspaces">Colorspaces</h2>
<p>OpenCV represents pixels in a BGR (blue, green, red) colorspace, use <code class="language-plaintext highlighter-rouge">cv.cvtColor</code> when you need to represent pixels in other colorspaces.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Profile'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
<span class="c1"># Matplotlib uses RGB
</span><span class="n">plt</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2RGB</span><span class="p">))</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>
<h3 id="splitmerge">Split/Merge</h3>
<p>Colors can be split into each individual channel.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Park'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
<span class="n">b</span><span class="p">,</span> <span class="n">g</span><span class="p">,</span> <span class="n">r</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="n">img</span><span class="p">)</span>
<span class="c1"># Show colors for split images
</span><span class="n">blank</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[:</span><span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'uint8'</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Blue'</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">merge</span><span class="p">([</span><span class="n">b</span><span class="p">,</span> <span class="n">blank</span><span class="p">,</span> <span class="n">blank</span><span class="p">]))</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Green'</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">merge</span><span class="p">([</span><span class="n">blank</span><span class="p">,</span> <span class="n">g</span><span class="p">,</span> <span class="n">blank</span><span class="p">]))</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Red'</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">merge</span><span class="p">([</span><span class="n">blank</span><span class="p">,</span> <span class="n">blank</span><span class="p">,</span> <span class="n">r</span><span class="p">]))</span>
<span class="n">merged</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">merge</span><span class="p">([</span><span class="n">b</span><span class="p">,</span> <span class="n">g</span><span class="p">,</span> <span class="n">r</span><span class="p">])</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Merged'</span><span class="p">,</span> <span class="n">merged</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="smoothing">Smoothing</h2>
<p>Typically, we use smoothing to remove noise.</p>
<p>Smoothing functions use a kernal size which defines the local pixels which are used for the smoothing operation.</p>
<p>For each example, assume we start with:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="averaging">Averaging</h3>
<p>Each pixel is the average of the neighbouring pixels.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">average</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">blur</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">[</span><span class="mi">11</span><span class="p">,</span><span class="mi">11</span><span class="p">])</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'AverageBlur'</span><span class="p">,</span> <span class="n">average</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="gaussian">Gaussian</h3>
<p>Each pixel is the gaussian weighted average of the neighbouring pixels.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gaussian</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">GaussianBlur</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">[</span><span class="mi">11</span><span class="p">,</span><span class="mi">11</span><span class="p">],</span> <span class="n">sigmaX</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'GaussianBlur'</span><span class="p">,</span> <span class="n">gaussian</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="median-blur">Median Blur</h3>
<p>Each pixel is the median of the neighbouring pixels.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">median</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">medianBlur</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="mi">11</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'MedianBlur'</span><span class="p">,</span> <span class="n">median</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="bilateral-blur">Bilateral Blur</h3>
<p>Applies a bilateral blur which provides smooth transitions.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bilateral</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">bilateralFilter</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="mi">35</span><span class="p">,</span> <span class="mi">25</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Bilateral'</span><span class="p">,</span> <span class="n">bilateral</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="bitwise-operations">Bitwise Operations</h2>
<p>Can apply AND, OR, NOT and XOR.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">blank</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">400</span><span class="p">,</span> <span class="mi">400</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'uint8'</span><span class="p">)</span>
<span class="n">rectangle</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">rectangle</span><span class="p">(</span><span class="n">blank</span><span class="p">.</span><span class="n">copy</span><span class="p">(),</span> <span class="p">(</span><span class="mi">30</span><span class="p">,</span> <span class="mi">30</span><span class="p">),</span> <span class="p">(</span><span class="mi">370</span><span class="p">,</span> <span class="mi">370</span><span class="p">),</span> <span class="mi">255</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">circle</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">circle</span><span class="p">(</span><span class="n">blank</span><span class="p">.</span><span class="n">copy</span><span class="p">(),</span> <span class="p">(</span><span class="mi">200</span><span class="p">,</span> <span class="mi">200</span><span class="p">),</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">255</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">bitwiseAnd</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">bitwise_and</span><span class="p">(</span><span class="n">rectangle</span><span class="p">,</span> <span class="n">circle</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Bitwise AND'</span><span class="p">,</span> <span class="n">bitwiseAnd</span><span class="p">)</span>
<span class="n">bitwiseOr</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">bitwise_or</span><span class="p">(</span><span class="n">rectangle</span><span class="p">,</span> <span class="n">circle</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Bitwise OR'</span><span class="p">,</span> <span class="n">bitwiseOr</span><span class="p">)</span>
<span class="n">bitwiseXor</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">bitwise_xor</span><span class="p">(</span><span class="n">rectangle</span><span class="p">,</span> <span class="n">circle</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Bitwise XOR'</span><span class="p">,</span> <span class="n">bitwiseXor</span><span class="p">)</span>
<span class="n">bitwiseNot</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">bitwise_not</span><span class="p">(</span><span class="n">rectangle</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Bitwise NOT'</span><span class="p">,</span> <span class="n">bitwiseNot</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="masking">Masking</h3>
<p>Can use bitwise functions to mask an image.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
<span class="n">mask</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">circle</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[:</span><span class="mi">2</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'uint8'</span><span class="p">),</span>
<span class="p">(</span><span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">//</span><span class="mi">2</span><span class="p">,</span> <span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">//</span><span class="mi">2</span><span class="p">),</span> <span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">//</span><span class="mi">3</span><span class="p">,</span>
<span class="mi">255</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">masked</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">bitwise_or</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">img</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Masked image'</span><span class="p">,</span> <span class="n">masked</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="histogram">Histogram</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
<span class="n">gray</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">)</span>
<span class="n">gray_hist</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">calcHist</span><span class="p">([</span><span class="n">gray</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="bp">None</span><span class="p">,</span> <span class="p">[</span><span class="mi">256</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">256</span><span class="p">])</span>
<span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">title</span><span class="p">(</span><span class="s">'Grayscale histogram'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s">'Bins'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s">'Num pixels'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">gray_hist</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">xlim</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span><span class="mi">256</span><span class="p">])</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>
<h2 id="thresholding">Thresholding</h2>
<p>Thresholding allows us to process an image based on pixel intensity.</p>
<p>For each example, assume we start with:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
<span class="n">gray</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="simple">Simple</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">th</span><span class="p">,</span> <span class="n">thresh</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">threshold</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="mi">150</span><span class="p">,</span> <span class="mi">255</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">THRESH_BINARY</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Simple Threshold'</span><span class="p">,</span> <span class="n">thresh</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="inverted">Inverted</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">th</span><span class="p">,</span> <span class="n">thresh</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">threshold</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="mi">150</span><span class="p">,</span> <span class="mi">255</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">THRESH_BINARY_INV</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Inverted Threshold'</span><span class="p">,</span> <span class="n">thresh</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="adaptive">Adaptive</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">thresh</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">adaptiveThreshold</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="mi">255</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">ADAPTIVE_THRESH_MEAN_C</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">THRESH_BINARY</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Adaptive threshold'</span><span class="p">,</span> <span class="n">thresh</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="gradient-detection">Gradient Detection</h2>
<p>Gradient detection is similar to edge detection; and is useful for simple object detection.</p>
<p>For each example, assume we start with:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/park.jpg'</span><span class="p">)</span>
<span class="n">gray</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="lapacian">Lapacian</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lap</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">Laplacian</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">CV_64F</span><span class="p">)</span>
<span class="n">lap</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">uint8</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">absolute</span><span class="p">(</span><span class="n">lap</span><span class="p">))</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Laplacian'</span><span class="p">,</span> <span class="n">lap</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="sobel">Sobel</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sobelx</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">Sobel</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">CV_64F</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">sobely</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">Sobel</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">CV_64F</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">combined_sobel</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">bitwise_or</span><span class="p">(</span><span class="n">sobelx</span><span class="p">,</span> <span class="n">sobely</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Sobel Combined'</span><span class="p">,</span> <span class="n">combined_sobel</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="canny">Canny</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">canny</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">Canny</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="mi">150</span><span class="p">,</span> <span class="mi">175</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Canny'</span><span class="p">,</span> <span class="n">canny</span><span class="p">)</span>
</code></pre></div></div>HoaniImage processing with OpenCV in pythonOpenCV Basics2023-06-03T00:00:00+00:002023-06-03T00:00:00+00:00https://hoani.net/guides/software/python/opencv-basics<p>These are my notes based on Jason Dsouza’s excellent <a href="https://youtu.be/oXlwWbU8l2o">OpenCV video course</a>.</p>
<h2 id="installation">Installation</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python3 -m pip install opencv-contrib-python
</code></pre></div></div>
<h2 id="showing-an-image">Showing an Image</h2>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Profile'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="c1"># Waits forever until keypress.
</span></code></pre></div></div>
<h2 id="showing-a-video">Showing a Video</h2>
<p>Showing a video is similar to an image, but we display each frame individually.</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="c1"># capture = cv.VideoCapture(0) for webcam
</span><span class="n">capture</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">VideoCapture</span><span class="p">(</span><span class="s">'Videos/drone.mp4'</span><span class="p">)</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="n">hasFrame</span><span class="p">,</span> <span class="n">frame</span> <span class="o">=</span> <span class="n">capture</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">hasFrame</span><span class="p">:</span>
<span class="k">break</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Video'</span><span class="p">,</span> <span class="n">frame</span><span class="p">)</span>
<span class="k">if</span> <span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">20</span><span class="p">)</span> <span class="o">&</span> <span class="mh">0xFF</span><span class="o">==</span><span class="nb">ord</span><span class="p">(</span><span class="s">'d'</span><span class="p">):</span>
<span class="k">break</span>
<span class="n">capture</span><span class="p">.</span><span class="n">release</span><span class="p">()</span>
<span class="n">cv</span><span class="p">.</span><span class="n">destroyAllWindows</span><span class="p">()</span>
</code></pre></div></div>
<h2 id="common-operations">Common Operations</h2>
<p>For each example, assume we start with:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="convert-to-grayscale">Convert to Grayscale</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gray</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">code</span><span class="o">=</span><span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Gray'</span><span class="p">,</span> <span class="n">gray</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="blurring">Blurring</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">blur</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">GaussianBlur</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="mi">7</span><span class="p">,</span><span class="mi">7</span><span class="p">),</span> <span class="n">cv</span><span class="p">.</span><span class="n">BORDER_DEFAULT</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Gray'</span><span class="p">,</span> <span class="n">gray</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="edge-cascade">Edge Cascade</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">canny</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">Canny</span><span class="p">(</span><span class="n">blur</span><span class="p">,</span> <span class="mi">125</span><span class="p">,</span> <span class="mi">175</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Canny Edges'</span><span class="p">,</span> <span class="n">canny</span><span class="p">)</span>
</code></pre></div></div>
<p><em>Note: Blurring prior to edge detection can provide better results.</em></p>
<h3 id="dilatingeroding">Dilating/Eroding</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dilated</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">dilate</span><span class="p">(</span><span class="n">canny</span><span class="p">,</span> <span class="p">(</span><span class="mi">7</span><span class="p">,</span><span class="mi">7</span><span class="p">),</span> <span class="n">iterations</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<span class="n">eroded</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">erode</span><span class="p">(</span><span class="n">dilated</span><span class="p">,</span> <span class="p">(</span><span class="mi">7</span><span class="p">,</span><span class="mi">7</span><span class="p">),</span> <span class="n">iterations</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Dilated'</span><span class="p">,</span> <span class="n">dilated</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Eroded'</span><span class="p">,</span> <span class="n">eroded</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="drawing-on-an-image">Drawing on an image</h2>
<p>For each example, assume we start with:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">500</span><span class="p">,</span> <span class="mi">500</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'uint8'</span><span class="p">)</span>
<span class="n">green</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span><span class="mi">255</span><span class="p">,</span><span class="mi">0</span>
<span class="n">red</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">255</span> <span class="c1"># Note BGR.
</span></code></pre></div></div>
<h3 id="painting-pixels">Painting pixels</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">blank</span><span class="p">[:]</span> <span class="o">=</span> <span class="n">green</span>
<span class="n">blank</span><span class="p">[</span><span class="mi">200</span><span class="p">:</span><span class="mi">300</span><span class="p">,</span> <span class="mi">300</span><span class="p">:</span><span class="mi">400</span><span class="p">]</span> <span class="o">=</span> <span class="n">red</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Green with red square'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="rectangle">Rectangle</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cv</span><span class="p">.</span><span class="n">rectangle</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span> <span class="p">(</span><span class="mi">250</span><span class="p">,</span> <span class="mi">400</span><span class="p">),</span> <span class="n">green</span><span class="p">,</span> <span class="n">thickness</span><span class="o">=</span><span class="n">cv</span><span class="p">.</span><span class="n">FILLED</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Rectangle'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
</code></pre></div></div>
<p>Set thickness to an integer for outlines.</p>
<h3 id="circle">Circle</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cv</span><span class="p">.</span><span class="n">circle</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="mi">250</span><span class="p">,</span> <span class="mi">250</span><span class="p">),</span> <span class="mi">100</span><span class="p">,</span> <span class="n">red</span><span class="p">,</span> <span class="n">thickness</span><span class="o">=</span><span class="n">cv</span><span class="p">.</span><span class="n">FILLED</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Circle'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="line">Line</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cv</span><span class="p">.</span><span class="n">line</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="mi">50</span><span class="p">,</span> <span class="mi">200</span><span class="p">),</span> <span class="p">(</span><span class="mi">450</span><span class="p">,</span> <span class="mi">300</span><span class="p">),</span> <span class="n">green</span><span class="p">,</span> <span class="n">thickness</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Line'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="text">Text</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cv</span><span class="p">.</span><span class="n">putText</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="s">"Hello"</span><span class="p">,</span> <span class="p">(</span><span class="mi">225</span><span class="p">,</span> <span class="mi">225</span><span class="p">),</span> <span class="n">cv</span><span class="p">.</span><span class="n">FONT_HERSHEY_TRIPLEX</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="n">green</span><span class="p">,</span> <span class="n">thickness</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">putText</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="s">"Bye bye"</span><span class="p">,</span> <span class="p">(</span><span class="mi">225</span><span class="p">,</span> <span class="mi">280</span><span class="p">),</span> <span class="n">cv</span><span class="p">.</span><span class="n">FONT_HERSHEY_DUPLEX</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="n">green</span><span class="p">,</span> <span class="n">thickness</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Text'</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="transformations">Transformations</h2>
<p>For each example, assume we start with:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="translation">Translation</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">translate</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="n">transMat</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">([[</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="n">x</span><span class="p">],</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="n">y</span><span class="p">]])</span>
<span class="n">dimensions</span> <span class="o">=</span> <span class="p">(</span><span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="k">return</span> <span class="n">cv</span><span class="p">.</span><span class="n">warpAffine</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">transMat</span><span class="p">,</span> <span class="n">dimensions</span><span class="p">)</span>
<span class="n">translated</span> <span class="o">=</span> <span class="n">translate</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="o">-</span><span class="mi">100</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Translated'</span><span class="p">,</span> <span class="n">translated</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="rotation">Rotation</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">rotate</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">angle</span><span class="p">,</span> <span class="n">rotPoint</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span> <span class="o">=</span> <span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">[:</span><span class="mi">2</span><span class="p">]</span>
<span class="k">if</span> <span class="n">rotPoint</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">rotPoint</span> <span class="o">=</span> <span class="p">(</span><span class="n">w</span><span class="o">//</span><span class="mi">2</span><span class="p">,</span> <span class="n">h</span><span class="o">//</span><span class="mi">2</span><span class="p">)</span>
<span class="n">rotMat</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">getRotationMatrix2D</span><span class="p">(</span><span class="n">rotPoint</span><span class="p">,</span> <span class="n">angle</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">)</span>
<span class="n">dimensions</span> <span class="o">=</span> <span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="n">h</span><span class="p">)</span>
<span class="k">return</span> <span class="n">cv</span><span class="p">.</span><span class="n">warpAffine</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">rotMat</span><span class="p">,</span> <span class="n">dimensions</span><span class="p">)</span>
<span class="n">rotated</span> <span class="o">=</span> <span class="n">rotate</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="o">-</span><span class="mi">45</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Rotated'</span><span class="p">,</span> <span class="n">rotated</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="resizing">Resizing</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">resized</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">resize</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="p">(</span><span class="mi">500</span><span class="p">,</span><span class="mi">500</span><span class="p">),</span> <span class="n">interpolation</span><span class="o">=</span><span class="n">cv</span><span class="p">.</span><span class="n">INTER_NEAREST</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Resized'</span><span class="p">,</span> <span class="n">resized</span><span class="p">)</span>
</code></pre></div></div>
<p><em>Note: Pay attention to interpolation; linear or cubic looks the best when increasing image sizes.</em></p>
<h3 id="cropping">Cropping</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cropped</span> <span class="o">=</span> <span class="n">img</span><span class="p">[</span><span class="mi">50</span><span class="p">:</span><span class="mi">200</span><span class="p">,</span> <span class="mi">200</span><span class="p">:</span><span class="mi">300</span><span class="p">]</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Cropped'</span><span class="p">,</span> <span class="n">cropped</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="flipping">Flipping</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Flip Vertical'</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">flip</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Flip Horizontal'</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">flip</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Flip Both'</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">flip</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">))</span>
</code></pre></div></div>
<h2 id="contours">Contours</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">cv2</span> <span class="k">as</span> <span class="n">cv</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="n">img</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">imread</span><span class="p">(</span><span class="s">'Photos/profile.jpg'</span><span class="p">)</span>
<span class="n">gray</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">COLOR_BGR2GRAY</span><span class="p">)</span>
<span class="n">ret</span><span class="p">,</span> <span class="n">thres</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">threshold</span><span class="p">(</span><span class="n">gray</span><span class="p">,</span> <span class="mi">125</span><span class="p">,</span> <span class="mi">255</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">THRESH_BINARY</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Threshold'</span><span class="p">,</span> <span class="n">thres</span><span class="p">)</span>
<span class="c1"># RETR_EXTERNAL provides only external contours
# CHAIN_APPROX_SIMPLE allows us to combine points when it
# makes sense
# i.e. (0,0), (1,1), (2,2) => (0,0), (2,2)
</span><span class="n">contours</span><span class="p">,</span> <span class="n">heirachies</span> <span class="o">=</span> <span class="n">cv</span><span class="p">.</span><span class="n">findContours</span><span class="p">(</span><span class="n">thres</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">RETR_LIST</span><span class="p">,</span> <span class="n">cv</span><span class="p">.</span><span class="n">CHAIN_APPROX_NONE</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">'</span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">contours</span><span class="p">)</span><span class="si">}</span><span class="s"> contour(s) found'</span><span class="p">)</span>
<span class="n">blank</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">img</span><span class="p">.</span><span class="n">shape</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'uint8'</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">drawContours</span><span class="p">(</span><span class="n">blank</span><span class="p">,</span> <span class="n">contours</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">255</span><span class="p">),</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="s">'Contours drawn'</span><span class="p">,</span> <span class="n">blank</span><span class="p">)</span>
<span class="n">cv</span><span class="p">.</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<p>The heirachies allow us to detect contours within contours, which may be useful for detecting features within objects.</p>HoaniBasic operations in OpenCV with PythonGo Functional Options Pattern2023-04-23T00:00:00+00:002023-04-23T00:00:00+00:00https://hoani.net/guides/software/golang/goOptions<p>The functional options pattern is a popular way to instantiate structs in Go.</p>
<p>It has two parts:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">New(options ...func(*MyStruct)) *MyStruct</code></li>
<li>Options <code class="language-plaintext highlighter-rouge">WithSomething(s Something) func(*MyStruct)</code></li>
</ul>
<p>Option functions return a function which configures your structure, they typically look like:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">WithSomething</span><span class="p">(</span><span class="n">s</span> <span class="n">Something</span><span class="p">)</span> <span class="k">func</span><span class="p">(</span><span class="o">*</span><span class="n">MyStruct</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="k">func</span><span class="p">(</span><span class="n">m</span> <span class="o">*</span><span class="n">MyStruct</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m</span><span class="o">.</span><span class="n">something</span> <span class="o">=</span> <span class="n">s</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Then the inside of your <code class="language-plaintext highlighter-rouge">New</code> function looks like:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">New</span><span class="p">(</span><span class="n">options</span> <span class="o">...</span><span class="k">func</span><span class="p">(</span><span class="o">*</span><span class="n">MyStruct</span><span class="p">))</span> <span class="o">*</span><span class="n">MyStruct</span> <span class="p">{</span>
<span class="c">// Default config here.</span>
<span class="n">m</span> <span class="o">:=</span> <span class="o">&</span><span class="n">MyStruct</span><span class="p">{}</span>
<span class="c">// ...</span>
<span class="c">// Apply options.</span>
<span class="k">for</span> <span class="n">_</span><span class="p">,</span> <span class="n">option</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">options</span> <span class="p">{</span>
<span class="n">option</span><span class="p">(</span><span class="n">m</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">m</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="example-http-client">Example: Http Client</h2>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span>
<span class="k">import</span> <span class="p">(</span>
<span class="s">"fmt"</span>
<span class="s">"net/http"</span>
<span class="s">"time"</span>
<span class="p">)</span>
<span class="k">type</span> <span class="n">Option</span> <span class="k">func</span><span class="p">(</span><span class="o">*</span><span class="n">http</span><span class="o">.</span><span class="n">Client</span><span class="p">)</span>
<span class="k">func</span> <span class="n">New</span><span class="p">(</span><span class="n">opts</span> <span class="o">...</span><span class="n">Option</span><span class="p">)</span> <span class="o">*</span><span class="n">http</span><span class="o">.</span><span class="n">Client</span> <span class="p">{</span>
<span class="n">c</span> <span class="o">:=</span> <span class="o">&</span><span class="n">http</span><span class="o">.</span><span class="n">Client</span><span class="p">{</span>
<span class="n">Timeout</span><span class="o">:</span> <span class="m">5</span> <span class="o">*</span> <span class="n">time</span><span class="o">.</span><span class="n">Second</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">for</span> <span class="n">_</span><span class="p">,</span> <span class="n">opt</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">opts</span> <span class="p">{</span>
<span class="n">opt</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">c</span>
<span class="p">}</span>
<span class="k">func</span> <span class="n">WithTimeout</span><span class="p">(</span><span class="n">t</span> <span class="n">time</span><span class="o">.</span><span class="n">Duration</span><span class="p">)</span> <span class="k">func</span><span class="p">(</span><span class="o">*</span><span class="n">http</span><span class="o">.</span><span class="n">Client</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="k">func</span><span class="p">(</span><span class="n">c</span> <span class="o">*</span><span class="n">http</span><span class="o">.</span><span class="n">Client</span><span class="p">)</span> <span class="p">{</span>
<span class="n">c</span><span class="o">.</span><span class="n">Timeout</span> <span class="o">=</span> <span class="n">t</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="n">c</span> <span class="o">:=</span> <span class="n">New</span><span class="p">(</span><span class="n">WithTimeout</span><span class="p">(</span><span class="n">time</span><span class="o">.</span><span class="n">Second</span><span class="p">))</span>
<span class="n">res</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">c</span><span class="o">.</span><span class="n">Head</span><span class="p">(</span><span class="s">"http://hoani.net"</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
<span class="nb">panic</span><span class="p">(</span><span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Printf</span><span class="p">(</span><span class="s">"status %v</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">res</span><span class="o">.</span><span class="n">Status</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>HoaniInstantiating structs with the options pattern.ChatBox - OpenAI driven speaker2023-04-16T00:00:00+00:002023-04-16T00:00:00+00:00https://hoani.net/posts/blog/chatbox<p>After making some <a href="hoani.net/guides/engineering/openai">openai guides</a> and <a href="https://www.youtube.com/playlist?list=PLvej1vrdatmmjorhRG4EyVrLF4Ktjc1Pb">openai video tutorials</a>, I thought it would be really cool to build an AI companion.</p>
<p>The general idea was to experience ChatGPT without jumping on a browser.</p>
<p>This is the finished product.</p>
<figure class="two-thirds">
<img src="/assets/images/posts/blog/chatbox/chatbox.jpg" />
<img />
</figure>
<p>It is made up of:</p>
<ul>
<li>One side of a USB speaker</li>
<li>Raspberry Pi as the brains of the operation</li>
<li>Teensy 4.0 to run the LEDs</li>
<li>LCD1602 RGB LCD screen</li>
<li>A pushbutton to let the chatbox know when you want to talk</li>
<li>A USB microphone (orginally I used a webcam)</li>
<li>A bunch of 3D printed parts</li>
</ul>
<h2 id="software">Software</h2>
<p>I wrote the Raspberry Pi code in Go, with a little bit of Python for controlling the LCD screen. You can check out the project on <a href="https://github.com/hoani/chatbox">github.com/hoani/chatbox</a>.</p>
<p>To speed up debugging, I wrote a lot of the code on my Macbook, and used <a href="hoani.net/guides/software/golang/conditional-compilation">conditional compilation</a> to compile a terminal UI to emulate the LEDs and LCD hardware.</p>
<p><img src="/assets/images/posts/blog/chatbox/chatbox-tui.gif" alt="chatbox tui" class="width:" /></p>
<h3 id="audio-visualization">Audio Visualization</h3>
<p>One of the more rewarding parts of writing the software was processing audio streams to visualize them. I ended up writing my own go audio library <a href="https://github.com/hoani/toot">github.com/hoani/toot</a> which provides microphone access and some audio processing. I really like the visualizer example, it responded really well to pure tones and music.</p>
<p><img src="/assets/images/posts/blog/chatbox/toot-visualizer.gif" alt="chatbox tui" /></p>
<p>One challenge I had was binning the frequency powers. A linear range isn’t very useful, because humans have a harder time distinguishing between 900-1000 Hz as opposed to 100-200Hz. Ideally, I would bin them in some logarithmatic way, but I had a lot to do in this project and ended up just using a <a href="https://en.wikipedia.org/wiki/Triangular_number">triangular number</a> distribution, which actually worked pretty well.</p>
<h3 id="led-driving">LED Driving</h3>
<p>The LEDs are a ring of WS2812, which are commonly branded by Adafruit as NeoPixels. Adafruit provide a library for running these, but I decided that it would be better to have the option of running and powering these off an Arduino instead; that way the Arduino can act as a serial slave to whatever machine wants to control the pixels.</p>
<p>This resulted in the <a href="https://github.com/hoani/serial-pixel">github.com/hoani/serial-pixel</a> library, which is probably the tidiest and most useful arduino project I have ever built.</p>
<p><img src="/assets/images/posts/blog/chatbox/leds.gif" alt="serial-pixel LEDs" /></p>
<h3 id="other-considerations">Other Considerations</h3>
<p>There was a lot involved in this project. Some highlights were:</p>
<ul>
<li>Choosing to just call python libraries from go using <code class="language-plaintext highlighter-rouge">exec.Cmd("python3", <library>)</code> - this saved a lot of extra development.</li>
<li>Learning to use <code class="language-plaintext highlighter-rouge">systemctl --user</code> - it turns out that with <code class="language-plaintext highlighter-rouge">portaudio</code> the root user doesn’t have access to the same audio hardware as the user… so this was necessary to have the program run on <code class="language-plaintext highlighter-rouge">systemd</code> when the raspberry pi boots up.</li>
<li>It turns out backing up SD cards on the raspberry pi isn’t too hard - <a href="https://all3dp.com/2/back-up-raspberry-pi-sd-card/">Step 1 of this guide</a> is really easy.</li>
</ul>
<h2 id="openai-chat">OpenAI Chat</h2>
<p>The design of this project was:</p>
<ul>
<li>Use a microphone + OpenAI <code class="language-plaintext highlighter-rouge">voice</code> transcription to convert voice to text for user input</li>
<li>Use an OpenAI <code class="language-plaintext highlighter-rouge">chat</code> completions session to generate responses</li>
<li>Use <code class="language-plaintext highlighter-rouge">espeak</code> to convert the <code class="language-plaintext highlighter-rouge">chat</code> responses to audio</li>
</ul>
<p>I had a lot of ideas coming into this project.</p>
<p>One of them, was to let the AI <code class="language-plaintext highlighter-rouge">chat</code> model decide what voice it was going to respond with. It turns out that this is a terrible idea. Because the GPT model generates text without knowing what’s coming next, it can’t predict the tone or voice it should be using… so it actually made the experience of conversing with the model worse.</p>
<p>I also considered letting the AI choose LED colors too, but this was equally disappointing for the same reasons.</p>
<p>I think if I were to enhance this project further, I would use a seperate AI model to determine what the tone, pitch or voice should be for chat responses… the only problem with this is that it adds extra latency to the chat.</p>HoaniChatGPT in a speakerConvert Video To Gif2023-04-16T00:00:00+00:002023-04-16T00:00:00+00:00https://hoani.net/guides/software/videoToGif<p>To convert video to GIF install <code class="language-plaintext highlighter-rouge">ffmpeg</code> and <code class="language-plaintext highlighter-rouge">gifsicle</code> then run:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ffmpeg -i input.mp4 -s 640x360 -pix_fmt rgb24 -r 20 -f gif - | gifsicle --optimize=3 --delay=3 > output.gif
</code></pre></div></div>
<p>Make sure your size parameter is the same ratio as your original file, or it will get squashed.</p>
<p>Example output:
<img src="/assets/images/posts/blog/chatbox/leds.gif" alt="serial-pixel LEDs" /></p>HoaniEasy trick to convert videos to gif