What is a Kernel in Image Processing? (ft. matrix arithmetic)

In this post I’ll be covering what a kernel is in regards to image processing, and some basic matrix arithmetic to lay a foundation for future image processing posts.

Advertisements

I want to start this off by saying I am by no means an image processing pro. To be totally honest, I just thought the topic was interesting and I’ve done a small bit of work with image processing in an app we’ve worked on, so I decided to just do a topical study on it to benefit both me and the project, and I figured I need more topics to post about, so why not do some blog posts on image processing? That said, if I get anything wrong here, please feel free to reach out and let me know and I will correct it.

Matrices (not as scary as it sounds)

Do you understand what an array is? Okay cool, you know what a matrix is…alright, I probably shouldn’t be that lazy about explaining it. A matrix is a mathematical construct that holds a collection of numbers organized as columns and rows…so…an array. People tend to complicate this topic way more than they should, in fact, I’ll even show you some matrix math to show you how not big of a deal a matrix is (okay fine, they can get pretty complex…but not at the base level):

Matrix arithmetic

Assume the matrix below:
[1, 2, 3]
[4, 5, 6]

Now assume this matrix:
[7, 8, 9]
[10, 11, 12]

If I want to add these matrices together, literally all I do is take each individual cell in the matrix, and add it to the corresponding cell in the opposing matrix, so that would look like this:
[1+7, 2+8, 3+9]
[4+10, 5+11, 6+12]

Which adds up to a resulting matrix of:
[8, 10, 12]
[14, 16, 18]

So if that’s how you add them, then you can probably figure out how subtract them right? If you said all you have to do is subtract each cell of the matrix from it’s corresponding matrix, you’d be spot on, I’m not even going to waste the time to…oh fine, I’ll show an example…but I’m going to make it simple.
[3, 3, 3] – [2, 2, 2] = [3-2, 3-2, 3-2] = [1, 1, 1]

Multiplication can be a little trickier, but not in all cases, let me show you what I mean:
If I want to multiply an entire matrix by a single number, that’s dead simple:

2 * [2, 4, 12] = [2*2, 2*4, 2*12] = [4, 8, 24]

In this case, 2 is what’s known as a scalar, a variable which scales the original data by some amount, in this case multiplying the matrix essentially upscales the data by a factor of 2, pretty simple.

Multiplying a matrix by another matrix gets slightly more tricky. Let’s get a few rules out of the way before I explain. First off, unlike traditional arithmetic, where multiplication is commutative, matrix multiplication is not, so in matrix multiplication, a*b != b*a, you may get lucky and get two matrices which can be multiplied in any order and get the same result, but it’s very rare and purely coincidence. Also, when you multiply two matrices together, the number of columns in matrix 1 must be equal to the number of rows in the second, so for example:

[1, 2, 3] * [1, 2, 3, 4]
            [3, 4, 5, 6]
            [2, 4, 6, 1]

Is perfectly valid, however:

[1, 2, 3] * [1, 2, 3, 4]

Is not valid. Lastly, the result of two cells from the multiplication of two matrices is known as the dot product.

Getting the dot product

To multiply two valid matrices together, we do so in the form of (col1 * r1) + (col2 * r2) + (col3 * r3). Let me illustrate this so it makes more sense.

[1, 2, 3] * [3, 4, 5, 6] = [(1 * 3) + (2 * 4) + (3 * 3), ...]
            [4, 5, 6, 7]   
            [3, 2, 5, 7]   
Results in:
[(3 + 8 + 9), (4 + 10 + 6), (5 + 12 + 15), (6 + 14 + 21)]
Which leaves you with a result of:
[20, 20, 32]

For each column in the first matrix, we multiply the first column by the first row in each column of the second matrix, second column in the first matrix by the second row in each column of the second matrix, and so on and so forth, then add together the products. The result of this operation is called the dot product, and the resulting matrix we end up with is a matrix containing the resulting dot products. It should also be noted that the first matrix can have as many rows as you want it to have, and as long as the number of columns matches the number of rows of the second matrix, it is perfectly valid, you just shift down to the next row of the first matrix once you get the dot products of the first row of the first matrix…..whew, talk about a mouth full there.

So if that didn’t all make total sense, no worries, either I suck at explaining it, in which case go find some other explanations, or you just need to read it again. For now though, it’s not so important that we have all of that stuff down, as we’re just discussing concepts here.

What’s a kernel?

Oh, right…forgot the whole point of the post. So a kernel as it pertains to image processing, is simply a matrix. In image processing, there are types of kernels, or matrices, which you can use across an image to achieve certain effects. For example, the identity kernel, which has a kernel of:

[0, 0, 0]
[0, 1, 0]
[0, 0, 0]

Is essentially a do-nothing kernel because when applied to an image, you get the original image.

Conclusion

That’s all I’ve got for you, kernels are real simple, but totally essential to understanding image processing. In future posts, I’ll be covering more image processing topics as I learn them…I’d rather not blog about things I know nothing about, I assume you all probably appreciate that as well. Thanks for reading, and until next time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Advertisements
Advertisements
%d bloggers like this: