1 - Images as Functions Intro¶
All right, well welcome to computer vision, you know hopefully, from the introduction, you feel, inspired to charge through here. And what we’re going to do is we’re going to start of easy, and and frankly, we’ll get kind of hard towards the end, which is okay, because, by the time you get there you’ll be, ready to do that. So today, we’re going to talk about images, the most of you think about images as, something you see on a screen, or if you’re old school, like me, something that’s actually printed in a magazine, you know, it’s something you can look at.
2 - Images as Functions Part 1¶
So here’s an image of an old and I think now expired comedian who’s, therefore, cannot sue me. That’s Phyllis Diller, by the way, in case you remember. And by the way, we’re going to start with black and white because black and white just makes everything easier because it’s just a single channel. We, we’ll do color on and off, but pretty much everything we do for black and white do, you do for color, hold for black and white, and, and it’s just easier. So when I show this to you, you actually think of it as a picture or something to look at. But what actually is, is a function. In fact, we can just call it a function of I of x y, all right, where the I has something to do with the image intensity. So, if I think of this as a function, then I can just plop this as a surface and MATLAB makes this incredibly easy. And if I did, it would look something like this. Okay? Now this is the exact same function, but instead of showing you as a picture where, you know, sort of straight on, and by the way, the way MATLAB does it it’s really cool, the, the higher the thing is it also makes it brighter, so you can see. So if you take a look at like the, the, the checkers pattern on that awful shirt she was wearing, right, so the bright spots are here, and the dark spots are down there. Okay, that function is the same function as the image that I was showing you before. Computer vision and especially image processing, we’ll be talking mostly about the image processing side of computer vision today and the next few are about taking these functions and computing something from them. Often, we’re just going to computer another image-like function, so images in, images out. And sometimes, we’ll be getting some sorts of information. So here’s a very simple example. Suppose I took that previous function, and I just smoothed it. All right, so now you see, I have the same surface I had before, but it’s now, you know, it blends smoother, and the peaks and the valleys of that shirt are, are much smoother. They’re not as steep as they were before. Okay. So that’s the function. Now, of course, I can show that to you as an image again. What’s that going to look like? Well, you’ve probably figured this out because you’re all so smart. It’s just going to be a blurry version of that image, okay. And I’m showing it here side by side with the blurred function, oh, sorry, the smooth function, right? Because there is this direct analogy between what we call blurring in the image and smoothing of that function. It’s exactly the same thing.
3 - Image Quiz¶
So here’s a quiz, all right? So an image can be thought of as A, a two-dimensional array of numbers ranging from some minimum to some maximum. B, a function I of x and y. C, something generated by a camera. And D, all of the above.
4 - Image Quiz Solution¶
Okay. So probably pretty obvious, it’s all of the above. Now, I’m going to tell you, most Georgia Tech tests are a little bit harder than that, okay? But it’s just to get you to pay attention.
5 - Images as Functions Part 2¶
So, let’s talk a little bit more about images as functions, all right? So, we can think of an image as a function, f, sometimes we’ll say f, sometimes we’ll say I. That maps, you know, to, from R squared to R. That is, it goes from an x y to some pure intensity or value at that position x y. But we’re not going to have, sort of, arbitrary functions, we’re going to limit them in certain ways. For us, an image is going to be defined to be over some bound. So x ranges from a to b. And y ranges from c to d. And the intensity ranges from some min to some max. Now, raise your hands out there if you think that images value go from zero to 255. Come on, be honest, you’re raising your hands, okay. That is a pure accident, zero to one probably would have made a lot more sense. Zero be black, white be white, oh sorry, one be white. Where did the 255. Well, you computer nerds we all know. Well there are 8 bits in a byte, right. So if they’re all on they’re all ones we’ll call that 255. There is nothing special about 255, okay. And in fact, later we’re going to even have to have images that can have negatives in them. So the thing that you just have to remember, is we’re going to allow our images to go from some min to some max. The min will be sort of the blackest black, the max will be the whitest white, if we’re actually thinking of them as intensities. But later if we think of them as just pure functions, like image derivatives, they just going to take on some real value. By the way, we can do color images the same way. Now, instead of having one function that maps x,y to an intensity, we just have three functions, often called r, g and b, sometimes called l, u and v. We’re going to talk about that way later. All right? But they basically, you can think of this as what we call a vector-valued function so every pixel, the function is a vector of three numbers. Like I said, most of the time we’ll be sticking to gray level images.
6 - Define an Image as a Function Quiz¶
Now how is this a function? Well, you can think of an image as a collection of light intensities at different locations. For instance, this area on the shirt is pretty bright, here this shadow is dark, and this grass somewhere in between. Now how would you identify these different locations? Note that these locations are laid out in a two-dimensional space. We can characterize it using, say an x-axis, which is the horizontal dimension, and a y-axis, which is the vertical dimension. Any location on the image can thus be specified using an x-axis value and a y-axis value. Notice that this x axis value can be any real number. Similarly for y. Now the image intensity at the position x, y can be written as f of x, y. And that’s how you can think of the image as a function. Now what about the intensity values in the image? If we assume that they are real numbers than we can say that f is a mapping from R cross R, or R square, to R. Now this definition seems to indicate that images are infinitely large, but practically speaking, images have a finite size, they have a certain width and a certain height. Assuming that our coordinate origin is here, we can define numerical bounds for the image. For instance, 10 to 210 n the x-axis, and say, 15 to 165 along the y-axis. We also know that image intensity values have a finite range. Say in this image, they range from zero through ten. Given these finite ranges. How would you define this image as a function? Complete the function definition above by filling in the boxes. Note that the first pair of boxes is the range of column values. The second set of boxes represents the range of rows, or y values. And finally, the third set of boxes represent the image in density range.
7 - Define an Image as a Function Quiz Solution¶
The answer is fairly straightforward. X values can range from 10 through 210. Y values can range from 15 through 165. And we define the image intensity values to be in the range 0 through 10.
8 - Define a Color Image as a Function Quiz¶
So a color image is also a function. It’s just that the value at each location is no longer a single number that represents the light intensity. Instead, it can be a vector. A vector that holds three different intensities for three color components. For instance, each location in an RGB has values for Red, Green, and Blue. These values can be separated into three different channels or planes. If you take all of the red intensity values in an image, you have your red plane. Similarly, if you take all the green intensity values, you get your green plane, likewise, for blue. Given this scheme where each location is associated with a tuple of three values and assuming that each of these intensity values is a real number, and so are the values for x and y. How would you represent this image as a functional mapping? Mark the expressions that correctly represent the desired functional mapping.
9 - Define a Color Image as a Function Quiz Solution¶
Let’s first look at the range of the function. Since each location has a tuple of three values, hence the right-hand side of the mapping should have R cross R cross R, or R cubed. The first expression is therefore clearly wrong. Note that a color image is still a mapping over a domain of two dimensional x and y values. Hence the third expression is also incorrect. Therefore, we can represent an RGB image as a mapping from R cross R to R cross R cross R. This is the same as writing R cross R maps to R cubed. In fact, you could also shorten it to R squared to R cubed
10 - The Real Phyllis¶
So far we’ve talked about functions from a mathematical perspective, but this is a computer and in a computer everything has to be digital and that gives us even some more restrictions okay. So let’s take a look at the grid that was a little chunk taken out of the Phyllis Diller picture. So here you see, in fact I had a little Matlab code right, it says pd. Pd Phyllis Diller. That was my array. And this says rows 40 to 60. Columns 30 to 40. And that’s the middle of Phyllis’s face. You might not have known it, but that’s the middle of Phyllis’s face. By the way, one of the things you should realize is this is exactly the same representation as the picture on the screen. But you happen to have a vision system that will look at bright dots and dark dots and see things. And when you look at these numbers, you don’t immediately see those things. Mathematically, these are identical. Oh, and by the way, something that I just mentioned, which is going to bite us in some place. These are rows and these are columns rows go down columns go over so x and y okay remember that we’re going to do that now.
11 - Digital Images¶
In, digital images in computer vision, we typically operate on discrete images, right, and that means we have to do two types of discretizations. First of all, we have to sample, the 2D space on a regular grid, that is we have discrete pixels at locations, you know what pixels stand for, picture elements, okay, in the old television world that was called pells also for picture elements, but for computers we have to be special so we call them pixels. So we have to pick them at specific locations, the other thing is we have to quantize, each value, we don’t get to have a continuous real value, we have some finite number of bits to represent that, so like we said, maybe have 8 bits, so it will go from zero to 255. These days, you tend to have 16-bit images, or 12-bit, or depending upon the device, but the idea is that it’s quantized, to some level. Even though, it’s quantized, later we’re going to tend to think of these things as floating point, and I’ll tell you now, that if you compute with integer images like unsigned integer 8, 8 bits your code will just break, so, use floating point images. So in general, especially in MATLAB, which we’ll be doing a lot of, images are represented as a matrix of values, typically integer values to start with [INAUDIBLE] so here’s Phyllis, looking as delightful as ever, and, we index our matrices by, again, i and j, row and column, sometimes, x going over this way and y going that way, if I say some pixel i j it means row i column j, if I say some pixel at x y, all right, x y, x is horizontal so I have to go get the column that, so you’ll have to swap them, and part of the problem is our math is always determined by x and y, and our computing is always determined by i and j, row and column, and, and that’s a, a tension that we’ll have. Sometimes we use 1D signals, 1D signals will just be an array of, of numbers as well. All right.
12 - Compute Image Size Quiz¶
Let’s take a moment to talk about image size. Here is an image defined over a two-dimensional space. Since it is a finite image, it must have certain bounds. Let’s say the limits along the x-axis are ten and 330. And similarly, along the y-axis, 20 to 278. Can you tell me what is the width and height of the image? How about the area that it occupies? Type in your answers in the boxes provided.
13 - Compute Image Size Quiz Solution¶
That’s right. The width is the difference between the two x-axis limits, 330 minus 10 equals 320. Similarly, the height is 258. And the area is simply width times height. And that comes out to 82,560. For digital images, the height is the same as the number of rows. And width is the number of columns. Thus, the area is the total number of picture elements, or pixels. Now, if each pixel has three color values for red, green and blue, a color image, specifically an RGB image, has three different values at each pixel. This means a color image of this size has 82,560 times 3 total color values. If each color value is represented by one byte, then you need as many bytes to represent the entire color image. This should give you a sense of how much memory you need to store an image on a computer and how it depends on the width, height and number of color channels.
14 - Matlab Images are Matrices¶
Most of you, I hope, will use either MATLAB or Octave for, for the work in this class. We talked about how you can use also Python and OpenCV, etcetera. MATLAB or the open source version of it of Octave makes it easy. If you’re an actually student somewhere, you know, really registered somewhere, there is a student edition of MATLAB. It is less than most textbooks and a great thing for you to purchase. So, in MATLAB, images and matrices just work really well. So, you know, here’s all it takes to read an image. Right? So we’ve got this function, imread, and we’re going to read in as a file peppers.png. And by the way, if you don’t put these semicolons in there, MATLAB spits out all the numbers, which is incredibly painful. And then what we’re going to do here is, I’m going to take just the green channel, and the green channel can be indexed because in, in MATLAB, images have a certain number of rows, certain number of columns, and you can think of this as the color planes or the layers. So when I say im of colon that means all the rows comma colon all the columns. And then 2 that’s red, green, blue. So that’s green. Now some of you are screaming, no, Professor Bobic, you messed up. I will certainly mess up in this class. Although I will mess up less frequently in this class then in my in class, class. Why? Because Megan hates to see me mess up. That’s actually not really true, she likes to put it in there, but anyway. No, unfortunately, I did not mess up, or fortunately? I don’t know. MATLAB indexing starts at one. In, sort of, normal computer, zero would be red, one would be green, two would be blue. In MATLAB indices, indexing of arrays, it starts with one. So one is red, two is green, three is blue. That means that we all the time have to be subtracting one off of indices in order to be able to multiply them to get into other locations, the computer scientists out there will know exactly what I’m talking about. Just remember that MATLAab is one based indexing. So we’ve got our green channel, which is just a single channel. Well, I can just show that. And if I can imshow, we’ll talk about that in a minute. Well, green has just got a single layer, so it thinks of it as a grayscale image, and I am showing this case would just display it. And then you’ll notice, MATLAB makes it really easy to plot things also. So it says draw a line, and it, can you see that red line in there? It just drew a line right across there. Ain’t MATLAB great? In fact, I can also just call plot. So what this is, is this says, give me the 256th row. Give me all the columns and then plot it, and you’ll just get a plot. By the way, I recommend highly getting used to sort of exploring your image. Plotting the values and being able to see is what’s going on, what really should be going on?
15 - Quantize Quiz¶
Since a digital image is sampled at discrete locations in space, it can be written down as a two-dimensional array or matrix of values. Here is an example. Note that this matrix has fractional values, both positive as well as negative. But what if we could only represent a small set of integer values, say between 0 and 5? How would you quantize this matrix so that the result consists of only the integers 0 through 5? Enter the converted values in the corresponding boxes. Assume that we always round down. For example, 1.8 becomes 1. Also, be careful about the limits. Anything less than the lower limit 0, should be converted to 0, and anything greater than the upper limit 5, should be converted to 5.
16 - Quantize Quiz Solution¶
To get the quantized matrix, convert each number, always rounding down. Integer values within the given range remain the same. Anything above the upper limit becomes 5. Similarly, anything that is less than 0, in this case -1.3 becomes 0, and so on. Note how quantization results in a loss of detail. Extreme values beyond the range are lost as well.
17 - Load and Display an Image¶
All right, let’s try to load and display an image. To load an image, we want to use the imread command. The image will be stored in this variable img. To display it, we want to use the imshow command. Hit Test Run and scroll down to view the output. A bottlenose dolphin surfing the waves. All right, what more can we find out about this image? What if we wanted to find out the size of the image? You guessed it, we’ll use the size function. That returns the size of the image. To display it, we’ll use the disp function. We can also find out the class or data type for the image. Go ahead and type out these commands, run the program, and note down the results.
18 - Image Size and Data Type Quiz¶
So what was the size of the image? Note, that Octave prints out the height first and then the width. What was class or data type of the the image? Type in your answers.
19 - Image Size and Data Type Quiz Solution¶
All right, let’s find out the size and class. Octave prints out the height of the image first, and then the width, height is 320 and width is 500. On the next line we see that the class of the image is uint8. If you type these values in correctly good job. So the height and width turned out to be 320 and 500 respectively and the class was uint8. Now what does uint8 mean? Some of you may know this already, u stands for unsigned, which means this data type cannot represent negative numbers. Int stands for integer and eight refers to eight bits or one byte. This is sometimes known as the bit depth. It indicates the number of bits allocated to store each intensity value.
20 - Inspect Image Values¶
Let’s look at some values from our dolphin image. As before, we load the image with imread and let’s also display the image with imshow. How about we print out the size as well? And let’s run this to make sure everything’s okay. All right. Dolphin’s still there, and the size is still 320 by 500. Let’s say we want to find out the image value at a particular location. We specify this location with a row and column number. We learned earlier that an image is a function over the two dimensions. Octave uses a notation similar to a function to access values at a particular location. Let’s say we want to find out the value at row 50, column 100. We write img, that’s our image variable. Followed by the row column coordinates in parenthesis. And let’s display this value. All right, let’s see what we have. So the value at 50, 100 is 208. Similarly we can find out the values for an entire row. We’ll use a similar notation as before, but we’ll do something different for the column. Putting a colon tells Octave to return values for all columns. Let’s see what the output is. As you can see, Octave dumps values from the entire row, all 500 columns. Obviously, this isn’t a useful way to look at values from an entire row. What else can we do? We can plot these values. This makes more sense, doesn’t it? You can clearly see the relatively higher values where the white wave is, and then the other values are comparatively lower. Can you find out the values from this three by three slice of the image? Rows 101 to 103. Columns 201 through 203.
21 - Inspect Image Values Quiz¶
Fill in the values in the corresponding boxes.
22 - Inspect Image Values Quiz Solution¶
All right, to select the desired slice, we specific a range of rows and a range of columns. In this case, the range of rows is 101 through 103. And the range of columns, 201 through 203. And let’s display these values, so the numbers are 81,77, 77, 81, 78, and so on.
23 - Crop an Image¶
Let’s use this method of selecting a range of rows and columns to extract a larger portion from an image. By the way, this is also known as cropping an image. Let’s use a different picture this time. Let’s see what it looks like. All right, a classic two-wheeler there. Let’s check the size of the image first. All right, 320 by 500. When cropping this image, we’d want our limits to be within this range. Let’s say we want to select rows 110 through 310, and columns 10 through 160. I wonder what we’ll find there. That’s the front wheel. No points for that. So what is the size of the cropped image?
24 - Crop an Image Quiz¶
So what is the size of the cropped image?
25 - Crop an Image Quiz Solution¶
As before, let’s use the size function. So it turns out to be 201 cross 151. Is that surprising? Why isn’t it 200 cross 150? Let’s look back at the range we selected, 110 through 310 and 10 through 160. Note that in both these ranges, the limits are inclusive. Which is why the first range, 110 through 310, includes 201 different rows. Similarly, 10 through 160 includes 151 different columns. Try to extract different parts of the image. What happens when your selected range goes out of bounds
26 - Color Planes¶
How about we look at a color image? What do you think is the size of this image? Let’s find out. 258, 320, and 3. Why are there three numbers? The first two are the height and width. The third one is a number of color planes or channels in the image, which is three. So how would you select a single color plane? We’ll use the same indexing notation we used to crop an image. Say we want to select the red channel, which is at index one. We want all the rows, all the columns, but only the first plane. Now what does this look like? All right. So that’s what the red channel looks like. Brighter areas indicate higher red values and darker areas vice versa. The apples are not as bright as you’d expect. Why do you think that is? Let’s also figure out the size of this color channel. As expected, the width and height of the color channel are the same as the original image, but it doesn’t have a third dimension. Well, that makes sense, doesn’t it? This color plane is one of three, which are stacked together to create the color image. Each one of them is a two-dimensional array. Note that the extracted color channel is an image by itself, so you can apply the same operations as you’ve seen before. Let’s try plotting the values from a row of the image. And that’s what the plot looks like. Play with this image in the code editor, try out different operations, try selecting other colored channels.
27 - Add 2 Images Demo¶
So what do arithmetic operations on images look like? Let’s start by adding two images. Like before, we load up the images. Let’s display them and make sure their sizes are equal. Here are the images, and as we can see in the program output, their sizes are equal. This is important because addition is an element by element operation. This means pixels in one image get added with corresponding pixels in the other image, and so on. Since the two images are of the same size, we can add them. Octave is intelligent enough to figure out that these two are matrices of the same size and hence performs an element-wise addition. The result is, as you would expect, an image that has elements of both source images. You can clearly see the bicycle and the white surf. You can also see the dolphin faintly visible. Notice how this image is exceptionally bright. Many areas are washed out. Why do you think the image is so bright? To find out, let’s look back at our code. Note that we’re directly adding values from both images. Areas where both images are bright turn out to be doubly bright. This indicates that we should perhaps scale down the image intensity values. By how much? We’ll think of a pixel, where both these images have the maximum intensity value. That pixel in the summed image will have twice the maximum value. So if we want the maximum possible value in the summed image to be the same as the maximum possible value in each of these source images, then we want to divide the intensities by 2. Does this look familiar? Yes, this is the average of the two images. Let’s see what this looks like. All right, much better. Compare this with the direct sum. You can clearly see the difference in brightness. Also notice that there are no longer any washed out areas. Let’s rewrite the expression for average and see what happens. Since both bicycle and dolphin are being divided by 2, we should be able to add them first and divide the result by 2, right? And let’s call it average_alt. Let’s see what this looks like. Wait a second, this is not right. Shouldn’t the two results be the same?
28 - Add 2 Images Quiz¶
To understand the result we are seeing, let’s take a closer look at the computation that is happening behind the scenes. Here are the two images. The first image is the result of dividing each image by 2 and then adding them. And the second is the result of adding the two images first and then dividing by 2. The first image has better detail and is also slightly brighter than the second one. Now, we know that these images are just collections of intensity values. And values at each corresponding location are being added together. Let’s take two sample values, say, 183 from one image and 152 from the other. Now, in the first case, we divide these numbers by 2 and then add the results. In the second case, we add the two numbers first. What do you think is the result in these two cases?
29 - Add 2 Images Quiz Solution¶
The key to solving this problem is to know that both these images are of type uint8. This means that all pixel values are integers in the range 0 to 255. So here’s what happens. Since the images are unsigned integers, Octave tries to retain the same data type throughout the arithmetic operation. So 183 by 2 comes out to be 92. Note that Octave rounds to the nearest integer. In this case, it is rounding up. Similarly, 152 by 2 is 76, and their sum comes out to be 168. In the second case, the addition is performed first, so 183 plus 152 is 335. But note that this number cannot fit in the unsigned int 8-bit range. The maximum value possible is 255. So this number gets truncated to the upper limit. The division proceeds as expected, and the result is 128. You can imagine that in a number of places, the pixel values add up to more than 255. In all these locations, you will only get 128 as the result. This is often less than the actual average of the two numbers. Hence we see that the first method better preserves pixel values. You will certainly come across these odd arithmetic errors. Just be mindful of the image data type you are using, and the order in which you perform arithmetic operations.
30 - Multiply by a Scalar Demo¶
In the previous example, we saw how we can divide an image by a number. Dividing by 2 is the same as multiplying by 0.5. And the order of writing these two doesn’t matter either. The constant 0.5 is known as a scalar. This potentially comes from the fact that it scales the image values. Let’s see what the result looks like compared to the original image. Halve the intensity values, clearly darker. Note that we can potentially multiply by any number, even greater than 1. Multiplying the intensity values by 1.5 makes the image brighter. And we see the same washed out effect in certain areas. This is due to the image values above 255 getting truncated at that limit. In Octave, we can write a function to perform a common operation. Let’s turn the scaling into a function. We write a function by typing in the word function, followed by a variable name for the return value. Then an equal sign, the name of the function, and parameters in parentheses. This is followed by the body of the function. In this case, we want the result to be the product of value and image. To ensure that we are performing element-wise multiplication, let’s change the star to a dot star. This doesn’t make any difference when one of the values is scalar, but when the two quantities being multiplied are vectors or matrices, then star and dot star produce different results. We end the function by typing endfunction. Let us load an image and try out this function. And there is the scaled image.
31 - Blend 2 Images¶
Now that we know how to add two images together, and multiply image intensity values by a scaler, let’s revisit our example of averaging two images. We saw that division by two can be rewritten as multiplication by 0.5. Now, this results in an image which has equal parts dolphin and equal parts bicycle. What if we wanted to change these ratios? Say we want more of dolphin. But note that, in order to keep the maximum intensity value the same as that of the original images, we should ensure that these weights sum to one. In general, this is known as blending two images. Let’s see what it looks like. Yes, we do see a little bit more surf from the dolphin image, but it’s a little hard to tell. How about we change the way it’s a little farther? More dolphin. I wish we had a function to do this, which we could call like this. Can you write this function for me? Let me get you started. Put your code inside the function body. Remember, to return something, assign it to the output variable. Also note that a and b are the two images to be blended, and alpha is the weight to be applied to a. Once you have implemented the function, test it out with different values of alpha.
32 - Blend 2 Images Solution¶
We know that we want to multiply A by alpha and since the some of weights needs to be one we multiply B by one minus alpha. And finally we assign this to the output variable. That’s it. Let’s see what we get. Alright, same image as before. Now see how easy it is to change the blending weights. For example, I want little less dolphin and more of bicycle. And, there we go. It almost looks like there is water on the street. This method of obtaining a weighted sum of two images is the basis of alpha blending.
33 - Common Types of Noise¶
So if images are just functions, then we can do things to images that we can do to functions. Like we can just add them, right? You can add two functions, right? Well, then we can add two images. And to introduce this a little bit, we’re going to introduce the concept of noise, okay? So noise in an image. Is just another function that, combined with the original image, gives us a new function. So, we’ll just write this, this way as our new image. We’ll call it I prime. It’s just I of x, y plus this noise function. You know, well what does that mean? Well, we have to take a look at what this noise function would be. Okay, so there are lots of different kinds of noise functions. Here’s one, and this stuff’s courtesy of Steve Sites, there’s a type of noise called salt and pepper noise. Which doesn’t take a rocket scientist for you to figure out that probably what it does is, it takes your original picture and it sprinkles occasional white spots and occasional dark spots. And that’s called salt and pepper noise for the, for the obvious reason. A, a relative to that is something called an impulse noise, where you just get little white specks now and then. Different kind of imaging systems might give you that kind of noise. But by far, the noise that you’re most familiar with is typically Gaussian noise, or normally distributed noise. Where we basically assume that at every pixel we take the original image and we stick on here some value that is independent identically distributed from some normal or some Gaussian distribution. All right, and that’s Gaussian noise. And most of the time when we talk about noise we’ll talk about that function. Okay? We can actually have Matlab make us a noise function. It’s real easy. So here we say, look we’re going to make a noise array, which is just, I take the size of my image, random n, randn generates a noise signal that has a mean of zero and a standard deviation of one, and if we scale that up by some sigma. Okay? That will spread that out and make it bigger so that’s essentially the noise with mean of zero and a, a standard deviation of sigma. And because functions are just functions and images are functions, I can just add them. I can say let my output just be the image plus the noise. And if I were to plot that, you would see what’s here, right? And on the right you can see that there’s all this noise in our peppers. And if we plot this, you can see here we get this nice clean plot and here we have all this extra noise that’s been added. And so that’s our noise function.
34 - Image Difference Demo¶
If you can add to images, you can subtract them as well. The difference between two images is simply one image minus the other. It might be hard to understand at first what’s going on. Greater values in the difference image signify greater difference between the two images. Brighter areas in this result indicate where the two images differ more. Note that this is dolphin minus bicycle. Here the order mattered. Bicycle minus dolphin gives us a different result. This makes sense, as what the difference operation is doing is simply subtracting pixels in corresponding locations. If two such pixel values are a and b, then a minus b is different from b minus a. But when thinking about the difference between two images, we often don’t care about which one is greater, and which one is less? Note that b minus a is simply a minus b negated. When thinking about the difference between two images, we often don’t care about the sign of this difference, only the magnitude. That is, we’re interested in the absolute difference between two images. For that you use the Octave ABS, or ABS function. Let’s see how different the two results are. Wait a second. These two don’t look different. In fact, they’re exactly the same. What’s going on? Let’s take a closer look at our code. Especially this line. Let’s say two values being subtracted are 20 from bicycle and 56 from dolphin. Theoretically the result should be minus 36. But remember uint8? These images can only represent numbers between zero and 255. So what happens here? It gets truncated to zero. Notice that even in the absolute difference case, the subtraction is performed first. This intermediate result is the same as the original difference. The numbers here are already between zero and 255. So the absolute value operator doesn’t make any difference. So what can we do about this.
35 - Image Difference Quiz¶
So we’re losing out all the negative values in the result. How do we ensure that we can preserve image difference? Let’s say a and b are two images of type uint8. Check the following options, which you think will give the desired result. For instance, should we compute the absolute values of the two images first and then compute their difference? What about this expression? Or converting to a different type? Would that help?
36 - Image Difference Quiz Solution¶
The first expression doesn’t make any difference. A and b contain only positive integers. So this will give us the same incorrect result as before. The second expression is interesting. A minus b would give correct difference values where a is greater than b. And zero, where b is greater than a. Similarly, b minus a will give you correct difference values where b is greater than a, and zero where a is greater than b. Hence, there’s sum is in fact the absolute difference that we want. Converting the images to uint16. Does increase the range of values that they can store. But remember that u signifies unsigned, which means uint16 cannot represent negative numbers, and we’ll end up getting the same result. Floating point images can inherently store negative values. Hence, converting to floating point would help. Fortunately, there is a built in function to compute image difference that preserves values. We don’t have to explicitly convert the data type or use any funky expressions. This function is contained in the image package in Octave or image processing toolkit in Matlab. You can load a package by typing pkg load followed by the package name. The function we want is called imabsdiff. It takes two parameters. The images to be subtracted. And the order doesn’t matter. Let’s see how this compares with our previous attempt. As you can see, this preserves the magnitude of image difference throughout the image. The image package provides many more functions to carry out common operations. Feel free to explore them.
37 - Generate Gaussian Noise¶
So we know that randn generates Gaussian noise. Let’s see how it actually works. If you call randn without any parameters, then it returns a random number. Here we get 0.76388. Run it again. A different number, 1.3958. You can pass in dimensions to randn to generate a vector or matrix filled with random numbers. Let’s say we want a row vector of five columns. So one row, five columns. Each time we run this, we get different sets of numbers. As you might have guessed, we can generate a two dimensional matrix of random numbers as well. Say we want two rows and three columns. Since these are a bunch of random numbers, we call this noise. What is interesting is that randn draws these numbers from a Gaussian or a random normal distribution. Hence, the n in randn. A Gaussian distribution has a probability distribution function that looks like this. The center, or mean, for randn is zero, and the standard deviation is one. The standard deviation is a measure of how spread out the distribution is. I mentioned this is a probability distribution, which means getting back numbers that are close to zero is highly likely, whereas numbers far away from zero are less likely. How do we do know for sure that randn is actually sampling from a Gaussian distribution? Well, if we had enough samples and distributed them among bins and we counted how many numbers landed in each bin, then we would see a pattern similar to the probability distribution function. Let’s try that. How about we start with a vector of hundred numbers? Instead of displaying the numbers directly, let’s compute a histogram. Hist accepts a vector or matrix of numbers as a first argument and as an optional second argument, you can pass in bin centers. Let’s say we want the centers to be integers, from minus three to plus three. Hist returns two values. One is the count of elements, which we want, and the second is the bin centers. Let us display the bin centers and the columns in a tabular form. We will create a small, temporary matrix, with the first row being the bin centers and the second row being the counts. As expected, the center has a high count, and the ends have low, in fact, zero counts. You see the same behavior no matter how many times you run it. For a visual representation of what’s going on, how about we plot these numbers? X-axis will contain our bin centers, and the counts will be on the y-axis. We see something that vaguely resembles the Gaussian probability distribution. To get a better picture, we need more bins. You can generate a sequence of uniformly spaced numbers using the lint space function. Here we can replace this vector by writing minus three to plus three, seven different numbers. That is including zero. Let’s make sure this is the same as before. Note here that the bin centers are same, as expected. Now we can easily increase the number of events. Say, we want 21 one of them. I’m going for odd numbers because I want to include the zero in the middle. Displaying so many numbers wouldn’t be useful, so let’s comment that out and see what the plot looks like. Clearly, we have better resolution along the x-axis, but what’s going on with these spikes? I think we need more data, let’s bump up the vector to 1,000 numbers. Now you see the familiar bell curve slowly emerging. Let’s increase the number of samples further. There you go. In addition to randn, you can find other random number generation functions in Octave or MATLAB such as just rand. This samples numbers from a uniform distribution. Randi generates random integers. Feel free to play with these functions.
38 - Effect of Sigma on Gaussian Noise¶
So I just snuck one past you there. Okay. We said the magnitude of the noise is determined by sigma. Fine that’s great. In fact, we could just look at the noise function itself, right. So don’t add in the original image. Just look at the noise function. But we can’t do that just yet until we make a decision, and the reason is this. What’s the mean of the noise. Megan? >> Zero. >> Very good. So that means some of the values are going to be what? Positive, some of the values are going to be negative. How do we look at a picture that has positives and negatives in it? Right? If we said zero was black and one was white, or zero was black and 255 was white, so how do we do this? Well, the mistake is saying that zero is black. Okay? We’re going to say, look, we’ll map some minimum value to black, some maximum value to white, and we’ll distribute them in between. In particular, zero should be what color? What color do you think zero should be, between black and white? What comes between black and white in the universe? Grey. All right? So let’s suppose we have values that you know, go from minus 20 to plus 20 in our image, well we can make minus 20 black, plus 20 white and, and zero would be grey. And if we did that, it would look like this. So here we’re showing you images of Gaussian noise. Just the noise, So if there’s a very small sigma, so remember sigma’s up here, so a very small sigma. All right? You can barely see that this is anything but a constant grey. As we let sigma get bigger and bigger and bigger, you start to see more and more speckle. And that’s the effect of sigma so it’s just a noise function being added to an image.
39 - Effect of Sigma on Gaussian Noise Quiz¶
So likes these quizzes, so here we have a quiz, and if you can’t do this with three eyes closed, I don’t know. It says, look, there are four different sigmas here, you know, what are those sigmas? Well, they’re 2, 8, 32, and 64. Can you label ‘em for me?
40 - Effect of Sigma on Gaussian Noise Quiz Solution¶
Two, small, okay, right there I see something I can’t. Oh, by the way, the reason we’re using all these slides, my handwriting is awful, so you’re going to be very happy that we have all these slides. So that’s two. Let’s see, I guess the eight is over here. The 32, yeah, I guess the one on the top left is less, and 64 is there. Okay, I passed my quiz. I hope you did too.
41 - Apply Gaussian Noise Quiz¶
Given an image img you know that its size is size of img and you can generate a noise image of the same size by passing this to randn. The values in this noise image will be normally distributed around zero. With standard deviation of one. What happens when you multiply each of these generated values by two? How does this affect the resulting distribution? Does it increase the counts? Does it increase the spread? Or both? Mark the right answer.
42 - Apply Gaussian Noise Quiz Solution¶
The correct answer is that it only increases the spread. Note that we are only multiplying the values. The number of values, or their counts, are unchanged. Multiplying a set of normally distributed numbers by a value effectively changes the standard deviation of the distribution they were drawn from. Now why is this important to know? Remember that randn generates values with standard deviation one, whereas the images we’ve been using are of type uint8 and range from zero to 255. What do you think happens when you add the results of randn directly to an image? Let’s find out. Time to use a new image. If you look carefully you’ll be able to see three moons and a shadow. Now we generate our noise image and add it to the original. Not really very different, is it? This is because the values that randn generated are really small compared to the image. Let’s scale up the values. Now you can see the noise affecting the image. How about we increase this further? And more. Now it’s really hard to see the moons, isn’t it?
43 - Displaying Images in Matlab¶
I didn’t say in the previous image what the range of our pictures would tend to be, okay? Remember I told you an image might go from, you know, 0 to 1. And that would be from the darkest black to the brightest white. Well, if I did that and I had a sigma of two. You would get black and white all over the place. And yet, when I go back here, sigma of two is just a small variation. Whereas sigma of 64 was a big variation. Why? In this image, we have this notion of maybe minus 127 was black and plus 128 was white. When we talk about the amount of noise in an image in terms of the intensity. It has to be with respect to sort of what’s the overall range. So another reason to use doubles in your images. And to think of them as going from 0 to 1, then we can talk about a sigma of, you know, 0.1. Well, that’s a tenth of sort of going from black to white. If you want to use something arbitrary like 0 to 255. You can do that. First of all, use 0.0 to 255.0, so use a floating-point number. And then you’re going to have to say, okay, I guess a sigma of 0.1 in one case would be a sigma of, like, 25 in another case. Right, because I’ve stretched the whole thing out. So you have to worry about the magnitude of sigma with respect to the overall range of your image. This will c, catch you number, numerous times when you go to display an image. All right, because now you have to tell the machine, okay I’ve got this image. How do you want it displayed? Matlab has a large number of ways of displaying images. If you have the imshow function which I think actually comes from the Image Processing Toolbox. You can show it this way where you’d tell it low and high. And it will display, anything with the value low or lower as black, anything higher than high as white, okay? You can also do imshow and just give it this empty array. And it will scale the image for you automatically. That is, it’ll find the minimum value in the image and say, okay, that’s going to be black. It’ll find the maximum. It’ll say, that’ll be white, and it’ll scale you. There’s another function called imagesc, for image scale. It’s a much older function. It’s not in Image Processing Toolbox, which will also display it. Don’t get caught between this question of how I display an image versus how I use an image. Just I just finished teaching part of this course here at Georgia Tech. I had some people doing, computing some gradients. They computed some gradients, which involves substructions and derivatives and all that stuff. And then one guy, he normalized his picture to go from 0 to 255, before he computed with it. You would only normalize it in order to display it, not in order to compute with it.
44 - Adding Noise Quiz¶
So here’s a question where we start to talk about noise. When I add noise, to images, as an arithmetic operation, which of the things do I have to worry about? A, the speed of the addition operation. B, the magnitude of the noise compared to the range of the image. C, whether we add the noise to the image or we add the image to the noise. That is, the order of the operation. Or D, none of the above.
45 - Adding Noise Quiz Solution¶
Well, I’ll tell you most of the time when I give a quiz, the answer is all of the above or none of the above, but not really in this case. The speed of the addition operation, you know, back in the Dark Ages, you had to worry about the speed of addition, now addition is instantaneous. Okay? And by the way, last time I checked, 4 plus 2 equals 2 plus 4. Addition is commutative, so we don’t have to worry about whether we add the noise to the image or the image to the noise. It’s kind of weird to think if I start with some noise and add an image to it, same value. But what you do have to worry about, as I said before, is the magnitude of the noise compared to the range in value of the image.
46 - Images as Functions End¶
So that ends our first lesson on, images as functions and image processing, our first technical lesson. I hope you didn’t find it too pedantic or too boring, and you’ll come back for the next ones because as we go forward, there’ll be more and more cool stuff to do. I can hardly wait, and I hope you can too.
47 - What Did You Learn Today¶
So what did you learn today? List any new concepts, terms, or commands that you picked up in the lesson. Comma separated, or one on each line. You can use this, and similar notes throughout the course as a reminder for yourself. This also gives us a sense of how you learn, and will help us improve this course.