Hello, if you happen to reach this article you are probably looking for a simple and sound explanation on how GLCM texture extraction algorithm works. It took a while for me to understand the mathematical intuition behind it since there is not much stated on the website of Math Works or other related software. This tutorial would be a little long but it's not difficult so stick till the end and try to grasp each concept to the fullest.
So, some background check to understand things better.
Consider a rough surface, by the word "rough" we refer to the uniformity of the surface which happens to be so because of the alternate high and low peaks that occur on that surface which is felt by our senses when we move over the material let's say with a finger.
Image with a rough texture
Now in analogy to a regular image these HIGH and LOW peaks can be compared to HIGH and LOW intensities of brightness which are termed as "GRAY LEVELS" and the finger which moves over it can be termed as a "WINDOW" which is capable of gliding in the left, right, forward and backward direction with a shift of a specified size termed as the "OFFSET". The specific point of focus under our finger or the one in observation can be termed as the "REFERENCE POINT" and it's immediate next one is the "NEIGHBOUR POINT". These neighbour points can be considered in any direction.
With that being said let's see what GLCM is and what the algorithm does
Grey Level Co-occurrence Matrix (GLCM) also sometimes referred to as the Grey Tone Spatial Dependency Matrix is a tabulation of the count or frequency of how often different combinations of pixel brightness values (grey levels) occur in an image. The various statistics calculated from this matrix determines the spatial relationship between the pixels of the image and these are the final values that serve as the features often called as handcrafted features to our ML models.
A quick emphasis on the word spatial ! It refers to the relationship between two pixels at a time unlike our regular statistical values that are descriptive and consider individual points.
Consider the following image represented in it's pixelated form of intensities.
If I consider each intensity as a numerical value with the darker ones being no light 0 and the values increasing with the brightness then I have the following representation of the image.
Now starting from the left moving to the right we consider each pixel and note down the count of their occurrence together. Note that while moving from left to right, the rightmost columns get's missed out because it has no neighbour to it's immediate right. To solve this issue we can make a small change in our counting methods. We first calculate the values from left to right then from right to left and generate two separate matrices and then add both together to get the result. This also helps to make our matrix symmetric.
Addition of both gives the matrix below.
In the above matrix entry (0,1) refers to frequency of (0,1)