3x3 Sobel operator and gradient features
- by pithyless
Reading a paper, I'm having difficulty understanding the algorithm described:
Given a black and white digital image of a handwriting sample, cut out a single character to analyze. Since this can be any size, the algorithm needs to take this into account (if it will be easier, we can assume the size is 2^n x 2^m).
Now, the description states given this image we will convert it to a 512-bit feature (a 512-bit hash) as follows:
(192 bits) computes the gradient of the image by convolving it with a 3x3 Sobel operator. The direction of the gradient at every edge is quantized to 12 directions.
(192 bits) The structural feature generator takes the gradient map and looks in a neighborhood for certain combinations of gradient values. (used to compute 8 distinct features that represent lines and corners in the image)
(128 bits) Concavity generator uses an 8-point star operator to find coarse concavities in 4 directions, holes, and lagrge-scale strokes.
The image feature maps are normalized with a 4x4 grid.
I'm for now struggling with how to take an arbitrary image, split into 16 sections, and using a 3x3 Sobel operator to come up with 12 bits for each section. (But if you have some insight into the other parts, feel free to comment :)