## Thursday, October 25, 2007

### Sprocket Perforations

Before we can find the sprocket holes, we must know what we are looking for. In my Background section I list the published dimensions for regular 8 and super8 film, including the sprocket dimensions. To find these dimensions in pixels, simply multiply the dimension in inches by the dots per inch of your scan.

The sprockets are positively determined one at a time starting at the right side of the scanned image since that is the location of the next frame in my set up. In finding the sprockets, I use a bit of logic to help narrow my search. I have broken this down into 3 simplifications:

First, I am not actually looking for the whole sprocket all at once. Instead, I am only looking for the edge pixel that corresponds to the exact center of the left edge of the sprocket. When I find an edge pixel, I assume it is the center of the left edge of the sprocket, I make a list of where I expect the other edges pixels to be in relation to this one if it really is a sprocket, and check to see if they are there.

Second, since I am only looking for the exact center of the left edge of each sprocket, my search window can be very narrow. There is no point in searching the entire image since even if I did find a sprocket near the bottom of the image, I would not be able to pick off a frame that corresponds to that sprocket. Instead, I guess that the top edge of the sprocket must be at least 5 pixels from the top of the picture. Then, since I am only looking for the edge pixel that is the center of the edge of the sprocket, I don't look for any edge pixels above 1/2 the sprocket width + 5 pixels from the top of the image. I call this my "tlim" or top limit. I also do not search for any pixel below 1/2 the sprocket width + the frame width from the bottom of the image. Since we have already eliminated the bottom section of the original image, this gives a narrow strip through the center of the image in which I can reasonably expect to find my sprocket center.

Third, if I already know where a sprocket is, then the next sprocket down the line should be 1 frame width away, so I search for it at that exact point (plus a little margin of error) first. If I don't find the next sprocket where I think it should be, I increase my search window to include more possibilities. If I still don't find it, I assume my last "sprocket" was a mistake and search the whole width of the frame.

The tricky part of this business is how to tell if a pixel really is the edge I am looking for, or if it is a random speck of dust. This is accomplished with statistics and a couple of very long arrays of numbers. The steps below give the logic process for determining the actual location of the sprockets.

1. Find the next edge pixel within the search window
-assume it is the center of the left edge of the sprocket
2. Calculate the location of the exact center of the sprocket if 1 were true
3. Based on the dimensions of the sprocket, check to see if there are other pixels in the expected locations around the potential center pixel
4. Give the center pixel a score for each pixel that you expected to find on the sprocket (a perfect super8 sprocket at 4800 dpi will have a score greater than 3000):
+5 There is an edge pixel exactly where you expected it to be
+3 There is an edge pixel +/- 1 pixel from the expected position
+1 There is an edge pixel +/- 2 pixels from the expected position
+0 There was no edge pixel
5. Record the location of the center pixel and its score in a 2D matrix
6. If you haven't finished searching the entire window, return to 1
7. The pixel with the highest score is the real center of the sprocket
8. Save the actual center of the sprocket in a separate array

For debugging purposes, I keep track of the actual sprocket and color code the pixels of the sprocket hole by their score (5 = red, 3 = blue, and 1 or 0 = green). I can save an image with the sprockets colored over the original scan to show how well the sprockets are being found as seen in the image below. This is a s8 sprocket with edges identified & scored. Note that corners are not specifically scored because of the dificult geometry, but are treated as a region (shown in green) within which any edge pixel receives full marks as a kind of bonus.

I found that there is need for some variation. Different makers of film, or film made at different times will have slightly different dimensions than those published and it will be enough to cause problems with picking off the frames.

I solved this problem by including user inputs on my GUI that modify the sprocket dimensions and writing a routine that automatically searches for the optimal sprocket dimensions. This requires me to keep track of the average sprocket score of all sprockets found in the project, and the standard deviation of the sprocket scores. The basic idea of the automatic calibration is:

1. Check to see if the average score of the current scan is within statistical tolerance of the total average score for the entire project
- If so don't bother adjusting, otherwise go on
2. Adjust the vertical sprocket dimension by n
- n = -10 for the first, -8 for the second, . . . up to +10
3. Record the sum score for all of the sprockets in the image
4. Go back to 2 until n = 10
5. Compare sum scores and choose the value of n that gives the greatest sum score
6. Go back to 2, but for the horizontal dimension this time.
7. With the 2 dimension modifiers, find the sprockets
8. Adjust the statistics for the entire project with the new sum scores

I found this routine to be quite efficient and hassle free if the tolerances are set properly. The total average sprocket score is easy enough to calculate. There is a field in the gui for the user to input the number of standard deviations he would like for his tolerances. I use 3.0. The following codes calculate the statistics.

Average sprocket value of current strip
For i = 0 To cmags.Count - 1 'average sprocket value of current strip
*****mavg = mavg + cmags(i) / cmags.Count
Next i

Average sprocket value for project
For i = 0 To cs.Count - 1
*****cavg = cavg + cs(i) / cs.Count
Next i

Sum square error & standard deviation
For i = 0 To cs.Count - 1
*****s = cs(i) - cavg
*****ss = s * s
*****ssum = ssum + ss
Next i
stdev = Sqrt(ssum / cs.Count)

So then the tolerance condition is:
mavg less than (cavg - stdev*n)

where n is the user input number of standard deviations in the tolerance interval. If the condition is true, then recalculate the sprocket dimensions, otherwise you have a good sprocket. Frames