Experimental study of apparent behavior. Fritz Heider & Marianne Simmel. 1944
“
With Fritz Heider, Simmel co-authored 'An Experimental Study of Apparent Behavior,' which explored the experience of animacy. The study showed that subjects presented with a certain display of inanimate two-dimensional figures are inclined to ascribe intentions to those figures. This result has been taken to establish "the human instinct for storytelling" and to serve as important data in the study of theory of mind.
Small Motion: (\(u\) and \(v\) are less than 1 pixel, or smooth)
Taylor series expansion of \(I\):
\[I(x+u,y+v) = I(x,y) + \frac{\partial I}{\partial x} u + \frac{\partial I}{\partial y} v + [\text{higher order terms}]\]
\[\approx I(x,y) + \frac{\partial I}{\partial x} u + \frac{\partial I}{\partial y} v\]
B.Lucas and T.Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of the International Joint Conference on Artificial Intelligence. pp. 674–679. 1981.
How to get more equations for a pixel?
Spatial coherence constraint
Assume the pixel's neighbors have the same \((u,v)\)
If we use a 5x5 window, that gives us 25 equations per pixel
\[0 = I_t(p_i) + \nabla I(p_i) \cdot [u\,v]\]
solving the ambiguity
Least squares problem! (\(A x = b\), where \(A\): 25x2, \(x\): 2x1, \(b\): 25x1)
When is this solvable? i.e., what are good points to track?
\(A^TA\) should be invertible
\(A^TA\) should not be too small due to noise
eigenvalues \(\lambda_1\) and \(\lambda_2\) of \(A^TA\) should not be too small
\(A^TA\) should be well-conditioned
\(\lambda_1 / \lambda_2\) should not be too large (\(\lambda_1\) is larger eigenvalue)
Does this remind you of anything? (criteria of Harris corner detector)
low texture region
\(\sum \nabla I(\nabla I)^T\)
gradients have small magnitude
small \(\lambda_1\), small \(\lambda_2\)
edge
\(\sum \nabla I(\nabla I)^T\)
large gradients, all the same
large \(\lambda_1\), small \(\lambda_2\)
high textured region
\(\sum \nabla I(\nabla I)^T\)
gradients are different, large magnitudes
large \(\lambda_1\), large \(\lambda_2\)
the aperture problems solved
the aperture problems solved
the aperture problems solved
the aperture problems solved
errors in lucas-kanade
A point does not move like its neighbors
Motion segmentation
Brightness constancy does not hold
Do exhaustive neighborhood search with normalized correlation, tracking features, maybe SIFT, more later...
The motion is large (larger than a pixel)
Not-linear: iterative refinement
Local minima: coarse-to-fine estimation
revisiting the small motion assumption
Is this motion small enough?
Probably not—it's much larger than one pixel
How might we solve this problem?
optical flow: aliasing
Temporal aliasing causes ambiguities in optical flow because images can have many pixels with the same intensity. i.e., how do we know which "correspondence" is correct?
nearest match is correct (no aliasing)nearest match is incorrect (aliasing)