01.13.hierarchical Model-based Motion Estimation

  • Uploaded by: Alessio
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View 01.13.hierarchical Model-based Motion Estimation as PDF for free.

More details

  • Words: 5,808
  • Pages: 16
Hierarchical Model-Based Motion Estimation James R. Bergen, P. Anandan, Kei,th J. Hanna, and Rajesh Hingorani Center,PrincetonNJ 08544,USA David SarnoffResearch

This paper describesa hierarchical estimation framework for Abstract. the computation of diverse representationsof motion information. The key features of the resulting framework (or family of algorithms) a,rea global model that constrainsthe overallstructure of the motion estimated,a local rnodel that is used in the estimation process,and a coa,rse-finerefinement strategy. Four specific motion models: affine flow, planar surfaceflow, rigid body motion, and generaloptical flow, are describedalong with their application to specific examPles.

1 Introduction A large body of work in computer vision over the last L0 or 15 years has been con."ro"d with the extraction of motion information from image sequences.The motivation of this work is actually quite diverse,with intended applicationsranging from data compressionto pattern recognition (alignment strategles) to robotics and vehicle navigat[gn. In tandem with this diversity of motivation is a diversity of representation of motion information: from optical flow, to affine or other parametric transformations, to 3-d egomotion plus range or other structure. The purpose of this paper is to describe a common framework within which all of these computations can be represented. This unification is possible because all of these problems can be viewed from t[e perspective of image registration. That is, given an image sequence,compute a representation of motion that best aligns pixels in one frame of the sequencewith those in the next. The differencesamong the various approachesmentioned above can then be expressedas different parametric representationsof the alignment process.In all ca^ses the function minimized is the same; the difference lies in the fact that it is minimized with respect to different parameters. The key features of the resulting framework (or family of algorithms) are a global modelthat constrainsthe overallstructure of the motion estimated, a local rnodelthat is used in the estimation process1, and a coarse-finerefinement strategy. An example of a global model is the rigidity constraint; an example of a local model is that displacement is constant over a patch. Coarse-finerefinement or hierarchical estimation is included in this framework for reasonsthat go well beyond the conventionalones of computational efficiency.Its utility derives from the nature of the objective function common to the various motion models. 1.1 Hierarchical estimation e.8.,see[2, L0, 1I,22,19]). havebeenusedby variousresearchers Hierarchicalapproaches More recently, a theoretical analysisof hieralchical motion estimation was described in 1 Becausethis model will be used in a multiresolution data structure, it is "local" in a slightly unconventional sensethat will be discussedbelow.

-J 238

[8] and the advantagesof using parametric models within such a framework have also been discussedin [5]. Arguments for use of hierarchical (i.e. pyramid based) estimation techniquesfor mo. tion estimation have usually focused on issuesof computational efficiency.A matching process that must accommodatelarge displacementscan be very expensiveto compute. Simple intuition suggeststhat if large displacementsca^nbe computed using low resolution image information great savingsin computation will be achieved.Higher resolution information can then be used to improve the accuracy of displacement estimation by incrementally estimating small displacements(see,for example, [2]). However,it can also be a.rguedthat it is not only efficient,to ignore high resolution image information when computing large displacements,in a senseit is necessaryto do so. This is becauseof aliasing of high spatial frequency componentsundergoing large motion. Aliasing is the source of false matchesin correspondencesolutions or (equivalently) local minima in the objective function used for minimization. Minimization or matching in a multiresolution framework helps to eliminate problems of this type. Another way of expressingthis is to say that many sourcesof non-convexitythat complicate the matching processare not stable with respect to scale. With only a few exceptions ([5, 9J), much of this work has concentrated on using a small family of "generic" motion models within the hiera,rchicalestimation framework. Such models involve the use of some type of a smoothnessconstraint (sometimesallowing for discontinuities) to constrain the estimation processat image locations containing little or no image structure. However, as noted above, the arguments for use of a multiresolution, hierarchical approach apply equally to more structured models of image motion. In this paper, we describe a variety of motion models used within the same hierarchical framework. These models provide powerful constraints on the estimation process and their use within the hierarchical estimation framework leads to increased accuracy, robustnessand efficiency.We outline the implementation of four new models and present results using real images. L.2 Motion Models Becauseoptical flow computation is an underconstrainedproblem, all motion estimation algorithms involve additional assumpti'onsabout the structure of the motion computed. In many cases,however, this assumption is not expressedexplicitly as such, rather it is presentedas a regularizationterm in an objective function [14, 16] or describedprimarily as a computational isbue [18, 4, 2, 20J. Previous work involving explicitly model-basedmotion estimation includes direct methods 1L7,217,[13] as well as methods for estimation under restricted conditions [7,9J. The first class of methods uses a global egomotion constraint while those in the second classof methods rely on parametric motion models within local regions.The description "direct methods" actually applies equally to both types. With respect to motion models,these algorithms can be divided into three categories: (i) fully parametric, (ii) quasi-parametric, and (iii) non-parametric. Fully parametric models describethe motion of individual pixels within a region in terms of a parametric form. These include affine and quadratic flow fields. Quasi-parametric models involve representing the motion of a pixel as a combination of a parametric component that is valid for the entire region and a local component which varies from pixei to pixel. F'or instance,the rigid motion model belongsto this class:the egomotionpararnetersconstrain the local flow vector to lie along a specific line, while the local depth value determinesthe

239 r,lso noitrg ute. >luion by r,lso ren rof bhe the ion ris not g a

rk. )wing rulIge aress ICY,

ent

ion ed.

tis :ily :ct

el.

,nd .on

exact value of the flow vector at each pixel. By non-parametric models, we mean those such as are commonly used in optical flow computation, i.e. those involving the use of some type of a smoothnessor uniformity constraint. A parallel taxonomy of motion models can be constructed by consideringlocal models that constrain the motion in the neighborhoodof a pixel and global models that describe the motion over the entire visual field. This distinction becomesespeciallyuseful in a^na. lyzing hiera,rchicalapproacheswhere the meaning of "local" changesas the computation movesthrough the multiresolution hierarchy.In this schemefully parametric models are global models, non-parametric models such as smoothnessor uniformity of displacement are local models, and quasi-parametricmodels involve both a global and a local model. The rea^sonfor describing motion models in this way is that it clarifies the relationship between different approachesand allows consideration of the range of possibilities in choosing a model appropriate to a given situation. Purely global (or fully parametric) models in essencetrivially imply a local model so no choice is possible.However, in the ca^seof quasi- or non-parametric models, the local model can be more or less complex. Also, it makesclea,rthat by varying the size of local neighborhoods,it is possibleto move continuously from a partially or purely local model to a purely global one. The reasonsfor choosingone model or a.notherare generally quite intuitive, though the exact choice of model is not always easy to make in a rigorous way. In general, parametric models constrain the local motion more strongly than the less parametric ones. A small number of parameters (e.g., six in the ca.seof a,ffineflow) are sufficient to completely specify the flow vector at every point within their region of applicability. However, they tend to be applicable only within local regions, and in many casesare approximations to the actual flow field within those regions (although they may be very good approximations). From the point of view of motion estimation, such models allow the preciseestimation of motion at locations containing no image structure, provided the region contains at least a few locations with significant image structure. Quasi-parametric models constrain the flow field less, but neverthelessconstrain it to some degree. For instance, for rigidly moving objects under perspective projection, the rigid motion pa.rameters(same as the egomotion paxarnetersin the case of observer motion), constrain the flow vector at each point to lie along a line in the velocity space. One dirnensionalimage structure (e.g., a,nedge) is generally sufficient to precisely estimate the motion of that point. These models tend to be applicable over a wide region in the image, perhaps even the entire image. If the local structure of the scene can be further parametrized (".9., planar surfacesunder rigid motion), the model becomesfully parametric within the region. Non-parametric models require local image structure that is two-dimensional (e.g., corner points, textured areas). However, with the use of a smoothnessconstraint it is usually possible to 'frll-in" where there is inadequatelocal information. The estimation process is typically more computationally expensive than the other two ca.ses.These models are more generally applicable (not requiring parametrizable scene structure or motion) than the other two classes.

1.3 Paper Organization The remainder of the paper consistsof an overview of the hierarchicalmotion estimation framework, a description of each of the four models and their application to specific examples,and a discussionof the overall approach and its applications.

'

240

2 Hierarchical Motion Estimation Figure 1 describesthe hierarchical motion estimation framework. The basic components of tnis framework are: (i) pyramid construction, (ii) motion estimation, (iii) image warping, and (iv) coarse-to-finerefinement. There are a number of ways to construct the image pyramids. Our implementation uses the Laplacian pyramid described in [6], which involves simple local computations and provides the necessaryspatial-frequencydecomposition. The motion estimator varies accordingto the model. In all cases,however,the estimation processinvolvesSSD minimization, but instead of performing a discrete search(such ." in [l]), Gauss-Newtonminimization is employed in a refinement process.The basic behind SSD minimization is intensity constancy.as applied to the Laplacian *rr*piion pyramid images.Thus, f(*,t) = /(* -.t(x),t - 1) where * = (r,y) denotesthe spatial imageposition of a point, f the (Laplacianpyramid) image intensity and u(*) - (u(o,a),a(x,y)) denotesthe image velocity at that point. the SSD error measurefor estimating the flow field within a region is:

r({.'}) - t

(/(*,t) - /(x - rr(*),t- L))'

(1)

x

where the sum is computed over all the points within the region and {.t} it used to denote the entire flow field within that region. In general this error (which is actually the sum of individual errors) is not quadratic in terms of the unknown quantities {t}, be_cause of the complex pu,[1gtttof intensity variations. Hence, we typically have a non-linear minimization problem at handNote that the basic structure of the problem is independent of the choiceof a motion model. The model is in essencea statement about the function t(x). To make this explicit, we can write, (2) u(x) = u(x;p-), where pr,. is a vector representingthe model parameters. A standa,rd numerical approach for solving such a problem is to apply Newton's method. Ilowever, for errors which are sum of squaresa good approximation to Newton's method is the Gauss-Newtonmethod, which usesa first order expansionof the individual error quantities before squaring. If {u}; current estimate of the flow field during the fth iteration, the incrementalestimate {6u} can be obtained by minimizing the quadratic error measure

a({6u})- I where

x

@I+ v/. 6u(x))2,

(3)

A/(x) - f(*,t) - /(* - ur(x) ,t - L),

that is the differencebetween the two images at correspondingpixels, after taking the current estimate into account. As such, the minimization problem describedin Equation 3 is underconstrained.The different motion models constrain the flow field in difierent ways. When these a,reused to describethe flow field, the estimation problem can be refiormulatedin terms of the unknown (incremental) model parameters.The details of these reformulations are described in the various sections correspondingto the individual motion models,

241

;s

n .s l,-

h .c n

r) t.

r) |'e m ie IT

current values of the The third component, image warping, is achievedby using the to warp I(t - L) field model parameters to compute"aflow fiefi', and then using this flow warping algorithm uses towards I(f),which is used as the referenceimage.Our current secondimage)is then bilinear interpolation. The warped image (as against,the original The spatial gradient used for the computation of thl error AI for further estimation2. v.[ computations a,rebased on the referenceimage. current motion estiThe final component, coarse-to-finerefinement, propagatesthe as initial estimates' For mates from one level to the next level where they are then used of the parameters are the parametric component of the model, this is easy; the values is also used, that simply transmitted to the next level. However, when a local model a flow field or a information is typically in the form of a denseimage (or images)-<.g., pyramid expansionoperadepth map. Thislmug" 1o, images)must be propagated via a the local information tion as describedin toj. irt" gloial'parameters in combination with initial warping at the perform can then be used to generat-ethe flo* field necessaryto this next level.

3 Motion Models 3.1 Affine Flow and the camera is The Model: when the distance between the background surfaces as an affine translarge, it is usually possible to approximate the motion of the surface formation:

u(r,y)=or*o,zt+asY a(x, y) -- a4 * asx * aaU

)n is

Using vector notation this can be rewritten as follows: u(x) - X(x)a

2)

md wherea denotesthe vector (orrozragra1tas,aa)T,

ts

X(x) =

ts

al ,h ic ])

te

rd n:d

(5)

I t ' y 000 l L00 0 t x Y)

parameter vector a, Thus, the motion of the entire region is completely specified by the which is the unknown quantity that needsto be estimated' the affine paramThe Estimation Algorithm: Let a; denote the current estimate of warping step, an the in eters. After using the flow field representedby these parameters the parametric incremental estimate da can be determined. To achieve this, we inserf of 6a' form of 6u into Equation 3, and obtain an error measurethat is a function

E(6a)= I 1e

(4)

(aI +(v/)rx 6u)'

(6)

x

Minimizing this error with respectto 6a leadsto the equation:

-I I x'(v/Xv/)'x] 6a=

x"(v/)(^I).

(i)

2 We have avoided usrng the standard notation It in order to avoid any confusion about this point.

242 use of the affine flow Experiments with the affine motion model: To demonstrate frame of the original model, we show its performance on an aerial image sequence.A two frames of sequenceis shown in Figure 2a and the unprocesseddifference between an affi'ne this sequenceis shown in rigure 2b. Figure 2c shows the result of estimating to comthis using then transformation using the hierarchical warp motion approach,and perfectly flat, we pensate for camera motion induced flow. Although the terrain is not simple difierence bestill obtain encouragingcompensationresults. In this example the locate a helicopter tween the comp"orJt"J and original image is sufficient to detect and of compensateddifferin the image. We use extensionsof the approach, like integration with respect to the ence imagesover time, to detect smaller objects moving more slowly background. 3.2 Planar Surface Flow a planar surface The Model: It is generally known that the instantaneousmotion of image coordinates undergoing rigid *otion can be describedas a secondorder function of we provide a brief section this In involving eight independentparameters(e.g.,see [15]): its estimation' derivation of this description a,ndmake some observationsconcerning object (in We begin by observing that the image motion induced by " rigidly moving this casea plane)' can be written as:

u(x)=

fuo(*)t*B(x)c.r

(8)

position where Z(*)is the distancefrom the cameraof the point (i.e., depth) whoseimage

is (x)' and

o"l A(*) = [-Jo - f aJ L B(x) =

+ *\lf v I | @il/f -(f-@v)lr -x )' Ltr+ f)lr

focal length f The A and the B matrices depend only on the image positions and the vector, and velocity angular the c, and not on the unknowns: t, tle translation vector, Z. A planar surface can be describedby the equation ktX*kzY*ksZ=l

(e)

plane from where (kt,kr,lca) relate to the surface slant, tilt, and the distance of the Dividing origin). the origin of the chose coordinate system (in this case, the camera throughout by Z, we get

t =k++k,I* kg.

we obtain using k to denotethe vector(tc1,kz,ks) and r to denotethe vector (*lf ,vlf ,1)

z(*)

- r(x)"k.

Substituting this into Equation 8 gives

u(x) = (A(x)t) (r(x)"k) + B(x)r.r

(10)

a.

.ow nal sof fine rIrlwe beIter ferthe

ace r,tes rief (it

(8)

t f md

(e) om i.rg

243 This flow field is quadratic in (x) and can be written also as

u(x) - a1* a2x * aey* azxz* asxy o(x) - &4* asc * aaU* azxU+ aeUz

(1 1 )

where the 8 coefficients(41,...,og) are functionsof the motion paramterst,cl and the surface parmeters k. Since this 8-parameterform is rather well-known (e.g., see [15]) we omit its details. If the egomotionparametersare known, then the three parameter vector k can be used to represent the motion of the pla^narsurface.Otherwise the 8-parameter representation can be used. In either case,the flow field is a linear in the unknown pa,rameters. The problem of estimating pla^narsurfacemotion has been has been extensivelystudied before [21, 1, 23]. In particular, Negahdaripourand Horn [21]suggestiterative methods for estimating the motion and the surfaceparameters,a"swell as amethod of estimating the 8 parameters and then decomposingthem into the five rigid motion parameters the three surfaceparametersin closedform. Besidesthe embeddingof thesecomputations within the hierarchical estimation framework, we also take a slightly different, approach to the problem. We assumethat the rigid motion parametersare already known or can be estimated (".9., see Section3.3 below). Then, the problem reducesto that of estimating the three surfaceparametersk. There are severalpractical reasonsto prefer this approach:First, in many situations the rigid motion model may be more globally applicable than the planar surface model, and can be estimated using information from all the surfacesundergoing the same rigid motion. Second,unless the region of interest subtends a significant field of view, the second order components of the flow field will be small, and hence the estimation of the eight parameterswill be inaccurateand the processmay be unstable. On the other hand, the information concerningthe three parametersk is containedin the first order componentsof the flow field, and (if the rigid motion parameters are known) their estimation will be more accurateand stable. The Estirnation Algorithm: Let ki denote the current estimate of the surface parameters, and let t and cudenote the motion parameters.These parameters are used to construct an initial flow field that is used in the warping step. The residual information is then used to determine an incremental estimate 6k. By substituting the parametric form of 6u 6u=u-u0 = (A(x)t) (r(x)"(ko + 6k)) * B(x)c.,- (a(*)t) (r(x)"ko) + B(x)c., - (A(x)t) r(x)"6k

(12)

in Equation 3, we can obtain the incrementalestimate 6k as the vector that minirnizes: aln

E(6k)= I( @t + (vD"(nt)r"ar)2

( 13)

x

Minimizing this error leadsto the equation:

r, "(rtA"Xv/)(v/)'(At)r")] ru = - I [I 10)

"(t'A"Xv/)aI

This equation can be solvedto obtain the incemental estimate dk.

(14)

244

We demonstrate the appliExperiments with the planar surface motion model: sequence.one of the outdoor an from cation of the planar ,,rrf*" model using images between both input images is input images is shown in Figure 3a, u"a trt" difference between the images using the shown in Figure Bb. After esiimating the camera motion estimation algorithm algorithm described in Section J.3, i" opplied the plana,rsurface a region on the ground plane' to a manually selectedimage window placed roughly over towards the first (this process These parameterswere then used to warp the secondframe this warped image and the should align the ground plane alone). The difierence between of the ground plane original image i, Jho*r, in rigu. e lc'.The figure showscompensation other objects in the background' motion, leavingresidual puru,Ilr* motion of tle trees and projected a rectangular grid Finally, in order to demonstratethe plane-fit, we graphically image in Figure 3d. onto that prane.This is shown superimposedon the input 3.3 Rigid BodY Model rigid mo_tioncannot usually The Model: The motion of arbitrary surfacesundergoing use of the global rigid bgat be describedby a single global model. We can howevermake In this section, we provide a model if we combine it with a local model of the surface. provides further details brief derivation of the global and the local models. Hanna [12] interact at corner-like and results, and also dkcribes how the local and global models and edge-likeimage structures. by a rigidity moving object As described in Section 8.2, the image motion induced can be written as:

'(x)=

fun(*)t*B(x)c.,

(15)

(i.e., its depth), whose image where z(*) is the distance from the camera of the point position is (x), and

A(*)= [-ot!,X] B(x) -

+,\lf v I | @illf -u2 -x l -@v)lf L(r'+ v2)lf

positions and the focal length The A and the B matrices depend only on the image vector, c.rthe angula,rvelocity vector, and f and not on the unknowns: t, th; translation c.rand t, with pa'rametersof Z . Eqaalion lb relates the parameters of the global model, the local scenestructure, Z(x)' that over a local image A local model we use is the frontal-planar model, which means the assumption that patch, we assumethat z(*) is constant. An alternative model uses constant estimate-is refined bZ 1*1-the differencebetween a previous estimate and a over each local image Patch. of the local strucWe refine the local and global modelsin turn using initial estimates @ and t. This local/global ture paramet erc,z(x), and the globat rigid body paiameters refinement is iterated severaltimes' be denoted as Z;(Ill.*l :"d The Estimation Algorithm: Let the current estimates to construct an initialflow cr.r;.As in the other models,we can use the model parameters towards the next. The residual field, ,i(*), which is used to warp one of the imag" fru*"r which it is warped is used to to error between the warped image and the originJ image

245

: S

refine the parametersof the local and global models.We now show how these models are refined. We begin by writing equation 15 in an incremental form so that

3

-#A(x)ts -B(x)c.,s = jft.A(x)t*B(x).., du(x)

I

is e a

I

(16)

Inserting the parametric form of du into Equation 3 we obtain the pixel-wise error as

E(t,u,Lfz(x))= (at + (vr)" N/z(x) + (v|rBu - (vr)" Ari/zi(x)- (v4"B r,)' (17) To refinethe local models,we assumethat L/Z(x) is constantover 5 x 5 imagepatches centered on each image pixel. We then algebraically solve for this Z both in order to estimate its current value, and to eliminate it from the global error measure. Consider the local component of the error measure, Eto"ot- I

E(t,w,I/Z(x)).

(18)

5xE

Differentiatingequation17 with respectto I/Z(x) and settingthe result to zero,we get L/z(x)-

- Ibxs(VI)"At

- (v (4/ - gz/)rAtilzd(x),+ (V/)"gc.t ' \ l)r3,w;) ' '

Du*u((vr;r6*'' ) :e

h d ,f e rt rt

rl

d w al ;o

t1( (19)

To refine the global model, we minimize the error in Equation L7 summed over the entire image: = (20) Estobat E(t,u,I/Z(x)). t Image

We insert the expressionfor | / Z (x) given in Equation L9-not the current numeri,cal aalue of the local parameter-into Equation 20. The result is an expression for Eilobar that is non-quadratic in t but quadratic in c.r. We recover refined estimates of t a,ndc.r by performing one Gauss-Newtonminimization step using the previous estimates of the global parameters, ti and arg,as starting values. Expressionsa,reevaluated numerically att ;andu)=u)i. We then repeat the estimation algorithm severaltimes at each image resolution. Experiments with the rigid body motion model: We have chosenan outdoor scene to demonstrate the rigid body motion model. Figure 4a shows one of the input images, and Figure 4b shows the difference between the two input images. The algorithm was perfiormedbeginning at level 3 (subsampledby u factor of 8) of a Laplacian pyramid. The local surface parameterc If Z(x) were all initialized to zero, and the rigid-body motion parameterswereinitializedto t0 = (0,0, 1)T and u) = (0,0,0)t.The modelparameters were refined 10 times at each image resolution. Figure 4c shows the difierence image between the secondimage and the first image after being warped using the final estimates of the rigid-body motion parameters and the local surface parameters. Figure 4d shows an image of the recoveredlocal surface parameterc lf Z(x) such that bright points are nea,rerthe camera than dark points. The recoveredinverse ranges are plausible almost everywhere,except at the image border and near the recoveredfocus of expansion.The bright dot at the bottom right hand side of the inverserange map correspondsto a leaf in the original image that is blowing acoss the ground towa"rdsthe camera. Figure 4e

246 shows a table of rigid-body motion parameters that were recovered at the end of each resolution of analysis. More experimental results and a detailed discussion of the algorithm's performance on va.rious types of scenes can be found in [12].

3.4 General Flow Fields The Modeh Unconstrainedgeneralflow fields are typically not describedby any global parametric model. Different local models have been used to facilitate the estimation process,including constant flow within a local window and locally smooth or continuousflow. The former facilitates direct local estimation [18, 20], whereasthe latter model requires iterative relaxation techniques [16] tt is also not uncommon to use the combination of these two types of local models (".g., [3, 10]). The local model chosenhere is constant flow within 5 x 5 pixel windows at each level of the pyramid. This is the sarnemodel as used by Lucas and Kanade [18] but here it is embeddedas a local model within the hiera,rchicalestimation framework. The Estirnation Algorithm: Assume that we have an approximate flow field from previous levels (or previous iterations at the same level). Assuming that the incremental flow vector 6u is constant within the 5 x 5 window, Equation 3 can be written as

E(6u)- f{a I +vfr6')'

(2r)

x

where the sum is taken within the 5 x 5 window. Minimizing this error with respect to 6u leadsto the equation,

-I [Itotxvo'] 6u-

vIAI.

(22)

We make some observationsconcerningthe singularities of this relationship. If the sum*ittg window consists of a single element, the 2 x 2 matrix on the left-hand-side is an outer product of a 2 x I vector and hence has a rank of atmost unity. In our case,when the sum*ittg window consistsof 25 points, the rank of the matrix on the left-hand-side will be two unlessthe directions of the gradient vectors V.I everywherewithin the window coincide. This situation is the general caseof the aperlure effect. In our implementation of this technique,the flow estimate at eachpoint is obtained by using a 5 x 5 windows centeredaround that point. This amounts to assumingimplicitly that the flow field varies smoothly over the image. Experiments with the general flow model: We demonstrate the generalflow algorithm on an image sequencecontaining severalindependently moving objects, a casefor which the other motion models described here are not applicable. Figure 5a shows one image of the original sequence.Figure 5b shows the difference between the two frames that were used to compute imageflow. Figure 5c showslittle differencebetween the compensatedimage and the other original image. Figure 5d showsthe horizontal component of the computed flow field, and figure 5e shows the vertical component. In local image regions where image structure is well-defined,and where the local image motion is simple, the recoveredmotion estimates appear plausible. Errors predictably occur however at motion bounda^ries.Errors also occur in image regionswhere the local image structure is not well-defined (like some parts of the road), but for the same rea"son,such errors do not appear as intensity errors in the compensateddifferenceimage.

247

I each nalrce

3lobal l proflow. luires on of level

:itis

from ental

(21) :t to (22) iumsan

rhen side .dow

tbv :itly

lgofor one mes oment age im:ver ure do

4 Discussron Thus far, we have described a hierarchicalframework for the estimation of image motion between two images using va^riousmodels. Our motivation was to generalizetle notion of direct estimation to model-basedestimation and unify a diverse set of model-based estimation algorithms into a singleframework.The framework also supports the combined use of parametric global models and local models which typically represent some type of a smoothnessor local uniformity assumption. One of the unifying aspects of the framework is that the same objective function (SSD) is used for all models, but the minimization is performed with respect to different parameters.As noted in the introduction, this is enabledby viewing all these problems from the perspective of image registration. It is interesting to contrast this perspective (of model-basedimage registration) with some of the more traditional approachesto motion analysis. One such approach is to compute image flow fields, which involvescombining the local brightness constraint with somesort of a global smoothnessa^ssumption, and then interpret them using appropriate motion models. In contrast, the approach taken here is to use the motion models to constrain the flow field computation. The obvious benefit of this is that the resulting flow fields may generally be expected to be more consistent with models than general smooth flow fields. Note, however,that the framework also includes general ,*ooih flow field techniques,which can be used if the motion model is unkno*n. In the caseof models that are not fully parametric, local image information is used to determine local image/sceneproperties (e.g.,the local range value). However,the accuracy of these can only be as good as the availablelocal image information. For example, in homogeneousareasof the scene,it may be possibleto achieveperfect registration even if the surface range estimates (and the correspondinglocal flow vectorsf are incorrect. However, in the presenceof significant image structures, these local estimates may be expectedto be accurate.On the other hand, the accuracyof the global parameters(e.g., the rigid motion parameters) dependsonly on having sufficient and sufficiently diverse local information across the entire region. Hence, it may be possible to obtain reliable estimates of these global parameters, even though estimated local inf,ormation may not be reliable everywherewithin the region. For fully parametric models, this problem does not exist. The image registration problem addressedin this paper occurs in a wide range of image processingapplications, far beyond the usual ones consideredin computer vision (".9., navigationand imageunderstanding).Theseinclude imagecompressionvia motion compensatedencoding, spatiotemporal analysisof remote sensingtype of images,image databaseindexing and retrieval, and possibly object recognition. On" way to state this general problem is as that of recoveringthe coordinate system that relate two imagesof a scenetaken from two different viewpoints. In this sense,the framework proporuJ h"r" unifies motion analysis acrossthese diferent applications as well. Acknowledgements: M*y individuals have contributed to the ideas and results presentedhere. These include Peter Burt and Leonid Oliker from the David Sarnoff Research Center, and ShmuelPelegfrom Hebrew University.

248

References and structure from optical flow generated 1. G. Adiv. Determining three-dimensionalmotion pattern Anorysis and Machine Intelligence, by severar moving objects. IEEE Trans. on ?( ):384-401,JulY 1985techniques for the measurement of 2 . p. Anandan. A unified perspective on computational vision, pages zl9-230, London, visual motion. rn Internationar conference on computer May 1987. an algorithm for the measurementof visual 3 . p. Anandan. A computational framework and 1989' motion. International Journal o! computer vision,2z283-3L0, efrcient motion estimation computationaly 4 . J. R. Bergen and E. H. Adersoo. Hi.rarchicar, algorithm. J. Opt. Soc.Am' A',4:35,1987'

Tt 5,ol' /d!a e

Awl

pereg.computing two motionsfrom three ;1. ir"rr,'ii. Hinsorani,lnd s. s. lt;:';;;, - l-T^-^T\^^o-l'o. 1 OOf) 1ee0' on computer vision,osaka, Japan,December c;;i;"ce rt lll ;;;;"ii"i)rr)r""t;;;' lfallles. _- .r^ Ttr1E image code. IEEE pyramid as a compact 6. p. J. Burt and E. H. Adelson. The raprr.ino Transactionson c ommunication,31:532-540,1983.

;. #""rrJ;."b;;:;-;;;;kfi.,oi*,

;;il:

(LItCLIJDrD.

i"'ioii -rr1

i";i;;;o\n

a moving camera,an appricationof dvnamic motion 'l

cA, March1e8epases2-12,rr-vine, visuarMotion, .

f1t A

lf,^-^L

oQo

component patg. p.J. Burt, R. llingorani, and R. J. Kolczynski. Mechanisms for isolating I r : ^ ^ . - l i f ^ + : ^ ^ --

:-^l^1i--

^^rnhanan

rn IEEE workshop on visual Motion, terns in the sequential analysis of multipre motion. pages187-193,Princeton,NJ, October 1991' motion parallax' In g. stefan carlsson. object detection using model based prediction and 1989' August sweden, stockholm workshopon computationaluision,stockholm, systemsfor Pgrarnidal the dynamic pyramid. rn 1 0 . J. Dengler. Locar motion estimation with 1986' ,o*pul", aision,pages289-298, Maratea, Italy' May for estimation of optical flow fields in algorithms 1 1 . w. Enkelmann. Investigations of multigrid Image Processing4339:150-L77'1988' image sequences.ComJuter Vision, Giaphics, and of ego-motion and structure from motion' 12. K. J. Hanna. Direct multi-resolution estimation NJ, October 1991' ln Workshopon Visual Motion' pages156-162, Princeton, ,Tt^^l

and *oiioo rrom multipleframes'TechnicalReport |

13. :: ri:"i5't;:;':rir**i.";il;;;e

.

1190,MIT AI LAB, Cambridge,MA, 1990' o! visual Motion' The MIT Press' 1983' 14. E. C. Hildreth . The Messureme,nt 1 5 . B . K . P . I l o r n . R o b o t V i s i o n , . M I T P r e s s , C a m b r i d g e , M A , 1 9 8 6 . Inteuigence,r7:L85optical flow. Artificial 1 6 . B. K. p. Horn and B. G. Schunck.Determining 2 0 3 ,1 9 8 1 . for recovering motion. International 17. B. K. p. Horn and E. J. weldon. Direct methods 1988' 'it"r.tive June Journalof ComputerVision,2(1):51-76' image registration techniquewith an application Kanade.'Ao T. and 1g. B.D. Lucas pages 121-130,1-991' to stereo vision. In Image{JnderstsndingWorkshop'

ror estimating

algorithms i: K;";J;. Karman'ftt"'-u.'ed SzensKl' ar Matthtes, K. s;"ilil:;;J l: il=jfit;'*. 19. L. r l f : on computer vision, pages 199depth from image-sequences. rn International conference

213,TamPa,FL, 1988' from second order intensity variations in inzo. H. H. Nagel. Displacement vectors derived andImage Ptocessing,2T:85-ll7 ' tensity sequences. computer vision, Pattern reognition 1983. passivenavigation. IEEE Trans. on Pattern zr. s. Negahdaripour and B.K.p. Ilorn. Direct g(1):168-1?6, January 1987' Analysisand Machine Intelligence, for image-flow computation. ln International 22. A. Singh. An estimation theoretic framework November 1990' Confeienceon Computer Vision,Osaka,Japan'

23. ;:#:ffiff;;;?:w;;:'b;;;;;;

andgrobar derormation ""i"ii.n, neighborhood .-L , /e\.o<

Joirnal of Robotics Reseotch,4(3):95image flow: planar surfaces in motion . International

108,Fall 1985.

ted C€,

;of )[r

ual ion ree ).

EE Fig. 1. Diagram of the hierarchical motion estimation framework.

nd ion at)ft,

In for in 3. )n. )rt

twl on ng t.n( rn wl >al Fig.2.

Affine motion estimation: a) Original. b) Raw difference. c) Compensated difference.

Fig.3.

Planar surface motion estimation. a) b) .) c)

Original image. Raw diference. Diff"r"nce after planar compensation. Planar grid superimposed on the original image.

&

(a)

(.0000,.0000,1.0000) (.oooo,.ooo0,.oooo) ( - . 3 3 7 9 , -3. 15 2 , . 9 3 1 4 ) 3 2 x 3 0 .0027,.0039,-.0001 ( -9. 3 31 9 , - . 0 5 6 1 , .19! )4 ( . 0 0 3 8 , . 0 0 4 1 , . 0 0 1 64x60 8) 1 2 8 x 1 2 0 ( . 0 0 3 ? , . 0 0 1 2 , . 0 0 0-.oooo,-.0383,.9971) -.0255,-.0899,.9956) 256 x 240 (.oozg,.oo06,.oo13)

Fig.4.

Egomotion based flow model' a) b) c) d) "j

Original image from an outdoor sequence' Raw difference. Difference after ego-motion compensation' Inverse range map. nigia body parameters recovered at each resolution.

252

(")

(b)

(")

(d)

(")

Fig.5.

Optical flow estimation. a) Original image. b) Raw diference. c) Difference after motion compensation. d) Horizontal component of the recovered flow fielcl. e) Vertical component of the recovered florv field.

Related Documents

Estimation
November 2019 22
Estimation
November 2019 28
Nickel Estimation
May 2020 9
Estimation Document.docx
November 2019 22
Effort Estimation
November 2019 31

More Documents from ""