# multiview_point_registration_via_alternating_optimization__08476e80.pdf

Multi-View Point Registration via Alternating Optimization

Junchi Yan12, Jun Wang5, Hongyuan Zha34, Xiaokang Yang1, Stephen M. Chu2

yanjunchi@sjtu.edu.cn, j.wang@alibaba-inc.com, zha@cc.gatech.edu, xkyang@sjtu.edu.cn, schu@cn.ibm.com 1Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China 2IBM Research China, Shanghai, 201203, China 3Software Engineering Institute, East China Normal University, Shanghai, 200062, China 4College of Computing, Georgia Institute of Technology, Atlanta, Georgia, 30332, USA 5Institue of Data Science and Technology, Alibaba Group, Seattle, WA, 98101, USA

Multi-view point registration is a relatively less studied problem compared with two-view point registration. Directly applying pairwise registration often leads to matching discrepancy as the mapping between two point sets can be determined either by direct correspondences or by any intermediate point set. Also, the local two-view registration tends to be sensitive to noises. We propose a novel multi-view registration method, where the optimal registration is achieved via an efﬁcient and effective alternating concave minimization process. We further extend our solution to a general case in practice of registration among point sets with different cardinalities. Extensive empirical evaluations of peer methods on both synthetic data and real images suggest our method is robust to large disturbance. In particular, it is shown that our method outperforms peer point matching methods and performs competitively against graph matching approaches. The latter approaches utilize the additional second-order information at the cost of exponentially increased run-time, thus usually being less efﬁcient.

Introduction

Point matching or registration is a fundamental research topic in computer vision and pattern recognition since many applications rely on accurate geometric model acquisition, including object recognition, localization, tracking, appearance and texture analysis, virtual reality, among others (Lian and Zhang 2012; Tsin and Kanade 2004; Chui and Rangarajan 2003; Zhang 1994; Myronenko and Song 2010). As a basic form, two-view point registration generally consists of two subproblems: i) ﬁnding correspondence from two views; ii) estimating the pose transformation. However, the key challenge is a typical chicken-and-egg problem: two subproblems are interlocked, and the overall optimization is highly non-convex. In practice, data acquisition often involves obtaining either intensity or depth data of an object from more than two view points (Dorai et al. 1998). Hence, multi-view point matching or registration has attracted more attention recently because it involves more matching param-

The work is partially supported by NSF DMS-1317424, NSFC 61129001/F010403 and 61025005/F010403. Copyright c 2015, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

eters and correspondences among a set of views with potential performance improvements. Fig.1 gives some exemplar results of the proposed multi-view registration method which is applied on real images, where the conventional two-view registration methods often produce unsatisfactory results due to various reasons, such as local occlusions and large transformations, among others. Despite of the empirical success of two-view point registration methods, the extension to multi-view registration remains as an open problem (Sharp, Lee, and Wehe 2004; Pooja and Govindu 2010). As a straightway matching strategy based on two-view solution, performing pairwise registration sequentially over all views hardly works well (Pulli 1999). Another potential scheme to reuse the two-view registration techniques is repeating the two-view registration on each pair of views and then aligning the pairwise alignment results to derive the ﬁnal solution. However, this strategy suffers two major limitations: i) different orders of pairwise registration may result in non-unique correspondences; ii) the registration error by the local noise of a pairwise alignment, in an extreme case, may propagate along the pairwise registration sequence and would be further accumulated if no extra information is available to dismiss the noise. It is generally recognized that additional information from multi-view can help improve the alignment than the case when only a pair of views is given. To address the above issues and leverage multi-view data, we design a novel multiview registration algorithm via an alternative optimization procedure. We transform the objective function to a concave one over iteration efﬁciently, which can be minimized by a global optimizer, preventing a local optimum in the sequential cycle of registration process. Extensive empirical studies corroborate the effectiveness of our method that generates more accurate and consistent aligning results.

Related Work In this section, we provide a structured view on previous work concerning point registration and a more broad topic i.e. ﬁnding point correspondence. Application scenario: multi-view vs. two-view There is extensive work for two-view registration. For instance, given a rough initial registration, the iterative closest point (ICP) method iterates between ﬁnding the point correspondence via nearest neighbor method and updating the transforma-

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence

Figure 1: Examples of registration results. From left to right: house and car from the Pose dataset, hotel from the CMU sequence.

tion with the least square error (Chen and Medioni 1992; Zhang 1994). Many other methods, such as the Robust Point Matching (RPM) (Chui and Rangarajan 2003) and the Coherent Point Drift (CPD) method (Myronenko and Song 2010), are designed to perform the pairwise registration. On the contrary, the multi-view alignment problem is less studied and existing work mainly focuses on developing the registration strategies for aligning multiple pairwise results (Bergevin et al. 1996; Sharp, Lee, and Wehe 2004; Pooja and Govindu 2010). Alignment strategy: simultaneous vs. sequential One intuitive strategy is to perform pair-view registration at a time, and then repeat the pairwise registration in a sequential manner (Chen and Medioni 1992; Turk and Levoy 1994). Such a strategy enjoys the simplicity of both computational overhead and implementation efforts. However, this simple strategy is known to be sensitive to noises. Another alternative registration method is ﬁrst presented by Bergevin et. al. (Bergevin et al. 1996), and then further reﬁned in (Benjemaa and Schmitt 1997). More recently, (Pooja and Govindu 2010) proposes an extension to the standard ICP algorithm to simultaneously average the redundant information in multi-view data. (A.Torsello, Rodola, and Albarelli 2011) proposes to use dual quaternion and adopts a new distortion measure derived from the screw motion. Note that most of these methods are speciﬁcally designed for the setting of the rigid transformation, neglecting non-rigid transformations. Optimization technique: global vs. heuristic Most of the well-known point registration methods are heuristic. Brieﬂy speaking, one starts from an initial estimation and solves the overall problem by alternately solving the two sub-problems, i.e., pose estimation and point correspondence. The above mentioned ICP based approaches fall into this category. In addition, Robust Point Matching (RPM) (Chui and Rangarajan 2003) relaxes point correspondence to be continuously valued and achieves an optimized solution in the continuous space. The coherent point drift (CPD) method (Myronenko and Song 2010) models point matching under a probabilistic framework and uses expectation maximization (EM) for obtaining optimal solutions. (Lian and Zhang 2012) forms a concave objective and obtains a globally optimal solution. Similarly, (Li and Hartley 2007) achieves a global optimality grounded by the Lipschitz global optimization process. However, the global optimized results are achieved only for the two-view case. Transformation: similarity vs. afﬁne transformation Given a point x R2, its similarity transformation is calculated as s Rx + t, where s is the scaling factor, and R, t denotes rotation matrix and translation vector respectively.

The afﬁne transformation is computed as Vx + t where V R2 2 denotes the afﬁne matrix and t for translation. In particular, for the similarity transformation, if the scalar factor equals 1, then it becomes rigid transformation. In general, both two types are non-rigid transformations. There are also other types of parameterized transformations, such as RBF non-rigid transformation (Chui and Rangarajan 2003; Myronenko and Song 2010), which is parameterized by a set of basis points. etc. In our experiment, we would focus on the two most popular types as described above. Point registration vs. graph matching Graph matching (Tian et al. 2012; Yan et al. 2013; 2014; Zhou and la Torre 2012; Cho, Lee, and Lee 2010) and point registration are closely related formulations, both of which are used for solving the point set correspondence problem. However, there are two important differences between these two approaches. First, graph matching typically explicitly infuses pairwise or higher-order point tuple afﬁnity, which is beyond unary similarity, to derive the objective function. While point registration does not explicitly use such point-tuple afﬁnities between two point sets. Its optimization can be casted as a linear programming problem given known transformation parameters. The other major difference is graph matching does not impose parametric prior or constraints to the transformation between two point sets, yet point registration involves various parameterized transformation models such as afﬁnity or similarity transform, as we discussed above. Therefore, both formulations are mathematically challenging: graph matching is casted to a constrained quadratic (due to including higher-order afﬁnity terms) assignment programming problem which is well known to be NP-hard; In contrast, point registration involves additional unknown transformation parameters which need be estimated jointly with the point correspondence. Note simultaneously optimizing with respect to these two variables are non-convex without a closed form solution.

From Two-view to Multi-view Registration Before diving into main body, ﬁrst several notations are deﬁned. Without loss of generality, this paper mainly presents the proposed multi-view registration formulation and algorithm in the context of three views. Let X = {xi}, Y = {yi}, Z = {zi} denote three point sets, whose cardinality equals m, n, l respectively. For point set X, denote each point by xi = [x1 i , ..., xd i ] Rd, so for Y, Z. P {0, 1}m n, denotes the point matching matrix with Pij = 1 indicating that xi corresponds to yj and Pij = 0 otherwise. The pairwise transform parameters over the three point sets are θxy Rk, θxz Rk, θyz Rk, where k is the de-

gree of freedom of the transformation. Let Jxi, Jyi, Jzi denote the Jacobian matrices of the i-th point in point sets X, Y, Z, respectively. To make the problem mathematically tractable, we assume the transformation T(xi|θ) is linear with respect to its parameters θ. Their matrix forms are Jx R2m d, Jy R2n d, Jz R2l d. In addition, we deﬁne Y [ y1 2 2, . . . , yn 2 2]T , Z [ z1 2 2, . . . , zl 2 2]T. Im Rm m denotes the identity matrix, and 1m Rm 1 denotes the column vector whose elements being all ones. A B is the Kronecker product of matrices A and B. We introduce one important formulation for two-view point registration. Being a baseline approach, this pairwise method is also the starting point of the proposed method.

Baseline: Pairwise Robust Point Matching For pairwise form of robust point matching (RPM) model (Chui and Rangarajan 2003), the problem is expressed via minimizing an objective w.r.t. Pxy (xy is omitted in Eq.1), θxy from X to Y, which consists of a pointto-point distance term and a regularizer g(θxy) related to a prior transform θ0 and a weight matrix H Rk k:

E(P, θxy) = X

i,j Pij yj T(xi|θ) + g(θxy) (1)

P1n = 1m, 1T m P 1n, g(θxy) = (θxy θ0)T H(θxy θ0)

The optimal ˆθxy is found by zeroing its derivative1.

ˆθxy = (JT x Jx + H) 1[JT x (Pxy Id)Y + Hθ0] (2)

Proposed Multiple-View RPM Formulation Without loss of generality, assume |X| |Y| |Z| and the mapping for X Y is injection, so for Y Z. Based on these assumptions, in line with footnote (1), we propose the following three-view registration formulation:

Exyz =Exy + Exz + Eyz (3)

i,j yj Jxiθxy 2 + X

i,k zk Jxiθxz 2

j,k zk Jyjθyz 2 + g(θxy) + g(θxz) + g(θyz)

In the explicit form w.r.t. correspondence matching matrix Pxy, Exy (similar for Exz, Eyz) can be written as:

θT xy JT x Jxθxy 2θT xy JT x (Pxy Id)Y + 1T m Pxy Y + g(θxy)

where g(θ) = (θ θ0)T H(θ θ0). To make the problem mathematically tractable, we assume θxy, θxz, θyz are independent to each other2, thus can be decided by the matching matrix Pxy, Pxz, Pyz independently. By letting

1We assume each point in X has its counterpart in Y: Pxy1n=1m. Therefore, we can simplify the mathematics and reach Eq.2 which is the foundation of later derivation of this paper. This constraint is widely used in related literature such as (Chui and Rangarajan 2003; Lian and Zhang 2012) and reference therein. 2The reader may have the concern that θxy, θxz, θyz should be compatible with each other while computing them independently as simpliﬁed in the paper in general cannot ensure this compatibil-

E θxy = E θxz = E θyz = 0 the optimal θxy, θxz, θyz can be calculated by Eq.2. As a result, they are eliminated in the objective function and we arrive at a new nonlinear objective function w.r.t. Pxy, Pxz, Pyz where Exy can be written as follows (similar for Exz, Eyz)

Exy =1T m Pxy Y YT (Pxy Id)T Jx + θT 0 H (JT x Jx + H) 1

JT x (Pxy Id)T Y + Hθ0 + θ0Hθ0

By eliminating the constant terms, the expansion of Exy (similar for Exz, Eyz) w.r.t. matching matrix Pxy becomes:

Exy =1T m Pxy Y 2θT 0 H(JT x Jx + H) 1JT x (Pxy Id)Y (4)

+ YT (Pxy Id)T Jx(JT x Jx + H) 1JT x (Pxy Id)Y To obtain a more clear expression, we vectorize the matrix version of P to p=vec(P) by stacking its rows. By separating the linear Elin xyz and the quadratic parts Eqd xyz we reach:

Exyz =Elin xyz(pxy, pxz, pyz) Eqd xyz(pxy, pxz, pyz) (5)

Elin xyz =(1T m Y T + (Kx Y)Wxy)pxy

+ (1T m Z T + (Kx Z)Wxz)pxz

+ (1T n Z T + (Ky Z)Wyz)pyz

Eqd xyz = (Ux JT x ) YT Wxypxy 2 + (Ux JT x ) ZT Wxzpxz 2

+ (Uy JT y ) ZT Wyzpyz 2

where Kx = 2(θT 0 H(JT x Jx + H) 1JT x ) and similar for Ky, Kz. In the rest of the paper, we further assume |Y|=|Z| and the correspondence is a bijection, then it is mathematically sound (refer to footnote 1) to rewrite the objective as Exyz=Exy+Exz+Ezy by replacing Eyz with Ezy. The reason for introducing this alternative writing of the objective function would be clear in the next section.

Exyz =Elin xyz(pxy, pxz, pzy) Eqd xyz(pxy, pxz, pzy) (6)

Elin xyz =(1T m Y T + (Kx Y)Wxy)pxy

+ (1T m Z T + (Kx Z)Wxz)pxz

+ (1T l Y T + (Kz Y)Wzy)pzy

Eqd xyz = (Ux JT x ) YT Wxypxy 2 + (Ux JT x ) ZT Wxzpxz 2

+ (Uz JT z ) YT Wzypzy 2

where UT x Ux = (JT x Jx + H) 1, UT y Uy = (JT y Jy + H) 1, UT z Uz = (JT z Jz + H) 1, Wxy Im [In e1 d, . . . , In ed d]T , Wxz Im [Il e1 d, . . . , Il ed d]T , Wyz In [Il e1 d, . . . , Il ed d]T , Wzy Il [In e1 d, . . . , In ed d]T , such that W satisﬁes vec(P Id)=W vec(P) and ei d is a d-dimensional column vector with the i-th element being 1 and the rest 0s.

Alternating Optimization The rough idea for alternating optimization is to ﬁx one pairwise alignment variable and then update the other in a rotating manner. For three-view scenario as we discussed so far,

ity. In fact, from our experiments, we empirically found the estimated θxy, θxz, θyz are roughly consistent. Note one way of mitigating the consistency issue is that to some extent we bring this compatibility back since the point correspondence matching consistency is preserved in our formulation i.e. Pxy = Pxz Pzy.

Figure 2: Similarity transform test on Chui-Rangarajan data: Point matching distance error by 5 methods (RPM, M-RPM, FGM, UG, M-GM) against deformation, noise and outlier. The bars indicate the standard deviation of the error over 20 random trials.

Figure 3: Afﬁne transform test on Chui-Rangarajan data: Point matching distance errors by 5 methods (RPM, M-RPM, FGM, M-GM, CPD) against deformation, noise and outlier. The bars indicate the standard deviation of the error over 20 random trials.

there are three variables pxy, pxz, pyz. Remember it is still under the assumption as stated in Sec.3 that |X| |Y| = |Z| and Pxz, Pxy are injection and Pzy bijection. Therefore, the two equations Pxz = Pxy Pyz, Pxy = Pxz Pzy both hold. However, it would cause node-mapping information loss if one moves Pyz (Pzy) to the left side of the equation: Pyz = Pyx Pxz (Pzy = Pzx Pxy). In fact these two equations do not hold because X only corresponds to a subset of Y and Z. In fact, the derived Pd yz = Pyx Pxz cannot satisfy the injection requirement for Eyz thus Eq.2 is broken. Based on the above analysis, we will chose to ﬁx one of pxy and pxz alternatively and optimize with respect to pyz (pzy) and the other. Speciﬁcally, given known pxy, we have pxz = (In Pxy)pyz; when pxz is ﬁxed, it becomes pxy = (In Pxz)pzy. Attention shall be taken here to the new form pzy instead of pyz. One may argue to avoid pzy (accordingly Ezy) by using pyx = (In Pxz)pyz. However pyx reverses the term Eyx which breaks the assumption in footnote (1) for A B is an injection in the objective function EAB. In contrast, as Y Z is a bijection, reverse Eyz to Ezy is not harmful. Therefore, it becomes clear why we replace the term Eyz with Ezy as shown in Eq.6, in addition with Eq.5. Note we impose no assumption about the property of the point distribution or the topology of the point sets. Hence, our approach and analysis are generic. In what follows, we will give a detailed description of the proposed method under the context of three-view registration, which will be extended to a more general setting with multiple views.

Three-view Alternating Optimization Mechanism By ﬁxing pxy we can discard Exy in Eq.5, and the objective function that is only with respect to pxz, pyz becomes

Exzyz =(1T m ZT + (Kx Z)Wxz)pxz [(Ux JT x ) ZT ]Wxzpxz 2

+(1T n YT + (Ky Z)Wyz)pyz [(Uy JT y ) ZT ]Wyzpyz 2

The two quadratic terms in Exzyz can be written as:

Eqd xzyz =p T xz[(Ux JT x ) ZT Wxz]T [(Ux JT x ) ZT Wxz]pxz +p T yz[(Uy JT y ) ZT Wyz]T [(Uy JT y ) ZT Wyz]pyz

Alternatively, one can ﬁx pxz and remove the constant term Exz from Eq.6 to obtain the new objective function:

Exyzy =(1T m YT + (Kx Y)Wxy)pxy [(Ux JT x ) YT ]Wxypxy 2

+(1T l YT + (Kz Y]Wzy)pzy [(Uz JT z ) YT ]Wzypzy 2

Similarly, its quadratic term can be written as:

Eqd xyzy =p T xy[(Ux JT x ) YT Wxy]T [(Ux JT x ) YT Wxy]pxy

+p T zy[(Uz JT z ) YT Wzy]T [(Uz JT z ) YT Wzy]pzy To simplify the exposition of the objective function over iterations that would be shown later, we introduce Bxz = (Ux JT x ) ZT Wxz, Byz = (Uxy JT y ) ZT Wyz, Bxy = (Ux JT x ) YT Wxy, and Bzy = (Uz JT z ) YT Wzy, Fxy = In Pxy, Fxz = In Pxz. Based on these notations, we

further deﬁne: Cyzxz = Byz Bxz Fxy

, Czyxy = Bzy Bxy Fxz

Using the two objective function formula Exzyz, Exyzy and the above notations, the iteration can be compactly described as follows in a rotating manner.

Step 1) Fix pxy update pyz for minimizing partial objective function Exz+Eyz. Replace pxz by pxz=Fxypyz, the quadratic term in Exzyz can be rewritten compactly as:

Eqd xzyz = p T yz BT yz Byzpyz + p T yz(Bxz Fxy)T (Bxz Fxy)pyz

= p T yz CT yzxz Cyzxzpyz Now we reach the objective function with respect to pyz:

E(pyz) = b T yzpyz Cyzxzpyz 2 (7)

Step 2) Fix pxz update pzy for minimizing partial objective function Exy+Ezy. Replace pxy by pxy=Fxzpzy, which leads to Exyzy=p T zy CT zyxy Czyxypzy. Then we have:

E(pzy) = b T zypzy Czyxypzy 2 (8)

where byz and bzy have the following forms:

byz =1T n ZT + (Ky ZT )Wyz + 1T m ZT + (Kx ZT )Wxz Fxy

bzy =1T l YT + (Kz YT )Wzy + 1T l YT + (Kx YT )Wxy Fxz

Algorithm 1 Alternating concave optimization for threeview point registration

1: Input: X, Y, Z (Y, Z are complete point sets), θ0, Tmax; 2: Output: Consistent point matching solution Pxy, Pxz, Pyz; 3: Procedure: 4: Obtain Pxy via minimizing Eq.1 for initialization; 5: for t = 1 : Tmax do 6: Fix Pxy, update Pyz (Pxz by Pxz=Pxy Pyz) via minimizing Eq.7 by concave optimization (Lian and Zhang 2012); 7: Fix Pxz, update Pzy (Pxy by Pxy=Pxz Pzy)via minimizing Eq.8 by concave optimization (Lian and Zhang 2012); 8: end for

Figure 4: Illustration for the alternating optimization order, given complete point sets B1, B2, B3 and incomplete sets A1, A2, A3. Two rows denote two possible looping alternating optimization iteration paths, respectively. Top row: (Pa1b1, Pa2b1, Pa3b1) (Pa1b2, Pa2b2, Pa3b2) (Pa1b3, Pa2b3, Pa3b3) (Pa1b1, Pa2b1, Pa3b1) ; Bottom row: (Pa1b1, Pa2b1, Pa3b1) (Pa1b3, Pa2b3, Pa3b3) (Pa1b2, Pa2b2, Pa3b2) (Pa1b1, Pa2b1, Pa3b1) .

After a series of derivation and transformation, the above two formulations enable us to employ the method used in (Lian and Zhang 2012) for solving the minimization problems of (7) and (8). Speciﬁcally, the objective function (7) or (8), in its concave quadratic program form, is ﬁrst transformed into a separable form via eigen decomposition, and then derives the convex envelope of the resulting function over a rectangular region. Then, the normal rectangular algorithm (Horst and Tuy 1996) is used, as a tailored Branchand-bound approach. Note that after the Eigen decomposition based linear transformation, the number of quadratic terms shrinks to be the number of transformation parameters, thus the method becomes more efﬁcient. Readers are referred to (Lian and Zhang 2012) for more details. The overall algorithmic chart of our method speciﬁcally for threeview point registration is described in Alg.1.

Multiple-View Extension

Now we discuss how to generalize our method to the N-view registration problem with unequal cardinality of point sets. Denote the complete point sets as {B}N i=1 = {B0, B1, . . . , BN}, and those incomplete point

Figure 5: Mean run-time (in seconds) for similarity transformation test on Chui-Rangarajan dataset by 5 methods (RPM, M-RPM, FGM, UG, M-GM). Three types of disturbance, i.e. deformation, noise, outlier are measured, respectively.

sets {A}M i=1 = {A0, A1, . . . , AM} such that the node mapping from Ai(i = 1, . . . , M) to Bj(j = 1, . . . , N) is an injection which satisﬁes the assumption in footnote (1), and the correspondences in any pair of Bi, Bj is bijection. Now we use ai to denote the index of Ai and bi for Bi. In general, the objective function suited to our speciﬁc alternating optimization method is tailored to the following form3:

ai=1 Eaibj +

bi,bj=1,bi =bj Ebibj (9)

One can choose an anchor point set from {B}N i=1 in rotation. The anchor view serves as the bridge to diffuse the information from {A}M i=1 to {B}N i=1 and within {B}N i=1. Without loss of generality, in iteration k, view Bk is chosen as the anchor view. By using p(k 1) aubk which is updated in the

previous iteration k-1, for the bijection p(k) bkbv between Bk and Bv (v =k), one can optimize its associated partial objective function E(p(k) bkbv)=Ebkbv+PM u=1 Eaubv by rewriting

p(k) aubv=(Iau P(k 1) aubk )p(k) bkbv to replace p(k) aubv with p(k) bkbv. In

consequence, one can update p(k) bkbv by applying the concave optimization algorithm similar to solving the problem of (7) and (8). Accordingly, p(k) aubk+1 is also updated which would be used in the next iteration. To further concretize our idea, Fig.4 illustrates two possible iteration rotation orders given three complete views B1, B2, B3 and three incomplete A1, A2, A3. In practice, in order to obtain such complete point sets given a large number of views, one can divide the whole batch into subsets, and performs our method in each of the subset where the complete views exist.

Experiments We implement all the competing methods in Matlab R2009 on a desktop with dual 2.53GHz CPU and 3G memory. Since all methods output point correspondences, we use the correspondences computed by a method to ﬁnd the best transformation between the two point sets, and deﬁne the error as the mean of the Euclidean distances between the transformed model points and their ground truth.

3For Eaiaj where the mapping from Ai to Aj is an injection, they cannot be incorporated under the current alternating optimization framework. This is because it is possible that the derived Paiaj=Paibk Pbkaj used in Eaiaj would break the assumption Paiaj1=1 if Ai, Aj corresponds to different subsets.

The synthetic data used in this paper is generated by the template from Chui-Rangarajan data sets (Chui and Rangarajan 2003). The tested real image data is from CMU hotel and house sequences. The other two sequences (volvo and house) from the pose database (Vikstn et al. 2009). In line with (Lian and Zhang 2012) s experimental protocol, rotation-invariant and rotation non-invariant experiments are performed by setting a regularized and nonregularized transformation parameter θ, respectively. For rotation-invariant tests, we use 2-D similarity transformation which has been shown a good tradeoff between transformation ﬂexibility and matching robustness. For rotation sensitive tests, we use afﬁne transformation in our method as it has small number of parameters while has a certain ﬂexibility to handle non-rigid transformation. For comparison, we use the two-view registration RPM formulation (Lian and Zhang 2012) as the baseline, and the recent factorized graph matching method (FGM) (Zhou and la Torre 2012). Note these two-view methods both cannot ensure the matching consistency across different viewpairs. The recent graph matching method (M-GM) (Yan et al. 2013) is also tested by following the authors settings. We further evaluate UG (Caetano and Caelli 2006) and CPD (Myronenko and Song 2010) for similarity and afﬁne transformation tests, respectively. We term the proposed multiview robust point matching as M-RPM (red curves in plots). Note only the M-GM method and our method can ensure the matching consistency among all the compared methods.

Experiments on Synthetic Data Similarity transform test For similarity transform test which is rotation invariant (non-regularized θ in our method), the RPM (Lian and Zhang 2012), FGM (Zhou and la Torre 2012), M-GM (Yan et al. 2013) and UG (Caetano and Caelli 2006) are evaluated by non-rigid deformation, noise and outliers respectively. For each test, in addition with these disturbances, random rotations are imposed on the other two of the three point sets respectively. The matching errors, which follow the protocol of (Lian and Zhang 2012), are shown in Fig.2. It can be seen that for all deformation, noise and outlier tests, our method often outperforms other point-matching methods. The average run time shown in Fig.5. While Fig.7 shows a visual example of how accuracy is boosted during two iterations by our method, meanwhile, still being more efﬁcient than the higher-order graph matching methods: FGM and M-GM, although outperform in the outlier test, yet being the slowest across tests. Afﬁne transform test For rotation sensitive afﬁne transformation (regularized θ in our method), the matching errors are displayed in Fig.3, where the deformation setting is similar to the similarity transformation test with no random rotation imposed. It can be seen that our method outperforms for deformation and noise tests signiﬁcantly while being less effective on the outlier tests. Compared with the RPM method, the performance gain is consistent and notable. Unequal size test We perform clutter test where one of the three point sets has less points than the other two. Fig.8 shows the improved alignment between the smaller-size object against the whole template, for the similarity transfor-

Figure 6: Examples of synthetic point sets of ﬁsh & Chinese character from (Chui and Rangarajan 2003). From left to right: reference template, deformation, outlier and noise.

mation deform test on the used synthetic data. Compared with the baseline RPM, our method s improvement is gained by incorporating an additional, but also deformed whole template. We further perform this test over different completion level as shown in the third column of Fig.8.

Figure 7: Iteration illustration: two iterations on similarity testing. Left two: registration results; right two: resulting estimated binary correspondence matrix, whose ground truth is the identity matrix.

Real Image Sequences

The matching accuracies (fraction of correct correspondences) against different frame intervals and view rotation angles by using different methods are shown in Fig.9. It can be seen that our method also achieves the best accuracy for most of the test cases. Note other methods like UG and the baseline RPM suffer from the performance ﬂuctuation due to the periodic appearance variation of the testing image sequence, while our method tends to be more robust.

Discussion and Summary

The above experimental results suggest that the proposed method mostly outperform peer point matching methods, and performs competitively against graph matching methods, which utilize the second-order information with an exponential exploration time costs (Zhou and la Torre 2012; Yan et al. 2013). As a result, our method is more efﬁcient than the comparing graph matching approaches.

We have proposed a novel formulation and an efﬁcient optimization solution for multi-view registration. It ensures consistent matching solutions, and is able to handle point sets of unequal sizes. Extensive tests on both synthetic and real

Figure 8: Matching errors of synthetic clutter test by similarity transformation (three point sets are used). From left to right: i) clutter before matching; ii) matching using the twoview RPM baseline (Lian and Zhang 2012); iii) our method; iv) average matching errors by varying the ratio of sub-set to whole-set over 10 tests for 3 methods: RPM, M-RPM, UG.

Figure 9: Matching accuracy on CMU and Pose datasets by 5 methods (RPM, M-RPM, FGM, UG, M-GM). From left to right: CMU hotel, Pose house, Pose volvo car.

datasets are performed with more promising results compared with other point registration methods, while being more efﬁcient than higher-order graph matching methods.

Appendix Why Eq.2 depends on Pxy1n = 1m Here we give the derivation details of Eq.2 to show why it depends on Pxy1n = 1m. The matrix form of Eq.1 is: X

i,j Pij(y T j yj + θT JT xi Jxiθ 2θT JT xiyj) + (θ θ0)T H(θ θ0)

=θT JT x (diag(P1n) Id)Jxθ 2θT JT x (P Id)Y + 1T m P Y + g(θ)

=θT JT x Jxθ 2θT JT x (P Id)Y + 1T m P Y + (θ θ0)T H(θ θ0)

Zeroing the following partial derivative, we obtain Eq.2.

2(JT x Jx + H)θ 2JT x (P Id)Y 2Hθ0

References A.Torsello; Rodola, E.; and Albarelli, A. 2011. Multiview registration via graph diffusion of dual quaternions. In CVPR. Benjemaa, R., and Schmitt, F. 1997. Fast global registration of 3d sampled surfaces using a multi-z-buffer technique. In Int. conf. on 3D Digital Imaging and Modeling. Bergevin, R.; Soucy, M.; Gagnon, H.; and Laurendeau, D. 1996. Towards a general multi-view registration technique. IEEE Trans. PAMI.

Caetano, T., and Caelli, T. 2006. A uniﬁed formulation of invariant point pattern matching. In ICPR. Chen, Y., and Medioni, G. 1992. Object modeling by registration of multiple range images. IVC. Cho, M.; Lee, J.; and Lee, K. M. 2010. Reweighted random walks for graph matching. In ECCV. Chui, H., and Rangarajan, A. 2003. A new point matching algorithm for non-rigid registration. CVIU. Dorai, C.; Wang, G.; Jain, A. K.; and Mercer, C. 1998. Registration and integration of multiple object views for 3d model construction. IEEE Trans. PAMI. Horst, R., and Tuy, H. 1996. Global Optimization, Deterministic Approaches. Springer Verlag. Li, H., and Hartley, R. 2007. The 3d-3d registration problem revisited. In ICCV. Lian, W., and Zhang, L. 2012. Robust point matching revisited: A concave optimization approach. In ECCV. Myronenko, A., and Song, X. 2010. Point set registration: Coherent point drift. IEEE Trans. PAMI. Pooja, A., and Govindu, V. M. 2010. A multi-view extension of the icp algorithm. In Indian Conference on Computer Vision, Graphics and Image Processing. Pulli, K. 1999. Multiview registration for large data sets. In Int. conf. on 3D Digital Imaging and Modeling. Sharp, G.; Lee, S.; and Wehe, D. 2004. Multiview registration of 3d scenes minimizing error between coordinate frames. IEEE Trans. PAMI. Tian, Y.; Yan, J.; Zhang, H.; Zhang, Y.; Yang, X.; and Zha, H. 2012. On the convergence of graph matching: Graduated assignment revisited. In ECCV. Tsin, Y., and Kanade, T. 2004. A correlation-based approach to robust point set registration. In ECCV. Turk, G., and Levoy, M. 1994. Zippered polygon meshes from range images. In SIGGRAPH. Vikstn, F.; Forssn, P.; Johansson, B.; and Moe, A. 2009. Comparison of local image descriptors for full 6 degree-offreedom pose estimation. In ICRA. Yan, J.; Tian, Y.; Zha, H.; Yang, X.; and Zhang, Y. 2013. Joint optimization for consistent multiple graph matching. In ICCV. Yan, J. C.; Li, Y.; Liu, W.; Zha, H. Y.; Yang, X. K.; and Chu, S. M. 2014. Graduated consistency-regularized optimization for multi-graph matching. In ECCV. Zhang, Z. 1994. Iterative point matching for registration of free-form curves and surfaces. IJCV. Zhou, F., and la Torre, F. D. 2012. Factorized graph matching. In CVPR.