# efficient_querying_from_weighted_binary_codes__e3cf1bd2.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Efﬁcient Querying from Weighted Binary Codes

Zhenyu Weng, Yuesheng Zhu Communication and Information Security Laboratory, Shenzhen Graduate School, Peking University {wzytumbler, zhuys}@pku.edu.cn

Binary codes are widely used to represent the data due to their small storage and efﬁcient computation. However, there exists an ambiguity problem that lots of binary codes share the same Hamming distance to a query. To alleviate the ambiguity problem, weighted binary codes assign different weights to each bit of binary codes and compare the binary codes by the weighted Hamming distance. Till now, performing the querying from the weighted binary codes efﬁciently is still an open issue. In this paper, we propose a new method to rank the weighted binary codes and return the nearest weighted binary codes of the query efﬁciently. In our method, based on the multi-index hash tables, two algorithms, the table bucket ﬁnding algorithm and the table merging algorithm, are proposed to select the nearest weighted binary codes of the query in a non-exhaustive and accurate way. The proposed algorithms are justiﬁed by proving their theoretic properties. The experiments on three large-scale datasets validate both the search efﬁciency and the search accuracy of our method. Especially for the number of weighted binary codes up to one billion, our method shows a great improvement of more than 1000 times faster than the linear scan.

Introduction With the explosive growth of data, binary codes are widely used to represent the data due to their small storage and efﬁcient computation. Given a query, the nearest binary codes can be ranked and returned efﬁciently by computing the Hamming distance between the query and the binary codes. BRISK (Leutenegger, Chli, and Siegwart 2011), ORB (Rublee et al. 2011), and other binary image descriptors (Balntas, Tang, and Mikolajczyk 2018) are designed to represent the image data, and successfully used in various applications, including image matching, 3D reconstruction and object recognition. End-to-end feature learning methods (Li et al. 2019; Song et al. 2018) based on the neural networks extract the binary codes from the images and are widely used in image retrieval and cross retrieval. In addition to these speciﬁc image binary codes, hashing methods (Liu et al. 2014; 2017; Lin et al. 2019; Liu et al. 2019; Wang et al. 2018) are used to map different

Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

high-dimensional feature vectors into compact binary codes. Since these feature vectors may represent image data, video data, or other multimedia data, the hashing methods can be used in various multimedia retrieval applications. However, the number of possible Hamming distance is limited and different binary codes may share the same Hamming distance to the given binary query. To alleviate this ambiguity problem and further improve the performance of binary codes, weighted binary codes are used (Fan et al. 2013; Gordo et al. 2014; Zhang et al. 2013). By assigning bitwise weights to each bit of binary codes, the distance between a pair of binary codes is calculated by weighted Hamming distance instead of Hamming distance. For example, (Huang, Wei, and Zhang 2017; Fan et al. 2013) are designed to learn the weights for the binary image descriptors to improve their discriminative power for image matching. And (Duan et al. 2015; Weng et al. 2016) are designed to learn the weights for the binary codes generated by different hashing methods to improve their search accuracy for multimedia retrieval. Although weighted binary codes can alleviate the ambiguity problem, querying from the binary codes by weighted Hamming distance is slower than that by Hamming distance. To accelerate the querying process from the weighted binary codes, some methods (Gordo et al. 2014) use lookup tables to compute the query-independent values in advance. However, it is still an exhaustive linear scan. Some methods (Duan et al. 2015; Norouzi, Punjani, and Fleet 2014) use Hamming distance to ﬁnd the neighbors that have the smallest Hamming distance to the query and rank them according to the weighted Hamming distance. This non-exhaustive way is fast but cannot return the nearest weighted binary codes of the query accurately, resulting in a degraded performance of weighted binary codes in the application. In this paper, we propose a new method to rank the weighted binary codes and return the nearest weighted binary codes of the query in a non-exhaustive but accurate way. The diagram of our method is shown in Fig. 1. Based on the multi-index hash tables (Norouzi, Punjani, and Fleet 2014) on the binary code substrings, our method can efﬁciently choose the candidates in each table and merge the candidates to select the nearest weighted binary codes of the query. Theoretical analysis is provided to prove the our

h1,1 Ă h1,b/m ĂĂ h1,(m-1)*b/m+1 Ă h1,b Ű Ű Ű Ű Ű Ű hn,1 Ă hn,b/m ĂĂ hn,(m-1)*b/m+1 Ă hn,b substring 1 ĂĂ substring m

database codes

substring as index.

IDs stored in the table buckets.

multi-index hash tables

weighted hashing method

table bucket finding algorithm for each table

indices of the buckets to be probed

table merging algorithm

return K identifiers of nearest weighted binary codes

candidates in the buckets

no binary query and weights

substring as index.

IDs stored in the table buckets.

if K-NN are found

candidates in the max-heap yes

Figure 1: The diagram of our method to ﬁnd the K nearest weighted binary codes of the query.

method can return the same ranking result as the linear scan does on the weighted binary codes. And the experiments show that our method is much faster than the linear scan.

Related Work Multi-Index Hash Tables on Binary Codes To avoid the exhaustive linear search on the binary codes, multi-index hash tables (Norouzi, Punjani, and Fleet 2014) are built to accelerate the search on the binary codes and to return the K nearest binary codes of the query in a nonexhaustive way. In the multi-index tables (Norouzi, Punjani, and Fleet 2014), to index the binary codes from the database, m different hash tables are built based on m disjoint substrings of the binary codes as the index. If a binary code differs from the query by r bits, it is an r-neighbor of the query. And the multi-index tables can ﬁnd the r-neighbors of the query efﬁciently by probing each table. To return the K nearest binary codes of the query, the Hamming search radius r is progressively increased to ﬁnd the r-neighbors of the query, until K nearest binary codes are found. In (Norouzi, Punjani, and Fleet 2014), the author mentioned that the multi-index tables can be used to return the top K weighted binary codes by using Hamming distance to ﬁnd the candidates that have the smallest Hamming distance to the query and culling them according to the weighted Hamming distance. However, this method cannot return the K nearest weighted binary codes accurately. When increasing the search radius progressively until K neighbors are found, it guarantees that the binary codes that are found have smaller Hamming distance to the query than the ones

that are not found. In contrast, it cannot guarantee that these binary codes have the smaller weighted Hamming distance than the ones that are not found. The binary codes which have the larger Hamming distance from the query may have the smaller weighted Hamming distance.

Querying from Weighted Binary Codes As shown in Fig. 1, based on the multi-index hash tables, our method is composed of the table bucket ﬁnding algorithm and the table merging algorithm. Since we focus on ﬁnding the nearest neighbors of the query in the weighted Hamming space, in the following, we use the K-Nearest Neighbors (K-NN) of the query to denote the K nearest weighted binary codes of the query.

Table Bucket Finding Algorithm We start with a single-index hash table and propose a table bucket ﬁnding algorithm to ﬁnd the table buckets in the single-index table. To further solve the long-code problem mentioned in (Norouzi, Punjani, and Fleet 2014), we extend to the multi-index hash tables and use the table bucket ﬁnding algorithm in each table. A table merging algorithm is proposed to merge the candidates from each table. Assume a binary query q {0, 1}b, a binary code g {0, 1}b and the weight functions wi( ) for each bit are given, where b is the length of the binary code and wi : {0, 1} R. The weighted Hamming distance between the query q and the binary code g is deﬁned as:

i=1 wi(qi gi), (1)

where is an xor operation, wi( ) is a weight function for the ith bit, qi is the ith bit of q, and gi is the ith bit of g. Instead of ﬁnding the K-NN of the query q exhaustively, a single-index hash table is built by using the binary codes as the index of the hash table buckets. We probe the buckets in order from smallest to largest according to their weighted Hamming distance to the query, and take the identiﬁers as candidates in each probed table bucket until K candidates are found. These K candidates are the K-NN of the query. The whole process of ﬁnding the buckets in order from smallest to largest can be regarded as multiple sequences combination problem (one bit represents one sequence). A algorithm (Matsui, Yamasaki, and Aizawa 2018; Babenko and Lempitsky 2015) is used to solve the multiple sequences combination problem. However, this algorithm is not suitable in this situation. The algorithm can only traverse a few sequences (e.g. 2 or 4) simultaneously to ﬁnd the combination composing the bucket index that have the smallest weighted Hamming distance to the query. But in this situation, we have b sequences where b is much larger than 4 such that the traversal space is very large. Based on the characteristic of the weighted binary codes, we propose a table bucket ﬁnding algorithm to ﬁnd the bit combination which can compose the bucket index with the smallest weighted Hamming distance to the query. In the process of searching for the nearest neighbors of the query, since the query is ﬁxed in each comparison between the query and the binary codes, the weight values for the xor result between the binary codes and the query can be precomputed and stored. Hence, Eqn. (1) is rewritten as

i=1 ˆwi(gi), (2)

where gi is the ith bit of g, ˆwi : {0, 1} R is a function to store the pre-computed weight value for the ith bit and is deﬁned as: ˆwi(0) = wi(0 qi) ˆwi(1) = wi(1 qi). (3)

As the input values of the function ˆwi( ) are 0 or 1, correspondingly, there are two output values of ˆwi( ). To construct a b-bit binary code h= [h1 . . . hb] that has the smallest weighted Hamming distance (smallest sum of weights) to the query, each bit hi of h is obtained as

hi = 0 ˆwi(0) ˆwi(1) 1 otherwise. (4)

When the ith bit of h is changed (i.e. from 0 to 1 or from 1 to 0), we use hi to denote the changed bit. When the bit is changed, the weight for this bit will increase. The increased weight Δ ˆwi of the ith bit is deﬁned as

Δ ˆwi = ˆwi( hi) ˆwi(hi). (5)

The bits are ranked from smallest to largest according to Δ ˆwi in advance. The leftmost bit has the smallest increased weight. After ranking the bits and constructing the smallest binary code h, to give the buckets to be probed in order from smallest to largest, we maintain a priority queue. The top of the

priority queue is the binary code that has the smallest sum of weights in the queue. h is the ﬁrst one that is pushed into the priority queue. When taking out the top binary code h from the priority queue and probing the corresponding hash bucket, two new binary codes are constructed from h by two different operations and pushed into the priority queue, respectively. Operation 1 is to construct a binary code by changing the unchanged bit right next to the rightmost changed bit of h if the rightmost changed bit is not at the end of the current binary code. For example, assume h = [h1 . . . hr . . . hb], where hr is rightmost changed bit. Then, the new binary code is constructed as ˆh = [h1 . . . hr hr+1 . . . hb]. Operation 2 is to construct a binary code by moving the rightmost changed bit of h to the next bit if the position of the rightmost changed bit is not at the end. For example, assume h = [h1 . . . hr . . . hb], where hr is rightmost changed bit. Then, the new binary code is constructed as h = [h1 . . . hr hr+1 . . . hb]. For both operations, the new binary code has a larger sum of weights than the current one since the bits are ranked from smallest to largest according to Eqn.(5) in advance. It should be noted that for the initial binary code h which has no changed bit, only the ﬁrst operation is permitted, which means to change the ﬁrst bit of the binary code. The pseudocode for querying with the single-index hash table is shown in Alg. 1. Init() is a function that constructs the binary code h which has the smallest sum of weights according to the query q and the weights w, and generates an order that denotes the positions of bits from smallest to largest according to Eqn.(5). Operation1() and Operation2() are two functions corresponding to above two operations to generate the new binary codes, respectively.

Algorithm 1 Querying with single-index hash table Input: q, table, K, weights w Output: u a set of ranked identiﬁers 1: u 2: pri que priority queue 3: [pri que, order] Init(q, w) 4: while |u| < K do 5: code pri que.top() 6: pri que.pop() remove top item from queue 7: pri que.push(Operation1(code, order)) 8: pri que.push(Operation2(code, order)) 9: ˆu table.bucket(code) identiﬁers in the bucket 10: u.extend(ˆu) 11: end while

To prove that our algorithm can always ﬁnd the binary code that has the smallest sum of weights among the unprobed binary codes, we begin with the following corollary. Corollary 1 : Every binary code can be generated by above two operations. Proof. A detailed proof is provided in Appendix A in the supplemental material. We prove the correctness of our algorithm as follows:

Theorem 1 : The binary code that has the smallest sum of weights among the un-probed binary codes is always in the priority queue. Proof. A detailed proof is provided in Appendix B in the supplemental material.

Table Merging Algorithm As described in (Norouzi, Punjani, and Fleet 2014), when the length of the binary code increases, the range of the table index expands and there are more table buckets in the table where a lot of buckets are empty. Traversing these empty table buckets is inefﬁcient. To solve this problem, following (Norouzi, Punjani, and Fleet 2014), m different hash tables are built based on the m disjoint substrings of the binary codes as the index in our method. The length of each substring is b/m or b/m . For convenience, we assume that b can be divided by m, and that the substrings comprise continuous bits. Before describing the following table merging algorithm, we deﬁne f : {0, 1}b/m R as a function to calculate the sum of weights of the substring s in the table, which is:

j=1 ˆwj(sj), (6)

where sj is the jth bit of s and ˆwj is the weight function for the jth bit in the corresponding table. When given a query, each table maintains a priority queue according to the sum of weights of the corresponding substring. The priority queues operate the same as in Alg. 1 to ﬁnd the un-probed bucket which is indexed by the corresponding substring and has the smallest sum of weights. Then, in each round we take out the top substring of each priority queue. By treating the substring as the index of the hash table bucket in the corresponding table, we can probe the table buckets and take the identiﬁers in each bucket as the candidates. To merge the candidates from each bucket and determine if the K-NN of the query are found, a K-size max-heap is built to ﬁlter the candidates. The root node of the maxheap has the largest sum of weights in the heap. Assume the node r {0, 1}b in the max-heap is in the form of r = [r1, . . . , rm] where ri is the substring of r in the ith table. And a function g : {0, 1}b R to calculate the sum of weights of the node is deﬁned as

i=1 f(ri). (7)

For each round, when the identiﬁers are taken from each table, they are compared to the root node in the max-heap. If an identiﬁer ˆr has a smaller sum of weights than the root node r (i.e. g(ˆr) < g(r)), the root node is thrown away and the identiﬁer is inserted into the max-heap. The process continues for multiple rounds until the root node of the maxheap is smaller or equal to a threshold. In detail, assume there are m tables and a b-bit binary code h is partitioned into m disjoint substrings s. When the top

substring si of the ith priority queue is taken out, the queue will have the new top substring ˆsi. The associated identiﬁers from the table bucket si of the ith table are taken out and compared with the root node of the max-heap. The sum of weights from the top substring of each current priority queue is calculated as

i=1 f(ˆsi). (8)

If the max-heap has K nodes and the root node r of the max-heap is smaller or equal to the sum of weights from the top substring of each current priority queue (i.e. g(r) S), the top K nearest neighbors are found and the process stops. We prove that the table merging algorithm can ﬁnd the K-NN of the query with Theorem 2. Theorem 2 : The binary codes that are found and stored in the max-heap have the smallest sum of weights among all binary codes. Proof. A detailed proof is provided in Appendix C in the supplemental material. We can further accelerate the searching process by reducing the number of the hash table buckets to be probed. In every round, assume the priority queues are ranked in some order sorder. If the top substrings from the ﬁrst j queues are taken out, deﬁne the current sum of weights as below

i=1 f(ˆsorder[i])+

i=j+1 f(sorder[i]). (9)

Obviously, ˆS S according to Eqn. (7). The searching process terminates when g(r) ˆS. It has been proved that the binary codes that are found are the smallest among all the binary codes. From the equation, we can see that when the substring in the ith queue is taken out, ˆS will increase Δfi, which is deﬁned as

Δfi = f(ˆsi) f(si). (10)

We want to make ˆS smallest among all the orders such that the root node of the max-heap can be smaller or equal to ˆS faster. Hence, the order can be obtained by ranking the priority queues from smallest to largest according to Eqn.(10). The pseudocode for querying with multi-index hash tables is shown in Alg. 2. Init(), Operation1() and Operation2() are the same functions as in the Alg. 1. m denotes the number of substrings for the binary code. table[], pri que[], order[] denotes a set of tables, a set of priority queues, and a set of bit rankings for each substring, respectively. Split() is a function that splits the binary code and the weights into m parts. max heap.satisﬁed() denotes whether max-heap has K nodes and the root node of the max-heap satisﬁes the stopping criterion. que order denotes the order of the priority queues to be checked. Sort() is a function that determines the ranking of the priority queues from smallest to largest according to Eqn.(10). As our method performs the querying process from the weighted binary codes based on the multi-index hash tables, we call it Multi-Index Weighted Querying (MIWQ).

Algorithm 2 Querying with multi-index hash tables Input: q, table[], K, m, weights w Output: max heap 1: max heap 2: [pri que[], order[]] Split(Init(q,w),m) 3: while !max heap.satisﬁed() do 4: for i 1 to m do 5: code[i] pri que[i].top() 6: pri que[i].pop() 7: pri que[i].push(Operation1(code[i], order[i])) 8: pri que[i].push(Operation2(code[i], order[i])) 9: end for 10: que order Sort(pri que[].top(), code[]) 11: for i 1 to m do 12: cur = que order[i] 13: max heap.insert(table[cur].hash(code[cur])) 14: if max heap.satisﬁed() then 15: break 16: end if 17: end for 18: end while

Table 1: The precision results on Places205.

precision (%) 1-NN 10-NN 100-NN

Baseline 21.56 22.23 19.83 MIH 22.81 23.20 20.86 Linear Scan 25.29 24.97 22.64 MIWQ 25.29 24.97 22.64

Baseline 31.17 29.71 26.67 MIH 31.41 30.27 27.46 Linear Scan 34.46 31.96 28.72 MIWQ 34.46 31.96 28.72

EXPERIMENTS Datasets and Environment The experiments are performed on the three datasets: Places205, GIST1M and SIFT1B. The Places205 dataset (Zhou et al. 2014) is a scenecentric dataset with 205 scene categories. For each category, we randomly choose 5,000 images for search and 50 images as queries. Hence, we have 1,025,000 images for search and 10,250 queries. Each image is represented by a 128-D feature (Cakir et al. 2017). The features are extracted from the fc7 layer of Alex Net (Krizhevsky, Sutskever, and Hinton 2012) pre-trained on Image Net and reduced to 128 dimensions by PCA. GIST1M dataset (Jegou, Douze, and Schmid 2011) contains 1 million 960-D GIST descriptors (Oliva and Torralba 2001) which are global descriptors, and extracted from Tiny image set (Torralba, Fergus, and Freeman 2008). The dataset contains 1000 queries. SIFT1B dataset (Jegou, Douze, and Schmid 2011) contains 1 billion 128-D SIFT descriptors (Lowe 2004) and 10000 queries. In the experiments, to evaluate the efﬁciency and the ac-

Table 2: The precision results on GIST1M.

precision (%) 1-NN 10-NN 100-NN

Baseline 11.90 7.57 5.11 MIH 12.10 8.46 5.92 Linear Scan 13.20 9.94 7.31 MIWQ 13.20 9.94 7.31

Baseline 21.10 14.47 10.16 MIH 21.50 16.11 11.55 Linear Scan 24.90 19.14 13.65 MIWQ 24.90 19.14 13.65

32-1 32-1032-100 64-1 64-1064-100 bit-nearest neighbors

candidates (log2)

32-1 32-1032-100 64-1 64-1064-100 bit-nearest neighbors

table buckets (log2)

(a) candidates (b) buckets

Figure 2: Comparison between MIWQ and MIH on SIFT1B.

curacy of different querying methods on the weighted binary codes, the classical data-independent hashing algorithm Locality-Sensitive Hashing (LSH) (Andoni and Indyk 2006) is used to map high-dimensional vectors into binary codes, and a weighted hashing method, Asymmetric Distance (Asym) (Gordo et al. 2014) is used to generate the weights for each bit of binary codes. All the experiments are run on a single core Intel Core-i7 CPU with 32GB of memory. The comparison of the querying methods on other binary codes and other weights is provided in Appendix D in the supplementary material to show the generality of our method.

Comparison to Different Querying Methods Precision@K is usually used to measure the accuracy of the approximate K-NN search. (Wang et al. 2018; Matsui, Yamasaki, and Aizawa 2018). Here, we use precision@K to evaluate whether our method can return the same results as the linear scan returns, and compare the performance of weighted binary codes in the approximate K-NN search with that of binary codes. The precision@K is deﬁned as the fraction of the true retrieved neighbors to the retrieved neighbors. It is formulated as follows

precision@K = the true retrieved neighbors

For Places205, the ground truth refers to as the true neighbors the identiﬁers that have the same label as the query. For GIST1M and SIFT1B, the ground truth refers to as the true neighbors the top 1000 identiﬁers selected by linear scan with the Euclidean distance from the query in the original space, i.e. Euclidean space. The precision@K results on Places205 and GIST1M are shown in Table 1 and Table 2, respectively. In the ta-

Table 3: The average time for the query on Places205.

speed-up factors for K-NN 1-NN 10-NN 100-NN time(ms) speed-up factor time(ms) speed-up factor time(ms) speed-up factor

Linear Scan 23.06 1.0 23.06 1.0 23.06 1.0 MIH 0.11 209.6 0.21 109.8 0.52 44.3 MIWQ 0.11 209.6 0.21 109.8 0.55 41.9

Linear Scan 33.83 1.0 33.83 1.0 33.83 1.0 MIH 0.51 66.3 1.03 32.84 2.4 14.0 MIWQ 0.63 53.6 1.42 23.82 3.65 9.2

Table 4: The average time for the query on GIST1M.

speed-up factors for K-NN 1-NN 10-NN 100-NN time(ms) speed-up factor time(ms) speed-up factor time(ms) speed-up factor

Linear Scan 22.17 1.0 22.17 1.0 22.17 1.0 MIH 0.1 221.7 0.16 138.5 0.37 59.9 MIWQ 0.12 184.7 0.25 88.6 0.77 28.7

Linear Scan 39.36 1.0 39.36 1.0 39.36 1.0 MIH 0.87 45.2 1.65 23.8 3.15 12.4 MIWQ 1.84 21.3 3.9 10.0 8.17 4.8

Table 5: The average time for the query on Places205 with longer binary codes.

speed-up factors for K-NN 1-NN 10-NN 100-NN time(ms) speed-up factor time(ms) speed-up factor time(ms) speed-up factor

Linear Scan 65.93 1.0 65.93 1.0 65.93 1.0 MIH 2.08 31.6 4.19 15.7 8.50 7.7 MIWQ 3.13 21.0 6.71 9.8 15.25 4.3

Linear Scan 103.92 1.0 103.92 1.0 103.92 1.0 MIH 5.68 18.2 11.80 8.8 21.22 4.8 MIWQ 9.30 11.1 21.97 4.7 39.28 2.6

Table 6: The precision results for the query on SIFT1B.

bit method precision (%) 1-NN 10-NN 100-NN

Baseline 1.61 1.66 1.77 MIH 3.04 2.04 1.48 Linear Scan 3.25 2.77 2.66 MIWQ 3.25 2.77 2.66

Baseline 7.62 8.26 9.07 MIH 25.18 20.22 10.52 Linear Scan 26.77 21.18 14.13 MIWQ 26.77 21.18 14.13

bles, Baseline denotes querying from the binary codes according to Hamming distance, Linear Scan denotes querying from the binary code by the linear scan according to weighted Hamming distance, and Multi-Index Hashing (MIH) (Norouzi, Punjani, and Fleet 2014) is a nonexhaustive but inexact querying method for the binary codes according to weighted Hamming distance. These querying methods are all implemented in C++. For MIH and our

method, MIWQ, we use the same heuristic (Norouzi, Punjani, and Fleet 2014) to determine the number of the substrings m, which is b/log2n where b is the length of the binary code and n is the data size. According to the results, MIH can achieve higher search accuracy than Baseline, but is inferior to MIWQ. As MIH cannot return the K nearest weighted binary codes accurately, MIH is inferior to MIWQ. Since MIWQ achieves the same search accuracy as Linear Scan, it shows that MIWQ can return the K nearest weighted binary codes of the query accurately. The speed-up factor is used to measure how fast our method and MIH are compared to the linear scan on the weighted binary codes. The speed-up factor is deﬁned as dividing the run-time cost of the linear scan by the run-time cost of the test method, which is formulated as follows

speed up factor = time cost of linear scan

time cost of test method (12)

Table 3 and Table 4 shows the average time for each query of returning the different amounts of Nearest Neighbors (NN) on Places205 and GIST1M, respectively. Linear

Table 7: The average time for the query on SIFT1B.

speed-up factors for K-NN 1-NN 10-NN 100-NN time(ms) speed-up factor time(ms) speed-up factor time(ms) speed-up factor

Linear Scan 22380.15 1.0 22380.15 1.0 22380.15 1.0 MIH 6.13 3650.9 6.14 3644.9 6.38 3507.8 MIWQ 6.15 3639.0 6.31 3546.7 6.53 3427.3

Linear Scan 43623.33 1.0 43623.33 1.0 43623.33 1.0 MIH 7.77 5614.3 11.42 3819.9 22.96 1899.9 MIWQ 7.78 5607.1 12.65 3448.4 32.17 1356.0

Table 8: Time cost (ms) on SIFT1B. bit-K NN MIWQ PQTable PQTablemax heap 32-1 6.15 51.31 50.44 32-10 6.31 51.22 50.48 32-100 6.53 51.43 50.55 64-1 7.78 57.26 54.53 64-10 12.65 64.55 61.21 64-100 32.17 92.11 85.33

Scan is accelerated by adopting the look-up tables (Gordo et al. 2014). From the results, we can see that MIH and MIWQ are both faster than Linear Scan in all the cases. MIWQ is comparable or a litter inferior to MIH in the average time. From the tables, we can see that the query time of both MIH and MIWQ for 100-NN is larger than that for 1-NN. To return more neighbors about the query, more buckets need to be probed, resulting in a larger time cost.

Case of Longer Binary Codes

In the above experiments, we analyze the performance of our method for 32 bits and 64 bits, which are the commonly used length of binary codes for the hashing methods. In some situations, longer binary codes (such as 128 bits and 256 bits) are used to achieve higher search accuracy but with additional storage cost. Here, we analyze the performance of our method in the case of long binary codes. Table 5 shows the average time for the query on Places205. For 128 bits and 256 bits, MIWQ can still accelerate the search on the binary codes. By comparing Table 5 to Table 3, the speed-up factors for 128 bit and 256 bits are smaller than the ones for 32 bits and 64 bits.

Case of Larger Dataset

Table 6 shows the precision@K results on SIFT1B. From the results, we can see that MIWQ still achieves better search accuracy than MIH and Baseline. Table 7 shows the average time for the query on SIFT1B. From the results, we can see that MIH and MIWQ both have a large improvement on the speed compared to linear scan. MIWQ achieves almost the same time cost as MIH does for 32 bits and 64 bits. In the aspect of comparing candidates, MIWQ compares candidates by using weighted Hamming distance, while MIH compares candidates by using

Hamming distance at ﬁrst and then culls candidates by using weighted Hamming distance. Hence, MIH has a smaller time cost than our method. However, the factors to affect the search efﬁciency is not only the distance computation, but also the number of the candidates to be compared and the number of the table buckets to be probed. The number of candidates and the number of table buckets are shown in Fig. 2. Since the average number of candidates and table buckets in our method are both smaller than those in MIH, our method can be almost as fast as MIH.

Recently, PQTable (Matsui, Yamasaki, and Aizawa 2018) is proposed to perform an efﬁcient search for Product Quantization (PQ) (Jegou, Douze, and Schmid 2011) which is another encoding method. With some modiﬁcations, PQTable can be used for querying from the weighted binary codes. The binary codes are split into disjoint parts each of which consists of continuous 8-bit binary codes. Then, each part can be regarded as a codebook, and PQTable is applied. Since PQTable and our method both can return the nearest weighted binary codes of the query accurately, we compare them with respect to the running time. As PQTable is also based on the multi-index tables, to further explore the difference between our method and PQTable, our table merge algorithm is applied to PQTable and replaces the table merge algorithm of PQTable, which is dubbed as PQTablemax heap. Table 8 shows the time comparison between MIWQ, PQTable and PQTablemax heap. According to the results, our table merging algorithm is faster than that of PQTable, which shows that our table merging algorithm can terminate the process by determining whether the K-NN of the query have been found faster than that of PQTable. Comparing MIWQ with PQTablemax heap, with same table merging algorithm, MIWQ is faster than PQTablemax heap, especially for the 32-bit case. As MIWQ and PQTablemax heap both perform the exact K-NN search, the order of table buckets to be probed is the same. The difference between them is the process to ﬁnd the next hash table bucket. Since our method exploits the characteristics of the weighted binary codes, the bucket candidate space to traverse from our method is smaller than that of PQTable. Hence, our method can ﬁnd the next smallest un-probed binary bucket faster than PQTable.

Conclusion In this paper, a new querying method is proposed to return the nearest weighted binary codes of the query in a nonexhaustive way. The method consists of two algorithms, the table bucket ﬁnding algorithm and the table merging algorithm. The former one is designed to consecutively ﬁnd the un-probed table buckets, and the latter one is developed to merge the candidates from each table. The experiments show that our method can produce the same querying results as linear scan does with a large time speed-up on the largescale dataset which includes up to 1 billion data points.

Acknowledgements This work is supported by Shenzhen Municipal Development and Reform Commission (Disciplinary Development Program for Data Science and Intelligent Computing), Key Area Research and Development Program of Guangdong Province (#2019B010137001), NSFC-Shenzhen Robot Jointed Founding (U1613215).

References Andoni, A., and Indyk, P. 2006. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In 47th Annual IEEE Symposium on Foundations of Computer Science, 459 468. Babenko, A., and Lempitsky, V. 2015. The inverted multiindex. IEEE TPAMI 37(6):1247 1260. Balntas, V.; Tang, L.; and Mikolajczyk, K. 2018. Binary online learned descriptors. IEEE TPAMI 40(3):555 567. Cakir, F.; He, K.; Bargal, S. A.; and Sclaroff, S. 2017. Mihash: Online hashing with mutual information. In Proceedings of the ICCV, 437 445. Duan, L. Y.; Lin, J.; Wang, Z.; Huang, T.; and Gao, W. 2015. Weighted component hashing of binary aggregated descriptors for fast visual search. IEEE TMM 17(6):828 842. Fan, B.; Kong, Q.; Yuan, X.; Wang, Z.; and Pan, C. 2013. Learning weighted hamming distance for binary descriptors. In Proceedings of the ICASSP, 2395 2399. Gordo, A.; Perronnin, F.; Gong, Y.; and Lazebnik, S. 2014. Asymmetric distances for binary embeddings. IEEE TPAMI 36(1):33 47. Huang, Z.; Wei, Z.; and Zhang, G. 2017. Rwbd: Learning robust weighted binary descriptor for image matching. IEEE TCSVT PP(99):1 1. Jegou, H.; Douze, M.; and Schmid, C. 2011. Product quantization for nearest neighbor search. IEEE TPAMI 33(1):117 128. Krizhevsky, A.; Sutskever, I.; and Hinton, G. E. 2012. Imagenet classiﬁcation with deep convolutional neural networks. In Proceedings of the Neur IPS, 1097 1105. Leutenegger, S.; Chli, M.; and Siegwart, R. Y. 2011. Brisk: Binary robust invariant scalable keypoints. In Proceedings of the ICCV, 2548 2555. Li, C.; Deng, C.; Wang, L.; Xie, D.; and Liu, X. 2019. Coupled cyclegan: Unsupervised hashing network for crossmodal retrieval. In Proceedings of the AAAI, 176 183.

Lin, M.; Ji, R.; Liu, H.; Sun, X.; Wu, Y.; and Wu, Y. 2019. Towards optimal discrete online hashing with balanced similarity. In Proceedings of the AAAI, 8722 8729. Liu, W.; Mu, C.; Kumar, S.; and Chang, S.-F. 2014. Discrete graph hashing. In Proceedings of the Neur IPS, 3419 3427. Liu, X.; Deng, C.; Mu, Y.; and Li, Z. 2017. Boosting complementary hash tables for fast nearest neighbor search. In Proceedings of the AAAI, 4183 4189. Liu, H.; Ji, R.; Wang, J.; and Shen, C. 2019. Ordinal constraint binary coding for approximate nearest neighbor search. IEEE TPAMI 41(4):941 955. Lowe, D. G. 2004. Distinctive image features from scaleinvariant keypoints. IJCV 60(2):91 110. Matsui, Y.; Yamasaki, T.; and Aizawa, K. 2018. Pqtable: Nonexhaustive fast search for product-quantized codes using hash tables. IEEE TMM 20(7):1809 1822. Norouzi, M.; Punjani, A.; and Fleet, D. J. 2014. Fast exact search in hamming space with multi-index hashing. IEEE TPAMI 36(6):1107 1119. Oliva, A., and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV 42(3):145 175. Rublee, E.; Rabaud, V.; Konolige, K.; and Bradski, G. 2011. Orb: An efﬁcient alternative to sift or surf. In Proceedings of the ICCV, 2564 2571. Song, J.; He, T.; Gao, L.; Xu, X.; Hanjalic, A.; and Shen, H. T. 2018. Binary generative adversarial networks for image retrieval. In Proceedings of the AAAI, 394 401. Torralba, A.; Fergus, R.; and Freeman, W. T. 2008. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE TPAMI 30(11):1958 1970. Wang, J.; Zhang, T.; Song, J.; Sebe, N.; and Shen, H. T. 2018. A survey on learning to hash. IEEE TPAMI 40(4):769 790. Weng, Z.; Yao, W.; Sun, Z.; and Zhu, Y. 2016. Asymmetric distance for spherical hashing. In Proceedings of the ICIP, 206 210. Zhang, L.; Zhang, Y.; Tang, J.; Lu, K.; and Tian, Q. 2013. Binary code ranking with weighted hamming distance. In Proceedings of the CVPR, 1586 1593. Zhou, B.; Lapedriza, A.; Xiao, J.; Torralba, A.; and Oliva, A. 2014. Learning deep features for scene recognition using places database. In Proceedings of the Neur IPS, 487 495.