Illumination-Robust Optical Flow Using Local Directional Pattern

The materials provided here have been published on [8]

Abstract

Most variational optical flow methods are based on the well-known brightness constancy assumption or high-order constancy assumptions in order to implement the data term in the optimization energy function. Unfortunately, any variation in the lighting within the scene violates the brightness constancy constraint; in turn, the gradient constancy assumption does not work properly with large illumination changes. This paper proposes an illumination-robust constancy based on a robust texture descriptor rather than the brightness constancy. Thus, the similarity function used as a data term was obtained from extracting texture features through the local directional pattern descriptor for two consecutive frames within the duality total variational optical flow algorithm. In addition, a weighted non- local term that depends on both the color similarity and the occlusion state of pixels is integrated during the optimization process in order to increase the accuracy of the resulting flow field. Experimental results show a qualitative comparison with the proposed approach and yield state-of-the-art results on KITTI datasets

Experimental Results

The proposed variational optical flow model was tested with different feature descriptors using sequence GROVE2 from the Middlebury datasets with ground-truth Middlebury by changing the illumination of the second frame depending on:

$I_{o} = uint8\left(255\left(\frac{mI_{i}+a}{255}\right)^{\gamma}\right)$

where Ii and Io are the input and output frames, respectively, m > 0 is a multiplicative factor, a>0 is an additive change factor and γ > 0 is the gamma correction. The experiments are conducted in Matlab and the function uint8 is used for quantizing the values to 8-bit unsigned integer format.

Figure 1 shows a qualitative comparison of the average end-point error (AEE) and the average angular error (AAE) between the flow fields obtained with LDP, LDNP and MLDP, in a 3 x 3 neighborhood. The effects of different values of m, a and γ have individually been assessed. As shown in figure 1, the LDNP is robust against small changes of m, a and γ. In turn, LDP and MLDP increase the robustness against both small and large changes of m and γ. In turn, MLDP yields the smallest AEE and AAE with the change a in turn the LDNP yields the worst values for both AEE and AAE among them.\\
 (a) (b) (c) AE = 0.1595 (d) AE = 0.0184 (e) AE = 0.0146 (f) AE = 0.0452 Fig. 1: (column 1) AEE and (column 2) AAE for LDP, LDNP and MLDP for changing of γ, m and a respectively.

Weighted Non-Local Term

The effect of the weighted non-local term on the final proposed algorithm has been evaluated. The AEE and the percentage of the bad pixels (BP) of the obtained flow fields with 8 KITTI training sequences are calculated for the proposed optical flow technique TV-L1 based on the three descriptors (LDP, LDNP and MLDP) with and without the weighted non-local term and are shown in table 1. As shown, the values of both AEE and BP for the proposed algorithm are reduced due to the detected accurate borders after using the weighted non-local term. In addtion, the use of weighted non-local term yields more accurate flow fields. In figure 2, the color flow field, the error image and the histogram of error with and without the non-local term has been visualized. Among LDP, LDNP and MLDP, the proposed algorithm with an weighted non-local term based on MLDP as data term leads to the best accurate flow fields.
 Fig. 2: Optical flow model for, Row 1: Original image for sequence 44 of KITTI datasets. Row 2: Resulting flow field without NL term, Row 3: Resulting flow field with NL term, Row 4: Error image and error histogram without NL, Row 4: Error image and error histogram with NL term.

 Sequence TV-L1 TV-L1 with Non-Local term MLDP LDNP LDP MLDP LDNP LDP 11 35.49%(15.77) 40.21%(11.41) 48.26%(26.26) 29.67% (7.37) 38.01%(11.14) 46.19%(16.78) 15 26.55%(13.21) 28.69%(7.74) 38.27%(18.65) 23.85% (8.31) 26.47%(10.64) 25.58%(7.24) 44 35.46%(14.70) 36.31%(11.27) 41.11%(16.76) 20.42% (4.54) 33.15%(12.00) 38.27%(11.72) 74 61.41%(24.41) 63.74%(23.68) 66.73%(24.15) 56.01% (20.48) 60.50%(21.49) 64.36%(26.59) 117 31.58%(15.22) 31.73%(14.88) 31.91%(15.53) 18.67% (9.16) 26.80%(14.89) 25.27%(7.10) 144 47.96%(20.03) 51.22%(19.27) 54.81%(18.48) 41.05% (17.81) 49.24%(17.09) 49.55%(18.49) 147 18.39%(11.22) 21.96%(21.65) 29.22%(19.18) 11.79% (2.98) 12.74%(3.90) 18.76%(13.35) 181 59.40%(48.78) 73.86%(59.62) 75.83%(58.97) 58.25% (48.68) 64.01%(48.94) 66.07%(53.56)

KITTI dataset

The proposed variational optical flow method tested upon the widely used KITTI dataset optical flow. According to the KITTI (July 2013), the results of the proposed model with MLDP (MLDP_OF) have been evaluated, and it has been ranked in the 8 position out of 32 current state-of-the-art optical flow algorithms. The KITTI benchmark considers the bad flow vectors at all pixels that are above a spatial distance of 3 pixels from the ground truth. (MLDP_OF) has average of 8.90% bad pixels, in turn the baseline methods [7] and [3] have 30.75% and 24.64%, respectively.
 Rank Method Out-Noc Out-All Avg-Noc Avg-All 1 PR-Sf+E 4.08 % 7.79 % 0.9 px 1.7 px 2 PCBP-Flow 4.08 % 8.70 % 0.9 px 2.2 px 3 MotionSLIC 4.36 % 10.91 % 1.0 px 2.7 px 4 PR-Sceneflow 4.48 % 8.98 % 1.3 px 3.3 px 5 TGV2ADCSIFT 6.55 % 15.35 % 1.6 px 4.5 px 6 Data-Flow 8.22 % 15.78 % 2.3 px 5.7 px 7 TVL1-HOG 8.31 % 19.21 % 2.0 px 6.1 px 8 MLDP-OF 8.91 % 18.95 % 2.5 px 6.7 px 9 CRTflow 9.71 % 18.88 % 2.7 px 6.5 px 10 C++ 10.16 % 20.29 % 2.6 px 7.1 px

Real illumination and large displacement test

Furthermore, the proposed variational optical flow method based on the MLDP descriptor is evaluated with eight real image sequences that include illumination changes and large displacements, as well as low-textured areas, reflections and specularities. Table 1 shows the AEE and bad pixels corresponding to four se- quences with illumination changes and large displacements calculated for the methods proposed in [1] , [2] , [3] , [4] , and [5] , in addition to the proposed method based on MLDP, the census transform and the gradient constancy. In another experiment, the estimated flow fields with MLDP (3 × 3 and 5 × 5) have visually been compared with the proposed optical flow method by using the data term based on the brightness constancy assumption, as well as the one based on the census transform. Figure 3 shows the estimated flow field for sequence 15, which includes illumination changes, as well as the error images and the error histograms. In addition, figure 4 shows the same information for sequence 181, which includes large displacements. Among the evaluated approaches, the optical flow model based on MLDP yields the most accurate flow fields with respect to the state-of-the-art methods for real images of KITTI datasets that include both illumination changes and large displacements.

Table 1: Percentage of bad pixels and AEE for the state-of-the-art methods and the proposed method with four sequences from KITTI datasets: sequences 11, 15, 44 and 74, which include illumination changes with the occluded points ground truth.
Method Seq 44 Seq 11 Seq 15 Seq 74 Average
MLDP 20.42% (4.54) 29.67% (7.37) 23.85% (8.31) 56.01% (20.48) 32.49%
Gradient Constancy 29.25% (9.54) 35.72%(10.91) 26.41% (8.47) 59.20% (23.07) 37.64%
OFH 2011 [5] 23.22% (5.11) 37.26% (12.47) 32.20% (9.06) 62.90% (24.00) 38.89%
Census(5 x 5) 35.23% (12.74) 33.93% (9.75) 29.04% (8.70) 57.57% (20.80) 38.94%
Census(3 x 3) 29.55% (10.22) 37.54% (11.14) 33.74% (9.11) 57.43% (20.53) 39.56%
SRB [3] 26.58%(4.67) 40.61% (13.76) 32.85% (9.72) 62.94% (24.27) 40.74%
SRBF [3] 31.83% (5.62) 40.34% (13.96) 35.13% (12.17) 64.89% (24.64) 43.05%
BW [1] 32.44% (5.19) 33.95% (8.50) 47.70% (12.40) 71.44% (25.15) 46.38%
HS [2] 42.96% (6.77) 38.84% (10.72) 58.08% (12.89) 82.14% (28.75) 55.50%
WPB [4] 49.09% (9.20) 49.99% (28.35) 67.28% (28.36) 88.67% (30.68) 63.76%

Table 2: Percentage of bad pixels and AEE for the state-of-the-art methods and the proposed method with four sequences from KITTI datasets: sequences 11, 15, 44 and 74, which include illumination changes with the non-occluded points ground truth.
Method Seq 44 Seq 11 Seq 15 Seq 74 Average
MLDP 8.88% (1.85) 15.09% (2.87) 10.22% (2.72) 49.87% (14.85) 21.02%
Gradient Constancy 16.78% (4.95) 19.43%(4.01) 11.97% (3.52) 53.13% (16.38) 25.33%
Census(5 x 5) 24.30% (7.96) 19.83% (5.06) 15.03% (3.41) 51.10%(15.14) 27.57%
OFH [5] 11.17% (2.44) 24.32% (6.48) 18.34% (3.63) 57.40% (17.25) 27.81%
SRB [3] 14.66% (2.44) 27.83% (6.43) 18.93% (4.05) ) 57.36% (17.36) 29.69%
Census(3 x 3) 18.26% (6.30) 24.05% (7.28) 20.30% (3.95) 57.43%(17.53) 30.01%
SRBF [3] 20.98% (3.29) 27.78% (6.73) 21.66% (4.53) 59.56% (17.52) 32.49%
BW [1] 22.38% (3.16) 20.54% (3.62) 36.85% (6.67) 67.22% (18.49) 36.75%
HS [2] 34.18% (4.61) 25.98% (6.79) 49.57% (7.95) 79.57% (21.55) 47.32%
WPB [4] 40.85% (5.88) 39.25% (18.75) 60.50% (17.63) 87.02% (24.09) 56.90%

Sequence 11

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 MLDP Fig. 1: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

Sequence 15

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 MLDP Fig. 2: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

Sequence 44

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 MLDP Fig. 3: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

Sequence 74

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 MLDP Fig. 4: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

Table 3: Percentage of bad pixels and AEE for the state-of-the-art methods and the proposed method with four sequences from KITTI datasets: sequences 117, 144, 147 and 181, which include large displacement with the occluded points ground truth.
Method Seq 147 Seq 117 Seq 144 Seq 181 Average
MLDP > 11.79% (2.98) 21.67% (9.16) 41.05% (17.81) 59.40% (48.68) 33.48%
OFH [5] 15.04% (4.96) 16.26% (4.33) 42.04% (15.01) 63.86% (50.52) 34.30%
Gradient Constancy 12.28% (3.93) 17.70% (10.81) 44.51% (18.67) 67.63% (58.40) 35.53%
SRB [3] 14.59% (4.85) 24.71% (9.74) 50.67% (19.03) 67.11% (47.70) 39.27%
SRBF [3] 14.79% (5.17) 24.41% (9.92) 50.66% (19.34) 68.41% (48.81) 39.57%
BW [1] 16.98% (5.17) 28.80% (7.86) 46.98% (16.85) 69.04% (45.27) 40.45%
Census(5 x 5) 13.98% (3.41) 27.33% (15.23) 47.68% (16.75) 73.85% (58.59) 40.71%
Census(3 x 3) 14.76% (3.54) 28.80% (15.20) 48.97% (16.83) 73.63% (58.58) 41.54%
HS [2] 24.84% (6.61) 43.24% (15.32) 51.89% (14.81) 74.11% (49.28) 48.52%
WPB [4] 32.72% (8.10) 46.80% (13.67) 52.25% (17.94) 76.00% (50.18) 51.94%

Table 4: Percentage of bad pixels and AEE for the state-of-the-art methods and the proposed method with four sequences from KITTI datasets: sequences 117, 144, 147 and 181, which include large displacement with the non-occluded points ground truth.
Method Seq 147 Seq 117 Seq 144 Seq 181 Average
MLDP > 4.66% (1.05) 13.80% (4.19) 28.02% (6.93) 46.81% (21.03) 23.32%
OFH [5] 8.03% (1.98) 9.09% (2.17) 29.62% (6.77) 52.32% (23.46) 24.76%
Gradient Constancy 7.13% (1.25) 9.70% (4.42) 32.25% (8.26) 57.21% (29.92) 26.57%
SRB [3] 7.55% (1.74) 18.11% (5.28) 39.55% (9.33) 56.51% (22.88) 30.43%
SRBF [3] 7.69% (1.97) 17.95% (5.29) 39.64% (9.59) ) 58.25% (23.78) 30.88%
BW [1] 10.07% (2.20) 22.25% (4.23) 35.01% (8.17) 59.05% (22.58) 31.60%
Census(5 x 5) 6.78% (0.95) 20.52% (9.82) 36.29% (7.71) 65.55% (31.26) 32.29%
Census(3 x 3) 6.63% (1.00) 21.85% (9.59) 37.49% (8.43) 65.29% (31.92) 32.82%
HS [2] 18.52% (3.38) 37.82% (9.77) 41.30% (7.32) 65.77% (23.40) 40.85%
WPB [4] 25.92% (4.43) 41.23% (9.18) 41.53% (8.94) 68.27% (25.96) 44.24%

Sequence 144

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 MLDP Fig. 5: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

Sequence 147

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 MLDP Fig. 6: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

Sequence 117

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 MLDP Fig. 7: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

Sequence 181

 (a) (b)
 Brightness Constrain Census 3x3 Census 5x5 HOG 3x3 MLDP Fig. 8: (row 1) Two original images for sequence 15 of KITTI datasets. Resulting flow field, error image and error histogram for the proposed optical flow model with: (row 2) brightness constancy, (row 3) 3 × 3 census transform, (row 4) 5 × 5 census transform, and (row 5) MLDP

References

 1 A. Bruhn and J. Weickert. Towards ultimate motion estimation: Combining highest accuracy with real-time performance. In ICCV, pages 749-755, 2005. 2 B. Horn and B. Schunk. Determining optical flow. In Artificial Intelligence, vol. 17, pages 185-203, 1981. 3 D. Sun, S. Roth, and M.J. Black. Secrets of optical flow estimation and their principles. In CVPR, pages 2432–2439. IEEE, 2010. 4 M. Werlberger, T. Pock, and H. Bischof. Motion estimation with non-local total variation regularization. In CVPR, pages 2464-2471.IEEE, 2010. 5 H. Zimmer, A. Bruhn, and J. Weickert. Optic flow in harmony. In IJCV, vol. 93(3): pages 368-388, 2011. 6 Rudin, L.I., Osher, S.J., Fatemi, E.: Nonlinear total variation based noise removal algorithms. In Physica D, vol. 60, pages 259-268, 1992 7 C. Zach, T. Pock, H. Bischof, A duality based approach for realtime tv- l1 optical flow. In DAGM. pages 214-223, 2007 8 M. Mohamed, H. Rashwan, B. Mertsching, M. Garcia, and D. Puig. Illumination-Robust Optical Flow Using Local Directional Pattern. In: IEEE Transactions on Circuits and Systems for Video Technology, 2014.