TY - GEN
T1 - Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention
AU - Kim, Jinkyu
AU - Canny, John
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/22
Y1 - 2017/12/22
N2 - Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-tointerpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers etc., can understand what triggered a particular behavior. Here we explore the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). Our approach is two-stage. In the first stage, we use a visual attention model to train a convolution network endto- end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. We demonstrate the effectiveness of our model on three datasets totaling 16 hours of driving. We first show that training with attention does not degrade the performance of the end-to-end network. Then we show that the network causally cues on a variety of features that are used by humans while driving.
AB - Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-tointerpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers etc., can understand what triggered a particular behavior. Here we explore the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). Our approach is two-stage. In the first stage, we use a visual attention model to train a convolution network endto- end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. We demonstrate the effectiveness of our model on three datasets totaling 16 hours of driving. We first show that training with attention does not degrade the performance of the end-to-end network. Then we show that the network causally cues on a variety of features that are used by humans while driving.
UR - http://www.scopus.com/inward/record.url?scp=85041925881&partnerID=8YFLogxK
U2 - 10.1109/ICCV.2017.320
DO - 10.1109/ICCV.2017.320
M3 - Conference contribution
AN - SCOPUS:85041925881
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 2961
EP - 2969
BT - Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th IEEE International Conference on Computer Vision, ICCV 2017
Y2 - 22 October 2017 through 29 October 2017
ER -