TY - JOUR
T1 - Towards an Interpretable Deep Driving Network by Attentional Bottleneck
AU - Kim, Jinkyu
AU - Bansal, Mayank
N1 - Funding Information:
Manuscript received March 1, 2021; accepted June 13, 2021. Date of publication July 13, 2021; date of current version August 2, 2021. J. Kim is partially supported by the National Research Foundation of Korea grant under NRF-2021R1C1C1009608, BasicScience Research Program under NRF-2021R1A6A1A13044830, and ICT Creative Consilience program under IITP-2021-2020-0-01819. This letter was recommended for publication by Associate Editor M. Ghaffari and Editor C. Cadena Lerma upon evaluation of the reviewers’ comments. (Corresponding author: Jinkyu Kim.) Jinkyu Kim is with the Waymo LLC, Mountain View, CA 94043 USA, and also with the Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea (e-mail: jinkyukim@korea.ac.kr).
Publisher Copyright:
© 2016 IEEE.
PY - 2021/10
Y1 - 2021/10
N2 - Deep neural networks are a key component of behavior prediction and motion generation for self-driving cars. One of their main drawbacks is a lack of transparency: they should provide easy to interpret rationales for what triggers certain behaviors. We propose an architecture called Attentional Bottleneck with the goal of improving transparency. Our key idea is to combine visual attention, which identifies what aspects of the input the model is using, with an information bottleneck that enables the model to only use aspects of the input which are important. This not only provides sparse and interpretable attention maps (e.g. focusing only on specific vehicles in the scene), but it adds this transparency at no cost to model accuracy. In fact, we find improvements in accuracy when applying Attentional Bottleneck to the ChauffeurNet model, whereas we find that the accuracy deteriorates with a traditional visual attention model.
AB - Deep neural networks are a key component of behavior prediction and motion generation for self-driving cars. One of their main drawbacks is a lack of transparency: they should provide easy to interpret rationales for what triggers certain behaviors. We propose an architecture called Attentional Bottleneck with the goal of improving transparency. Our key idea is to combine visual attention, which identifies what aspects of the input the model is using, with an information bottleneck that enables the model to only use aspects of the input which are important. This not only provides sparse and interpretable attention maps (e.g. focusing only on specific vehicles in the scene), but it adds this transparency at no cost to model accuracy. In fact, we find improvements in accuracy when applying Attentional Bottleneck to the ChauffeurNet model, whereas we find that the accuracy deteriorates with a traditional visual attention model.
KW - Explainable AI (XAI)
KW - deep driving network
UR - http://www.scopus.com/inward/record.url?scp=85110882468&partnerID=8YFLogxK
U2 - 10.1109/LRA.2021.3096495
DO - 10.1109/LRA.2021.3096495
M3 - Article
AN - SCOPUS:85110882468
SN - 2377-3766
VL - 6
SP - 7349
EP - 7356
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 4
M1 - 9483668
ER -