Introduction

We envision a future with safe, interactive robots which can co-exist with people. For this reason we chose the topic ”Social Navigation”. Social navigation is the type of navigation, during which the agent aims to avoid conflicts with pedestrians in the environment while navigating towards its goal. SARL, the state of the art method proposed by Chen et. al. [1], explores this problem in a simple environment without any obstacles. In our work, we investigate this problem further under more challenging conditions.

Graph Attention Network for Social Navigation (GAT4SN)

a. Graph structure

a. Graph structure

b. Attention mechanism

b. Attention mechanism

Moreover, instead of using only a single attention mechanism, we decided to use multi-head graph attentional layer (see figure 3) as suggested in [2]. Basically, we compute the attention mechanism multiple times and then concatenate all the output column-wise together as shown in the equation below.

$\begin{aligned} \boldsymbol{H^K} & =||^K_{k=1}[\boldsymbol{H^k}] \\ & = \begin{bmatrix} \boldsymbol{H^1} && \boldsymbol{H^2} && \boldsymbol{H^{K-1}} && \boldsymbol{H^{K}} \end{bmatrix} \end{aligned}$

Furthermore, we put the concatenated matrix into the output attention mechanism which serves as an averaging function in the whole process. Last but not least, we extract only the meaningful features from the output of the multi-head graph attentional layer to get the vector $\vec{\boldsymbol{h^G}}$.

$\begin{aligned} \boldsymbol{H^{out}} & =att_{out}(\boldsymbol{H^K}) = \begin{bmatrix} \vec{\boldsymbol{h}_r^{out}}\\ \vec{\boldsymbol{h}_1^{out}}\\ \vdots \\ \vec{\boldsymbol{h}_N^{out}}\\ \end{bmatrix}\\ \vec{\boldsymbol{h^G}} & = [\vec{\boldsymbol{h^{out}r}} || \sum^N{i=1}\vec{\boldsymbol{h^{out}_i}}] \end{aligned}$

  1. Deep Value Network

    We combine our graph attention network with the deep value network proposed by Chen et. al. [1] as shown in figure 2. The calculated value represents how good an input state is ($s_r$ and all the other $s^{h/o}_i$). This information is later utilized to determine the robot's next action in order to maximize the cumulative rewards.

Figure 2. Deep Value Network

Figure 2. Deep Value Network

Quantitative Evaluation

Table 1: Simple: Same environment setting as training. Hard: Static obstacle with variable
radius ∈ [0.5, 1.5].

Table 1: Simple: Same environment setting as training. Hard: Static obstacle with variable radius ∈ [0.5, 1.5].

Table 2: 20 humans environment result

Table 2: 20 humans environment result

Visualization

test_178_gat4sn.gif

test_183_gat4sn.gif

test_178_sarl.gif

test_183_sarl.gif

With Even Crazier Scenarios…

gat4sn_test_case_40_H20.gif

H10O10_1.gif

H10O10_0.gif

H30O0_0.gif

Reference

[1] Chen, Changan, et al. "Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning." 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019.

[2] Veličković, Petar, et al. "Graph attention networks." arXiv preprint arXiv:1710.10903 (2017)

[3] Our code