DRL - deep reinforcement learning
POI - points of interest
CNN - convolutional neural network
SAC - soft actor-critic
DD-PPO - decentralized distributed proximal policy optimization
SPL - Success weighted by Path Length.
How efficient was the agent’s path compared to an optimal path? (Notice: optimal path = shortest path from the agent’s starting position to the closest instance of the target object category.)