Дата: 14-07-2022, 19:01
The loss has two phrases that try and implement two constraints on the slot representation: slot saliency and slot variety. Thus, we formulate a loss that tries to ensure that the slot representations seize "time dependent" features (i.e. capture things that transfer). Traditionally, slot representations have been evaluated by inspecting qualitative reconstructions Greff et al. The first sort is spatial attention models which attend completely different places in the scene to extract objects (Kosiorek et al., 2018; Eslami et al., 2016; Crawford & Pineau, 2019a; Lin et al., 2020; Jiang et al., 2019) and the second is scene-mixture fashions, the place the scene is modelled as a Gaussian mixture mannequin of scene parts (Nash et al., 2017; Greff et al., 2016; 2017; 2019; Burgess et al., 2019). The third major form of object-centric fashions are keypoint fashions (Zhang et al., 2018; Jakab et al., 2018), which extract keypoints (the spatial coordinates of entities) by fitting 2D Gaussians to the feature maps of an encoder-decoder model.