Local Positional Encoding for Multi-Layer Perceptrons
About
A multi-layer perceptron (MLP) is a type of neural networks which has a long history of research and has been studied actively recently in computer vision and graphics fields. One of the well-known problems of an MLP is the capability of expressing high-frequency signals from low-dimensional inputs. There are several studies for input encodings to improve the reconstruction quality of an MLP by applying pre-processing against the input data. This paper proposes a novel input encoding method, local positional encoding, which is an extension of positional and grid encodings. Our proposed method combines these two encoding techniques so that a small MLP learns high-frequency signals by using positional encoding with fewer frequencies under the lower resolution of the grid to consider the local position and scale in each grid cell. We demonstrate the effectiveness of our proposed method by applying it to common 2D and 3D regression tasks where it shows higher-quality results compared to positional and grid encodings, and comparable results to hierarchical variants of grid encoding such as multi-resolution grid encoding with equivalent memory footprint.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| SDF Reconstruction | Armadillo | IoU95.6 | 16 | |
| SDF Reconstruction | Lucy | IoU90.2 | 16 | |
| SDF Reconstruction | Thai Statue | IoU92.5 | 16 | |
| Texture Set Compression | Poly Haven based texture dataset 18 texture sets (test) | PSNR40.21 | 11 | |
| Signed Distance Function Representation | Global Average | IoU79.3 | 10 | |
| Signed Distance Function Representation | Pitted Stonefish | IoU38.9 | 10 | |
| Implicit Image Representation | Kodak low resolution (768x512) (test) | PSNR45.06 | 9 |