PixelCNN++ argues that neighboring intensity usually correlate is not captured by the softmax distribution. Instead, they propose a mixture model consisting of the logitic distribution (like normal but with heavier tails).
https://arxiv.org/pdf/1701.05517.pdf