BlazeObsidian t1_j4v495i wrote on January 18, 2023 at 1:58 PM

~~Autoencoders like VAE’s should work better than any other models for image to image translation. Maybe you can try different VAE models and compare their performance~~

I was wrong.

kingdroopa OP t1_j4v5t9a wrote on January 18, 2023 at 2:10 PM

Hmm, interesting! Do you have any papers/article/sources supporting this claim?

BlazeObsidian t1_j4var74 wrote on January 18, 2023 at 2:45 PM

Sorry, I was wrong. Modern deep VAE's can match SOTA GAN model performance for img superresolution(https://arxiv.org/abs/2203.09445) but I don't have evidence for recoloring.

But diffusion models are shown to outperform GAN's on multiple img-to-img translation tasks. Eg:- https://deepai.org/publication/palette-image-to-image-diffusion-models

You could probably reframe your problem as an image colorization task:- https://paperswithcode.com/task/colorization and the SOTA is still Palette linked above

kingdroopa OP t1_j4vbaxk wrote on January 18, 2023 at 2:49 PM

Thanks :) I noticed Palette uses paired images, whilst mine are a bit unaligned. Would you considered it a paired image set, or unpaired? They look closely similar, but don't share pixel information in the top/bottom of the images.

BlazeObsidian t1_j4vc61q wrote on January 18, 2023 at 2:55 PM

That depends on the extent to which the pixel information is misaligned I think. If cropping your images is not a solution and a large portion of your images have this issue, the model wouldn't be able to generate the right pixel information for the misaligned sections. But it's worth giving a try with Palette if the misalignment is not significant.