Any advice for generating reproducible images across devices?

comfy@lemmy.ml · 5 months ago

Any advice for generating reproducible images across devices?

tal@lemmy.today · 5 months ago

Have you determined that whatever you’re presently doing isn’t reproducible? I mean, if the concern is that tiling is a factor relative to non-tiling, okay, but if someone else is tiling, l’d think that they’d get the same output.

comfy@lemmy.ml · 5 months ago

I haven’t determined that. I only have one device set up to run SD and haven’t organized any test with someone else.

I mean, if the concern is that tiling is a factor relative to non-tiling, okay, but if someone else is tiling, l’d think that they’d get the same output.

That’s true, I’ll check to see if the metadata mentions the tiling was used.

tal@lemmy.today · edit-2 5 months ago

If you’re talking about that VAE tiling feature or or Tiled Diffusion or whatever it’s called, I think that it shows up in the text below the image in A1111, and I think that anything that shows up there is also stored in the generated image’s comment metadata.

I don’t normally use Tiled Diffusion, if that’s what you’re referring to, but let me see if I can go generate something with it and check.

checks

Yeah, text: Tiled Diffusion: {"Method": "MultiDiffusion", "Tile tile width": 96, "Tile tile height": 96, "Tile Overlap": 48, "Tile batch size": 4}, gets added to the text below the image and to the image metadata.

That being said, I don’t know how far I’d trust the image metadata for reproducibility if this is a hard requirement you’re looking for. I have definitely seen various settings that mention that they induce non-deterministic behavior, and I’m not sure that all of those are encoded in the metadata. Also, while the version (and looks like git hash of built version) is encoded, I’m sure that not everyone is using the same version, and I don’t know what compatibility is like across versions.

EDIT: For example, see the “Optimizations” here:

https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations

You have a bunch of A1111 command-line optimizations options that have descriptions like:

--opt-sdp-attention May results in faster speeds than using xFormers on some systems but requires more VRAM. (non-deterministic)

And those are not encoded in the image metadata, and that’ll make a given output non-reproducible.

Stampela@startrek.website · 5 months ago

“Better quality” is an interesting concept. Increasing steps, depending in the sampler, changes the image. The seed mode usually changes image with changes in size.

So, what exactly do you mean with “better quality”?

comfy@lemmy.ml · edit-2 5 months ago

Good call-out. My (naïve) understanding is that tools like tiling VAE to handle low VRAM, and lowing steps in the more stable of the samplers, are going to have a generally negative impact on the result, and a very similar image with better detail could be remade using similar variables on better hardware. Maybe that’s a bit idealistic. Like you said, the seed mode usually changes images with size. (You said ‘usually’, is there a way to minimize this?)

edit: I’m aware ‘better’ and ‘higher quality’ are vague and even subjective terms. But I’m trying to convey something beyond merely higher resolution.

Stampela@startrek.website · edit-2 5 months ago

I have a few examples that I hope retain their metadata.

Seed mode is… basically, I stopped using Automatic1111 a long time ago and kinda lost track of what goes on there but in the app I use (Draw Things) there’s a seed mode called Scale Alike. Could be exclusive, could be the standard everywhere for what I know. It does what it says, changing resolution will keep things looking close enough.

Edit: obviously at some point they had to lose the bloody metadata….

Even_Adder@lemmy.dbzer0.com · 5 months ago

If you’re using the same UI and metadata, you should be able to reproduce images with only slight differences and then upscale them with hires fix or something else.