data_loader_config
AspectRatioBucketConfig
target_resolution
The target resolution for all aspect ratios. When generating aspect ratio buckets, the resolution of each bucket
is selected to have roughly target_resolution * target_resolution
pixels (i.e. a square image with dimensions
equal to target_resolution
).
start_dim
Aspect ratio bucket resolutions are generated as follows:
- Iterate over 'first' dimension values from
start_dim
toend_dim
in steps of sizedivisible_by
. - Calculate the 'second' dimension to be as close as possible to the total number of pixels in
target_resolution
, while still being divisible bydivisible_by
.
Choosing aspect ratio buckets
The aspect ratio bucket resolutions are logged at the start of training with the number of images in each bucket. Review these logs to make sure that images are being split into buckets as expected.
Highly fragmented splits (i.e. many buckets with few examples in each) can 1) limit the extent to which examples can be shuffled, and 2) slow down training if there are many partial batches.
ImageCaptionSDDataLoaderConfig
resolution
The resolution for input images. Either a scalar integer representing the square resolution height and width, or
a (height, width) tuple. All of the images in the dataset will be resized to this resolution unless the
aspect_ratio_buckets
config is set.
center_crop
If True, input images will be center-cropped to the target resolution. If False, input images will be randomly cropped to the target resolution.
random_flip
Whether random flip augmentations should be applied to input images.
caption_prefix
A prefix that will be prepended to all captions. If None, no prefix will be added.
DreamboothSDDataLoaderConfig
class_data_loss_weight
The loss weight applied to class dataset examples. Instance dataset examples have an implicit loss weight of 1.0.
aspect_ratio_buckets
The aspect ratio bucketing configuration. If None, aspect ratio bucketing is disabled, and all images will be resized to the same resolution.
resolution
The resolution for input images. Either a scalar integer representing the square resolution height and width, or
a (height, width) tuple. All of the images in the dataset will be resized to this resolution unless the
aspect_ratio_buckets
config is set.
center_crop
If True, input images will be center-cropped to the target resolution. If False, input images will be randomly cropped to the target resolution.
random_flip
Whether random flip augmentations should be applied to input images.
TextualInversionSDDataLoaderConfig
caption_templates
A list of caption templates with a single template argument 'slot' in each. E.g.:
- "a photo of a {}"
- "a rendering of a {}"
- "a cropped photo of the {}"
keep_original_captions
If True
, then the captions generated as a result of the caption_preset
or caption_templates
will be used as
prefixes for the original captions. If False
, then the generated captions will replace the original captions.
aspect_ratio_buckets
The aspect ratio bucketing configuration. If None, aspect ratio bucketing is disabled, and all images will be resized to the same resolution.
resolution
The resolution for input images. Either a scalar integer representing the square resolution height and width, or
a (height, width) tuple. All of the images in the dataset will be resized to this resolution unless the
aspect_ratio_buckets
config is set.
center_crop
If True, input images will be center-cropped to the target resolution. If False, input images will be randomly cropped to the target resolution.
random_flip
Whether random flip augmentations should be applied to input images.
shuffle_caption_delimiter
If None
, then no caption shuffling is applied. If set, then captions are split on this delimiter and shuffled.