Sieve is a data curation method for vision-language data, enabling training highly performant CLIP models. Sieve works on assessing the alignment between webscale image-text pairs using ...