Open web screenshot data
Project Common Screens
A corpus of web screenshot and metadata data composed of over 70 million websites, published as open data on AWS and updated monthly.
Access JPEG screenshots
Convert a host name into the flat S3 object key and public URL.
AWS file layout
Understand the S3 prefixes currently published for images and derived data.
Other projects
Browse related Dosvak projects for compute, AI assets, genealogy, workflow Q&A, and social communities.