Did I do something dumb with #statistics here? (please be gentle)
I have a list of 4,496,871 Census FTP URLs (thx @andrewjbtw ). I randomly sampled 385 of them, and looked them up in the Wayback Machine. 178 of them have one or more archival snapshots that were 200 OK.
So, based on this sample, I can say with 95% confidence, and a 5% margin of error, that only 46% of these Census FTP URLs have a snapshot in the Wayback Machine.
I hope I did something wrong because this sounds not great?