ID photo of Ciro Santilli taken in 2013 right eyeCiro Santilli OurBigBook logoOurBigBook.com  Sponsor 中国独裁统治 China Dictatorship 新疆改造中心、六四事件、法轮功、郝海东、709大抓捕、2015巴拿马文件 邓家贵、低端人口、西藏骚乱
D'oh.
But to be serious. The Wayback Machine contains a very large proportion of all sites. It does happen sometime that a Wayback Machine archive is missing or broken and cqcounter has the screenshot. But the Wayback Machine is still the most complete database we have found so far. Some archives are very broken. But those are rare.
The only problem with the Wayback Machine is that there is no known efficient way to query its archives across domains. You have to have a domain in hand for CDX queries: Wayback Machine CDX scanning.
The Common Crawl project attempts in part to address this lack of querriability, but we haven't managed to extract any hits from it.
CDX + 2013 DNS Census + heuristics however has been fruitful however.
We have dumped all Wayback Machine archives of known websites to: github.com/cirosantilli/cia-2010-websites-dump using cia-2010-covert-communication-websites/download-websites.sh. This allows for better grepping and serves as a backup in case they ever go down.

Tagged (1)

Ancestors (14)

  1. Data sources
  2. Methodology
  3. CIA 2010 covert communication websites
  4. Central Intelligence Agency
  5. American intelligence agency
  6. United States Intelligence Community
  7. Intelligence community
  8. Secret service
  9. Espionage
  10. War
  11. Social science
  12. Scientific method
  13. Science
  14. Home