This list is far from complete. These are some of the prominent archived data sites:
- Internet Archive featuring the Wayback Machine
"Our mission is to provide Universal Access to All Knowledge. We collect published works and make them available in digital formats. Today our archive contains:
279 billion web pages
11 million books and texts
4 million audio recordings (including 160,000 live concerts)
3 million videos (including 1 million Television News programs)
1 million images
100,000 software programs"
- End of Term Web Archive
"The End of Term Web Archive captures and saves U.S. Government websites at the end of presidential administrations. Beginning in 2008, the EOT has thus far preserved websites from administration changes in 2008 and 2012, and is currently preparing for the 2016 electoral season." The End of Term Web Archive contains federal government websites (.gov, .mil, etc) in the Legislative, Executive, or Judicial branches of the government. Websites that were at risk of changing (i.e., whitehouse.gov) or disappearing altogether during government transitions were captured. Local or state government websites, or any other site not part of the federal government domain were out of scope.
- Data Refuge
"DataRefuge helps to build refuge for federal data and supports climate and environmental research and advocacy. We are committed to fact-based arguments. DataRefuge preserves the facts we need at a time of ongoing climate change."
This site is one part of the project. The vast majority of the government information gathered through this project is available from the Internet Archive through the End of Term project. This data catalog is a place to store data that is difficult or impossible to harvest through web crawlers."
- Climate Mirror : an open project to mirror public climate datasets
"Climate Mirror is a distributed effort conducted by volunteers, in conjunction with efforts from institutions such as University of Pennsylvania, University of Toronto, and the Internet Archive, to mirror and back up U.S. Federal Climate Data."
- DataLumos (ICPSR)
"DataLumos is an ICPSR archive for valuable government data resources. ICPSR has a long commitment to safekeeping and disseminating US government and other social science data. DataLumos accepts deposits of public data resources from the community and recommendations of public data resources that ICPSR itself might add to DataLumos."
- MemoryHole2 Resources
"The Memory Hole 2 - run by writer and anthologist Russ Kick - saves important documents from oblivion. Its predecessor, The Memory Hole (2002-2009), posted hundreds of documents, many of which will be reposted on the new site."
- Environmental Data and Governance Initiative (EDGI)
"The Environmental Data and Governance Initiative (EDGI) is an international network of academics and non-profits addressing potential threats to federal environmental and energy policy, and to the scientific research infrastructure built to investigate, inform, and enforce. Dismantling this infrastructure — which ranges from databases to satellites to models for climate, air, and water — could imperil the public’s right to know, the United States’ standing as a scientific leader, corporate accountability, and environmental protection."
- Libraries Network
"Coordinating the work of a lot of dedicated information professionals." Learn about this group of librarians who have set up the Data Refuge and are working to preserve data.