Talk:List of Web archiving initiatives

Lists High‑importance

	This article is within the scope of WikiProject Lists, an attempt to structure and organize all list pages on Wikipedia. If you wish to help, please visit the project page, where you can join the project and/or contribute to the discussion.ListsWikipedia:WikiProject ListsTemplate:WikiProject ListsList articles
High	This article has been rated as High-importance on the project's importance scale.

Digital Preservation

This article is within the scope of WikiProject Digital Preservation, a collaborative effort to improve the coverage of digital preservation on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Digital PreservationWikipedia:WikiProject Digital PreservationTemplate:WikiProject Digital PreservationDigital Preservation articles

Please place new discussions at the bottom of the talk page.

This is the talk page for discussing improvements to the List of Web archiving initiatives article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Comment - need to add more initatives & repository URLs

I do not appreciate that list - it would be central to have repository urls for those which are offering public access to archived contents. More initiatives see e.g. at http://archiv.twoday.net/topics/Webarchivierung --92.72.202.15 (talk) 15:42, 31 March 2011 (UTC)[reply]

Need to define terms like "Archived Contents (millions)"

I actually like the list, since there isn't anything else much like it anywhere else. Does anyone know what the "Archived Contents (millions)" column means? Is that the number of URLs present in the repository, or the number of snapshots of all URLs over time, or is it something else? Edsu (talk) 18:35, 28 October 2013 (UTC)[reply]

Need to normalize to FTE

The "Number of employees" column makes it difficult to compare. It would be better if the information was normalized to Full time employee equivalents (FTE). --PeterKz (talk) 19:18, 6 February 2014 (UTC)[reply]

Need to define most terms in this article

Most of the terms in this article come from the research paper:

A survey on web archiving initiatives^[1] published by the Portuguese Web Archive^[2] team.
— Wikipedia editors, [1]

The terms are not clear to the typical Wikipedia reader, being too obscure and academic in tone.

For example:

In the lead: "web archiving initiatives, archived data, and access methods", which could each be explained at the start of each table.
Most of the column headers in each table need to be defined and clarified. See remarks in prior sections.

- Lentower (talk) 13:09, 29 April 2014 (UTC)[reply]

References

^ Daniel Gomes; João Miranda; Miguel Costa (25--29 September 2011). "A survey on web archiving initiatives". International Conference on Theory and Practice of Digital Libraries 2011. Springer. Retrieved 23 October 2012. {{cite web}}: Check date values in: |date= (help)CS1 maint: multiple names: authors list (link)
^ Foundation for National Scientific Computing (FCCN) (23 October 2012). "Portuguese Web Archive: search the past". Foundation for National Scientific Computing (FCCN). Retrieved 23 October 2012.

Archive.is is not notable

That site is not established for notability and cannot be. So it is not for inclusion to Wikipedia. It is referenced to the primary source, i.e., the site itself, and that site does not even exist. Do not revert my edit without establishing a reason, thanks. ~ R.T.G 13:37, 2 September 2014 (UTC)[reply]

I improved the entry using new site name and notable source (CNET Japan). Also note that this site is the second most popular archive site after Wayback Machine according to Alexa. 88.246.46.189 (talk) 14:12, 2 September 2014 (UTC)[reply]

That whole article is nothing more than an advertisement. It gives no reason for why they are writing about it. Doesn't establish notablity. ~ R.T.G 14:28, 2 September 2014 (UTC)[reply]

MOS:SAL does not say that notability is a metric by which one includes something on a list.—Ryūlóng (琉竜) 14:50, 2 September 2014 (UTC)[reply]

Alexa ranks it as having overtaken the library of congress in the last two months. There's not much point trying to argue about Alexa, but for a site with no content and scant advertising to outdo all of the sites which have been running for twenty years with massive libraries of content unrelated to archiving? There are apparently sites which provide dummy traffic for about $5 per 10,000. That's not very expensive for some people. They have 2,000,000 website snapshots. That is all they have. My opinion is that they've got their instant save button incorporated into some sort of malware this year and compounded it with the traffic they are getting from the controversy on this site but hey, that's just an opinion. I shouldn't even do that some would say. (troll on my new disciple) ~ R.T.G 14:59, 2 September 2014 (UTC)[reply]

(MOS:SAL, paragraph 2 "Being articles, stand-alone lists are subject to Wikipedia's content policies, such as verifiability, no original research, neutral point of view, and what Wikipedia is not, as well as the notability guidelines.") ~ R.T.G 15:03, 2 September 2014 (UTC)[reply]

Your "opinion" is no more than WP:IDL. 88.246.46.189 (talk) 15:21, 2 September 2014 (UTC)[reply]

The entry on Archive.is seems to be verified, not original research, and neutrally written. Stop tilting at windmills across the project RTG.—Ryūlóng (琉竜) 16:57, 2 September 2014 (UTC)[reply]

Aleph Archives

Do we need to have producers of archive software in the list ? For example "Aleph Archives" is not an archive at all, it is a commercial company developing and selling software for archives. 88.246.46.189 (talk) 15:25, 2 September 2014 (UTC)[reply]

I would suppose that it is an initiative, even if that initiative is not the archiving itself, it is a highly notable archiving related initiative. What might be an improvement is to split off a small section of the software, but at such a long list of something I've no familiarity with, it would be much easier for someone who knows about archive software. Even if that section is only one at the moment, it's still a sort of division of that type in the information. ~ R.T.G 15:22, 14 November 2014 (UTC)[reply]

External links modified

Hello fellow Wikipedians,

I have just added archive links to one external link on List of Web archiving initiatives. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

Added archive https://web.archive.org/20120925004220/http://www.nyu.edu/library/bobst/research/tam/webarchive.html to http://www.nyu.edu/library/bobst/research/tam/webarchive.html/

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—^{cyberbot II}_{Talk to my owner:Online} 21:23, 1 February 2016 (UTC)[reply]

Inclusion criteria

In order to make this list easily maintainable and follow Wikipedia's policies and guidelines, I suggest only including entries with their own Wikipedia articles. --Ronz (talk) 17:22, 9 February 2017 (UTC)[reply]

External links modified

Hello fellow Wikipedians,

I have just modified 2 external links on List of Web archiving initiatives. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Added {{dead link}} tag to http://210.82.118.162:9090/webarchive
Added archive https://web.archive.org/web/20130927195054/http://ediasporas.ticmigrations.fr/ to http://ediasporas.ticmigrations.fr/
Added {{dead link}} tag to http://digital.cacak-dis.rs/english/web-archive-of-cacak/
Added archive https://web.archive.org/web/20130927195054/http://ediasporas.ticmigrations.fr/ to http://ediasporas.ticmigrations.fr/

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 04:09, 28 December 2017 (UTC)[reply]

Megalodon.jp

Megalodon.jp need to be mention. Is minor in comparison to perma.cc / webrecorder.io or even archive.is (which some user above have said ironically that is not relevant), but I think it should be here --Jakeukalane (talk) 13:16, 26 March 2019 (UTC)[reply]

The first column of the table is a mess

The first column of the table in List of Web archiving initiatives, the "Name" column, is a mashup of plain text names with endnotes, wikilinks, red links, and external links. There is no apparent regularity in formatting here. I propose that the column should include either (preferably) wikilinks or plain text names with endnotes, no red links (per WP:WTAF) and no external links (per WP:EL). Above, in § Inclusion criteria, Ronz suggested including only "entries with their own Wikipedia articles" (i.e. wikilinks) but that suggestion apparently did not go anywhere. What I am proposing is a step toward some kind of regularity. Biogeographist (talk) 12:05, 10 June 2019 (UTC)[reply]

Go ahead. This list is a mess. --Ronz (talk) 15:24, 10 June 2019 (UTC)[reply]

Pay/Freemium/Free

Need a column that specifies pay structure(s): Pay is entirely non-free (Arkiwera). Freemium is a mix of free and pay services (Conifer/Rhizome). Free is entirely free (WaybackMachine). -- GreenC 22:03, 12 April 2024 (UTC)[reply]

Mix of Public and Open Initiatives, Specialized Scoped "Initiatives", and Commercial Service Providers

The list doesn't seem particularly useful with its mix of scope and listed content.

I would find it much more appropriate and useful if it were split into three lists:

Public and Open Initiatives where anyone can archive (broadly) any web pages like the Wayback Machine
Specialized Initiatives like the numerous self-funded and self-scoped government webpage archiving initiatives listed archiving only their own domains
(Commercial) service offerers like Aleph Archives (inclusion and applicability of which was discussed in another thread)

Kissaki (talk) 08:44, 9 June 2024 (UTC)[reply]

I would expect Web archive to include anything from the web, not only specific subdomains or archiving service providers. See Web and World Wide Web. Thus, I would even find it appropriate to remove anything that does not intend to archive a broader section of the web from this list titled "List of Web archiving initiatives". Kissaki (talk) 08:49, 9 June 2024 (UTC)[reply]

[A_survey_on_web_archiving_initiatives-1] Daniel Gomes; João Miranda; Miguel Costa (25--29 September 2011). "A survey on web archiving initiatives". International Conference on Theory and Practice of Digital Libraries 2011. Springer. Retrieved 23 October 2012. {{cite web}}: Check date values in: |date= (help)CS1 maint: multiple names: authors list (link)

[Portuguese_Web_Archive-2] Foundation for National Scientific Computing (FCCN) (23 October 2012). "Portuguese Web Archive: search the past". Foundation for National Scientific Computing (FCCN). Retrieved 23 October 2012.

[1]

[2]