How to save web pages for posterity
#1
Lightbulb 
Quote:Web pages often disappear, move, or change content. How to keep them the way you want, or easily locate a web archive?

Contrary to the popular belief that anything online stays online, the internet doesn’t remember everything. In a previous post in this series, we examined no fewer than nine scenarios in which you could lose access to online content. We also provided a detailed guide to what information you absolutely must (and preferably quickly) back up to your computer and how to do it. Today, we’ll discuss how to easily save web pages to your computer, how to organize these archives, and what to do if your favorite site has gone AWOL.
 
Let’s say you want to save a blog post with a recipe, compile a bibliography for your research paper, or even preserve a specific online publication for legal purposes. All of the above are published as web pages — which have a tendency to disappear at the wrong moment. Want to reminisce about music news and gossip from 2005? Good luck with that — the MTV News site shut down and all its articles and interviews are no longer available. Check references in Wikipedia articles? 11% of them lead nowhere, even though they were working when the article was published. This phenomenon of “link rot” — the gradual deletion or relocation of online content — is rapidly becoming a major problem. 38% of pages that existed ten years ago are no longer accessible today. So, if there’s a web page out there that you like or need, the wise move would be to create a backup.

How to save a web page to your computer

Since a web page consists of dozens or even hundreds of files, backing it up will require a bit of effort. Here are the main ways to do it:
 
Save only the text as an HTML file. Select the “Save page as…” menu command or button in your browser and then select “Webpage, HTML Only”. This will only save the text of the web page, without any graphics or other eye candy.
 
Save text and images. The “Webpage, Complete” option will create, besides an HTML file, a folder with the same name containing all graphic elements, styles, and scripts from the page. A downside of this option is that saving a lot of auxiliary files clutters your drive. The “Webpage, Single File” option is more convenient, bundling the web page and all its resources into a single .mhtml file. This will open freely in Chrome or Edge, but other browsers may have issues. This option is not available in all browsers, but if you install the SingleFile extension (available for most browsers), you can save the entire web page and its media content as a single HTML file that opens perfectly fine in all modern browsers.
 
Print to PDF. To preserve the main content of the page, but scrap menus and banners, your best option is Print to PDF. The resulting file will open on any computer.
 
With any of these options, make sure that the main text that you actually want to keep is still readable when you open the document.

Continue Reading...
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)
[-]
Welcome
You have to register before you can post on our site.

Username/Email:


Password:





[-]
Recent Posts
Mozilla plans to use Firefox's installer...
Mozilla has a plan...harlan4096 — 10:30
Vulkan 1.4 released: developer-requested...
Vulkan 1.4 focuses...harlan4096 — 08:32
Emsisoft Anti-Malware 2024.12.0.12633
Changes in 2024.12...harlan4096 — 08:16
Android Security Bulletin December 2024
Android Security B...harlan4096 — 08:15
Thunderbird Nebula Version 128.5.1 (ESR)
Thunderbird Nebula...harlan4096 — 08:07

[-]
Birthdays
Today's Birthdays
No birthdays today.
Upcoming Birthdays
avatar (42)ivyhuv
avatar (40)Enlargedterrestrial20

[-]
Online Staff
There are no staff members currently online.

>