ArchiveBox or similar for shared archiving of research project

Stopwatch1986@lemmy.ml · 11 小时前

One advantage and disadvantage of having webrecorder host our archived pages is that the archive may survive longer than, or not as long as our project.

I have been using singlefile for years. It’s great but not for seamlessly making cached web pages available to the general public reading our reports and finding that cited links are now dead. And it doesn’t support URLs point to PDF, CSV files. A public-facing repository of singlefile files with an index for ToC might do it though. Simplicity is good for future-proofing an archive.

Something like archive.org and archive.is would be ideal, but we have no control over its future and practices.

Stopwatch1986@lemmy.ml · 12 小时前

I wonder if an authorised remote user (ie an affiliated researcher) can easily instruct ArchiveBox to store a URL and later retrieve it. Also, ideally a random user should be able to retrieve the archived web page or file (eg a PDF, CSV etc). The idea is that authorised researchers can get URLs archived, and then any user reading our reports can click on a citation and get our archived source if the original is not available any more. I’ll need to run it and see, but it looks promising.

Keeping the archive alive for years later, possibly after funding dries up, is another challenge but there are public repositories that may be suitable for that.

Stopwatch1986@lemmy.ml · 1 天前

ArchiveBox or similar for shared archiving of research project

Stopwatch1986@lemmy.ml · 2 天前

Doesn’t clicking on the headphones switch to an audio test like with regular captcha? That’s what I do and it works first time instead of getting an endless number of images when I use VPN. The words you enter don’t even have to be 100% correct.

Stopwatch1986@lemmy.ml · 9 天前

It is but creeping privatisation may change that, as does legislation becoming more hostile to unionisation since the 1980s.

The broader point is that individuals can try all they want to preserve their privacy, but then friends, family and organisations spy on them, often unwittingly, eg when we share with them calendar events or email messages. The only way forward is collective resistance, building alliances and influencing public policy. But it’s always been like that with systemic issues.

Stopwatch1986@lemmy.ml · edit-2 11 天前

And resistance can only be collective. Another reason unionisation is as important as it’s ever been.

Stopwatch1986@lemmy.ml · 15 天前

The implication is that sending links to encrypted files with the decryption key added to the URL (eg Thunderbird Send, Mega etc) is not zero-trust. Decryption may take place locally and the key part of the URL may not be sent to the file hosting service, but when the recipient clicks on the link and is served one-off code by the web site, that code may be compromised.

As we know, the best way to be sure is to do your own separate encryption but without secure-by-design most people will think you are very odd demanding that decryption is done separately and keys are shared through a different channel. Speaking from experience, no matter how much training they are given at work, most people, including HR, would rather you sent them sensitive documents (like passport scans) in the clear as email attachments or at least in a way that involves a single click (Wetransfer etc).

Stopwatch1986@lemmy.ml · 16 天前

Zero-trust services and web access

Stopwatch1986@lemmy.ml · 29 天前

I thought it was Autonomy. You installed a program, instructed puppies agents, logged out, and while you were offline the puppies searched through several engines. Next time you logged in the findings waited for you. That was the time of 56k modems and metered connections.