bleacherreport articles are showing up blank (screenshot works but the actual webpage archive is blank). Could you take a look at this? Thanks!
e.g. bleacherreport*com/articles/2583445-nba-opening-night-gets-the-hotline-bling-drake-music-video-treatment
I made a fix for future bleacherreport saves, but this one seems unrecoverable (React often cleans the whole page on JavaScript error, and this is what was happening on bleacherreport)
Hi, thanks for providing this fabulous archive service! In https://blog.archive.today/post/688077534761566208, you said that user’s IP address won’t be send to website since 2019. Could you provide an option to send IP address to capture localized contents? And some websites may only be reached in certain regions… :-(
Websites no longer look at `X-Forwarded-For` for user region, so I have to use per-website proxies to get localized content and avoid geo-block.
It is not 100% correct though, so feel free to report a bug it you spot that.
Did something change over the past few days such that the site no longer functions through TOR?
It works.
Let me guess: if you copy-pasted archiveiya74codqgiixo33q62qlrqtkgmcitqx5u2oeqnmn5bpcbiyd.onion from wikipedia, it won’t work because it contains a zero-width space character inside.
Pretty please remove the recent restrictions for archiving Twitter. It's practically useless for archiving Twitter now. I just archived someone's media page and it only captured their first 4 tweets.
I know, it need to be remade almost from scratch because of the changes on Twitter side. Old code resulted in loading spinner and no content at all.
"It needs accounts, otherwise instagram redirects to the login page or shows a fake 404 page. Accounts do not live long." I've entered the full instagram url, all i get is "Not Found (yet?)". what does this mean? the provided ig acount i entered is public not private
This means it has tried 10 times and given up.You didn't specify an account for the archiver to log in to instagram, it's just not implemented
I notice on the /wip pages when I archive something from a news site, most of the time is spent loading various trackers, assets that are never displayed like videos, etc. Why not only load what's needed to render content? Not commenting on the state of the web here, just the performance of the archiver.
Known trackers are skipped (their lines are gray instead of green)
