Sunday, July 18, 2021

a manual textise to render as text recalcitrant -- and obstinate -- web pages

Some times you'll want to render some pages as text only.  Some pages think they have a right to refuse.  Some people, like the author, find it really difficult to take no for an answer — from any software, but especially from a large corporation.

Think mainstream media, larger search engines, &c...

---------------------------------------------------------------------------------------------------

For the Recalcitrant Page:

Enter URL below, load frame, find an innocuous spot to inset cursor, click, select all, and paste into textarea, tap the button, & lo and behold! text.  text sometimes without all the unnecessary things that load after the page and often prevent optimal viewing.

-----------------------------------------------------------------------------

For the Obstinate Page:

(such as a top tier search, video, image, or music engine):

Enter URL below, and note how when you load frame, some text appears below that starts with "curl." Select that text, and paste where it says "curl" in the iframe at the bottom of the page. Tap the button to run it, wait a sec, & lo and behold! the frame for you to... find an innocuous spot to inset cursor, click, select all, and paste into textarea, tap the button, & lo and behold! text.  text sometimes without all the unnecessary things that load after the page and often prevent optimal viewing. 

Apologies if you thought it was going to be formatted. At least all the thingies are resizeable (or ought to be, dammit).

But wait!  There's more!  Still unhappy because untextified? 

Scroll to the UPDATE, let us experiment with the PDF creator! 

There ought to be a viewer just below it, so you can see what you just created, scroll, copy text, paste it in that now familiar orange textarea, tap Ye Olde Button, & lo!  Behold! 
----------------------------------------------------------------------

⌨️⌨️⌨️⌨️⌨️⌨️⌨️⌨️⌨️

ENJOY YOUR NEWFOUND READING MATERIAL.

or, go to Faust's hella WYSIWYG to download any way you please.

⌨️⌨️⌨️⌨️⌨️⌨️⌨️⌨️⌨️

EVERY PAGE STARTS HERE:

URL:


OBSTINATE COPY THESE WORDS, PASTE IN WINDOW BELOW, THEN PASTE INTO TEXTAREA  (SEE INSTRUCTIONS ABOVE):



THE TEXTISING TEXTAREA:


-------------------------------------------------

MAYBE GOOGLE CACHE TEXT ONLY WILL WORK:

Paste your URL and a new link should appear below to open the cached page in a new tab. 

DO NOT TO INCLUDE THE HTTP OR HTTPS (INCLUDING COLON AND SLASHES). 

DO NOT INCLUDE ANY UNNECESSARY PARAMETERS.  If it does not work, remove parameters.

Then just click through to your text only page -- if the regular cache was all that loaded. Tap and hold, or right-click to download.  URL:

Remember, this only works because it is Google's domain -- don't try to click through, it will un-cache you.  Maybe try using a web archive like Archive.org, with a very recent version of the page you want.  You could even MAKE a version.  That might could work.  Or maybe if you sandboxed that... But I digress.

-------------------------------------------------------------

.UPDATE

A PDF Creator where you can paste the URL of that really obstinate page! Then just scroll down to view and copy text (hopefully I added that utility), or just view it as you would another PDF.



BCNU.

No comments:

Post a Comment