Retrieving data from a table on a website is quite easy. I wanted to get this data and then output it in plain text. This turned out to be easier than I thought.
╭──(john㉿DESKTOP-PF01IEE)───╮ ╰───────────────────────────╾╯(~/Documents)-(172.18.116.29)┋ curl --silent -L https://aec.gov.au/media/2023/09-21a.htm | htmlq td | w3m -dump -T text/html | awk NF 18* 68,263 52,638 45,206 22,100 14,138 4,721 3,966 1,414 212,446 1.2% 19 82,251 64,273 55,307 27,035 16,991 5,385 4,776 2,254 258,272 1.5% 20-24 409,622 325,346 280,506 134,919 88,992 26,956 25,152 12,870 1,304,363 7.4% 25-29 421,117 348,130 290,857 140,745 94,188 28,342 28,003 15,569 1,366,951 7.7% 30-34 446,040 378,590 296,065 150,544 96,875 29,502 29,677 16,690 1,443,983 8.2% 35-39 477,934 407,336 304,999 167,130 102,795 30,068 31,938 16,524 1,538,724 8.7% |
I am getting the raw HTML this way.
curl --silent -L https://aec.gov.au/media/2023/09-21a.htm |
Then I feed it into htmlq to find all TD elements on the web page.
htmlq td |
Then dump the HTML as plain text and filter out all blank lines.
w3m -dump -T text/html | awk NF |
Here is another example: printing the headers for the content.
╭──(john㉿DESKTOP-PF01IEE)───╮ ╰───────────────────────────╾╯(~/Documents)-(172.18.116.29)┋ curl --silent -L https://aec.gov.au/media/2023/09-21a.htm | htmlq h2,td | w3m -dump -T text/html |
This will print the H2 and TD tags.
htmlq h2,td |
This does look very good indeed.
17,676,347 Australians are enrolled to vote in the upcoming 2023 referendum Enrolment by state, territory and age 18* 68,263 52,638 45,206 22,100 14,138 4,721 3,966 1,414 212,446 1.2% 19 82,251 64,273 55,307 27,035 16,991 5,385 4,776 2,254 258,272 1.5% |
Yet another example. This finds every UL tag with the CSS class ‘_2eAhj’. This is very effective.
╭──(john㉿DESKTOP-PF01IEE)───╮ ╰───────────────────────────╾╯(~/Documents)-(172.18.116.29)┋ curl --silent -L https://theage.com.au/siteguide | htmlq ul._2eAhj | w3m -dump -T text/html • Federal • Victoria • NSW • Queensland • Western Australia • Companies • Markets • The economy • Banking & finance • Entrepreneurship • Media • Workplace • North America • Europe • Asia • Middle East • Oceania • South America • Africa • NSW • Queensland • Western Australia • News • Living • Auctions • Financing • AFL • Cricket • Soccer • Racing • Tennis • NRL • Rugby union • Netball • Basketball • Motorsport • Cycling • Golf • NFL • Athletics • Swimming • Boxing |
This is very nicely formatted.
And finally, a nice way to get a weather forecast. This is printing the contents of an HTML table.
╭──(john㉿DESKTOP-PF01IEE)───╮ ╰───────────────────────────╾╯(~/Documents)-(172.18.116.29)┋ curl --silent -L https://www.dailymail.co.uk/weather/australia/index.html | htmlq table | w3m -dump -T text/html Location Condition Now Min Max Capital Cities Sydney 16°C 15°C 24°C Melbourne 8°C 7°C 26°C Brisbane 18°C 16°C 28°C Perth 11°C 9°C 23°C Adelaide 21°C 15°C 31°C Canberra 9°C 6°C 24°C Hobart 8°C 7°C 21°C Darwin 25°C 25°C 35°C More Top Towns Sydney 16°C 15°C 24°C Brisbane 18°C 16°C 28°C Perth 11°C 9°C 23°C Melbourne 8°C 7°C 26°C Adelaide 21°C 15°C 31°C Hobart 8°C 7°C 21°C Newcastle 19°C 14°C 26°C Canberra 9°C 6°C 24°C Wollongong 17°C 15°C 23°C Gold Coast 19°C 17°C 26°C Carrapateena Mine 20°C 18°C 34°C Gruyere Mine 19°C 18°C 31°C Bridport 9°C 7°C 17°C Boorowa 5°C 6°C 24°C Iron Bridge Mine 19°C 19°C 41°C Darwin 25°C 25°C 35°C Boco Rock 6°C 7°C 22°C Eliwana 22°C 18°C 37°C Annuello 14°C 6°C 31°C Goondiwindi 19°C 14°C 32°C |