onlinejournalismblog.com
How-to: Scraping ugly HTML using ‘regular expressions’ in an OutWit Hub scraper
The following is the first part of an extract from Chapter 10 of Scraping for Journalists. It introduces a particularly useful tool in scraping – regex – which is designed to look for &…