Computers read the fossil record: Palaeontologists hope that software can construct fossil databases directly from research papers
- by Ewen Callaway
“For a field whose raison d'être is to chronicle the deep past, palaeontology is remarkably forward-looking when it comes to organizing its data. Victorian natural history museums meticulously organized their collections with handwritten cards that survive to this day. And over the past 15 years, researchers have collectively entered records of more than a million fossils into an online database, allowing them to track broad trends in the history of life.
Now, palaeontologists are exploring the use of machine algorithms to pull fossil data from their research papers automatically. “I’m fairly convinced that this is the future, for sure,” says Shanan Peters, a palaeontologist at the University of Wisconsin–Madison (UW Madison) who is co-leading an effort to use software to extract information from tens of thousands of palaeontology papers. “Building a database, per se, will be a thing of the past. Those databases will be dynamically generated based on the questions you’re interested in, and the machine will do the heavy lifting.”
Peters should know. He is the principal investigator of the Paleobiology Database (PBDB; paleobiodb.org), which details the age, location and identity of some 1.2 million fossils. Since it was started in 1998, researchers have spent about 80,000 hours — the equivalent of 9 continuous years — entering and opining over data from original field research and around 40,000 articles. The PBDB has produced hundreds of papers and has allowed palaeontologists to address questions that would have been otherwise unanswerable, on topics ranging from epoch-wide extinction rates to the disappearance of certain dinosaurs” (read more).