automated speech recognition

Every Tool and Project I learned about, at #mozfest

Morning everyone,

I spent the weekend at a conference in London called #mozfest, which was run by Mozilla and featured people talking about open source code, journalism, art, science, community, and community learning. Saw lots of great products/ideas, which are summarized below. I’m highlighting things that I would need dev help on, but that we could make (if you’re interested in a side project.) Tomorrow, I’ll report on key takeaways. 



1. HearUsHere uses GPS coordinates to place sounds at specific locations. This allows users to compose audio experiences as they travel throughout a space. (For example, recording sounds as a walking tour around a city.) Imagine what we could do with this for news or our programming.

2. is natural language for the Internet of Things.

3. Free web app to take the pain out of transcribing interviews. 

4. I worked on a team to think about how we could recreate the tape recorder to make a better experience for reporters out in the field. Want to work on this? Let me know.

5. The BBC R&D Labs has designed something called Perceptive Radio  which changes words in stories based on variables, like the weather. Here’s a picture.


2. The BBC News Lab team is working on several projects of note. 

COMMA creates metadata from large collections of audio files. It produces crude transcripts using speech recognition, automated tags and speaker segmentation. (from Letter from America Rediscovered.) 

Letter from America categorized by theme

To do this, they used WikipediaMiner, which is a toolkit built for tapping into the rich semantics encoded within Wikipedia. 

2a. Related: Local Angle  developed by researchers at the Knight Lab, finds locally revenant stories in national news. They do this by using Wikipedia’s API and an API that finds keywords in stories. Imagine how we could apply this to audio, particularly if we can pluck out keywords.

Structured Wikipedia Data Resources: How Wikipedia structures data and how you can use it to do cool stuff


  • How the New York Times, the New York Public Library and ProPublica are crowdsourcing data — and relying on their audience to provide layers of metadata that could be useful for future projects 


Internal Tools

  • PopcornMaker is a way to remix web video, audio, and images into mashups that can be embedded onto other websites. 

  • A group of us came up with several bot ideas that would be useful for a newsroom (Help me make these.)

1. Could a bot provide reporters with predictions about board appointments, political appointments or hiring decisions based on aberrant behaviors on Twitter? In other words, if everyone from NPR starts following X, then it’s pretty likely X is about to be hired by NPR — even if that information isn’t publicly announced yet. This could help business reporters, political reporters and entertainment reporters. (Here’s how Buzzfeed predicted Ezra Klein was going to VOX using humans to determine this.) 

2. Could a bot suggest questions for reporters to ask about a particular topic, based on questions that have been posted to Ask Metafilter, Reddit’s Explain it Like I’m 5, Quora, and the Stack Exchange network? i.e. How can we filter the best questions posted on these networks and give them to reporters so they know what the audience is curious about?

3. Could you have a bot that would monitor the ethics of other bots?

For Fun

See something cool? Let us know! The archives for this listserv live here: