Difference between revisions of "Sounds of the City"

From MLML
Jump to: navigation, search
Line 15: Line 15:
 
== Alignment ==
 
== Alignment ==
 
Already aligned.
 
Already aligned.
 +
 +
== Import ==
 +
The following files are missing a corresponding transcript, and thus should be left out of the directory to be imported:
 +
  70-Y-m07-labo.wav
 +
  00-O-m01-mw74.wav
 +
  00-Y-m03-medi.wav
 +
  00-Y-m04-medi.wav
 +
  70-mc-M-m02-mlay.wav
 +
  70-mc-M-m03-mlay.wav
 +
  80-O-f01-clbk.wav
 +
  80-O-f02-clbk.wav
 +
  80-O-f03-clbk.wav
 +
  80-O-m01-clbk.wav
 +
  80-O-m02-clbk.wav
 +
  80-O-m03-clbk.wav
 +
  80-O-m04-clbk.wav
 +
  80-O-m05-clbk.wav
 +
  80-O-m06-clbk.wav

Revision as of 15:37, 29 May 2017

Sounds of the City is a corpus of 142 speakers of the Glaswegian English vernacular over a span of around 100 years. It is interested in examining the changes in Glaswegian over time. This page describes the steps to treat this corpus so that it may be used with the Montreal Forced Aligner and imported for the SPADE project.

Get Dataset

This data can be found on Havarti at corpora/SOTC.

Treating Audio

None needed.

Treating Transcripts

None needed.

Dictionary

The dictionary being used for this task is CELEX, on Havarti at corpora/CELEX.

Alignment

Already aligned.

Import

The following files are missing a corresponding transcript, and thus should be left out of the directory to be imported:

 70-Y-m07-labo.wav
 00-O-m01-mw74.wav
 00-Y-m03-medi.wav
 00-Y-m04-medi.wav
 70-mc-M-m02-mlay.wav
 70-mc-M-m03-mlay.wav
 80-O-f01-clbk.wav
 80-O-f02-clbk.wav
 80-O-f03-clbk.wav
 80-O-m01-clbk.wav
 80-O-m02-clbk.wav
 80-O-m03-clbk.wav
 80-O-m04-clbk.wav
 80-O-m05-clbk.wav
 80-O-m06-clbk.wav