The British National Corpus (BNC) is a collection of written and spoken linguistic data, which date to at least the late 20th century and total around 100 million words. Approximately 90% of the corpus is composed of written data, with the additional 10% being speech data. In both cases, a variety of styles and settings are provided.

