Skip to content

Pérez-Paredes

Corpus linguistics, applied linguistics & technology in language education

  • Selected publications
  • Blog
  • Featured research papers
  • Data-driven Learning in and out of the Language Classroom
  • Corpus linguistics for education
  • Corpus Linguistics for Language Learning Research
  • About me
    • Short bio
    • Research group
    • Contact
    • Conferences & talks
    • Resumen CV
    • Traducciones oficiales
    • European projects

Search this site

Featured research paper

Categories

Out now: "Future Challenges and Opportunities for Data-Driven Learning" with @languagecopora @carlosordonana.bsky.social #corpuslinguistics #languagelearning an entry in The Palgrave Encyclopedia of Computer-Assisted Language Learning. doi.org/10.1007/978-... Thanks to the editors and reviewers.

— Pascual Pérez-Paredes (@perez-paredes.bsky.social) 2025-03-31T04:40:52.874Z

Selected papers

Exploring Part of Speech (POS)-tag sequences in a large-scale learner corpus of L2 English

Archives

Tag: Metadata

Corpora and metadata

Lou Burnard:

[…] it is no exaggeration to say that without metadata, corpus linguistics would be virtually impossible. Why? Because corpus linguistics is an empirical science, in which the investigator seeks to identify patterns of linguistic behaviour by inspection and analysis of naturally occurring samples of language. A typical corpus analysis will therefore gather together many examples of linguistic usage, each taken out of the context in which it originally occurred, like a laboratory specimen. Metadata can restore that context by supplying information about it, thus enabling us to relate the specimen to its original habitat. Furthermore, since language corpora are constructed from pre-existing pieces of language, questions of accuracy and authenticity are all but inevitable when using them: without metadata, the investigator has no way of answering such questions. Without metadata, the investigator has nothing but disconnected words of unknowable provenance or authenticity[1].


[1] URL: http://users.ox.ac.uk/~lou/wip/metadata.html

References: Burnard, Lou; Aston, Guy (1998). The BNC handbook: exploring the British National Corpus. Edinburgh: Edinburgh University Press.

Tweet
Posted on 29th December 201929th December 2019Categories corpus linguisticsTags Metadata
Proudly powered by WordPress
 

Loading Comments...