Transcription

Here you can find some useful resources to carry out your transcription project.

MacWhinney, B. (2000). The CHILDES Project: Tools for Analyzing Talk. 3rd Edition. Mahwah, NJ: Lawrence Erlbaum Associates.

Brian MacWhinney (2019) Tools for Analyzing Talk. Part 1: The CHAT Transcription Format. URL:
https://childes.talkbank.org/

Leech (2004): types of annotation

phonetic annotation e.g. adding information about how a word in a spoken corpus was pronounced.


prosodic annotation — again in a spoken corpus — adding information about prosodic features such as stress, intonation and pauses.

syntactic annotation —e.g. adding information about how a given sentence is parsed, in terms of syntactic analysis into such units such phrases and clauses

semantic annotation e.g. adding information about the semantic category of words — the noun cricket as a term for a sport and as a term for an insect belong to different semantic categories, although there is no difference in spelling or pronunciation.


pragmatic annotation e.g. adding information about the kinds of speech act (or dialogue act) that occur in a spoken dialogue — thus the utterance okay on different occasions may be an acknowledgement, a request for feedback, an acceptance, or a pragmatic marker initiating a new phase of discussion.
discourse annotation e.g. adding information about anaphoric links in a text, for example connecting the pronoun them and its antecedent the horses in: I’ll saddle the horses and bring them round. [an example from the Brown corpus]


stylistic annotation e.g. adding information about speech and thought presentation (direct speech, indirect speech, free indirect thought, etc.)
lexical annotation adding the identity of the lemma of each word form in a text — i.e. the base form of the word, such as would occur as its headword in a dictionary (e.g. lying has the lemma LIE).

Online services:

https://transcribe.wreally.com/

https://otranscribe.com/

BRAT: http://brat.nlplab.org/introduction.html

Backbone Transcriptor. URL

Gate: https://gate.ac.uk/teaching.html

Folia: https://proycon.github.io/folia/

Metadata for corpus work: http://users.ox.ac.uk/~lou/wip/metadata.html

Annotation on Sketch Engine: https://www.sketchengine.eu/guide/annotating-corpus-text/

TEI by example website: https://teibyexample.org/modules/TBED02v00.htm

Wordcounter

Online editor that can help you to improve word choice and writing style, and, optionally, help you to detect grammar mistakes and plagiarism. To check word count, simply place your cursor into the text box above and start typing. You’ll see the number of characters and words increase or decrease as you type, delete, and edit them. You can also copy and paste text from another program over into the online editor above. The Auto-Save feature will make sure you won’t lose any changes while editing, even if you leave the site and come back later.

URL: https://wordcounter.net

Graphic Online Language Diagnostic

 

Graph-Magnifier-icon

The Graphic Online Language Diagnostic (“GOLD”) is a corpus tool that allows language educators to submit and analyze language data. GOLD was developed by the Center for Advanced Language Proficiency Education and Research (“CALPER”) at The Pennsylvania State University (“PSU”), University Park, PA, USA under a grant from the U.S. Department of Education (Title VI, P229A060003 and P229A020010).

Link here: http://gold.gwserver1.net

Writing tools for researchers

This is a selection of resources for those wishing to improve their scientific and academic writing in English. It showcases some online resources including courses, academic word lists, online data bases, concordancers, corpora as well as some diy tools.

Online courses

British Council Writing for a purpose

Face to face & online courses

VI Escribir ciencia en inglés / Writing science in English (Universidad de Murcia)

Word lists

AWL and definitions. Academic Word List Coxhead (2000). Around  570 headwords

AWL 10 sublists and sublist families

Exploring contexts of AWL (dictionary-based)  and academic areas  (needs a code)

Test your vocabulary range using Lex Tutor

The Manchester Phrase Bank

Exploring collocations

Oxford online collocations dictionary

Collocation forbetterenglish (Sketch Engine SKELL): examples, word sketches and similar words

Word neighbors (different corpora available)

String net (explore patterns)

Collocaid: collocation errors and editor

Using Google N-GRAM to discover word combinations (intake of *)

Online corpora

Academic words in American English (Mark Davies COCA)

CRA (Corpus of Research Articles) Great to test your hypothesis (perform an analysis?)

MICUSP

MICASE

British Academic Written English Corpus (BAWE) Sketch engine gateway

BAWE corpus (Coventry site)

ScienQuest

CQPweb portal

Deconstructing discourse

Clean your text 

Generate word lists (Input url)

Ngram Analyzer

Ngram Extractor

Web as a corpus (n-gram browser)

Online text comparator

Google books Ngram Viewer Use it to test phraseological uses  All the options here

Online DBs

Exploration tools:

Ngramfinder

Babla (just for fun)

Netspeak

Video talks

Webcorp (The web is your corpus)

Springer exemplar

Taporware tools (Alberta)

Concordancers

Antconc (Win, MacOS, lINUX)

Textstat (Windows & MacOS)

Do-it-yourself tools & Advanced users

Just-text

Beautifulsoup parser (Python)

Avoid deduplication: Onion

—————————–

Using COCA (Corpus of Contemporary American English)

For more information on research group and interests, visit our website: Languages for specific purposes, language corpora, and English linguistics applied to knowledge engineering.