Category: Corpus Linguistics Conference 2015
Corpus Linguistics #cl2015: notes and pics
Corpus Linguistics Conference 2015, University of Lancaster, UK
Thanks to @TonyMcEnery, @HardieResearch and everybody at @UCREL_Lancaster for organizing a wonderful conference.
Abstract book download:
A selection of talks and personal notes:
Learner corpus research plenary #cl2015
Multi-dimensional analysis of oral proficiency interviews #cl2015
Non-obvious meaning in CL and CADS #cl2015
Representation of benefit claimants in UK media #cl2015
Tono Linguistic feature extraction #cefr #cl2015
Language learning theories underpinning corpus-based pedagogy #cl2015
And some pics:
Robert Poole (left)
Ricardo Jiménez
Carlos Ordoñana (left)
Lynne Flowerdew
Carlos Ordoñana (left) and Yukio Tono (right)
Discussing the representation of immigrants in the context of the LADEX project.
Discussing the representation of immigrants in the context of the LADEX project.
Carlos Ordoñana (left) and Yukio Tono (right)
Yolanda Noguera and John Flowerdew
Yukio Tono (middle)
Yolanda Noguera and Michael Barlow
Multi-dimensional analysis of oral proficiency interviews #cl2015
Shelley Staples; Jesse Egbert; Geoff LaFlair
A multi-dimensional comparison of oral proficiency interviews to conversation, academic and professional spoken registers
MELAB : Michigan Engish Language Battery 989 OPIs in 2013
OPI used for academic and profesional purposes
Only transcribed the first 5 minutes
55 linguistic features
TagCount
FA
6 factor solution
Dimensions interpreted functionally
Dimension scores
Differences across registers (ANOVAs and post hocs)
6 dimension
1. Explicit stance: private verbs, that deletion, lower rates of implicit stance that the Longman corpus
3. Speaker-centered informational vs listener centered involvement: pro1, subject-conj.causative, nn, amplifiers,
4. Extended informational discourse: word length, prep, jj atr, that rel, negative features: all pronouns
6. Implicit stance: higher rates of implicit stance that the Longman corpus
Non-obvious meaning in CL and CADS #cl2015
Plenary session: Alan Partington
Non-obvious meaning in CL and CADS: from ‘hindsight post-dictability’ to sweet serendipity
Chair: Amanda Potts
http://www3.lingue.unibo.it/blog/clb/
Introspection & intuition
Processes of inference from the linguistic trace left by speakers/writers
Shared meaning
Idiom principle
Complexity of common grammatical items
Colligation: every word primed to occur in or avoid certain grammatical positions and functions (Hoey, 2005: 13)
SiBol (Siena-Bologna) corpus of newspapers, judicial inquiries, press briefings. Link.
Rapid language change
Corpus methodology is useful in detecting absence, not only presence
Language looks rather different when you look at a lot of it at once (Sinclair 1991)
Qualitative: anaphoric, historic, past behaviour
Quantitative anaphoric and cataphoric; enough data with which to infer
If primed >> psychologically fixed >> reproduced
Evaluation as prototypicality: inner circle obvious, outer circle non-obvious
Prosody can depend on grammar (Louw 1993), pov, literal vs figurative use and on field of register
Embedding is an important factor to interpret prosody
The added value of CL in discourse studies
Looking at language at different levels of abstraction: overview & close reading
Data are not sacred
Much of textual meaning is accretional
Positive cherry-picking: find counter examples
Almost all explanation in DA is informed speculation: in human science this is the closest you get to explanation
Moral panics have evolved over the years (globesity in 2015)
Representation of benefit claimants in UK media #cl2015
Ben Clarke
The ideological representation of benefit claimants in UK print media
2010 – 2014
2.3 M corpus
benefits clsimant(s) search criteria
Adjectival constructions
Adjective lemmas are ranked
hard number 40
tough number 53
enTenTen13 score
Tough on is significant in the corpus
Tough patterns
Benefit claimants: scroungers
tougher conditions, curbs on
Prepositions and ideology: on here as a Goal PR in a Material PT (impacted/affected entity)
Tono Linguistic feature extraction #cefr #cl2015
Yukio Tono
Linguistic feature extraction and evaluation using machine learning to identify “criterial” grammar constructions for the CEFR levels
L2 learner profile
English Profile – CEFR for Englsih
Criterial features: Hawkins & Filipovic 2012
CEFR-J RLD Project: aim prepare list of vocabulary and grammar item to be taught and assessed at each CEFR level
CEFR Coursebook Corpus
Weka format 3.6.12
158 features
Attribute selection