EGP: investigating patterns of learner grammar development AAAL 2018 Chicago

 

The English Grammar Profile: investigating patterns of learner grammar development

Anne O´Keeffe, Mary Immaculate College, University of Limerick – 

Geraldine Mark, Mary Immaculate College, University of Limerick – 

Pascual Pérez-Paredes, University of Cambridge

Check out our handout here.

Weblinks

The CEFR: http://www.cambridgeenglish.org/exams-and-tests/cefr/

The English Grammar Profile: http://www.englishprofile.org/english-grammar-profile/egp-online

Cambridge Learner Corpus: https://www.sketchengine.co.uk/cambridge-learner-corpus/

Sketch Engine universal POS tags https://www.sketchengine.co.uk/universal-pos-tags/

 

References

Ellis, N. C. (2003). ‘Constructions, chunking, and connectionism: The emergence of second language structure’. In C. Doughty & M. H. Long (Eds.), Handbook of Second Language Acquisition (pp. 33–68). Oxford, UK: Blackwell.

Ellis, N. C. (2012). “Formulaic language and second language acquisition: Zipf and the phrasal teddy bear”. Annual Review of Applied Linguistics, 32, 17-44.

Simpson-Vlach, R., & Ellis, N. C. (2010). An Academic Formulas List (AFL). Applied Linguistics, 31, 487–512.

Ellis, N. C., Römer, U. & O’Donnell, M. B. (2016). Usage-based Approaches to Language Acquisition and Processing: Cognitive and Corpus Investigations of Construction Grammar. Language Learning Monograph Series. Wiley-Blackwell.

Larsen-Freeman, D. (2006).  “The emergence of complexity,  fluency, and accuracy in the oral and written production of  five Chinese learners of English”. Applied Linguistics, 27(4), 590–619.

Milton, J., & Meara, P. (1995). “How periods abroad affect vocabulary growth in a foreign language”. ITL Review of Applied Linguistics, (107–08), 17–34.

O’Keeffe, A., & Mark, G. (2017). “The English Grammar Profile of learner competence: Methodology and key findings”. International Journal of Corpus Linguistics, 22(4), 457-489. https://benjamins.com/#catalog/journals/ijcl.14086.oke/fulltext

Römer, U., O’Donnell, M. B., & Ellis, N. C. (2014). “Second language learner knowledge of verb–argument constructions: Effects of language transfer and typology”. The Modern Language Journal, 98(4), 952-975.

Thewissen, J. (2013). “Capturing L2 accuracy developmental patterns: Insights from an error-tagged learner corpus”. The Modern Language Journal, 97(S1), 77–101.

Deadline of the CfP for LCR 2017 extended to 31 Jan 2017 #corpuslinguistics

The deadline of the CfP for LCR 2017 has been extended to Tuesday, 31 January 2017

4th Learner Corpus Research Conference, Bolzano/Bozen, 5-7 October 2017

Call for Papers

Following the successful conferences in Louvain-la-Neuve (Belgium) in 2011, Bergen (Norway) in 2013 and Nijmegen (the Netherlands) in 2015, the 4th Learner Corpus Research Conference will be hosted by the Institute for Specialised Communication and Multilingualism at EURAC Research, Bolzano/Bozen, Italy. The conference, organized under the aegis of the Learner Corpus Association, aims to be a showcase for the latest developments in the field and will feature full paper presentations, work in progress reports, poster presentations, software demos and a book exhibition.

The theme of LCR 2017 is “Widening the Scope of Learner Corpus Research”.

Conference Venue: European Academy Bozen/Bolzano – EURAC Research

Confirmed keynote speakers:

  • Philip Durrant (University of Exeter, United Kingdom)
  • Stefan Th. Gries (University of California, Santa Barbara, U.S.A.)
  • Stefania Spina (Università per Stranieri Perugia, Italy)

The keynote speakers will address the theme of LCR 2017 in their respective lectures on L1 writing development and Learner Corpus Research, quantitative methods in Learner Corpus Research, and Learner Corpus Research and Italian as L2. We welcome papers that address all aspects of Learner Corpus Research, in particular the following ones:

  • Corpora as pedagogical resources
  • Corpus-based transfer studies
  • Data mining and other explorative approaches to learner corpora
  • English as a Lingua Franca
  • Error detection and correction of learner language
  • Extracting language features from learner corpora
  • Innovative annotations in learner corpora
  • Language for academic/specific purposes
  • Learner varieties
  • Learner corpora for less commonly taught languages
  • Learner Corpus Research and the Common European Framework of Reference for Languages (CEFR)
  • Learner Corpus Research and Natural Language Processing
  • Links between Learner Corpus Research and other research methodologies (e.g. experimental methods)
  • Search engines for learner corpora
  • Statistical methods in learner corpus studies
  • Task and learner variables

There will be four different categories of presentation:

  • Full paper (20 minutes + 10 minutes for discussion)
  • Work in Progress (WiP) report (10 minutes + 5 minutes for discussion)
  • Corpus/software demonstration
  • Poster

The Work in Progress reports and posters are intended to present research still at a preliminary stage and on which researchers would like to get feedback.

The language of the conference is English.

Abstracts

Your abstract should be between 600 and 700 words (excluding a list of references). Abstracts should provide the following:

  • clearly articulated research question(s) and its/their relevance;
  • the most important details about research approach, data and methods;
  • the main results and their interpretation.

Abstracts should be submitted through EasyChair (https://easychair.org/conferences/?conf=lcr2017) by Sunday 15 January 2017 by Tuesday 31 January 2017 (new deadline!). Please follow instructions provided on the conference website (http://lcr2017.eurac.edu).

Please note: The Learner Corpus Association will award the best paper and the best poster presentation given by a PhD student. Only LCA members can participate in the competition. Members interested in entering the competition must indicate so when submitting their abstracts.

Abstracts will be reviewed anonymously by the scientific committee. Notification of the outcome of the review process will be sent by 31 March 2017.

 

LCR2017 – Preconference workshop in honour of Professor Sylviane Granger

“LCR at the interfaces”, 4 October 2017, 15.00 to 18.00

This workshop, organized in honour of Sylviane Granger, will feature a series of invited speakers whose work has greatly contributed to the development of LCR. 

Four key interfaces will be discussed during the workshop:

“The interfaces between LCR and contrastive analysis” (Hilde Hasselgård and Signe Oksefjell Ebeling)

“The interfaces between LCR and SLA” (Nina Vyatkina)

“The interfaces between LCR and lexicography” (tbc)

“The interfaces between LCR and NLP” (tbc)

Join us for this event which promises to be a landmark in the LCR history!

 

The LCR 2017 organising committee

Andrea Abel (EURAC Research)
María Belén Díez-Bedmar (Universidad de Jaén)
Daniela Gasser (EURAC Research)
Aivars Glaznieks (EURAC Research)
Verena Lyding (EURAC Research)
Lionel Nicolas (EURAC Research)

The LCR 2017 scientific committee

Andrea Abel (EURAC Research)
Katherine Ackerley (Università degil Studi di Padova)
Annelie Ädel (Dalarna University)
Nicolas Ballier (Université Paris Diderot – Paris 7)
María Belén Díez-Bedmar (Universidad de Jaén)
Marcus Callies (Universität Bremen)
Erik Castello (Università degil Studi di Padova)
Francesca Coccetta (Università Ca’Foscari Venezia)
Pieter de Haan (Radboud Universiteit Nijmegen)
Hilde Hasselgård (Universitet i Oslo)
Sandra Deshors (New Mexico State University)
Ana Diaz-Negrillo (Universidad de Granada)
Michael Flor (ETS)
John Flowerdew (City University of Hong Kong)
Lynne Flowerdew (independent researcher)
Fanny Forsberg Lundell (Stockholm University)
Gaëtanelle Gilquin (University of Louvain)
Sandra Götz (Justus Liebig Universität Gießen)
Solveig Granath (Karlstad University)
Sylviane Granger (Universtié catholique de Louvain)
Nicholas Groom (University of Birmingham)
Jirka Hana (Charles University Prague)
Shin’ichiro Ishikawa (Kobe University)
Jarmo Harri Jantunen (University of Jyväskylä)
Scott Jarvis (Ohio University)
Marie Källkvist (Lund University Sweden)
Agnieszka Lenko-Szymanska (University of Warsaw)
Cristóbal Jesús Lozano Pozo (Universidad de Granada)
Anke Lüdeling (Humboldt-Universität Berlin)
Carla Marello (Università degil Studi Torino)
Fanny Meunier (Universtié catholique de Louvain)
Detmar Meurers (Universität Tübingen)
Florence Myles (University of Essex)
Susan Nacey (Hedmark University College)
Lionel Nicolas (EURAC Research)
Michael O’Donnell (Universidad Autónoma de Madrid)
Signe Oksefjell Ebeling (Universitetet i Oslo)
Magali Paquot (Universtié catholique de Louvain/FNRS)
Pascual Pérez-Paredes (University of Cambridge)
Tom Rankin (Vienna University of Economics and Business)
Paul Rayson (UCREL, Lancaster University)
Ute Römer (University of Michigan)
Anna Siyanova-Chanturia (Victoria University of Wellington)
Jennifer Thewissen (Universiteit Antwerpen)
Yukio Tono (Tokyo University of Foreign Studies)
Nina Vyatkina (University of Kansas)
Heike Zinsmeister (Universität Hamburg)

Graphic Online Language Diagnostic

 

Graph-Magnifier-icon

The Graphic Online Language Diagnostic (“GOLD”) is a corpus tool that allows language educators to submit and analyze language data. GOLD was developed by the Center for Advanced Language Proficiency Education and Research (“CALPER”) at The Pennsylvania State University (“PSU”), University Park, PA, USA under a grant from the U.S. Department of Education (Title VI, P229A060003 and P229A020010).

Link here: http://gold.gwserver1.net

4th Learner Corpus Research Conference, Bolzano, Italy, 5‐7 October 2017

4th Learner Corpus Research Conference
Bolzano/Bozen, Italy, 5‐7 October 2017

Home

Abstracts should be submitted through EasyChair by Sunday 15 January 2017.

Notification of the outcome of the review process will be sent by 31 March 2017.

Call for Papers

Following the successful conferences in Louvain‐la‐Neuve (Belgium) in 2011, Bergen (Norway) in 2013 and Nijmegen (the Netherlands) in 2015, the 4th Learner Corpus Research Conference will be hosted by the Institute for Specialised Communication and Multilingualism at EURAC Research, Bolzano/Bozen, Italy. The conference, organized under the aegis of the Learner Corpus Association, aims to be a showcase for the latest developments in the field and will feature full paper presentations, work in progress reports, poster presentations, software demos and a book exhibition.

The theme of LCR 2017 is “Widening the Scope of Learner Corpus Research”.

Conference Venue: European Academy Bozen/Bolzano – EURAC Research

Confirmed keynote speakers:

Philip Durrant (University of Exeter, United Kingdom)
Stefan Th. Gries (University of California, Santa Barbara, U.S.A.)
Stefania Spina (Università per Stranieri Perugia, Italy)
The keynote speakers will adress the theme of LCR 2017 in their respective lectures on L1 writing  development and Learner Corpus Research, quantitative methods in Learner Corpus Research, and Learner Corpus Research and Italian as L2. We welcome papers that address all aspects of Learner  Corpus Research, in particular the following ones:

* Corpora as pedagogical resources
* Corpus‐based transfer studies
* Data mining and other explorative approaches to learner corpora
* English as a Lingua Franca
* Error detection and correction of learner language
* Extracting language features from learner corpora
* Innovative annotations in learner corpora
* Language for academic/specific purposes
* Learner varieties
* Learner corpora for less commonly taught languages
* Learner Corpus Research and the Common European Framework of Reference for Languages (CEFR)
* Learner Corpus Research and Natural Language Processing
* Links between Learner Corpus Research and other research methodologies (e.g. experimental methods)
* Search engines for learner corpora
* Statistical methods in learner corpus studies
* Task and learner variables

There will be four different categories of presentation:

* Full paper (20 minutes + 10 minutes for discussion)
* Work in Progress (WiP) report (10 minutes + 5 minutes for discussion)
* Corpus/software demonstration
* Poster
* The Work in Progress reports and posters are intended to present research still at a preliminary stage and on which researchers would like to get feedback.

The language of the conference is English.

Abstracts
Your abstract should be between 600 and 700 words (excluding a list of references). Abstracts should  provide the following:
* clearly articulated research question(s) and its/their relevance;
* the most important details about research approach, data and methods;
* the main results and their interpretation.

Abstracts should be submitted through EasyChair (https://easychair.org/conferences/?conf=lcr2017) by Sunday 15 January 2017. Please follow instructions provided on the conference website (http://lcr2017.eurac.edu).
Abstracts will be reviewed anonymously by the scientific committee. Notification of the outcome of  the review process will be sent by 31 March 2017.

The LCR 2017 organising committee
Andrea Abel (EURAC Research)
María Belén Díez‐Bedmar (Universidad de Jaén)
Daniela Gasser (EURAC Research)
Aivars Glaznieks (EURAC Research)
Verena Lyding (EURAC Research)
Lionel Nicolas (EURAC Research)

The LCR 2017 scientific committee
Andrea Abel (EURAC Research)
Katherine Ackerley (Università degil Studi di Padova)
Annelie Ädel (Dalarna University)
Nicolas Ballier (Université Paris Diderot – Paris 7)
María Belén Díez‐Bedmar (Universidad de Jaén)
Marcus Callies (Universität Bremen)
Erik Castello (Università degil Studi di Padova)
Francesca Coccetta (Università Ca’Foscari Venezia)
Pieter de Haan (Radboud Universiteit Nijmegen)
Hilde Hasselgård (Universitet i Oslo)
Sandra Deshors (New Mexico State University)
Ana Diaz‐Negrillo (Universidad de Granada)
Michael Flor (ETS)
John Flowerdew (City University of Hong Kong)
Lynne Flowerdew (independent researcher)
Fanny Forsberg Lundell (Stockholm University)
Gaëtanelle Gilquin (University of Louvain)
Sandra Götz (Justus Liebig Universität Gießen)
Solveig Granath (Karlstad University)
Sylviane Granger (Universtié catholique de Louvain)
Nicholas Groom (University of Birmingham)
Jirka Hana (Charles University Prague)
Shin’ichiro Ishikawa (Kobe University)
Jarmo Harri Jantunen (University of Jyväskylä)
Scott Jarvis (Ohio University)
Marie Källkvist (Lund University Sweden)
Agnieszka Lenko‐Szymanska (University of Warsaw)
Anke Lüdeling (Humboldt‐Universität Berlin)
Carla Marello (Università degil Studi Torino)
Fanny Meunier (Universtié catholique de Louvain)
Detmar Meurers (Universität Tübingen)
Florence Myles (University of Essex)
Susan Nacey (Hedmark University College)
Lionel Nicolas (EURAC Research)
Michael O’Donnell (Universidad Autónoma de Madrid)
Signe Oksefjell Ebeling (Universitetet i Oslo)
Magali Paquot (Universtié catholique de Louvain/FNRS)
Pascual Pérez‐Paredes (University of Cambridge)
Tom Rankin (Vienna University of Economics and Business)
Paul Rayson (UCREL, Lancaster University)
Ute Römer (University of Michigan)
Anna Siyanova‐Chanturia (Victoria University of Wellington)
Jennifer Thewissen (Universiteit Antwerpen)
Yukio Tono (Tokyo University of Foreign Studies)
Nina Vyatkina (University of Kansas)
Heike Zinsmeister (Universität Hamburg)

For inquiries, contact Andrea Abel: Andrea . Abel @ eurac . edu

Tono Linguistic feature extraction #cefr #cl2015

Yukio Tono

Linguistic feature extraction and evaluation using machine learning to identify “criterial” grammar constructions for the CEFR levels

IMG_20150722_160026

 

L2 learner profile

English Profile – CEFR for Englsih

Criterial features: Hawkins & Filipovic 2012

CEFR-J RLD Project: aim prepare list of vocabulary and grammar item to be taught and assessed at each CEFR level

CEFR Coursebook Corpus

IMG_20150722_160504

Weka format 3.6.12

158 features

Attribute selection

 

 

 

 

 

 

Learner corpus research plenary #cl2015

Learner corpus research: a fast-growing interdisciplinary field

Sylviane Granger

IMG_20150722_100646

 

LCR IS an interdisciplinary research

Design: learner and taks variables to control

Not only English language

Method: CIA (Granger, 1996) and computer-aided error analysis

Wider spectrum of linguistic analysis

Interpretation: focus on transfer but this is changing; growing integration of SLA theory

Applications: few up-and-running resources but great potential

Version 3 (2016 or 2017) around 30 L1s as opposed to 11 L1s in Version 1

Learner corpora is a powerful heuristic resource

Corpus techniques make it possible to uncover new dimensions of learner language and lead to the formulation of new research questions: the L2 phrasicon (word combinations).

Prof. Granger brings up Leech’s preface to Learner English on Computer (1998)

Gradual change from mute corpora to sound aligned corpora

POS tagging has improved so much

Error-tagging: wide range of error tagging systems: multi-layer annotation systems

Parsing of learner data (90% accuracy Geertzen et al. 2014)

Static learner corpora vs monito corpora

CMC learner corpus (Marchand 2015)

Granger (2009) paper on the learner research field:

Granger, Sylviane. “The contribution of learner corpora to second language acquisition and foreign language teaching.” Corpora and language teaching 33 (2009): 13.

 

CIA V2 Granger (2015): a new model

SLA researchers are more interested in corpus data and corpus linguists are more familiar with SLA grounding

Implications are much more numerous than applications

Links with NLP: spell and gramar checking, learner feedback, native language id, etc.

Multiple perspectives on the same resource: richer insights and more powerful tools

Phraseology

Louvain English for Academic Purposes Dictionary (LEAD)

web-based

corpus based

descriptions of cross-disciplinary academic vocabulary

1200 lexical times around 18 functions (contrast, illustrate, quote, refer, etc.)

A really exciting application