Tono Linguistic feature extraction #cefr #cl2015

Yukio Tono

Linguistic feature extraction and evaluation using machine learning to identify “criterial” grammar constructions for the CEFR levels

IMG_20150722_160026

 

L2 learner profile

English Profile – CEFR for Englsih

Criterial features: Hawkins & Filipovic 2012

CEFR-J RLD Project: aim prepare list of vocabulary and grammar item to be taught and assessed at each CEFR level

CEFR Coursebook Corpus

IMG_20150722_160504

Weka format 3.6.12

158 features

Attribute selection

 

 

 

 

 

 

Language learning theories underpinning corpus-based pedagogy #cl2015

 

IMG_20150722_140248
Lynne Flowerdew
Language learning theories underpinning corpus-based pedagogy

The noticing hypothesis (Schmidt)

Attention consciously drawn

Noticing linked to frequency counts

Implicit vs explicit learning

 Constructivist learning

Learners engage in discovery learning

Inductive learning

Cognitive skills, problem solving to understand new data

Widmann et al. 2011: the more possible starting points for exploitation, the more likely for different learners- SACODEYL project.

Sociocultural theory

What about language learning outside the classroom and incidental learning?

 

Learner corpus research plenary #cl2015

Learner corpus research: a fast-growing interdisciplinary field

Sylviane Granger

IMG_20150722_100646

 

LCR IS an interdisciplinary research

Design: learner and taks variables to control

Not only English language

Method: CIA (Granger, 1996) and computer-aided error analysis

Wider spectrum of linguistic analysis

Interpretation: focus on transfer but this is changing; growing integration of SLA theory

Applications: few up-and-running resources but great potential

Version 3 (2016 or 2017) around 30 L1s as opposed to 11 L1s in Version 1

Learner corpora is a powerful heuristic resource

Corpus techniques make it possible to uncover new dimensions of learner language and lead to the formulation of new research questions: the L2 phrasicon (word combinations).

Prof. Granger brings up Leech’s preface to Learner English on Computer (1998)

Gradual change from mute corpora to sound aligned corpora

POS tagging has improved so much

Error-tagging: wide range of error tagging systems: multi-layer annotation systems

Parsing of learner data (90% accuracy Geertzen et al. 2014)

Static learner corpora vs monito corpora

CMC learner corpus (Marchand 2015)

Granger (2009) paper on the learner research field:

Granger, Sylviane. “The contribution of learner corpora to second language acquisition and foreign language teaching.” Corpora and language teaching 33 (2009): 13.

 

CIA V2 Granger (2015): a new model

SLA researchers are more interested in corpus data and corpus linguists are more familiar with SLA grounding

Implications are much more numerous than applications

Links with NLP: spell and gramar checking, learner feedback, native language id, etc.

Multiple perspectives on the same resource: richer insights and more powerful tools

Phraseology

Louvain English for Academic Purposes Dictionary (LEAD)

web-based

corpus based

descriptions of cross-disciplinary academic vocabulary

1200 lexical times around 18 functions (contrast, illustrate, quote, refer, etc.)

A really exciting application

 

 

 

 

 

 

 

 

MA of L2 learner English

Corpus Linguistics 2015, University of Lancaster, 21-24 July

IMG_20150722_083955

Yu Yuan:
“Exploring the variation in world Learner Englishes: A multidimensional analysis of L2 written corpora”

109 features included in the analysis

RQ:

Can Biber’s model be extended?

How do features co-occur in learner English?

 

Data

ICLE 1.0 (Granger, 2002)

SWEECL 2.0 (Wen & Wang, 2008)

 

Tools

MA tagger Nini (2014) Manual here. Software (Windows) here.

Stanford Corenlp

R

Pythin scripts

 

Method

Kaisser’s criteria + Scree test for Factor Analysis

 

Results

10 dimensions stand out

Dimensions are largely epistemological, rhetorical and syntactical.

 

European Journal of Applied Linguistics invites submissions

The European Journal of Applied Linguistics (EuJAL) focuses on the particular concerns of applied linguistics in European contexts, both by addressing problems that are typically relevant for the linguistic situation in Europe, from those on the level of the EU as a pan-national body down to the level of the individual, and by examining topics broached by or discussed in European applied linguistics in particular. In addition to resulting from an epistemological stance, EuJAL is a logical outcome of the regionalization policy of the Association Internationale de Linguistique Appliquée (AILA), supporting the societies’ commitment to regionalization by focusing on the European language space and by giving applied linguists from this regional context an adequate forum. EuJAL is part of the joint activities of the European AILA affiliates.

Researching Language Learner Interactions Online: From Social Media to MOOCs

The 2015 CALICO Monograph: Researching Language Learner Interactions Online: From Social Media to MOOCs edited by Ed Dixon and Michael Thomas is now available.

 

Ch. 1
Edward Dixon
Michael Thomas

Introduction
Ch. 2 Dana Milstein Pancake People, Throwaway Culture, and En Media Res Practices: A New Era of Distance Foreign Language Learning

Ch. 3 Alice Chik English Language Teaching Apps: Reconceptualizing Learners, Parents, and Teachers

Ch. 4
Timothy Lewis
Anna Comas-Quinn
Mirjam Hauck

Clustering, Collaboration, and Community: Sociality at Work in a cMOOC

Ch. 5 Fernando Rubio The Role of Interaction in MOOCs and Traditional Technology-Enhanced Language Courses

Ch. 6
Edward Dixon
Carolin Fuchs

Face to Face, Online, or MOOC–How the Format Impacts Content, Objectives, Assignments, and Assessments

Ch. 7
Vickie Karasic
Anu Vedantham

Video Creation Tools for Language Learning: Lessons Learned

Ch. 8
Michael Thomas

Researching Machinima in Project-Based Language Learning: Learner-Generated Content in the CAMELOT Project

Ch. 9 Yuki Akiyama Task-Based Investigations of Learner Perceptions: Affordances of Video-Based eTandem Learning

Ch. 10 Ilona Vandergriff Exercising Learner Agency in Forum Interactions in a Profesionally Moderated Language Learning Networking Site

Ch. 11 Motoko I. Christensen
Mark Christensen Language Learner Interaction in Social Network Site Virtual Worlds

Ch. 12 Geraldine Blattner
Amanda Dalola
Lara Lomicka Tweetsmarts: A Pragmatic Analysis of Well Known Native French Speaker Tweeters

Ch. 13 Theresa Schenker Telecollaboration for Novice Language Learners–Negotiation of Meaning in Text Chats between Nonnative and Native Speakers

Ch. 14 Giulia Messina Dahlberg
Sangeeta Bagga-Gupta Learning On-The-Go in Institutional Telecollaboration: Anthropological Perspectives on the Boundaries of Digital Spaces

Ch. 15 Marie-Thérèse Batardière Examining Cognitive Presence in Students’ Asynchronous Online Discussions

Ch. 16 Kelsey D. White Orientations and Access to German-Speaking Communities in Virtual Environments

Ch. 17 Megan Case Language Students’ Personal Learning Environments Through an Activity Theory Lens

Ch. 18 Bonnie Youngs
Sarah Moss-Horwitz
Elizabeth Snyder Educational Data Mining for Elementary French On-line: A Descriptive Study

Ch. 19 Stephanie Link
Zhi Li Understanding Online Interaction Through Learning Analytics: Defining a Theory-Based Research Agenda