Representation of benefit claimants in UK media #cl2015

 

Ben Clarke
The ideological representation of benefit claimants in UK print media

2010 – 2014

2.3 M corpus

benefits clsimant(s) search criteria

Adjectival constructions

Adjective lemmas are ranked

hard number 40

tough number 53

enTenTen13 score

Tough on is significant in the corpus

Tough patterns

Benefit claimants: scroungers

tougher conditions, curbs on

Prepositions and ideology: on here as a Goal PR in a Material PT (impacted/affected entity)

 

Summer Schools in Corpus Linguistics

Through the Corpora List

::::::::::::::::::::::::::::::::::::

lancaster

Summer Schools in Corpus Linguistics / Statistics for Corpus Linguistics

http://ucrel.lancs.ac.uk/summerschool

Lancaster University, UK – 14th to 17th July 2015

 

Since 2010, Lancaster University has run a highly successful series of free-to attend summer training events. In 2015, we will for the first time be running two corpus linguistics events in parallel:

 

  • The UCREL Summer School in Corpus Linguistics
  • The UCREL/CASS Summer School in Statistics for Corpus Linguistics

 

Sponsored by UCREL at Lancaster University – one of the world’s leading and longest-established centres for corpus-based research – and by the ESRC-funded CASS project, these events’ aim is to support students of language and linguistics in the development of advanced skills in corpus methods.

Both are intended primarily for postgraduate research students (and secondarily for Masters-level students, postdoctoral researchers, and others); both assume at least a basic knowledge of corpus linguistics (but in the case of the Statistics Summer School, no knowledge of statistics is assumed).

The four-day programme consists of a series of intensive two-hour sessions, some involving practical work, others more discussion-oriented. Some sessions are shared across the two events. The instructors include, as well as speakers from Lancaster University, external guest speakers who are prominent specialists in their respective areas.

For a list of topics and speakers in the UCREL Summer School in Corpus Linguistics, see http://ucrel.lancs.ac.uk/summerschool/corpusling.php

 

For a list of topics and speakers in the UCREL/CASS Summer School in Statistics for Corpus Linguistics, see http://ucrel.lancs.ac.uk/summerschool/stats.php

These events are part of a larger set of five co-located Lancaster Summer Schools in Interdisciplinary Digital Methods; the other events include training in corpus methods directed at non-linguists; see the website for further information:

http://ucrel.lancs.ac.uk/summerschool

Note that the summer schools run the week immediately before the Corpus Linguistics 2015 conference, for the benefit of anyone who might wish to attend both.

 

How to register

Our Summer Schools are free to attend, but registration in advance is compulsory, as places are limited.

The deadline for registrations is Sunday 7th June 2015, but we cannot guarantee that places will still be available at that point!

The application forms are available on the event website here as is further information on the programme.

Lexicoder automated content analysis of text

Lexicoder is a Java-based, multi-platform software for automated content analysis of text. Lexicoder was developed by Lori Young and Stuart Soroka, and programmed by Mark Daku (initially at McGill University, and now at Penn, Michigan, and McGill respectively).

The current version of the software (2.0) is freely available – for academic use only. Additions and revisions will also be released here as they become available. In addition, the Lexicoder Sentiment Dictionary, a dictionary designed to capture the sentiment of political texts, is available formatted for Lexicoder, or WordStat, and also adaptable to other content-analytic software. Work on Topic Dictionaries, based on the Policy Agendas coding scheme, is also underway.

Through Linkedin The WebGenre R&D Group.

1st Intl. NLP for Informal Text- Deadline 17/4

Graph-Magnifier-icon

The 1st International Workshop on Natural Language Processing for Informal Text (NLPIT 2015)
In conjunction with The International Conference on Web Engineering(ICWE 2015)
June 23, 2015, Rotterdam, The Netherlands
http://wwwhome.cs.utwente.nl/~badiehm/nlpit2015/

Overview
The rapid growth of Internet usage in the last two decades adds new challenges to understand the informal user generated content (UGC) on the Internet. Textual UGC refers to textual posts on social media, blogs, emails, chat conversations, instant messages, forums, reviews, or advertisements that are created by end-users of an online system. A large portion of language used on textual UGC is informal. Informal text is the style of writing that disregard language grammars and uses a mixture of abbreviations and context dependent terms. The straightforward application of state-of-the-art Natural Language Processing approaches on informal text typically results in significantly degraded performance due to the following reasons: the lack of sentence structure; the lack of enough context required; the seldom entities involved; the noisy sparse contents of users’ contributions; and the untrusted facts contained. It is the aim of this work- shop to bring the attention of researchers to the opportunities and challenges involved in informal text processing. In particular, we are interested in discussing informal text modeling, normalization, mining, and understanding in addition to various application areas in which UGC is involved.

Topics

We invite submissions on topics that include, but are not limited to, the following core NLP approaches for informal UGC: language identification, classification, clustering, filtering, summarization, tokenization, segmentation, morphological analysis, POS tagging, parsing, named entity extraction, named entity disambiguation, relation/fact extraction, semantic annotation, sentiment analysis, language normalization, informality modeling and measuring, language generation, handling uncertainties, machine translation, ontology construction, dictionary construction, etc.

Submission

Authors are invited to submit original work not submitted to another conference or workshop. Workshop submissions could be a full paper or short paper. Paper length should not exceed 12 pages for full papers and 6 pages for short papers. All papers should follow the Springer’s LNCS format. Papers in PDF can be sent via the EasyChair Conference System https://easychair.org/conferences/?conf=nlpit2015. Each submission will receive, in addition to a meta-review, at least 2 peer double-blind reviews. Each full paper will get 25 minutes presentation time. Short papers will get 5 minutes presentation time in addition to a poster. Beside papers, we also plan to have an invited talk by a renowned scientist on a topic relevant for the workshop. Workshop proceedings will be published as part of the ICWE2015 workshop proceedings. To contact the NLPIT 2015 organization team, please send an e-mail to: nlpit2015@easychair.org.

Deadlines

– Submission deadline: April 17, 2015
– Notification deadline: May 17, 2015
– Camera-ready version: May 24, 2015
– Workshop date: June 23, 2015

Msg. distributed through the corpora list

Ideology in corporate language

Ruth Breeze

Ideology in corporate language: discourse analysis using Wmatrix3

2013 Annual Reports from leading companies (16)  in financial services, mining, food and pharmaceutical

Parts: first part, non technical, discursive, visually interesting

Reference corpus: 1st BNC Sampler Business & BNC Informative texts but then only BNC Business

Use of semantic categories

Three case studies: size (big), time (begin) and casuse and effect

Size: Focus on growth, large, expanding, substantial. Not only adjectives are interesting here.

Conclusions:

Ideology of cause and effect

Dynamic approach to time

Emphasis on size and importance

Salient semantic areas: investigation, tough, strong, attentive, jelp & give, in power, belonging to a group

Differences: only in domain/topic-focus, probably different stresses on newness and green economy