Categories
corpora corpus linguistics COW14

Free ngram databases from COW14 web corpora

From the corpora list

::::::::::::::::::::::::::::::

We are pleased to announce the release of the first very large ngram databases derived from the giga-token COW14 web corpora. They are completely free (CC-BY) and can be downloaded without registration. We have applied no frequency thresholds whatsoever. In addition to the counted ngram lists, we offer raw versions such that everybody can create their own version. The raw ngrams also contain additional information (crawl year, top-level domain, country geolocation).

There are also English dependency bigrams (based on Malt parses) containing words, their heads, and the dependency relation between them.

For end-users, there are also word and lemma frequency lists with some convenient frequency measures, optionally with a frequency threshold of 10 (smaller files, easier handling).

——————————————————————–

LICENSE AND REFERENCES

License Creative Commons Attribution 4.0 International
References http://corporafromtheweb.org/category/cow-citation/

Please tell us whenever you publish work based on COW:
https://webcorpora.org/publication/

DOWNLOAD

http://hpsg.fu-berlin.de/cow/ngrams/
http://hpsg.fu-berlin.de/cow/frequencies/

ORIGIN AND ORIGINAL CORPUS SIZES

The ngrams are derived from the COW14AX sentence-shuffled corpora.

Information http://corporafromtheweb.org/category/corpora/
Interface https://webcorpora.org/

English 9,578,828,861 tokens (International)
German 11,660,894,000 tokens (AT, CH, DE)
Spanish 3,680,794,644 tokens (International)
Swedish 4,842,753,707 tokens (FI, SV)

FREQUENCY LISTS

Languages English, German, Spanish, Swedish
Versions Lemma, Lemma + POS, Word, Word + POS
Thresholds no threshold; raw frequency > 9
Measures raw frequency, absolute rank, frequency per million,
log-frequency per million, frequency band

NGRAMS

N 1 .. 5
Languages English, German, Spanish, Swedish
Versions Raw, Word, Word + POS, Lemma (except Swedish)

DEPENDENCY BIGRAMS

Languages English (German soon, maybe Swedish)
Versions Raw, Word, Word + POS, Lemma, Lemma + POS

Categories
Infographics

Think before you print

beforePrint

Categories
AESLA applied linguistics CFP CLIL

CFP International Journal of Bilingual Education and Bilingualism: Special Issue 2017

Through the AESLA mail-list

::::::::::::::::::::::::::::::::::::::::::

 

CALL FOR PAPERS

International Journal of Bilingual Education and Bilingualism: Special Issue 2017

As guest editors (Yolanda Ruiz de Zarobe and Roy Lyster) of a Special Issue of the International Journal of Bilingual Education and Bilingualism, we invite you to submit proposals on the following topic:

Instructional practices and teacher development in Content and Language Integrated Learning (CLIL)

The aim of this Journal is to be thoroughly international in nature. It disseminates high-quality research, theoretical advances, and international developments related to

initiatives in bilingualism and bilingual education. Each year the International Journal of Bilingual Education and Bilingualism devotes two of its issues to Special Issues.

Previous Special Issues have tended to receive remarkable praise, particularly as they focus on one issue and often provide a major step forward in the study of a particular

This Special Issue on CLIL seeks:

• To promote theoretical and applied research conducted in the context of CLIL and other content-based programs such as immersion.

• To disseminate information about best practices in content-based instruction.

• To provide a truly international exchange on how CLIL pedagogy is applied in a wide

Authors are invited to submit proposals focusing on instructional practices and teacher development in CLIL at any educational level and in any educational setting. Both

state-of-the-art articles and empirical studies are welcomed. Manuscripts submitted  should be original, not under review by any other publication and not published

– Deadline for 200-250 word abstracts: 15th September 2015. Proposals should be submitted by email attachment to the co-editors at yolanda.ruizdezarobe@ehu.es and

They should contain the author’s name, affiliation and e-mail address.

– Notification of acceptance/rejection: 1st November 2015. Please note that selection of the proposal does not always guarantee publication.

– Deadline for full papers (no longer than 7,000 words including notes and references):

15th February 2016. Each article will receive two independent and anonymous

 

For further information on the journal’s submission guidelines please visit.

http://www.tandf.co.uk/journals/rbeb