Other corpora using the same or similar annotation schemes as the Penn-Helsinki Corpora
Parsed corpora of historical English
The following corpora are all part of an overarching project at the University of Pennsylvania, the University of York, and elsewhere to produce syntactically annotated corpora for all stages of the history of English:
- Old English (before 1100)
- Middle English (1100-1500)
- Early Modern English (1500-1700)
- Modern English (1700-1914)
Parsed corpora of other languages
This list is updated from time to time, but does not aim to be exhaustive. We hope it is useful nonetheless.
- Germanic
- Audio-Aligned and Parsed Corpus of Appalachian English (AAPCAppE) - Christina Tortora (City University of New York) and collaborators
- Corpus of Historical Low German - Anne Breitbarth (University of Gent) and collaborators
- HeliPaD (Old Saxon Heliand) - George Walkden (University of Konstanz)
- Icelandic Parsed Historical Corpus (IcePaHC) - Eiríkur Rögnvaldsson (University of Iceland) and collaborators
- Indiana Parsed Corpus of Historical High German - Chris Sapp (Indiana University) and collaborators
- Penn Parsed Corpus of Historical Yiddish - Beatrice Santorini (University of Pennsylvania)
- The Parsed Corpus of Scottish Correspondence - Lisa Gotthard (University of Edinburgh)
- Romance
- CORDIAL-SIN Corpus, a syntax-oriented corpus of European Portuguese dialects - Ana Maria Martins (Centro de Linguística da Universidade de Lisboa) and collaborators
- Modéliser le changement: les voies du français (Modelling change: the paths of French), a parsed corpus of historical French - France Martineau (University of Ottawa) and collaborators
- Penn-BFM Parsed Corpus of Historical French - Tony Kroch (University of Pennsylvania) and collaborators
- P.S. Post Scriptum - A Digital Archive of Ordinary Writing (Early Modern Portugal and Spain) - Rita Marquilhas (Centro de Linguística da Universidade de Lisboa) and collaborators
- Tycho Brahe Corpus, a parsed corpus of historical European Portuguese - Charlotte Galves (University of Campinas, Brazil) and collaborators
- Word order and word order change in Western European languages (WOChWEL) Corpus, a growing parsed corpus of Old Portuguese - Ana Maria Martins and Sandra Pereira (Centro de Linguística da Universidade de Lisboa) and collaborators
- Japanese
- NINJAL Parsed Corpus of Modern Japanese (NPCMJ) - Prashant Pardeshi (National Institute of Japanese Language and Linguistics) and collaborators
- Oxford-NINJAL Corpus of Old Japanese (ONCOJ) - Bjarke Frellesvig (Oxford University) and an international committee