• Skip to Content
  • Skip to Main Navigation
  • Skip to Search

Indiana University Bloomington Indiana University Bloomington IU Bloomington

Open Search
  • Corpus description and texts
    • Structure of corpus
    • Texts in Corpus
    • Version numbers
    • Source corpora and licenses
  • Tag set
    • Splitting & joining words; lemmatization
    • Part-of-speech labels
    • Phrasal labels
    • Treatment of individual words, phrases, foreign words, and proper nouns
    • Empty categories
  • Syntactic annotation
    • Sentence tokenization
    • Noun Phrases
    • Other phrases
    • IP and other clause types
    • CP types
    • Coordination
    • Comparison
    • Passives and other constructions
  • Query the corpus
    • Corpus Search
  • Publications

Indiana Parsed Corpus of Historical High German

  • Home
  • Corpus description and texts
    • Structure of corpus
    • Texts in Corpus
    • Version numbers
    • Source corpora and licenses
  • Tag set
    • Splitting & joining words; lemmatization
    • Part-of-speech labels
    • Phrasal labels
    • Treatment of individual words, phrases, foreign words, and proper nouns
    • Empty categories
  • Syntactic annotation
    • Sentence tokenization
    • Noun Phrases
    • Other phrases
    • IP and other clause types
    • CP types
    • Coordination
    • Comparison
    • Passives and other constructions
  • Query the corpus
    • Corpus Search
  • Publications
  • Search

Indiana Parsed Corpus of Historical (High) German

The IPCHG is a syntactically parsed corpus of High German texts from the 11th through 20th centuries. The corpus, which is nearly complete, will have a total of 165 texts. The texts that have been annotated to date are available to download or query on this website.

About the corpus

  • Corpus overview and texts
  • Tag set
  • Syntactic annotation
  • Publications

Research Team

Christopher D. Sapp, Ph.D., primary investigator
Rex A. Sprouse, Ph.D., primary investigator
Elliott Evans, Ph.D., postdoctoral fellow
Danny Dakota, Ph.D., computational consultant
Graduate assistants: David Bolter, Elaine Dalida, Janine Emerson, Mary Gilbert, Sal Goldfinch, Jane Harris, Tyler Kniess, Daniel Mitropolous, Elijah Peters, and Bradley Weiss

Links

Related parsed historical corpora
  • The Penn Parsed Historical Corpora of English (PPHCE)
  • The Icelandic Parsed Historical Corpus (IcePaHC)
  • The Heliand Parsed Database (HeliPaD)
  • The Corpus of Historical Low German (CHLG)
  • Caitlin Light's parsed corpus of Martin Luther's ENHG Bible translation
Parsing / Annotating tools
  • CorpusSearch 2, the query language for Penn-style parsed corpora
  • Annotald, a tool for annotating parsed texts
  • Our scripts for extracting parsable sentences from source corpora
  • Our scripts for running CorpusSearch 2 in a web browser and for concatenating the linguistic string to the results of a coding query
German language resources
  • Wörterbuchnetz searchable historical dictionaries of German
  • Deutsch Diachron Digital, a family of historical corpora of German

Acknowledgments

This project is possible thanks to the following grants:

  • Faculty Research Support Funding Seed grant from the IU OVPR, supported by the Department of Germanic Studies and the Department of Second Language Studies.
  • External Resubmission grant from the IU OVPR, also supported by the Departments of Germanic Studies and Second Language Studies.
  • National Science Foundation 3-year grant "Building a parsed historical corpus to investigate word-order variation and change."

Indiana University

Accessibility | College Scorecard | Open to All | Privacy Notice | Copyright © 2026 The Trustees of Indiana University