• Skip to Content
  • Skip to Main Navigation
  • Skip to Search

Indiana University Bloomington Indiana University Bloomington IU Bloomington

Open Search
  • Corpus description and texts
    • Structure of corpus
    • Texts in Corpus
    • Version numbers
    • Source corpora and licenses
  • Tag set
    • Splitting & joining words; lemmatization
    • Part-of-speech labels
    • Phrasal labels
    • Treatment of individual words and phrases
    • Empty categories
  • Syntactic annotation
    • Sentence tokenization
    • Noun Phrases
    • Other phrases
    • IP and other clause types
    • CP types
    • Coordination
    • Comparison
    • Passives and other constructions
  • Query the corpus
    • Corpus Search
  • Publications

Indiana Parsed Corpus of Historical High German

  • Home
  • Corpus description and texts
    • Structure of corpus
    • Texts in Corpus
    • Version numbers
    • Source corpora and licenses
  • Tag set
    • Splitting & joining words; lemmatization
    • Part-of-speech labels
    • Phrasal labels
    • Treatment of individual words and phrases
    • Empty categories
  • Syntactic annotation
    • Sentence tokenization
    • Noun Phrases
    • Other phrases
    • IP and other clause types
    • CP types
    • Coordination
    • Comparison
    • Passives and other constructions
  • Query the corpus
    • Corpus Search
  • Publications
  • Search

Indiana Parsed Corpus of Historical (High) German

The IPCHG will be a syntactically parsed corpus of approx. 165 High German texts from the 11th through 20th centuries. The corpus is currently about one-third complete and publically available (most of the Early New High German texts and some Middle High German ones). As we complete annotations of the texts, you can download and query the texts on this website.

About the corpus

  • Corpus overview and texts
  • Tag set
  • Syntactic annotation
  • Publications

Research Team

Christopher D. Sapp, Ph.D., primary investigator
Rex A. Sprouse, Ph.D., primary investigator
Elliott Evans, Ph.D., postdoctoral fellow
Danny Dakota, Ph.D., computational consultant
Elaine Dalida, M.A., graduate assistant
Jane Harris, M.A., graduate assistant
Daniel Mitropolous, M.A. graduate assistant

Links

Related parsed historical corpora
  • The Penn Parsed Historical Corpora of English (PPHCE)
  • The Icelandic Parsed Historical Corpus (IcePaHC)
  • The Heliand Parsed Database (HeliPaD)
  • The Corpus of Historical Low German (CHLG)
  • Caitlin Light's parsed corpus of Martin Luther's ENHG Bible translation
Parsing / Annotating tools
  • CorpusSearch 2, the query language for Penn-style parsed corpora
  • Annotald, a tool for annotating parsed texts
  • Our scripts for extracting parsable sentences from source corpora
German language resources
  • Wörterbuchnetz searchable historical dictionaries of German
  • Deutsch Diachron Digital, a family of historical corpora of German

Acknowledgments

This project is possible thanks to the following grants:

  • Faculty Research Support Funding Seed grant from the IU OVPR, supported by the Department of Germanic Studies and the Department of Second Language Studies.
  • External Resubmission grant from the IU OVPR, also supported by the Departments of Germanic Studies and Second Language Studies.
  • National Science Foundation 3-year grant "Building a parsed historical corpus to investigate word-order variation and change."

Special thanks to our former graduate assistants: Janine Emerson, Mary Gilbert, Sal Goldfinch, Tyler Kniess, and Elijah Peters.

Indiana University

Accessibility | College Scorecard | Privacy Notice | Copyright © 2025 The Trustees of Indiana University