Our annotation of temporal phrases differs from the Penn corpora of English in two ways. First, all bare, clausal temporal NPs are tagged NP-TMP. Secondly, we use a flat structure for longer time expressions only if there are no German function words that indicate a more refined structure.
When to use flat structure:
- When there are no (German) function words, or when the function words don't distinguish the parts of the time expression:
(NP-TMP (D^A^SG den) (NUM^A^SG 01.12.1900))
- When the whole time expression is in a foreign language:
(NP-TMP (FW Anno) (FW domini) (NUM^A^SG 1447) (ADJ^G^SG S.) (NPR^G^SG Ceciliae))
Use articulated, syntactic structure when there are German articles or prepositions:
Note that if the word 'year' and a numerical year are present, the latter is in apposition to the former:
(PP (P i@) (NP (D @m) (N jar) (NP-PRN (NUM 1549))))
Following a date that is a cardinal number, assume that a month is genitive:
(NP-TMP (D den) (ADJ ersten) (NP-POS (NPR^G^SG Dezember)) )
('the first (day) of Dec.', not 'the first Dec.')
Syntactically structured time phrases are generally sisters:
(NP-TMP (D den) (ADJ ersten) (NP-POS (NPR^G^SG Dez.)) )
(NP-TMP (NUM^A^SG 1900))
Unless very clear that one phrase modifies another, e.g. when vor or nach further specify the day:
(PP (P an) (NP (NPR Mitchwuchen) (PP (P vor) (NP (NPR Cecilee)))))
Function words may cause a shift from flat to articulated structure:
Here, 'on S. Martin's night' is sister to 'AD 1448' because there is clearly a PP but neither is part of the other: (NP-TMP (FW Anno) (FW domini) (FW m.cccc.x.lviii)) (PP (P uff) (NP-POS sant martinis) (N nacht)))
Choice of case:
- NUM^A^SG for any parts of dates or times that are otherwise unmarked for case (see Anno domini 1447 S. Ceciliae above).
- NPR^G^SG for Saint's days even if not in an NP-POS (as with S. Cecliae above).
Determinining head vs. modifier:
If the day/year that is specific point in time, it is the 'head' of the time phrase:
(PP (P a@) (NP (D @m) (ADJ 14.) (N Tag) (PP nach Ostern) ))
'on the 14th day after Easter'
(NP-TMP (PP nach Christi geburt) (NUM^A^SG 1078))
'A.D. 1078' (not '1078 [years] after Christ's birth')
However, if the days/years (in plural) are a measurement w.r.t. a specific point in time, the specific point in time is the head:
(PP (NP-MSR 14 tage) (P nach) (NP (NPR ostern)))
'[the day] 14 days after Easter' (annotating as NP-TMP would imply a duration of 14 days beginning at Easter)
(PP (P nach (NP (NP-POS Christi (N geburt)) (NP-MSR (NUM^A^PL 1078) (N^A^PL Jahre)))
'[the year] 1078 years after Christ's birth', not a duration of 1078 years.