Abstract
Introduction
Chomsky (2013) argues that when two constituents are merged, owing to the requirement of the interpretive system, the derived syntactic object must have a label, which is assigned by the labeling algorithm (LA). Roughly speaking, LA works in two ways: One is that when a head and a phrase are merged, the head provides the label for the derived structure; the other is related to the cases where two phrases are merged, including successive-cyclic movement, criterial positions, and other symmetric structures.
As for the first way of labeling, Chomsky (2015) further claims that some heads are too weak to label, proposing there are two kinds of weak heads: lexical heads, which have no category feature although they have lexical contents; and T in languages like English, whose inflectional features are poor. These weak heads cannot provide labels, but can have a structure labeled by agreeing with an overt constituent in the Spec position (we use terms of the Government and Binding theory only for ease of exposition). These assumptions about labeling can help explain the distribution of many empty categories. Let us use (1) for illustration.
(1) *Whoi do you think that ti will like Mary?
Suppose the derivation of (1) reaches the stage shown in (2), in which
(2) [C that [?? who [? T [like Mary]]]] (3) [who [C that [?? who [? T [like Mary]]]]]
Based on this theory, many insightful hypotheses have been posited, and interesting data have been uncovered; see, for example, Bošković (2016b, 2018). However, Hayashi (2020) argues that the concept of “weak head” should be eliminated because it cannot account for the labeling of English infinitival clauses and the sentences in such languages as Japanese. For example,
(4) John expects Bill to win. (5) [ζ v* [
ε
Bill [
δ
R [
γ
(6) (Boku-wa) ringo-o tabe-masu. I-TOP apple-ACC eat-PRS ‘I ate an apple.’ (Hayashi, 2020, p. 280)Bill [β to [α Bill, win]]]]]] (Hayashi, 2020, p. 279)
The phenomena pointed out by Hayashi (2020) is considerably challenging; however, this does not indicate that the notion of weak heads must be eliminated. Indeed, we can develop Chomsky’s (2015) notion of weak heads by covering phenomena such as (4) and (6). To some extent, Chomsky’s (2015) assumption of weak head T in English is a reinterpretation of a long-standing observation, dating back to Taraldsen (1978); it holds that languages such as Italian, Spanish, and Hungarian, allow pro-drop because they are rich in inflectional features. Conversely, pro-drop is not allowed in English owing to its poor agreement inflection. In Chomsky’s (2015) labeling theory, he cited only this observation, leaving untouched other interesting remarks on pro-drop and infinitival T. While making a correlation between rich agreement inflection and pro-drop, previous research also makes some insightful remarks. For instance, Rizzi (1982:143) holds that there is a “pronominal Agr” in Infl/T. Borer (1986, 1989) argues that infinitival Agr and gerundive Agr in nonfinite T are anaphoric. Huang (1982) proposes that Chinese allows pro-drop because it has no agreement/Agr. Similarly, Saito (2007) suggests that in Japanese there is a covert operation allowing for pro-drop, which involves covert copying of elements from discourse-given entities to null argument positions. The precondition for this covert operation is the lack of surface agreement (Roberts, 2010).
Based on the previous research, particularly Huang (1982) and Saito (2007), we can develop Chomsky’s (2015) assumption by defining the weak T as (7).
(7) T is weak if its inflectional features are not rich.
The definition presupposes that if the head T is weak, it must have inflectional features. Assuming a head has agreement features, it is reasonable to consider it weak if it turns out to be poor in such features. With such a definition, nonfinite T should be strong because it has no inflectional features. T in such languages as Italian, Spanish, and Hungarian is strong because the inflectional features on T are rich. Furthermore, T in languages like Japanese and Chinese should be strong too because such languages do not have inflectional agreements.
Our assumption can explain why Chomsky (2013, 2015) believes labels to be crucial for interpretation at interfaces. In the pro-drop languages, the “pronominal Agr” (the set of phi-features) in T is interpretable (Holmberg, 2005; Rizzi, 1982; see also Alexiadou & Anagnostopoulou, 1998; Rizzi, 1982; Sheehan, 2006 for discussion of D-feature in T). When T merges with vP, T together with the Agr/D feature will provide a label for the mother node. Furthermore, the Agr/D feature can identify the semantic content of pro and help it interpreted (Rizzi, 1986). In English the Agr in T is not enough to interpret the pro in the Spec-T (alternatively, the Agr in T is uninterpretable). Therefore, if T tries to provide a label, the pro in the subject cannot be interpreted in the end, an undesirable consequence. If the label is nonfinite T (e.g., if “to” provides a label for the mother node), the PRO/trace/copy in Spec-T will be taken to be anaphoric (see Borer, 1986, 1989). In languages without agreement, such as Chinese, Korean, and Japanese, if the label is T, an entity will be copied from the discourse to null subject position at the interface (Saito, 2007; see also Miyagawa, 2017).
Although Chomsky’s (2015) assumption is insightful, he left at least one problem that needs prompt solutions. For example, to solve the problems of projection and to account for linguistic facts, he put forward two kinds of weak heads. If his assumption is correct, one may wonder whether there are other kinds of weak heads. In this paper, we aim to address this problem. We develop Chomsky’s labeling theory by proposing that there is another kind of weak heads. Specifically, based on Richards’ (2016, 2020) argument that certain aspects of phonological structures are built in the narrow syntax, we propose that phonological features play a crucial role in the LA. Particularly, a head that loses phonological features in the syntax is weak when it comes to labeling. This approach to weak heads, together with Chomsky’s (2013, 2015) constraint that a structure must be labeled for interpretation, can explain the distribution of empty categories in topicalization, relativization, and ellipsis.
The remainder of this paper is organized as follows. Section 2 introduces additional weak heads, that is, a definition of weak heads from the perspective of phonological features. Section 3 presents the facts that it can account for. Section 4 concludes the paper.
Our Assumptions About Weak Heads and Labeling
Most generative linguists agree that when taken from the lexicon, lexical items typically have semantic, formal, and phonological features (Chomsky, 1995). After a phase is complete, phonological features are transferred to the sensorimotor system by the spell-out operation (Chomsky, 2000, 2001, 2008). Put differently, although phonological features are present in the narrow syntax, they are blind to syntactic operations, and the phonological structures are built until the phonological features reach the sensorimotor system. Along this line, the relationship between syntax and phonology is unidirectional, that is, syntax determines phonology. Intending to change this system, Richards (2016, 2020) proposes that part of the phonological structures are constructed within the syntax. That is, there is an interaction between syntax and phonology, and their relationship is bidirectional rather than unidirectional (Kandybowicz, 2020). Of course, there is a restriction on the phonological information available to the narrow syntax. Specifically, only the phonological information predictable from the syntax is available, while the lexical specific information, such as specific segmental content, is not. With these assumptions, he offers a principled account of A’-moment, A-movement, and head movement. Similar to Richards (2016, 2020), Holmberg (2000) suggests that although syntactic operations cannot detect the exact phonological feature matrix of a constituent, they can determine whether a constituent has phonological features.
As for the phonological features of words, Richards (2016) makes some enlightening remarks. He argues that a conversion operation comes into play when phonological structures are built in the syntax. Consequently, a nominal with phonological features can be converted into PRO. In other words, the phonological features of a nominal can be erased by the conversion operation. It is well known in the Government and Binding theory that PRO is different from nouns/pronouns with phonological features in that it has the feature specification [+a, +p], which means that it is ungoverned (Chomsky, 1981). This indicates that the loss of phonological features has a significant impact on the syntactic properties of constituents.
As can be observed, the phonological features as well as the operation in the prosodic structure in the syntax exert a strong impact on syntax. Following this line of reasoning, we propose that phonological features also play a crucial role in the LA, and inspired by Chomsky’s (2015) assumption of weak heads, we put forward a new version of weak heads, as shown in (8).
(8) A head that loses its phonological features in the syntax is weak when it comes to labeling.
The weak heads discussed by Chomsky (2015) have phonological features. For the sake of exposition, we name these weak heads as overt weak heads and heads without phonological features as null weak heads. A head can lose phonological features in two ways. One is that the phonological features on a head are erased by the conversion operation in the narrow syntax. The other is that the phonological features of a head move away while the prosodical structure is built in the syntax. Thus, the original site of the head will have no phonological features. See also Tian (2022).
Before proceeding to view the empirical evidence in support of (8), we need to make two points clear. The first point is related to the identification of null weak heads. We assume that a head that loses its phonological features in the syntax is weak. Nevertheless, the phonological features in the syntax are invisible, and all we see are the phonetic forms at the surface level. Moreover, Richards (2016) makes it clear that the phonological features in the syntax may be different from the phonetic forms we see at the surface level because the early syntactic derivational process may be obscured by later derivations. For example, he argues that PRO is initially a noun with phonological features. To establish the contiguity relation, the phonological features of the noun are erased and the noun is converted to a PRO. Given this, we are faced with the following question: how can we determine whether a head loses its phonological features in the syntax?
One effective method is to see whether a head can have a phonetic form. If it can, but it ends up empty, or its phonetic form is moved away, then we can claim that the head loses its phonological features in the syntax. For example, the phonological features of C can be realized as “that,” as (9a) shows, but it turns out to be empty in (9b). Then it becomes a null weak head in (9b).
(9) a. I think that you are right. b. I think you are right.
If a head is never phonetically realized, we cannot claim that it loses its phonological features in the narrow syntax because it is possible that the phonological features of the head are not phonetically realized. For example, (10a) shows that the phonological features on T may be realized as [s], but they have no phonetic realization at all in (10b).
(10) a. He likes syntax. b. We all like syntax.
However, we cannot claim that when the subject is plural, T has no phonological features. This is confirmed by (11), in which the verb takes a different form when the subject is changed from singular to plural. Another example in our favor is the phonological features of the plural morpheme. They may be realized as [s], as in
(11) a. He is a writer. b. We are writers.
The next point is related to the labeling of weak heads. Chomsky (2015) proposes that the weak head can have its mother node labeled by carrying out feature matching with a phrase. Specifically, although the overt weak head cannot provide a label for its mother node, it can help its mother node get labeled by undergoing feature matching with another phrase in its Spec. How then can the null weak head get its mother node labeled? As a kind of weak heads, the null head should try to have its mother node labeled with the help of feature matching, as overt weak heads do. Given that LA is a minimal search carried out in a local domain (Chomsky, 2013, 2015), the null weak head should try to find a constituent in its local domain to carry out feature matching so its mother node can be labeled. As stated above, the overt weak head relies on feature matching with the phrase in its Spec to have its mother node labeled. Since the overt weak head can become null, as in (12), the null weak head must resort to another kind of feature matching to have its mother node labeled. Otherwise, there will be no difference between the null and the overt weak head, which is against the assumption that the phonological features really play a crucial role in the LA.
(12) I suggest that he (should) go to the library right now.
Then what kind of feature matching can a null weak head rely on to have its mother node labeled? In the local domain, a head is likely to have two kinds of feature matching: One is to perform feature matching with the phrase in its Spec, and the other is to perform c-selection feature matching with its complement (Chomsky, 1995). Since the feature matching between the null weak head and its Spec is unable to have its mother node labeled, the null weak head has to rely on c-selection feature matching to have its mother node labeled.
We think that there are some factors in favor of this assumption. Firstly, almost all heads can be empty, that is, every head is likely to turn into a null weak head, and the feature shared by every head and the phrase in its local domain should be c-selection feature matching. For example, T in (12) is null, namely a null weak head. The complement of null T is a VP/v*P, which can undergo c-selection feature matching with T to have their mother node labeled. Accordingly, (12) is expected to be grammatical. By contrast, the complement of T in (13) is a ParticipleP, which cannot succeed in undergoing c-selection feature matching with the null head T. As a result, a labeling failure ensues.
(13) *I suggest that she going to the school.
Secondly, many scholars like Seely (2006), Bošković (2016b), and Narita and Fukui (2022) argue convincingly that c-selection is indispensable in the minimalist syntax. Besides, Chomsky (1995, p. 247) also argues that when a head and a phrase are merged, category feature checking is necessary.
Based on the discussion above, we can claim that the null weak head cannot label, but it can have its mother node labeled by carrying out c-selection feature matching with its complement. If we carefully scrutinize null weak heads, we can see that they should be divided into two types: (a) the overt weak heads become null owing to loss of phonological features in the syntax. The overt weak head like English T must undergo feature matching with the phrase in its Spec to have its mother node labeled. Once it becomes null, its identity should not be altered. For example, after the head T becomes null, although its syntactic properties may be affected, it is still T (i.e., we cannot argue that once T is null, it becomes a V or other heads for that matter). Therefore, feature matching between the null T and the phrase in Spec-TP is still necessary. To be exact, the overt weak T relies on phi-feature matching to have its mother node labeled. Then the null weak T should rely on both phi-feature matching and c-selection feature matching to have its mother node labeled. This assumption of multiple matching might be reasonable because many scholars also propose that multiple agreement is possible in syntax; see Hiraiwa (2004), Anagnostopoulou (2005) and Nevins (2011), particularly, Béjar and Rezac (2009). (b) The overt strong heads become null owing to loss of phonological features in the syntax. The overt strong head can label on this own. Then the null version of the strong head can carry out c-selection feature matching to have its mother node labeled. In the following, for ease of illustration, we will not differentiate these two types of weak heads, and assume that the null weak heads can have their mother nodes labeled by carrying out c-selection feature matching with their complements.
To sum up, in this section we develop Chomsky’s (2015) labeling theory by proposing that there is another version of weak head, namely the head that loses phonological features in the syntax. This kind of weak head cannot provide a label for its mother node, but it can have its mother node labeled by undergoing c-selection feature matching with its complement. We also provide a diagnostic method to determine whether a head has lost phonological features in the syntax or not. In the following section we will test whether our proposal can lead to be a novel prediction of linguistic facts or provide a new perspective to uncover the rule underlying different phenomena or offer a better account of linguistic facts than previous research. If our hypothesis can achieve any or all of the goals listed above, it can be said that the hypothesis is both theoretically and empirically superior.
The Distribution of Empty Categories in Null Head Constructions
In this section, we study the distribution of empty categories in many constructions with our definition of weak heads and Chomsky’s (2013, 2015) constraint that a structure must be labeled for interpretation.
Null T Constructions
It has been noted that an elided VP must be preceded by an auxiliary verb, as can be exemplified in (14) and (15) (See Aelbrecht & Haegeman, 2012; Aelbrecht & Harwood, 2015; Bresnan, 1976; Johnson, 2001; Lobeck, 1995; Zagona, 1988 for more examples).
(14) a. Jane hasn’t eaten any rutabagas and Holly hasn’t either. b. Mag Wildwood wants to read Fred’s story, and I also want to. c. John wants to go on vacation, but he doesn’t know when to. (15) a. I thought the auxiliary hadn’t disappeared, but it *(had) b. *I can’t believe Holly Golightly won’t eat rutabagas. I can’t believe Fred, either. c. John didn’t go because he did want *(to).
If there is no auxiliary verb in the clause, that is, if the elided verb is finite, the dummy auxiliary
(16) The chicken didn’t put the tuna on the table, but the penguin did.
Johnson (2001) observes that the trace of topicalized VP must be governed by an overt auxiliary, too, which is shown clearly in the contrast between (17) and (18). In (17) the trace of topicalized VP is governed by an overt auxiliary, and the relevant examples are grammatical. By contrast, the auxiliary together with VP is topicalized in (18). In other words, the trace of VP is governed by the trace of an auxiliary rather than the overt auxiliary itself. On this occasion, the relevant examples are ungrammatical.
(17) Madame Spanella claimed that . . . a. eat rutabagas, Holly wouldn’t t. b. eaten rutabagas, Holly hasn’t t. c. eating rutabagas, Holly should be t. (18) Madame Spanella claimed that . . . a. *would eat rutabagas, Holly t. b. *hasn’t eaten rutabagas, Holly t.
The following sentences lend stronger support to the idea that both the elided VP and the trace of topicalized VP must be preceded by an overt auxiliary. The auxiliary in (19a) is optional. It can be seen in (19b) and (19c) that once the auxiliary is empty, VP cannot be elided or fronted.
(19) a. They requested that he (should) sing a song. b. *They requested that he sing a song and she also. c. *Sing a song, they requested that he.
(20) indicates that the same is true of Chinese. (20a) shows that if the auxiliary becomes covert, the VP cannot be elided, and (20b) suggests that if the auxiliary is also topicalized together with VP, the sentence will be ungrammatical.
(20) Zhangsan say he can speak English Lisi also can speak English. ‘Zhangsan said he could speak English, and Lisi could speak English, too.’ b. * Zhangsan say can speak English Lisi ‘Zhangsan said that Lisi was able to speak English.’shuo yingyu.
Under our approach, the generalization that the ellipsis of VP or the trace of the topicalized VP must be preceded by an overt auxiliary can be accounted for. First consider (14) and (16). Since T is realized either as an auxiliary or the dummy auxiliary
(21) 
By contrast, owing to lack of phonological features, T in (15) is a null weak head, incapable of providing a label for the structure marked as “?” in (22). It must undergo c-selection feature matching with the phrase in its complement position to get its mother node labeled. However, VP has been elided in the narrow syntax (Baltin, 2012; Park, 2017). After VP is elided, the ellipsis site will be like a pro-form (Baltin, 2012). This pro-form cannot participate in the LA because it has no specific category feature. Whether the phrase is originally a VP, a DP, or an NP, after it is elided, it will be a pro-form, whose contents are recovered at LF/interface. In addition,
(22) 
(17) to (19) can be explained away in the same way. If the auxiliary also moves to Spec-CP together with VP, the head T will be a null weak head when the LA starts to work as part of the spell-out operation. As a null weak head, it must undergo c-selection feature matching with the phrase in the complement of T. However, at this moment, the null weak head T cannot undergo feature matching with its complement because the latter has moved to Spec-CP, a position too far away from the null T (See Chomsky, 2015, for similar account of the ungrammaticality of [1a]). Consequently, the structure formed by the trace of the moved T and the trace of the VP cannot be properly labeled (see [22]), and the ungrammaticality ensues. In (20) if the T in Chinese loses its phonological features in the syntax, it will be a null weak head. Topicalization or ellipsis of VP will be ungrammatical for the same reason as in the case of (17) to (19).
Sentences like (23) are also in favor of our assumption. Even if no overt auxiliary is available after the complementizer
(23) Ted hoped to vacation in Liberia but his agent recommended that he not. Under our approach the following phenomenon can also be explained. (24) a. Is she a teacher? *Yes, she’s. b. Who’s the tallest girl in our university? *Mary’s. (25) a. Is she a teacher? Yes, she is. b. Who’s the tallest girl in our university? *Mary’s.
Before we move to another kind of null head, it needs to be clarified that stripping is not a challenge to our assumption. We agree with Lobeck (1995, p. 27) in assuming that ellipsis is different from stripping, as (26) shows.
(26) Jane loves to study rocks and John [e] too.
Following Haegeman and Lohndal (2015), who recast Johnson (2014), we can assume that both
The fragment answer shown in (27) is not a challenge to our analysis either. Following Stainton (2006), we can assume that there is no ellipsis involved in this construction. Alternatively, we can follow Merchant (2005) in assuming that the fragment moves to the clause periphery first before the TP structure is elided. See also section 3.5 for discussion about the head C.
(27) Who finished this task first? John.
Null Verb Constructions
Verbs can be empty under certain circumstances. In such a case, the object of the verb cannot be relativized, topicalized, or elided, which is exemplified by (28) and (29).
(28) a. I think the students in MIT have all arrived, but I do not know whether our guests have both *(arrived). b. These books, we must have all read, but I do not know whether our guests have both *(read). c. The books that we must have all read are written by Chomsky, and the one that our guests have both *(read) is written by Chomsky, too. (29) a. these city we some go-ASP they all all go-ASP ‘As for these cities, we have been to some of them, but they have been to all of them.’ b. we some person go-ASP these city they all all go-ASP ‘Only some of us have been to these cities, but all of them have been to there.’
The above facts can be captured neatly under our assumptions. These sentences are similar in that there is a quantifier in front of the empty categories left by VP. As argued convincingly by Sportiche (1988), the quantifier is adjacent to the NP with which it is merged before the latter undergoes movement. Considering the subject VP internal hypotheses (Diesing, 1990; Kitagawa, 1986) and the requirement of (30), we anticipate that the position occupied by the stranded quantifier should be in Spec-v*P, the base position for the subject (see also Bonet, 1990). Also, the functional category v* serves as the head of v*P in terms of the Government and Binding theory.
(30) Functional heads may be empty. But the abstract functional features must be licensed by the lexical material in its Specifier, and vise versa (Gasde an& Paul, 1996, p. 265).
Nevertheless, v* does not have phonological features in the LA in (28) and (29) because the phonological features of v* and V are elided before LA starts to work (See also our diagnostic test for null weak heads). Put differently, it is a null weak head. Then, it must undergo c-selection feature matching with the phrase in its complement position to get its mother node labeled. However, its complement (namely VP) has been elided before LA is to start (See Johnson, 2001; Merchant, 2001; Schuyler, 2001, among others, for the assumption that structures like (28) and (29) are VP ellipsis constructions). Consequently, feature matching becomes impossible and the empty category will be in an unlabeled structure at the interface. The unlabeled structure is marked as “?,” which is illustrated in (31).
(31) 
Now let us consider the following sentence, which can be used in specific contexts, in which, for example, I am distributing a fruit to students. It is a null verb construction rather than a typical gapping construction because the verbs in both conjuncts are empty. Therefore, it is a perfect example to test the effect of null verb.
(32) you three-CL apple he four-CL banana ‘You have three apples and he has four bananas.’
(33) shows that the object cannot be topicalized or elided in such a case. Put differently, if the verb is null, the object position cannot turn out to be empty, which falls right into place under our approach.
(33) you three-CL apple he also ‘You have three apples and he also has three apples.’ b. you three-CL apple four-CL banana he ‘You have three apples and as for the four bananas, he will have them.’ c. * three-CL apple you four-CL banana he ‘You have three apples and he has four bananas.’
The following is the corresponding sentence of (32) in which the verb is overt/not omitted.
(34) you eat three-CL apple he eat four-CL banana ‘You have three apples and he has four bananas.’
As (35) shows, on this occasion, the complement of the verb can be empty, which lends stronger support to our hypothesis, that is, it is the phonological features of the verb that make all the differences.
(35) you eat three-CL apple he also eat ‘You have three apples and he also has three apples.’ b. you eat three-CL apple four-CL banana he eat ‘You have three apples and as for the four bananas, he will have them.’ c. three-CL apple you have four-CL banana he eat ‘As for the three apples, you eat them; and as for the four bananas, he will eat them.’
Above, it is shown that when the phonological features on v*/V are erased, its complement position cannot be empty. Otherwise, there will be a labeling failure, and the sentence will be ungrammatical, accordingly. In the above examples, most of the verbs are transitive. Actually, even if the verb is intransitive, it cannot be elided either, which is demonstrated below.
(36) they all smile-ASP we also all all smile-ASP ‘They all smiled. So did we.’ b. they all leave-ASP you also can ‘They all left. So can you.’
This is what we expect, and the structure shown in (31) is applicable to (36a). With the phrase
Null Cl Constructions
In Mandarin Chinese, the classifier can be omitted in a specific circumstance; that is, the number is one, and the classifier is
(37) I not very love this (CL) child. ‘I do not like this child very much.’
Classifier omission is much more common in Beijing Dialect (Jin, 1995; Zhu, 1982, p. 220), which can be exemplified by (38).
(38) here have one (CL) desk ‘There is a desk here.’
One piece of evidence in support of the null classifier hypothesis in (38) comes from tone sandhi.
Interestingly, if the classifier
(39) a. I not like this child also not like that ‘I like neither this child nor that one.’ b. child I like this ‘As for the children, I like this one.’ c. I like this DE child ‘the child that I like’ (40) a. I not like this-CL child also not like that-CL ‘I like neither this child nor that one.’ b. child I like this-CL ‘As for the children, I like this one’.
It cannot be said that the sentences in (39) are unacceptable because the demonstrative pronoun
(41) you not like this also not like that you really like what ‘You do not like this, and you do not like that. What on earth is your favorite?’
The above fact is expected. If the classifier has phonological features, it will be a strong head, thereby providing provide a label for the structure formed by the merger of CL and NP. By contrast, when the classifier is empty, it will be weak. Once its complement becomes an empty category, the whole structure will be short of a label, resulting in a crash at the interface.
The following phrase can also lend support to our assumption.
(42) one boy ‘one boy ’ b. a sound thrashing ‘a sound thrashing’
(42a) is grammatical because the null classifier only c-selects an individual noun. Being an individual noun,
Null de Constructions
In Chinese,
Null de used to connect a possessee with a possessor
(43) I like you DE school ‘I like your school.’ (44) I like you DE school too like them DE school ‘I like your school and I like theirs, too.’ b. school I like them DE ‘As for the school, I like theirs.’ (45) I like you school too like them school ‘I like your school and I like theirs, too.’ b. * school I like them ‘As for the school, I like theirs.’
With our assumptions, the above data can fall into place. First let us see (44). With an overt form,
(46)
Perhaps, one may ask whether the possessive phrases without
(47) Zhangsan DE that-CL book ruin-ASP him/ himself/himself ‘That book of Zhangsan ruins him.’ (48) * he Zhangsan DE father not like ‘Zhangsani’s father does not like himi.’
Furthermore, the possessor is obligatory even if there is no overt
(49) * he father like linguistics ‘His father likes linguistics.’
De used to connect two phrases without possessive relation
Similar to the cases in which
(50) table I like wood/red/red DE ‘Speaking of the tables, I like the wooden/red one. ’
book I like about history DE ‘Speaking of the books, I like the one about history. ’
computer I like you buy DE ‘Speaking of the computers, I like the one you bought.’
In Mandarin Chinese, the complement of
(51) I am drive car DE ‘I am a driver.’
These facts can be accounted for easily along the above lines. Without phonological features, the head
Other Heads
In this section, we will have a brief discussion of other heads, some of which might appear to challenge our assumptions. As was previously argued, a head that loses phonological features in the syntax is weak. As a null weak head, it is incapable of providing a label for its mother node. If its complement is also empty, the empty category will be in an unlabeled structure. The empty category cannot be identified at the interface, and a crash will show up. However, the sluicing phenomenon shown in (52) seems to be contrary to our prediction.
(52) He is writing something, but you can’t imagine what. (Ross, 1967, p. 252)
It can be seen that in the embedded clause of the second conjunct in (52), the head C is empty and the complement of C is elided, but this sentence turns out to be acceptable. Why is it that the head C can still provide a label for the structure?
Instead of being a challenge to our hypothesis, the sluicing phenomena fall right into place under our assumption. In the following we will choose Merchant’s (2001) analysis of sluicing as an example for illustration because his analysis is the most influential (See Abe, 2015; Lasnik, 2001, among many others). Based on Merchant (2001), (52) should be derived by moving
(53) You can’t imagine [CP what CQ [TP he is writing what]].
Now it will be clear why sluicing is not a challenge to our hypothesis. C has phonological features. However, such features never get phonetic forms in sluicing constructions. This produces an illusion that the phonological features on C get erased and become empty in the narrow syntax. In fact, the phonological features of C remain intact throughout the derivation. Since C is a strong head in sluicing constructions, it will be able to serve as a label. As a result, the empty category of TP will be in a labeled structure. Nothing is wrong with sentences like (52).
The following sentences are in support of our assumption.
(54) A: Max has invited someone. B: Really? Who (*has)? (Abe, 2015, p. 17) (55) En eller andensnakker med Marit, men vi ved ikke hvem (*der). (Danish) someone talks with Marit but we know not who C0 ‘Someone talks with Marit, but we do not know who.’ (Merchant, 2001, p. 68)
As can be seen, wh-word and an overt C in sluicing never co-occur in English and other Germanic languages. This motivates Merchant (2001) into coming to the following generalization.
(56) Sluicing-ComP generation: In sluicing, no non-operator material may appear in Comp.
This generalization shows that the phonological features of C in the sluicing constructions never get phonetic forms at PF. Given that phonological features on C are not elided in the syntax, C will be a strong head, capable of providing a label for its mother node.
(9), repeated as (57), can also be accounted for easily. In (57a) the head C can be realized as
(57) a. I think that you are right. b. I think you are right.
After this brief discussion of the head C, let us turn to the head D. The phonological features of D can have overt phonetic form in English, as (58) shows. In this sentence, D should be a strong head, which can provide a label for its mother node. Therefore, nothing is wrong with it.
(58) I like the books.
What is interesting is (59). D is able to have a phonetic form, but it does not, which suggests that D in (59) should be a null weak head. Based on feature matching with its complement, D can have its mother node labeled. Thus, no structure in (59) is improperly labeled, and (59) is expected to be grammatical.
(59) I like books.
The following sentence supports our assumption.
(60) I like new DE book ‘I like new books.’
This is really the case. The phrase
With the discussion above, let us consider null preposition constructions, which, in our opinion, can lend support to our assumptions. Please see (61).
(61) a. Where has John been? b. Which place has John been *(to)?
In (61a), the wh-word
(62) a. Where did John fly to? He flew to Germany. (Miyagawa, 2017, p. 132) b. *Where did John fly? He flew to Germany.
With the overt preposition
Before concluding this section, we need to point out that there are alternative accounts for certain phenomena listed in this paper, such as (14) and (15); see, for example, Lobeck (1995), Merchant (2001, 2005), and Aelbrecht (2010). However, compared with them, our approach has obvious advantages. First, citing Miyagawa et al. (2019), we can say that “the advantage, we believe, of our approach is that we suggest a unified way to view the multitude of phenomena that require separate explanations.” Second, under our approach, many new data can be uncovered (See section 3.2 and 3.3, among others). Third, many facts in this paper cannot be explained with the existing hypotheses. Let us take Aelbrecht (2010) as an example for exposition. According to her, VP ellipsis is licensed by an ellipsis feature on Voice (namely, [E]-feature) which must be checked against the head T via Agree. Although this can explain the contrast shown in (14) and (15), it is hard to account for the fact shown in (19b) where the auxiliary should be optional. Under this approach it will be a mystery why the phonological features of the auxiliary can affect the ellipsis of VP. Moreover, it is hard to explain Chinese ellipsis, as in (20), where no overt agreement features or inflectional variations are observable. Furthermore, it cannot be extended to movement and ellipsis constraints shown in (19c), (24), and (25), and the facts in section 3.2 to 3.4. Interestingly, once our assumption is adopted, all these facts fall right into place.
Conclusion
Chomsky (2015) argues that there are two kinds of weak heads: lexical heads and T in languages with poor inflectional morphology. In this paper, we follow Richards (2016, 2020) in proposing that there is another kind of weak head, namely the head that loses its phonological features in the syntax. This approach to weak heads, together with the constraint that a structure must be labeled for interpretation, can explain the distribution of empty categories in different kinds of constructions. Thus, our syntax-phonology perspective presents a new approach to labeling and opens up new possibilities to account for the distribution of empty categories in a principled way.
