# Week 5. Grammar II: Syntax

`````{admonition} TL;DR [[slides](https://docs.google.com/presentation/d/1wsjv8PVLyPvXI27o69clMr7U-yTkJINJaKPn76EvczQ/edit?usp=sharing)]
:class: note
- Syntax: putting words together
	- Grammatical vs. ungrammatical sentences
	- Words in linear order
	- Syntax beyond linearization
- Syntax and language technology
	- Syntactic parsing: a classic task
    - But do we need syntax at all? Maybe not as much as before, but probably still yes
`````

## Putting words together

Last week, we talked about words and their parts, morphemes: how morphemes are combined together, what kinds of morphemes exist and so on. Today, we move one level up and discuss how words can be put together to form phrases and sentences. 

`````{admonition} Important notion
:class: warning
**Syntax** studies how words combine into larger units like phrases and sentences.  
`````

This seems very similar to morphology -- there are objects (morphemes) that combine together to make words; there are objects (words) that combine together to make sentences, kind of like this:


```{image} ./images/syntmorph.jpeg
:alt: double structure
:class: bg-primary mb-1
:width: 350px
:align: center
```

Aren't syntax and morphology the same thing then, apart from working with different units? Why study them separately at all? That's a fair consideration actually, which might be a reasonable concern for different languages to different extents, and boils down to the amount of distinct processes that work on the level of words but not at the level of sentences, and the other way around. 

`````{admonition} A tiny question along the way
:class: attention 
We discussed an example of such process last week for Turkish: it works on the level of individual words but not above. Can you remember what it was?
`````

### Infinite sequences, finite means

There is one interesting way where morphology and syntax seem to diverge from each other. When reading syntax introductions, one can often find the discussion of syntax as a **tool to build infinite sequences from finite means**. This means two things: 

- The number of sentences one can build in a language is potentially infinite;
- The length of each individual sentence is also in principle unconstrained: if I had infinite time in my disposal, I could've given you an infinite sentence.

The first point can be illustrated by the fact that we constantly come up with sentences that have never been said before. Here is one: _Lisa Bylinina will teach in Groningen this week_ -- I am pretty sure it's a brand new sentence, but it's a part of English language, in a sense that you recognize that this is well-formed and you know what it means, so making and processing new sentences is clearly a core part of knowing and using language. With words, it does not really work this way -- I can come up with a new word that nobody has seen before (here's one: _braplomindew_), but it's not very likely that people will understand me and, unless some time passes and more and more people use it the same way as I did, it will not really become a part of language. 

What about word vs. sentence length? We saw one pretty long word last week:

<center><big><span style="color:DeepPink"><b>anti</b></span><span style="color:Indigo"><b>dis</b></span><span style="color:Crimson"><b>establish</b></span><span style="color:ForestGreen"><b>ment</b></span><span style="color:DarkSlateGray"><b>ari</b></span><span style="color:Purple"><b>an</b></span><span style="color:SteelBlue"><b>ism</b></span></big></center>
<br>

How much longer can it get? With this particular one, it's easy: we can repeat the prefix _anti-_ potentially indefinitely with a meaningful result, even though after a while we might struggle to see what exactly it's supposed to mean:

<center><big>...<span style="color:Purple"><b>anti</b></span><span style="color:SteelBlue"><b>anti</b></span><span style="color:DeepPink"><b>anti</b></span><span style="color:Indigo"><b>dis</b></span><span style="color:Crimson"><b>establish</b></span><span style="color:ForestGreen"><b>ment</b></span><span style="color:DarkSlateGray"><b>ari</b></span><span style="color:Purple"><b>an</b></span><span style="color:SteelBlue"><b>ism</b></span></big></center>
<br>

Similarly with sentences, in some cases we can infinitely repeat some parts of sentences and get potentially unbounded sentences:

<center><big>
This sentence will go on and on and on and on and on and on...
</big></center>
<br>

But with sentences the sources of this potential infinity are more easy to come by and more diverse than that. As an example, a sentence can contain another sentence as its part, and once we see that grammar allows for this, it makes obvious the possibility of potentially infinite embedding, where a sentence contains a sentence, and it contains another sentence, and so on and so forth, infinitely (a property of a system known as **recursion**):

<center><big>
[<sub>S</sub> I called my friend ]<br>
[<sub>S</sub> I called my friend [<sub>S'</sub> who knows my colleague ] ]<br>
[<sub>S</sub> I called my friend [<sub>S'</sub> who knows my colleague [<sub>S''</sub> who is away this semester ] ] ]<br>
...
</big></center>
<br>

The thought here is this: it looks like there are some deep differences between combining morphemes into words and combining words into sentences. Sometimes these differences are not so clear, actually, but there's something about this contrast in how free and creative word combinations are, while combinations of morphemes are less so. Syntax gives you combinatorial freedom that is somehow beyond the limits of combinatorial freedom of morphology.

If you are not totally convinced by this, think about dictionaries! Dictionaries are attempts to list all words, or maybe as many words as possible. We don't really have dictionaries of sentences -- and intuitively, it's clear why. There are just too many, and it's easier to make / analyze them on the fly, no need to list them. Similarly, think about things like longest word challenges, where people compete to come up with very long words that are longer than what other people came up with. If combinatorial power of word formation were as great as those for sentences, such challenges wouldn't exist. But, to be fair, I looked it up now and there are longest sentence challenges, go figure.

Long story short, sentences are built from words, there is a potentially infinite set of sentences, and words can be treated as a finite set, as a bit of a simplification. So, syntax builds infinite sequences from finite means.


### Words in the right order

Let's start by stating the maybe obvious fact that language has rules of combining words with each other, and not all word combinations in all orders result in phrases or sentences that the language allows. For instance, this sentence below is completely fine in English:

<center><big>The girl likes the book.</big></center>
<br>

```{margin} 
The asterisk in front of the sentence is used to indicate ungrammaticality.
```
But arranging the very same words in different order might result in a sentence that sounds bad, or ill-formed, or what's called **ungrammatical** -- a speaker of English wouldn't say this:

<center><big><sup>*</sup>Likes the girl the book.</big></center>
<br>

In the first sentence, the word order was Subject -- Verb -- Object (**SVO**), while in the second, ungrammatical, sentence the order is **VSO**, which is not allowed in English. But not all languages work like English in this respect. In Welsh, for example, VSO order is exactly how you put together a sentence:

```{margin} 
Welsh. Example from Borsley, R.D., Tallerman, M. and Willis, D. 2007. _The syntax of Welsh_. Cambridge University Press.
```
<div id='outerTable'><table>
  <tr><td>(1)&nbsp;&nbsp;&nbsp;</td><td>Prynodd</td><td>Elin</td><td>dorth</td><td>o</td><td>fara.</td></tr>
  <tr><td></td><td>buy.PAST.3S&nbsp;&nbsp;&nbsp;</td><td>Elin&nbsp;&nbsp;&nbsp;</td><td>loaf&nbsp;&nbsp;&nbsp;</td><td>of</td><td>bread&nbsp;&nbsp;&nbsp;</td></tr>
  <tr><td></td><td colspan=5>'Elin bought a loaf of bread.'</td></tr>
</table></div>
<br>

In fact, the English word order is not even the most wide-spread word order among languages of the world, as far as we know. The most widespread group is **SOV** languages:


```{margin} 
Data from [WALS](https://wals.info/feature/81A#2/18.0/152.9).
```
```{image} ./images/svo.png
:alt: word orders
:class: bg-primary mb-1
:width: 500px
:align: center
```
Some examples of SOV languages are Turkish, Japanese and Korean. Here's an illustrating Japanese example:

```{margin} 
Japanese
```
<div id='outerTable'><table>
  <tr><td>(2)&nbsp;&nbsp;&nbsp;</td><td>Watashi&nbsp;&nbsp;&nbsp;</td><td>wa&nbsp;&nbsp;&nbsp;</td><td>hon&nbsp;&nbsp;&nbsp;</td><td>o</td><td>yomimasu.&nbsp;&nbsp;&nbsp;</td></tr>
  <tr><td></td><td>I</td><td>TOP&nbsp;&nbsp;&nbsp;</td><td>book&nbsp;&nbsp;&nbsp;</td><td>ACC&nbsp;&nbsp;&nbsp;</td><td>read&nbsp;&nbsp;&nbsp;</td></tr>
  <tr><td></td><td colspan=5>'I read the book'</td></tr>
</table></div>
<br>


`````{admonition} Oversimplification alert!
:class: warning, dropdown
According to this classification, languages like Dutch seem to fall in the SVO category. In simple sentences, it's correct:

<div id='outerTable'><table>
  <tr><td>Riny</td><td>vindt</td><td>linguistiek</td><td>leuk</td></tr>
  <tr><td>subject&nbsp;&nbsp;&nbsp;</td><td>verb&nbsp;&nbsp;&nbsp;</td><td>object&nbsp;&nbsp;&nbsp;</td><td></td></tr>
</table></div>
<br>

But when we are dealing with a complex sentence, the embedded one has SOV order:

<div id='outerTable'><table>
  <tr><td>Thijs</td><td>vertelde</td><td>(aan) jou</td><td>dat&nbsp;&nbsp;&nbsp;</td><td>Riny</td><td>linguistiek</td><td>leuk</td><td>vindt</td></tr>
  <tr><td>subject&nbsp;&nbsp;&nbsp;</td><td>verb&nbsp;&nbsp;&nbsp;</td><td>object&nbsp;&nbsp;&nbsp;</td><td></td><td>subject&nbsp;&nbsp;&nbsp;</td><td>object&nbsp;&nbsp;&nbsp;</td><td></td><td>verb</td></tr>
</table></div>
<br>

Also, it's not like the order is free or unfixed -- it's fixed, but in different ways in different constructions! This is the case for many languages in this classification. We will ignore this fact.
`````

Languages constrain word order not only between the verb and its subject and/or object. Languages differ, for example, also in the following (this is not an exhaustive list!):


1. Whether language has prepositions or postpositions (that is, the linear placement of the **adpositions**) ([see distribution across languages](https://wals.info/feature/85A#2/16.0/170.9))

|         | preposition | postposition |
|---------|:-----------:|:------------:|
| English | with Anna   | <sup>*</sup>Anna with   |
| Turkish | <sup>*</sup>ile Anna   | Anna ile     |

2. The position of the possessor with the respect to the possessee ([see distribution](https://wals.info/feature/86A#2/21.0/152.9)):

|         | Poss-N | N-Poss |
|---------|:-----------:|:------------:|
| English | Anna's book   | <sup>*</sup>book Anna's   |
| Irish | <sup>*</sup>Anna leabhar  | leabhar Anna    |

It's interesting that these different word order parameters are not independent from each other. Linguists have been noticing interactions between them for quite a while now. Here are four relevant [Greenberg Universals](https://en.wikipedia.org/wiki/Greenberg%27s_linguistic_universals) (we've seen one or two linguistic universals before!)

> **Universal 2**: "In languages with prepositions, the genitive almost always follows the governing noun, while in languages with postpositions it almost always precedes."<br><br>
**Universal 3**: "Languages with dominant VSO order are always prepositional."<br><br>
**Universal 4**: "If a language has dominant SOV order and the genitive follows the governing noun, then the adjective likewise follows the noun."<br><br>
**Universal 5**: "With overwhelmingly greater than chance frequency, languages with normal SOV order are postpositional."

Looking at these generalizations -- let's say, at the last one in particular -- one might want to generalize these constraints in a way that makes them fall out of just one rule. For instance, we can say that a language is either **head-final** or **head-initial**: 
- if a verb combines with its object so that the object precedes the verb (as in the SOV order), the adposition combines with its noun so that the noun precedes the adposition (that's like in Turkish);
- and the other way around: if the verb precedes its object, then the adposition precedes its noun (that's English).

That would give us two types of languages where two linear parameters are covered by just one rule:
- **Head-final** languages (e.g. Turkish): O > V; N > ADP;
- **Head-initial** languages (e.g. English): V > O; ADP > N.

But this attempt is complicated by the fact that not all languages fall nicely into one of these patterns, and also by the fact that we have to define what the head is in each of these cases.

`````{admonition} A tiny question along the way
:class: attention 
English does not conform to one of the generalizations above. Which one and how?
`````

Some word order constraints received interesting potential explanations grounded in distributions of objects and their properties in the world, might be interesting as extra reading if you're curious!

> Culbertson, J., Schouwstra, M. and Kirby, S., 2020. [From the world to word order: deriving biases in noun phrase order from statistical properties of the world](https://www.linguisticsociety.org/sites/default/files/08_96.3Culbertson.pdf). Language, 96(3), pp.696-717.

### More complete recipes for simple sentences

Constraints on word order work both ways: 
- They tell speakers where to put subject and object of the sentence with respect to the verb;
- They tell the person hearing or reading the sentence which parts of the sentence are subjects and objects, based on their linear position.

What if a language does not have fixed word order? As a speaker, this means you can do whatever you want placing different parts of sentences with respect to each other (well, there are often still constraints, but let's ignore that too). But as a listener, how do you know who did what to whom if the participants can appear anywhere in the sentence? Well, grammar governs things beyond just linear order.

Let's look at ways you build a simple sentence in different languages in a way that takes into account things other than word order. We will need to start with a notion of **semantic role**.

`````{admonition} Important notion
:class: warning
**Semantic role**, a.k.a. thematic relation, describes the type of involvement of a participant in an event described by the verb in the sentence.
`````

We will look at just two:
- **Agent** is a participant that initiates or causes the event, typically intentionally, and normally has control over the event.
- **Patient** undergoes the action and changes its state, normally has no control over the course of the event.

In a sentence _The dog attacked the cat_, for example, the dog is the agent and the cat is the patient. There can be events with just one participant, and it can be an agent (_Mary is walking_) or a patient (_John fell_). Note that semantic roles (agent, patient) are not the same as grammatical roles (subject, object). Often, those coincide:


```{image} ./images/active.jpeg
:alt: active construction
:class: bg-primary mb-1
:width: 350px
:align: center
```

But sometimes, for example, in the passive construction, they do not: the agent can be the object, while the patient becomes the subject.

```{image} ./images/passive.jpeg
:alt: passive construction
:class: bg-primary mb-1
:width: 400px
:align: center
```
So, formulating a sentence requires mapping from its meaning-related components (such as participants and their type of participation) to syntax -- and this mapping can sometimes be tricky. Let's zoom in on this mapping for a bit and look at three types of events against the set of their main participants:

- Events with two participants: an agent and a patient (typical ones are: _kill_, _push_, _attack_ etc.);
- Events with just one participant: an agent (_run_, _exercise_);
- Events with just one participant: a patient (_fall_, _die_).

How does language encode them? Let's look at English:

> **She** attacked **her**. <br> **She** exercised. <br> **She** fell. 

We see that the agent of the 2-participant event and the agent as well as the patient of the one-participant events have the same form (_she_), while the patient of the 2-participant event is grammatically different (_her_). Let's unfold the system behind this fact step by step.

```{margin} 
Accusative alignment
```
```{image} ./images/acc_detail.jpeg
:alt: accusative alignment
:class: bg-primary mb-1
:width: 650px
:align: center
```

1. First, let's map these pronouns to the roles the corresponding participants play in the event described by the sentence.
      - _She_ in _She exercised_ is an agent in a 1-participant event.
      - _She_ in _She fell_ is a patient in a 1-participant event.
      - _She_ and _her_ are the agent and the patient in the 2-participant event, respectively.
2. If we look at the forms of these pronouns again, we see that three of them coincide and one of them looks different -- that's the patient in the 2-participant event.
3. The two forms we are dealing with here -- _she_ and _her_ -- come from two different series of pronouns: 1) _I_, _we_, _she_, _he_, _they_ vs. 2) _me_, _us_, _her_, _him_, _them_. These series differ in **case**: the first set is in nominative case, while the second set is in accusative. That's what the observation is on a more abstract level: English, when it comes to pronouns, uses nominative case to encode the agent and the patient in 1-participant events and the agent of 2-participant events, and accusative for the patient of 2-participant events.
4. This way of mapping participants to their grammatical encoding (**morphosyntactic alignment**) is called **accusative alignment** (sometimes, nominal-accusative alignment).

NB: English distinguishes these cases only for pronouns, the full nouns would not show this contrast at least when we look at case marking, because English nouns don't really have case!

<!---
. We can draw this schematically like this, where the blue marks one-participant events' agent and patient, and the green A and P are two participants in a 2-participant event. 

```{margin} 
Accusative alignment
```
```{image} ./images/acc.jpeg
:alt: accusative alignment
:class: bg-primary mb-1
:width: 180px
:align: center
```

This way of mapping participants to their grammatical encoding (**morphosyntactic alignment**) is called **accusative alignment** (sometimes, nominal-accusative alignment). 
--->

Accusative alignment is very common but it's not the only possible way!

Here is another very popular type of morphosyntactic alignment called **ergative alignment** (a.k.a. ergative-absolutive alignment): the agent of a 2-participant event is treated grammatically differently from other kinds of participants, as shown below for Warlpiri:

<!---
```{margin} 
Ergative alignment
```
```{image} ./images/erg.jpeg
:alt: ergative alignment
:class: bg-primary mb-1
:width: 180px
:align: center
```
--->
```{margin} 
[Warlpiri](https://en.wikipedia.org/wiki/Warlpiri_language). Examples from Hale, K. 1983. _Warlpiri and the grammar of non-configurational languages_. NLLT 1.
```
<div id='outerTable'><table>
  <tr><td>(3)&nbsp;&nbsp;&nbsp;</td><td>ngarrka-ngku&nbsp;&nbsp;&nbsp;</td><td>ka&nbsp;&nbsp;&nbsp;</td><td>wawirri&nbsp;&nbsp;&nbsp;</td><td>panti-rni&nbsp;&nbsp;&nbsp;</td></tr>
  <tr><td></td><td>man-ERG</td><td>AUX</td><td>cangaroo</td><td>spear-NPST</td></tr>
  <tr><td></td><td colspan=4>'The man is spearing the kangaroo.'</td></tr>
</table></div>
<br>

<div id='outerTable'><table>
  <tr><td>(4)&nbsp;&nbsp;&nbsp;</td><td>kurdu&nbsp;&nbsp;&nbsp;</td><td>ka&nbsp;&nbsp;&nbsp;</td><td>wanka-mi&nbsp;&nbsp;&nbsp;</td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>(5)&nbsp;&nbsp;&nbsp;</td><td>kurdu&nbsp;&nbsp;&nbsp;</td><td>kapi&nbsp;&nbsp;&nbsp;</td><td>wanti-mi&nbsp;&nbsp;&nbsp;</td></tr>
  <tr><td></td><td>child</td><td>AUX</td><td>speak-NPST</td><td>&nbsp;&nbsp;&nbsp;</td><td></td><td>child</td><td>AUX</td><td>fall-NPST</td></tr>
  <tr><td></td><td colspan=3>'The child is speaking.'</td><td>&nbsp;&nbsp;&nbsp;</td><td></td><td colspan=3>'The child will fall.'</td></tr>
</table></div>
<br>

```{margin} 
Ergative alignment
```
```{image} ./images/erg_detail.jpeg
:alt: ergative alignment
:class: bg-primary mb-1
:width: 700px
:align: center
```
1. If we map nouns in these examples on the types of participants they encode --
2. We see that in the 1-participant situations, there is no case marking on the participants, as well as on the patient of the 2-participant event, but there is something -- _ngku_ -- that marks the agent of the 2-participant event
3. This unmarked case is called absolutive case, and the _ngku_ case here is an example of ergative case
4. We see that Warlpiri groups the nouns here in a different way than what English does with pronouns. This is ergative alignment.

Think about how English sentences with pronouns would look if English had ergative alignment. The 1-participant sentences would stay the same, but the 2-participant sentences could look something like *_Them hit he_, meaning _They hit him_.

Accusative and ergative alignment do not exhaust all attested possibilities. Here are just two more (remember I said above that English nouns do not use case to distinguish different types of participants -- so, in this fragment of English language, the alignment is **neutral**):

```{margin} 
Some more of existing alignments
```
```{image} ./images/good_alignments.jpeg
:alt: alignments
:class: bg-primary mb-1
:width: 250px
:align: center
```

For the distribution of existing alignments, see the corresponding [chapter of WALS](https://wals.info/chapter/98). We don't need to talk in any detail about all available alignments, but it's important, I think, to wrap our heads around the fact that not all languages of the world organize their syntactic structures in the same ways as languages we are most familiar with. At the same time, there are limits to this variability, for instance, here is a sample of alignments that are not attested in natural language:

```{margin} 
Non-existent alignments
```
```{image} ./images/bad_alignments.jpeg
:alt: alignments
:class: bg-primary mb-1
:width: 350px
:align: center
```

`````{admonition} A tiny question along the way
:class: attention 
What should a language look like to exemplify one of these alignments?
`````

**NB:** Morphosyntactic alignment is not a classification of case marking -- or, not exclusively. It's a more general classification of the ways grammar encodes the basic clause structure. It can show itself as case, but it can also show as verbal agreement patterns, as in the examples from Halkomelem below, where the 3-person suffix on the verb only appears to agree with the agent of a 2-participant event and not in other situations:

```{margin} 
[Halkomelem](https://en.wikipedia.org/wiki/Halkomelem). Example from Gerdts, D. 1988. _Object and absolutive in Halkomelem Salish_. Garland.
```
<div id='outerTable'><table>
  <tr><td>(6)&nbsp;&nbsp;&nbsp;</td><td>a.&nbsp;&nbsp;</td><td>ni</td><td>Ɂímǝš</td><td></td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>b.&nbsp;&nbsp;</td><td>ni</td><td>q’<sup>w</sup>ǝ´l-ǝt-ǝs</td><td>&nbsp;&nbsp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>c.&nbsp;&nbsp;</td><td>ni</td><td>cǝn</td><td>q’<sup>w</sup>ǝ´l-ǝt</td></tr>
  <tr><td></td><td></td><td>AUX</td><td>walk</td><td></td><td></td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>AUX&nbsp;&nbsp;&nbsp;</td><td>bake-TR-<b>3ERG</b></td><td></td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>AUX</td><td>I</td><td>bake-TR</td></tr>
  <tr><td></td><td></td><td colspan=2>'He/she/it walks.'</td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td></td><td colspan=2>'He/she/it baked it.'</td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td colspan=3>'I baked it.'</td></tr>
</table></div>

`````{admonition} A tiny question along the way
:class: attention 
What alignment does English show in its verbal agreement?
`````

The take-away message from the discussion so far is this: combining words into sentences and phrases involves establishing grammatical connections between words and expressing these grammatical relations in one way or another. It can be word order and/or morphological marking of some type, such as case or agreement. Some ways of organizing these connections and expressing them are very common among languages, some less so, some are unattested for reasons we do or don't have guesses about. Diving into this deeper can help you form expectations about language data and, consequently, can help you choose tools in how to deal with languages of different types.

### Representing syntax

I ended the previous subsection on a rather abstract note: words in a sentence are connected to each other in some way, and this connection can be expressed by grammar in different ways. But in order for us to be able to talk about these relations and connections, we should try to be more specific in how we represent them. There will be a much much more detailed discussion of this in the 2nd year of the program, during the 'Computational Grammar' course, but it might be helpful to say just one thing about it now.

There are two main ways to talk about syntactic relations, each with their own long-standing linguistic tradition. One of them is based on the idea of **constituency**, and the other one is based on **dependency**. They rely on two equally important and basic intuitions about what happens when two words combine together. On the one hand, these two words now form a bigger unit -- a phrase, a constituent. On the other hand, there is now a directed relation between these two words, one of them depends on the other in some way. The very core of these two approaches can be shown on a simple sentence:

```{image} ./images/const_dep.jpeg
:alt: constituency vs. dependency
:class: bg-primary mb-1
:width: 550px
:align: center
```

Constituency analysis emphasises the fact that these two words now function as a unit that can be used, for instance, as part of bigger phrases and sentences: _Mary thinks that [John runs]_. The dependency analysis shows the grammatical relation between the two words.

These two systems convey a lot of the same information. Recall a sentence from the very first lecture that I used to convince you that there is more to grammar than linear order of words:

<center><big>They see a cat with the telescope.</big></center>
<br>

This sentence is ambiguous -- that is, has two readings. Both constituency analysis and dependency analysis can express the two structures that correspond to two readings of this sentence. In terms of dependencies, the _with the telescope_ part bears a syntactic relation either to _cat_ or to _see_, as I show in the partial dependency analysis below:

```{image} ./images/cat_dep.jpeg
:alt: ambiguity dependency
:class: bg-primary mb-1
:width: 500px
:align: center
```

Constituents allow us to express the same intuition about how these two readings of the same sentence are different structurally, see the partial constituency analysis below:

```{image} ./images/cat_const.jpeg
:alt: ambiguity constituency
:class: bg-primary mb-1
:width: 650px
:align: center
```

In the first structure, _with the telescope_ forms a constituent together with _cat_ and to the exclusion of the verb. In the second structure, this is not the case: first, the constituent _see a cat_ is formed, then it combines with the constituent _with the telescope_, so that the seeing-a-cat as a whole interacts with _with the telescope_.

`````{admonition} Extra info
:class: tip, dropdown
This sentence above is an example of **attachment ambiguity** -- that is, where the phrase connects to the rest of the sentence. There are other types of syntactic ambiguity in language, here's a short list with an example for each:
- **Modifier scope** (which is actually also sort of attachment ambiguity): _southern food store_
- **Complement structure** (which is actually also sort of attachment ambiguity): _The tourists objected to the guide that they couldn’t hear_
- **Coordination scope** (which is, finally, something else, but also not a very nice thing to make a joke about; I wonder where I took this example from..): _“I see,” said the blind man, as he picked up the hammer and saw._
`````

Constituent structures and dependency structures are similar in many ways in the information they convey, but not equivalent. A transitive sentence can show these differences nicely:

```{image} ./images/johnmary.jpeg
:alt: constituents vs. dependencies
:class: bg-primary mb-1
:width: 550px
:align: center
```

Here, both representations show that there is something going on between the word _admires_ and the word _Mary_: they form a constituent, and there is a dependency relation that represents that connection too. The dependency relation is not symmetric though: the arrow starts at the verb and ends with the noun, not the other way around. If we were to recover this fragment of the dependency structure from the constituent structure, we wouldn't know where to direct the arrow (unless we knew that one of the elements in the constituent is marked as **head**). 

On the other hand, if we were to reconstruct the constituents from the dependency tree, we wouldn't know what to group with what first: is it _[[John admires] Mary]_ or _[John [admires Mary]]_? The arrows show that there are two connections, each between the verb and the noun -- but it doesn't encode the priority, that is, which connection is more tight or comes first. Maybe it doesn't matter? It kind of does: there are reasons to think that the connection between the verb and the object is more tight and they form a constituent together to the exclusion of the subject. How do we know? We know because they behave as one unit:
- They can be a conjunct in coordination: _John hates Bill and **admires Mary**_ vs. *_**John admires** and Ann dislikes Mary_
- They can undergo ellipsis together: _John **admires Mary** and Bill does too_ (-> _Bill admires Mary_)
- They participate in cleft constructions together: _**Admiring Mary** is what John is good at_ vs. *_**John admiring** is what Mary likes_.

A lot more can be said about constituents and dependencies, but I leave this here now -- just remember that these things exist! That should be enough as a starting point for syntax and its role in language technology.


`````{admonition} Extra info
:class: tip, dropdown
There are several very important topics in syntax that we were not able to cover. I don't want to over-pack this class, but I invite you to look for information on these topics yourself. Some of them will be partially covered in the homework reading, some of them won't:
- Parts of speech!
- Syntactic movement
- Ellipsis
- Pro-drop
- Head marking vs. dependent-marking
`````


## Syntax and language technology

Automatic syntactic analysis (known as **syntactic parsing**) is one of the classic NLP tasks: the text of the sentence serves as input, and the output is the syntactic structure, either as a constituency structure or a dependency structure. Let me show you both, using our previous example sentence:

In [15]:
sentence = "John admires Mary"

Let's do the constituents first. I'll be using Berkley Neural Parser (benepar), you can try out the [no-coding demo](https://parser.kitaev.io/) yourself.

In [16]:
import benepar, spacy
from nltk import ParentedTree
from IPython.utils import io
with io.capture_output() as captured:
    nlp = spacy.load('en_core_web_sm')
    nlp.add_pipe("benepar", config={"model": "benepar_en3"})
    sentence = list(nlp(sentence).sents)[0]

In [17]:
parse_tree = ParentedTree.fromstring('(' + sentence._.parse_string + ')')
parse_tree.pretty_print()

                     
        |             
        S            
  ______|_____        
 |            VP     
 |       _____|___    
 NP     |         NP 
 |      |         |   
NNP    VBZ       NNP 
 |      |         |   
John admires     Mary



The output is structurally the same as my hand-drawn tree in the previous section -- but note the additional information that this tree contains, compared to my version: nodes in this tree have labels that I did not write. These labels mark categories of constituents in the nodes, where _S_, for example, is a simple declarative sentence and _VP_ is a verb phrase (here, verb plus its object).

Let's compare this to the dependency analysis from spaCy, a popular NLP library (see their [no-coding demo](https://demos.explosion.ai/displacy) if you want to try more examples):

In [18]:
spacy.displacy.render(sentence, style='dep')

Again, this is very similar to what I drew above! But this structure has more info -- the dependency arrows now have labels that specify what type of relation we are looking at: a subject relation (nsubj) connecting the verb to its subject and a direct object relation (dobj) connecting the verb to the object.

Models performing syntactic parsers in these two different ways heavily rely on data resources that encode syntactic structures in these particular ways -- syntactic treebanks. 

An example of syntactic treebank annotated with constituency structure is [Penn Treebank](https://catalog.ldc.upenn.edu/LDC99T42), with annotation format that looks like this:

```
(IP-MAT (NP-SBJ (PRO I))  
        (VBD saw)  
        (NP-OB1 (D the)  
                (N man)))  
```

A very important resource for dependency structures is [Universal Dependencies](https://universaldependencies.org/), which I discussed last week as a source of morphological information. Its syntactic annotation is the same as what we saw in the analysis by spaCy, and that's not a coincidence: 
```{margin} 
Croatian
```
```
# text = Kazna medijskom mogulu obnovila raspravu u Makedoniji
1	Kazna	kazna	NOUN	Ncfsn	Case=Nom|Gender=Fem|Number=Sing	4	nsubj	_	_
2	medijskom	medijski	ADJ	Agpmsdy	Case=Dat|Definite=Def|Degree=Pos|Gender=Masc|Number=Sing	3	amod	_	_
3	mogulu	mogul	NOUN	Ncmsd	Case=Dat|Gender=Masc|Number=Sing	1	nmod	_	_
4	obnovila	obnoviti	VERB	Vmp-sf	Gender=Fem|Number=Sing|Tense=Past|VerbForm=Part|Voice=Act	0	root	_	_
5	raspravu	rasprava	NOUN	Ncfsa	Case=Acc|Gender=Fem|Number=Sing	4	obj	_	_
6	u	u	ADP	Sl	Case=Loc	7	case	_	_
7	Makedoniji	Makedonija	PROPN	Npfsl	Case=Loc|Gender=Fem|Number=Sing	4	obl	_	_
```

What do we need syntactic parsing for? I said it's a classic NLP task, but I didn't say anything about **why** it exists apart from the sole purpose of us admiring or hating the resulting parses.

The **first -- somewhat superficial -- answer** is that maybe we don't need it much. During the last several years, there's been a tendency to approach a lot of NLP tasks in an end-to-end fashion (if you remember this term from one of our previous lectures) rather than building pipelines of systems that produce different levels of linguistic analysis for the downstream task. 

Moreover, some relatively recent models that were not trained for syntactic parsing at all end up learning something that resembles syntactic relations. A seminal paper that introduced the currently most successful deep learning architecture in NLP ([Attention is all you need](https://pdf-reader-dkraft.s3.us-east-2.amazonaws.com/1706.03762.pdf)) has the following figures in the appendix, showing information flow between different words in a sentence (the technical details of what it means and how it works don't matter now) with a note that these patterns look related to the structure of the sentence, which the model has never been shown explicitly during training. And it actually does look roughly like it -- see prominent groups of words like _[the law]_, _[its applications]_ etc.

```{image} ./images/attention.png
:alt: attention patterns
:class: bg-primary mb-1
:width: 500px
:align: center
```
Further research analysing the inner workings of this type of models showed that, indeed, there is some syntax emerging in them without specialized training. Check this work out if you are curious:

> Voita, L. et al. 2019. [Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned](https://aclanthology.org/P19-1580/). ACL.

```{margin} 
I want to thank David Dalé for the discussion of this part -- I closely follow his very good points.
```
Does this all mean that language technology practicioners do not need syntax or knowledge of syntax at all anymore? I think they probably still **need it anyway**. 
- First and foremost, you would be surprised how non-trivial the very idea of **(un)grammaticality** can be in the NLP world, and how this simple idea can change the course of an engineering approach to an NLP problem. 
- Second, as we saw with the structural ambiguity example above, syntactic structure is related to meaning, and therefore syntax is linked to **natural language understanding** -- a huge research area and a very important super-task -- in ways we will discuss more next week;
- Identifying phrases and other units larger than words is still useful in a variety of downstream tasks, for instance in search -- in cases when convenient search units are not individual words but something bigger, e.g. phrases.
- Sometimes, depending on the task, a small handwritten grammar is a better way to go than a huge model. Such small handwritten grammars are often used to help virtual assistants recognize and act on commands that have simple predictable structure. Those grammars don't always have to be true to the letter of linguistic syntactic theory, but it's a grammar that breaks down a sequence of words into smaller units, so it is syntax in that sense. An example below shows a tiny grammar for [the virtual assistant Alice](https://en.wikipedia.org/wiki/Alice_(virtual_assistant)) that parses phrases like _turn on the light in the bathroom_, _turn on air conditioning in the kitchen_ and the like, detecting what should be done and where, so that the structured request can be sent further down the pipeline:

````
root:
    turn on $What $Where

slots:
    what:
        source: $What                   
    where:
        source: $Where
$What:
    the light | air conditioning
$Where:
    in the bathroom | in the kitchen | in the bedroom
````

```{margin} 
Example from Ban, P., Jiang, Y., Liu, T. and Steinert-Threlkeld, S. 2022. [Testing Pre-trained Language Models’ Understanding of Distributivity via Causal Mediation Analysis](https://arxiv.org/pdf/2209.04761.pdf). 5th BlackboxNLP Workshop.
```
- Syntactic templates are very handy when it comes to generating a lot of synthetic data for whichever purpose you might need it. For example, to generate test data for the task of Natural Language Inference (given two sentences, does the second one follow from the first? we will discuss this task next week), you can put together a list of nouns (_Mia_, _Lin_ etc.) and predicates (_wore a mask_ etc.), as well as simple templates, and generate a bunch of sentences:

````
Premise:
    [N1] and [N2] [Pred]
Hypothesis:
    [N1] [Pred]
````

> **Premise**: _Mia and Lin wore a mask_. <br> **Hypothesis**: _Mia wore a mask_.

- Finally, I want to show one recent and, I think, very interesting use of syntactic parsing to improve performance of a very sophisticated model. Above I said that recent language models learn a little bit of syntax even when not trained for it. On the other hand, in the first lecture, I talked about cases where models actually **do not** learn syntax -- or, at least, not enough of it. In particular, vision-and-language models have been shown to be pretty bad when it comes to semantically distinguishing sentences that encode different meanings by syntactic means:

```{image} ./images/grass_mug.png
:alt: the grass and the mug again
:class: bg-primary mb-1
:width: 500px
:align: center
```
For text-to-image models, this can result in inaccuracies in the generated image when it comes to assigning the right properties to the right object, as discussed in this recent paper:

> Rassin et al. 2023. [Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment](https://arxiv.org/pdf/2306.08877.pdf). To be presented at NeurIPS.

To help the model relate properties to objects more consistently, the authors first apply syntactic parsing to the text prompt, and then force the model to act on the detected connections during the image generation process (the details of how this is done exactly are way beyond what we need to discuss today):

```{image} ./images/goldberg_method.png
:alt: method
:class: bg-primary mb-1
:width: 650px
:align: center
```
The result is more faithful to the meaning conveyed by the text prompt, as reflected in its syntactic structure, compare the outputs of the original model below and the syntactically guided one above:

```{image} ./images/goldberg_result.png
:alt: result
:class: bg-primary mb-1
:width: 300px
:align: center
```
I think that's pretty cool! So, you never know.


`````{admonition} Homework 5
:class: note
**Task 1**

Read the following three bits from the textbook Bender, E.M. 2013. _Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax_:

- Chapter 5. Syntax: Introduction, pp. 53-55
- Chapter 6. Parts of speech, pp. 57-60
- Chapter 8. Argument types and grammatical functions, pp. 79-99

As usual, name (and say something about!) three things that the texbook discusses differently than I did or those missing from our class completely. Not from Chapter 6 though! We ignored parts of speech completely, unfortunately. But do read that part anyway.

**Task 2**

Imagine a language that's almost like English, but a little bit different. Maybe a group of British travellers got stuck in a far-away island long, long ago, their language gradually changed and their descendents now speak this language. Here is a short text in that language, glossed:

<div id='outerTable'><table>
  <tr><td>Chase-did</td><td>eagle-se</td><td>monkey</td><td>forest</td><td>in.</td><td>Escape-did</td><td>monkey</td><td>fast.</td><td>Fall-did</td><td>eagle.</td></tr>
  <tr><td>chase-PST</td><td>eagle-CASE</td><td>monkey</td><td>forest</td><td>in</td><td>escape-PST</td><td>monkey</td><td>fast</td><td>fall-PST</td><td>eagle</td></tr>
  <tr><td colspan=10>'An eagle chased a monkey in the forest. The monkey escaped fast. The eagle fell.'</td></tr>
</table></div>
<br>

Answer the following questions about this text:
- Which order of subject, verb, object does the language have?
- Does it obey all of Greenberg's word order universals discussed above?
- What type of morphosyntactic alignment does this language have when it comes to case marking?
- What can we say about alignment when it comes to verb agreement?

**Task 3**

Run the first of the fake-English sentences in the text above (_Chase-did eagle-se monkey forest in_) through the syntactic parsers for English: [the constituency parser](https://parser.kitaev.io/) and [the dependency parser](https://demos.explosion.ai/displacy). Describe what you get: Did the parsers analyse the sentence in a similar way? If not, how do their analyses differ? 

`````