Â
- what is probabilistic pragmatics?
Â
- example: vanilla rational speech act model
Â
- extensions & applications:
- individual differences
- embedded scalars
\[ \definecolor{firebrick}{RGB}{178,34,34} \newcommand{\red}[1]{{\color{firebrick}{#1}}} \] \[ \definecolor{green}{RGB}{107,142,35} \newcommand{\green}[1]{{\color{green}{#1}}} \] \[ \definecolor{blue}{RGB}{0,0,205} \newcommand{\blue}[1]{{\color{blue}{#1}}} \] \[ \newcommand{\den}[1]{[\![#1]\!]} \] \[ \newcommand{\set}[1]{\{#1\}} \]
Â
Â
Â
Â
Â
Franke & Jäger (2016), Probabilistic pragmatics, ZfS 35(1):3-44
Â
\(\Rightarrow\) Bayesian as a consequence of particular implementations of 1-3
A rational analysis is an explanation of an aspect of human behavior based on the assumption that it is optimized somehow to the structure of the environment. ... [T]he term does not imply any actual logical deduction in choosing optimal behavior, only that the behavior will be optimized. (Anderson 1991, p. 471)
speaker knows which referent \(t \in T\) she wants to talk about
speaker can choose a message \(m\) from set \(M = \{ \text{blue}, \text{green}, \text{square}, \text{circle} \}\)
listener tries to recover intended referent based on message
communication is successful if guess matches intended referent
[signaling game!]
dummy
literal listener picks literal interpretation (uniformly at random):
\[ P_{LL}(t \mid m) \propto P(t \mid [\![m]\!]) \]
dummy
Gricean speaker approximates informativity-maximization (with parameter \(\lambda\)):
\[ P_{S}(m \mid t \, ; \, \lambda) \propto \exp(\lambda \cdot \log P_{LL}(t \mid m)) \]
dummy
pragmatic listener uses Bayes' rule to infer likely world states:
\[ P_L(t \mid m \, ; \, \lambda) \propto P(t) \cdot P_S(m \mid t \, ; \, \lambda) \]
(c.f., Benz 2006, Frank & Goodman 2012)
Â
Franke & Degen (2016), Reasoning in reference games, PLoS one 11(5)
(c.f., Camerer 2006, Franke 2011, Jäger 2014)
  Â
Â
utterance: "I own some of Johnny Cash's albums." ---implicates---> I don't own them all.
dummy
Traditionalism
reason why S didn't say "all" is because it's not true
Grammaticalism
parse the sentence as:
I own O(some) of JC's albums.
where:
\[ O(x) = x \bigcap_{y \in ALT^+(x)} \neg y \]
rational probabilistic reasoning about putative meaning enrichments
(joined work in progress with Leon Bergen)
9 possible sentences
[none | some | all] of the monsters drank [none | some | all] of their water
dummy
7 possible worlds
100
010
110
011
101
001
111
## NN NS NA SN SS SA AN AS AA ## 100 0 1 1 1 0 0 1 0 0 ## 110 0 0 1 1 1 0 0 0 0 ## 101 0 0 0 1 1 1 0 0 0 ## 111 0 0 0 1 1 1 0 0 0 ## 010 1 0 1 0 1 0 0 1 0 ## 011 1 0 0 0 1 1 0 1 0 ## 001 1 0 0 0 1 1 0 1 1
000: [ N | S | A] of the ... drank [ N | S | A] of the ... 001: [ N | S | A] of the ... drank O([ N | S | A]) of the ... 010: O([ N | S | A]) of the ... drank [ N | S | A] of the ... 011: O([ N | S | A]) of the ... drank O([ N | S | A]) of the ... 100: O( [ N | S | A] of the ... drank [ N | S | A] of the ...) 101: O( [ N | S | A] of the ... drank O([ N | S | A]) of the ...) 110: O( O([ N | S | A]) of the ... drank [ N | S | A] of the ...) 111: O( O([ N | S | A]) of the ... drank O([ N | S | A]) of the ...)
lexical exhaustification
O(N) = N O(A) = A O(S) = "some but not all"
sentential exhausticifation
\[ O(x) = x \bigcap_{y \in ALT^+(x)} \neg y \ \ \ , \text{if consistent} \]
where \(ALT^+(x)\) contains all sentences that are stronger than \(x\) obtainable by replacing N,S, or A for one another
sentence parse s100 s110 s101 s111 s010 s011 s001 NN p000 0 0 0 0 1 1 1 NN p100 0 0 0 0 1 1 0 NS p000 1 0 0 0 0 0 0 NS p001 1 0 1 0 0 0 1 NS p101 0 0 1 0 0 0 1 xNA p000 1 1 0 0 1 0 0 xNA p100 0 1 0 0 1 0 0 SN p000 1 1 1 1 0 0 0 SN p010 0 1 1 1 0 0 0 SS p000 0 1 1 1 1 1 1 SS p001 0 1 0 1 1 1 0 SS p010 0 1 1 1 0 0 0 SS p011 0 1 0 1 0 1 0 SS p100 0 1 0 0 0 0 0 SA p000 0 0 1 1 0 1 1 SA p010 0 0 1 1 0 1 0 AS p000 0 0 0 0 1 1 1 AS p100 0 0 0 0 1 1 0 AS p001 0 0 0 0 1 0 0
dummy
Traditionalism
000
and 100
Grammaticalism
dummy
literal listener picks literal interpretation (uniformly at random):
\[ P_{LL}(t \mid m) \propto P(t \mid [\![m]\!]) \]
dummy
Gricean speaker approximates informativity-maximization (with parameter \(\lambda\)):
\[ P_{S}(m \mid t \, ; \, \lambda) \propto \exp(\lambda \cdot \log P_{LL}(t \mid m)) \]
dummy
pragmatic listener uses Bayes' rule to infer likely world states:
\[ P_L(t \mid m \, ; \, \lambda) \propto P(t) \cdot P_S(m \mid t \, ; \, \lambda) \]
(c.f., Benz 2006, Frank & Goodman 2012)
idea: reasoning over lexicon \(\red{l}\) of the speaker
dummy
\[ P_{LL}(t \mid m) \propto P(t \mid [\![m]\!]^{\red{l}}) \]
dummy
\[ P_{S}(m \mid t \, , \red{l} \, ; \, \lambda) \propto \exp(\lambda \cdot \log P_{LL}(t \mid m)) \]
dummy
\[ P_L(t, \red{l} \mid m \, ; \, \lambda) \propto P(t) \cdot P(\red{l}) \cdot P_{S}(m \mid t \, , \red{l} \, ; \, \lambda) \]
dummy
l in {p000, p011}
(c.f., Bergen et al. to appear, Potts et al. 2015)
idea: speaker can choose to intend a lexical narrowing \(\red{l}\)
dummy
\[ P_{LL}(t \mid m) \propto P(t \mid [\![m]\!]^{\red{l}}) \]
dummy
\[ P_{S}(m, \red{l} \mid t \, ; \, \lambda) \propto \exp(\lambda \cdot \log P_{LL}(t \mid m)) \]
dummy
\[ P_L(t, \red{l} \mid m \, ; \, \lambda) \propto P(t) \cdot P(\red{l}) \cdot P_{S}(m, \red{l} \mid t \, ; \, \lambda) \]
dummy
l in {p000, p001, p010, p011}
idea: speaker can choose to intend a syntactic parse \(\red{p}\)
dummy
\[ P_{LL}(t \mid m) \propto P(t \mid [\![m]\!]^{\red{p}}) \]
dummy
\[ P_{S}(m, \red{p} \mid t \, ; \, \lambda) \propto \exp(\lambda \cdot \log P_{LL}(t \mid m)) \]
dummy
\[ P_L(t, \red{p} \mid m \, ; \, \lambda) \propto P(t) \cdot P(\red{p}) \cdot P_{S}(m, \red{p} \mid t \, ; \, \lambda) \]
dummy
p in {p000, p001, p010, p011, p100, p101, p110, p111}
idea: weigted parses introduce graded semantics
dummy
\[ P_S(m \mid t \, ; \, \lambda) = \sum_{p \in \mathcal{P}} \exp(\lambda \cdot w(m,p)) \cdot \delta_{t \in \den{m}^p} \]
dummy
\(w(m,p))\) is the rank of \(\den{m}^p\) in the ordering of \(\set{\den{m}^p \mid p \in \mathcal{P}}\) by logical strength
100 110 101 111 010 011 001 NN 0.26 0.09 0.06 0.05 0.20 0.15 0.19 NS 0.53 0.07 0.12 0.05 0.05 0.04 0.14 xNA 0.29 0.25 0.03 0.04 0.30 0.04 0.05 SN 0.15 0.25 0.26 0.24 0.03 0.04 0.03 SS 0.02 0.23 0.11 0.21 0.14 0.22 0.06 SA 0.03 0.04 0.26 0.25 0.03 0.25 0.14 AN 0.69 0.07 0.05 0.04 0.05 0.03 0.06 AS 0.03 0.04 0.03 0.05 0.41 0.27 0.17 AA 0.07 0.03 0.05 0.06 0.04 0.05 0.71
derived approximate Bayes factors
lex_int exh SM lex_unc rsa lex_int 1 - - - - exh 8.89 1 - - - StrongMean 71 7.98 1 - - lex_unc 239 269 3.37 1 - rsa 6.63e+09 7.45e+08 9.33e+07 2.76e+07 1
parse-choice model: \(P_L(p \mid m ; \hat{\lambda})\)
p000 p001 p010 p011 p100 p101 p110 p111 NN 0.131 0.131 0.131 0.131 0.119 0.119 0.119 0.119 NS 0.105 0.158 0.105 0.158 0.105 0.132 0.105 0.132 xNA 0.133 0.133 0.133 0.133 0.117 0.117 0.117 0.117 SN 0.136 0.136 0.121 0.121 0.121 0.121 0.121 0.121 SS 0.159 0.139 0.118 0.121 0.085 0.139 0.118 0.121 SA 0.134 0.134 0.122 0.122 0.122 0.122 0.122 0.122 AN 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125 AS 0.151 0.106 0.151 0.106 0.136 0.106 0.136 0.106 AA 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
weights in strongest meaning model:
strongest 2nd 3rd 4th 0.362 0.274 0.207 0.156
Â
dummy
dummy