Bayesian Reader Parameters

This file describes the paramaters available in the version of the Bayesian Reader program that uses letter representations. In practice, this means any versions of the program other then the original Norris (2006) Psych Review paper. The old program was called BayesVisual, the new one is called BayesianReader.

The main things that the new programs enable one to do are to simulate masked priming and add various sources of noise so as to be able to model RT distributions. The model aslo has the option of adding noise to letter-position as well as letter-identity, which can be used to model letter-position coding in masked priming.

Apart from increasing the scope of the simulations the model can perform, the main thing that has varied between versions is the way nonword likelihoods are computed in lexical decision. The full description of the current lexical decision procedure is given in Norris (2009) Putting it all together: A unified account of word recognition and reaction-time distributions, Psychological Review.

See the Appendix of Norris, D., & Kinoshita, S. (2008) Perception as evidence accumulation and Bayesian inference: Insights from masked priming. Journal of Experimental Psychology: General. [pdf]for a description masked priming is performed in this version.

Most of the parameters here control what the model does, rather than how it does it. Our experience is that if you set the response thresholds so as to produce appropriate levels of accuracy, the model produces very much the same pattern of behaviour regardless of parameter settings. For example, changing the way the nonword-likelihoods are calculated really doesn't have much effect. The main parameters that exert a large influence on the model's behaviour are PrimeSteps (the duration of the prime in a masked priming experiment) and PositionSD (letter position noise). PositionSD will determine how sensitive the model is to things like transposed letter priming (WROD-WORD).

The Bayesian Reader is still under development. For example, we are developing a new version of the model that can deal with words of any length.

Dennis Norris & Maarten van Casteren

Example scripts

Parameters

The table is organised by category of parameter.
Note that where the default is specified as 'none', this means that the keyword has no effect if it is not used.
Many of these parameters have never been used in anger and haven't been used in any published simulations.
In most simulations (other than those dealing with RT distributions) most of the parameters are there to
control the input/output/averaging etc., and not to alter how the model actually operates.

Some of these parameters should be rationalised. For example, sometimes two parameters have to be set to generate
the 'sensible' behaviour in a particular mode, when setting one should set the correct value of the other.

Click on the parameter name in the table to jump to the corresponding description below.

Name	Default
StimulusFileName	none
CharacterFileName	none
OutputFileName	none
LexiconFileName	none
Steps	none
MinSteps	2
MaxSteps	0
Average	1
InitialSD	0
InitialPaWordSD	0
UpperInitialSD	none
LowerInitialSD	none
UniformPaWord	on
InitialSDSD	0
UniformInitialSD	off
PositionSD	0
TargetFieldPosition	1
PrimeFieldPosition	0
ProbeFieldPosition	0
P_a_WordThreshold	none
ID_LD_Threshold	none
WordProbabilityThreshold	none
GarbageThreshold	none
VirtualNonWordFrequency	0
UseBackgroundNonWords	off
LetterRanking	5
WordRanking	5
SetWordPriors	off
SetLetterPriors	off
SetPseudoPriors	off
SetProbePrior	off
ProbeFrequency	0
ProbeRatioThreshold	none
PrimeSteps	0
UseLetterFrequency	off
RunInBackground	off

StimulusFileName

A list of the stimuli. This could be just a single column of stimuli, but it can also have multiple columns, and the PrimeFieldPosition, TargetFieldPosition and ProbeFieldPosition keywords can be used to select the columns. In fact, you have to have at least two columns to simulate masked priming (prime and target) and three to simulate masked priming in the same-different task (reference, prime, target). But, you can have all of the different priming conditions in different columns and then just select the columns from this single input file.

CharacterFileName

Contains the specification of the character vectors. For each character there should be a line begining with the character itself followed by a list of real numbers specifying the value of the features. All characters need to be given explicit feature values and must have the same number of features.

The simplest character file would be something like this where each letter is represented by setting a single element in a 26 element vector.

a 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

b 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

c 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

d 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

e 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

f 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

g 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

h 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

i 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

j 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

k 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

l 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

m 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

n 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

p 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

r 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

s 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

t 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0

u 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0

v 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

w 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0

x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

z 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

and here's a set using the Rumelhart and Siple (Psychological Review, 1974, 81:99-118) coding:

a 1 1 1 1 1 0 0 1 0 0 0 1 0 0
b 0 0 1 1 1 1 0 0 0 1 0 1 0 1
c 1 1 1 0 0 1 0 0 0 0 0 0 0 0
d 0 0 1 1 1 1 0 0 0 1 0 0 0 1
e 1 1 1 0 0 1 0 1 0 0 0 0 0 0
f 1 1 1 0 0 0 0 1 0 0 0 0 0 0
g 1 1 1 0 1 1 0 0 0 0 0 1 0 0
h 1 1 0 1 1 0 0 1 0 0 0 1 0 0
i 0 0 1 0 0 1 0 0 0 1 0 0 0 1
j 1 0 0 1 1 1 0 0 0 0 0 0 0 0
k 1 1 0 0 0 0 0 1 0 0 1 0 1 0
l 1 1 0 0 0 1 0 0 0 0 0 0 0 0
m 1 1 0 1 1 0 0 0 1 0 1 0 0 0
n 1 1 0 1 1 0 0 0 1 0 0 0 1 0
o 1 1 1 1 1 1 0 0 0 0 0 0 0 0
p 1 1 1 1 0 0 0 1 0 0 0 1 0 0
q 1 1 1 1 1 1 0 0 0 0 0 0 1 0
r 1 1 1 1 0 0 0 1 0 0 0 1 1 0
s 0 1 1 0 1 1 0 1 0 0 0 1 0 0
t 0 0 1 0 0 0 0 0 0 1 0 0 0 1
u 1 1 0 1 1 1 0 0 0 0 0 0 0 0
v 1 1 0 0 0 0 1 0 0 0 1 0 0 0
w 1 1 0 1 1 0 1 0 0 0 0 0 1 0
x 0 0 0 0 0 0 1 0 1 0 1 0 1 0
y 0 0 0 0 0 0 0 0 1 0 1 0 0 1
z 0 0 1 0 0 1 1 0 0 0 1 0 0 0

OutputFileName

As it says - the name of the output file

LexiconFileName

The name of the lexicon

A lexicon must contain lines with 2 columns, each line consiting of a word and a frequency. Frequencies are normalised to probabilities.
For same-different simulations the lexicon should be empty
Unless you are operating in MultiLengthMode all words in the lexicon should be the same length.

Steps

Steps must be followed by a list of steps at which probability information etc is to be printed out. Negative numbers indicate times before the presentation of the target (i.e. during the presentation of the prime). Step number 0 doesn't exist: after step -1 comes step 1. This should default to something like 100 so that it doesn't break when there is no Steps keyword

MinSteps

Don't make responses before this number of steps. This parameter is included because the estimated variance after only a few samples can vary widely and result in ridiculously fast responses.

Setting MinSteps to 20 is usually sufficient.

MaxSteps

Maximum number of steps to run the simulation for. If not specified the largest number given in the Steps keyword will be used. All thresholds after this number that haven't been reached yet will be considered to have timed out.

Average

Run each trial for this many iterations and print averages. 50 is usually sufficient to get within a few steps of the long-term average.

InitialSD

Specifies the standard deviation of the sampling noise added to each element of the input vector. This is effectively a scaling parameter. A value of about 10.0 is a good starting point with the latest version. Use a smaller value to make the model run faster, but if you're doing masked priming, you'll have to reduce the prine duration too.

InitialPaWordSD

Used in RT distribution simulations.
Standard deviation of the noise added to the initial P(a word) (which should be 0.5). Used in RT distribution simulations.
The argument can be interpreted as the range of uniform noise by using the parameter UniformPaWord

UpperInitialSD

Used in RT distribution simulations.
Sets the upper bound on the InitialSD when using InitialSDSD with Gaussian noise. LowerInitialSD sets the lower bound

LowerInitialSD

Used in RT distribution simulations.
Sets the lower bound on the InitialSD when using InitialSDSD with Gaussian noise

UpperInitialSD sets the upper bound

UniformPaWord

Used in RT distribution simulations.Including this keyword makes the parameter given to InitialPaWordSD be the range of uniform noise instead of the standard deviation of the noise added to the initial P(a word). i.e. InitialPaWordSD 0.4 would make the starting P(a word) range between 0.3 and 0.7.

InitialSDSD

Used in RT distribution simulations.
Standard deviation of noise added to InitialSD (i.e. the value of InitialSD used on each trial) .

NB because this is a SD, it can go to extreme values, such as 0. If the SD is 0, then letters get to P = 1 almost straight away. Therefore there are two additional keywords to keep the effective SD within sensible limits. So,

UpperInitialSD 5.0

LowerInitialSD 0.2

would keep the initial SD between 5.0 and 0.2. Actual values aren't at all critical.

UniformInitialSD makes the distribution uniform.

UniformInitialSD

Used in RT distribution simulations.
makes the InitialSDSD uniform instead of Gaussian.

PositionSD

Standard deviation of the sampling noise added to the position code associated with each letter slot.
So far this is only used in masked primiing simulations. It doesn't have any obvious qualitative effect on
ordinary lexical decision.

TargetFieldPosition

Column/field of a multi-column input file to use as the target word for identification/lexical decision and same-different. See StimulusFileName

PrimeFieldPosition

Column/field of a multi-column input file to use as the prime in masked priming. See StimulusFileName

To run masked priming simulations you'll need to set some, or all, of these:

In practice only PrimeSteps and SetWordPriors need to be set for masked lexical decision, and PrimeSteps, SetProbePrior, ProbeFieldPosition and ProbeRatioThreshold need to be set for masked same-different.

ProbeFieldPosition

Column/field of a multi-column input file to use as the probe/reference (first string presented) in a same-different simulation.

P_a_WordThreshold

P_a_WordThreshold YesThresh NoThresh MinSteps where YesThresh is the P_a_WordThreshold above which a Yes response is made, NoThresh is the P_a_word threshold below which a No response is made, and MinSteps is the minimum number of steps before a response can be made (because the SD can vary wildly over the first few steps).

e.g.

P_a_WordThreshold 0.90 0.10

You can also use this format to add noise to the threshold:

P_a_WordThreshold 0.90:0.05 0.10:0.05

Use 1.5:0.5 for Gaussian and 1.5=0.5 for a uniform noise

There can be multiple P_a_WordThreshold parameters

ID_LD_Threshold

ID_LD_Threshold Yes No

This does lexical decision using P(a word) for 'No' responses, and P(the single best word) for 'Yes' decisions.

e.g.

ID_LD_Threshold 0.9 0.1

WordProbabilityThreshold

this is for performing straight identification – not lexical decision - using the threshold for individual words P(word|input).

e.g. WordProbabilityThreshold 0.9

GarbageThreshold

GarbageThreshold Yes_threshold No_threshold Garbage_likelihood

e.g.: GarbageThreshold 0.9 0.1 0.000009

When using the garbage threshold the nonword response is assigned a fixed likelihood regardless of the evidence.

VirtualNonWordFrequency

It's best to only ever set this to 1.0
What it does is to add a virtual nonword (as in the original model) into the letter-based model. The virtual nonword is set to be the letter-string with the highest probability that is not a word. When the argument is 1.0 this virtual nonword has the same frequency as the mean word frequency, and the frequency of the background words is adjusted to that they still sum to 1.0.

By itself the Pstring procedure implies that all letter-strings are equally likely as nonwords. This clearly isn't the case. Consequently, the letter-strings neighbouring a word will have very low effective frequencies. This produces a big 'Yes' bias. By adding the virtual nonword we reflect the fact that nonwords are generally going to be located near words. This seems especially important in masked priming as the prime can easily boost the frequency of words in the region of the prime to a level where the nonwords can't compete.

For simplicity I have not used the virtual nonword in simulations of RT distributions - it just didn't make much difference. It does

UseBackgroundNonWords

This only works with version >= 2.18 from July 31 2007

Default value is OFF.

Background nonwords are the nonwords used in addition to the virtual nonword to reflect the fact that nonwords can actually be located pretty well anywhere.

The likelihood of background nonwords (other than the virtual nonword itself) is calculated by taking the expected number of neighbours at N = 1, N = 2 . . N = wordlength.

The expected number of neighbours is calculated when the lexicon is read, by looping over all lexical entries and calculating the actual mean number of neighbours for all values of N. These values are then stored.

The program now has to calculate the mean Pstr of all letterstrings that are at all possible distances from the best letterstring, which is defined as the string formed by taking the letter with the highest probability at each location in the word. Wordlength is, again, set to be equal to the current target or prime.

Looping through all possible permutations given by the input letter probability matrix would be extremely time consuming. Therefore, each letter probability table for a position is reduced to two values: the best probability and the mean for the rest. It has been checked that this produces the same outcome as looping through all the actual permutations.

With this reduced set of probability tables we can now calculate the mean Pstr for a neighbour at any distance from the best string This leaves us with a list of two numbers for each letter-position - the highest letter P and the mean of the rest.The final sum is made using the mean number of neighbours in the actual lexicon, the Pstr just calculated, and the mean word frequency as a prior.

LetterRanking

specifies how many of the best letters to print out.

WordRanking

specifies how many of hte best words to print out.

SetWordPriors

Following the prime, the computed P(Word|Input) values are transferred to the P(Word)s. i.e. the effective frequency of the words is changed.

NB This just redistributes the probabilities WITHIN the lexicon, it doesn't change the overall balance between words and nonwords i.eP(a word).

SetLetterPriors

Following the prime, the computed P(Letter|Input) values are transferred to the P(Letter)s. i.e. the effective frequency of the letters is changed.

SetPseudoPriors

Can be used when simulating masked priming. When set, it updates the P(a word) at the end of the mask. P(a word) should start at 0.5, if this drifts during the prime, this new value will be used as the starting point for P(a word) during target processing. We don't use this option.

SetProbePrior

Stores the ratio by which the likelihood of the probe has changed after processing the prime. Needed to do same-different simulations.

ProbeFrequency

Should always be 1. See ProbeRatioThreshold

ProbeRatioThreshold

Used for same-different simulations.

Needs to be combined with ProbeFrequency 1

Arguments are SAME threshold, DIFFERENT threshold, minimum number of steps. As with all of the other thresholds you can have as many as you want in the script.

ProbeRatioThreshold 0.95 0.05 10

The basic procedure is simply that the probe/reference is added to the lexicon and then we compare that to the best letter-string that is not the probe.

But, as the reference/probe is suppose to be the only word in the lexicon the specified lexicon should be empty. This should be fixed
so that the lexicon is always ignored when simulating same-different.

Note that the assumption about different targets being more than a letter different from the reference isn't actually correct for the case where for example, the different targets differ from the reference at all letter positions. This means that the accuracy won't correspond closely to the thresholds. In fact, the thresholds often need to be near 0.6, 0.4 to produce any errors at all. We could change the program to change the assumptions, but as there's a straighforward trade-off between thresholds and the way the difference between the reference and different targets is set, this doesn't see worthwhile.

PrimeSteps

number of steps to present the prime for in masked priming simulations. Needs to be used with one of the following:

SetLetterPriors, SetWordPriors, SetPseudoPriors

SetWordPriors is the only one you should ever need to use.

UseLetterFrequency

Keep track of letter frequencies while reading the lexicon. Use these as priors when calculating the letter probabilities.
Off by default, indicating that all letters have the same prior. We don't use this. Letter priors and word priors aren't independent.

RunInBackground

Changes the priority of the program

For Windows, the argument "idle" runs the program at the lowest priority. Under unix this 'nices' the program to a priority of 10. An integer argument sets the priority to the specified value