-
-
Notifications
You must be signed in to change notification settings - Fork 116
/
Copy pathtypes.qmd
1986 lines (1524 loc) · 71.5 KB
/
types.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
pagetitle: Data Types and Declarations
---
# Data Types and Declarations {#data-types.chapter}
This chapter covers the data types for expressions in Stan. Every
variable used in a Stan program must have a declared data type. Only
values of that type will be assignable to the variable (except for
temporary states of transformed data and transformed parameter
values). This follows the convention of programming languages like
C++, not the conventions of scripting languages like Python or
statistical languages such as R or BUGS.
The motivation for strong, static typing is threefold.
1. Strong typing forces the programmer's intent to be declared with
the variable, making programs easier to comprehend and hence easier
to debug and maintain.
2. Strong typing allows programming errors relative to the declared
intent to be caught sooner (at compile time) rather than later (at
run time). The Stan compiler (called through an interface such as
CmdStan, RStan, or PyStan) will flag any type errors and indicate
the offending expressions quickly when the program is compiled.
3. Constrained types will catch runtime data, initialization, and
intermediate value errors as soon as they occur rather than allowing
them to propagate and potentially pollute final results.
Strong typing disallows assigning the same variable to objects of
different types at different points in the program or in different
invocations of the program.
## Overview of data types
Arguments for built-in and user-defined functions and local variables
are required to be basic data types, meaning an unconstrained scalar,
vector, or matrix type, or an array of such.
Passing arguments to functions in Stan works just like assignment to
basic types. Stan functions are only specified for the basic data
types of their arguments, including array dimensionality, but not for
sizes or constraints. Of course, functions often check constraints as
part of their behavior.
### Primitive types {-}
Stan provides two primitive data types, `real` for continuous
values and `int` for integer values. These are both considered scalar
types.
### Complex types {-}
Stan provides a complex number data type `complex`, where a complex
number contains both a real and an imaginary component, both of which
are of type `real`. Complex types are considered scalar types.
### Vector and matrix types {-}
Stan provides three real-valued matrix data types, `vector` for column
vectors, `row_vector` for row vectors, and `matrix` for
matrices.
Stan also provides three complex-valued matrix data types,
`complex_vector` for column vectors, `complex_row_vector` for row
vectors, and `complex_matrix` for matrices.
### Array types {-}
Any type (including the constrained types discussed in the next
section) can be made into an array type by declaring array arguments.
For example,
```stan
array[10] real x;
array[6, 7] matrix[3, 3] m;
array[12, 8, 15] complex z;
```
declares `x` to be a one-dimensional array of size 10 containing
real values, declares `m` to be a two-dimensional array of
size $6 \times 7$ containing values that are $3 \times 3$ matrices,
and declares `z` to be a $12 \times 8 \times 15$ array of complex numbers.
Prior to 2.26 Stan models used a different syntax which has since been removed.
See the [Removed Features](removals.qmd) chapter for more details.
### Tuple types {-}
For any sequence of types, Stan provides a tuple data type. For example,
```stan
tuple(real, array[5] int) xi;
```
declares `xi` to be a tuple holding two values, the first of which is
of type type `real` and the second of which a 5-dimensional array of
type `int`.
### Constrained data types
Declarations of variables other than local variables may be provided
with constraints. These constraints are not part of the underlying
data type for a variable, but determine error checking in the
transformed data, transformed parameter, and generated quantities
block, and the transform from unconstrained to constrained space in
the parameters block.
All of the basic data types other than `complex` may be given lower
and upper bounds using syntax such as
```stan
int<lower=1> N;
real<upper=0> log_p;
vector<lower=-1, upper=1>[3] rho;
```
There are also special data types for structured vectors and
matrices. There are five constrained vector data types, `simplex`
for unit simplexes, `unit_vector` for unit-length vectors,
`sum_to_zero_vector` for vectors that sum to zero,
`ordered` for ordered vectors of scalars, and
`positive_ordered` for vectors of positive ordered
scalars. There are specialized matrix data types `corr_matrix`
and `cov_matrix` for correlation matrices (symmetric, positive
definite, unit diagonal) and covariance matrices (symmetric, positive
definite). The type `cholesky_factor_cov` is for Cholesky
factors of covariance matrices (lower triangular, positive diagonal,
product with own transpose is a covariance matrix). The type
`cholesky_factor_corr` is for Cholesky factors of correlation
matrices (lower triangular, positive diagonal, unit-length rows).
Constraints provide error checking for variables defined in the
`data`, `transformed data`, `transformed parameters`,
and `generated quantities` blocks. Constraints are critical for
variables declared in the `parameters` block, where they
determine the transformation from constrained variables (those
satisfying the declared constraint) to unconstrained variables (those
ranging over all of $\mathbb{R}^n$).
It is worth calling out the most important aspect of constrained data
types:
<div style="margin:0 2em 0 2em">
*The model must have support (non-zero density, equivalently finite
log density) at parameter values that satisfy the declared
constraints.*
</div>
If this condition is violated with parameter values that satisfy
declared constraints but do not have finite log density, then the
samplers and optimizers may have any of a number of pathologies
including just getting stuck, failure to initialize, excessive
Metropolis rejection, or biased draws due to inability to explore
the tails of the distribution.
## Primitive numerical data types {#numerical-data-types}
Unfortunately, the lovely mathematical abstraction of integers and
real numbers is only partially supported by finite-precision computer
arithmetic.
### Integers {-}
Stan uses 32-bit (4-byte) integers for all of its integer
representations. The maximum value that can be represented
as an integer is $2^{31}-1$; the minimum value is $-(2^{31})$.
When integers overflow, their value is determined by the underlying architecture.
On most, their values wrap, but this cannot be guaranteed. Thus it is up to the Stan
programmer to make sure the integer values in their programs stay in
range. In particular, every intermediate expression must have an
integer value that is in range.
Integer arithmetic works in the expected way for addition,
subtraction, and multiplication, but truncates the result of division
(see [the Stan Functions Reference integer-valued arithmetic operators
section](https://mc-stan.org/docs/functions-reference/integer-valued_basic_functions.html#int-arithmetic)
for more information).
### Reals {-}
Stan uses 64-bit (8-byte) floating point representations of real
numbers. Stan roughly^[Stan compiles integers to `int` and reals to `double` types in C++. Precise details of rounding will depend on the compiler and hardware architecture on which the code is run.]
follows the IEEE 754 standard for floating-point computation.
The range of a 64-bit number is roughly $\pm 2^{1022}$, which is
slightly larger than $\pm 10^{307}$. It is a good idea to stay well
away from such extreme values in Stan models as they are prone to
cause overflow.
64-bit floating point representations have roughly 15 decimal digits
of accuracy. But when they are combined, the result often has less
accuracy. In some cases, the difference in accuracy between two
operands and their result is large.
There are three special real values used to represent (1) not-a-number
value for error conditions, (2) positive infinity for overflow, and
(3) negative infinity for overflow. The behavior of these special
numbers follows standard IEEE 754 behavior.
#### Not-a-number {-}
The not-a-number value propagates. If an argument to a real-valued
function is not-a-number, it either rejects (an exception in the
underlying C++) or returns not-a-number itself. For boolean-valued
comparison operators, if one of the arguments is not-a-number, the
return value is always zero (i.e., false).
#### Infinite values {-}
Positive infinity is greater than all numbers other than itself and
not-a-number; negative infinity is similarly smaller. Adding an
infinite value to a finite value returns the infinite value. Dividing
a finite number by an infinite value returns zero; dividing an
infinite number by a finite number returns the infinite number of
appropriate sign. Dividing a finite number by zero returns positive
infinity. Dividing two infinite numbers produces a not-a-number value
as does subtracting two infinite numbers. Some functions are
sensitive to infinite values; for example, the exponential function
returns zero if given negative infinity and positive infinity if given
positive infinity. Often the gradients will break down when values
are infinite, making these boundary conditions less useful than they
may appear at first.
### Promoting integers to reals {-}
Stan automatically promotes integer values to real values if
necessary, but does not automatically demote real values to integers.
For very large integers, this will cause a rounding error to fewer
significant digits in the floating point representation than in the
integer representation.
Unlike in C++, real values are never demoted to integers. Therefore,
real values may only be assigned to real variables. Integer values
may be assigned to either integer variables or real variables.
Internally, the integer representation is cast to a floating-point
representation. This operation is not without overhead and should
thus be avoided where possible.
## Complex numerical data type
The `complex` data type is a scalar, but unlike `real` and `int`
types, it contains two components, a real and imaginary component,
both of which are of type `real`. That is, the real and imaginary
components of a complex number are 64-bit, IEEE 754-complaint floating
point numbers.
### Constructing and accessing complex numbers {-}
Imaginary literals are written in mathematical notation using a
numeral followed by the suffix `i`. For example, the following
example constructs a complex number $2 - 1.3i$ and assigns it to
the variable `z`.
```stan
complex z = 2 - 1.3i;
real re = get_real(z); // re has value 2.0
real im = get_imag(z); // im has value -1.3
```
The getter functions then extract the real and imaginary components of
`z` and assign them to `re` and `im` respectively.
The function `to_complex` constructs a complex number from its real
and imaginary components. The functional form needs to be used
whenever the components are not literal numerals, as in the following
example.
```stan
vector[K] re;
vector[K] im;
// ...
for (k in 1:K) {
complex z = to_complex(re[k], im[k]);
// ...
}
```
### Promoting real to complex {-}
Expressions of type `real` may be assigned to variables of type
`complex`. For example, the following is a valid sequence of Stan
statements.
```stan
real x = 5.0;
complex z = x; // get_real(z) == 5.0, get_imag(z) == 0
```
The real number assigned to a complex number determine's the complex
number's real component, with the imaginary component set to zero.
Assignability is transitive, so that expressions of type `int` may
also be assigned to variables of type `complex`, as in the following
example.
```stan
int n = 2;
complex z = n;
```
Function arguments also support promotion of integer or real typed
expressions to type `complex`.
## Scalar data types and variable declarations
All variables used in a Stan program must have an explicitly declared
data type. The form of a declaration includes the type and the name
of a variable. This section covers scalar types, namely integer,
real, and complex. The next section covers vector and matrix types,
and the following section array types.
### Unconstrained integer {-}
Unconstrained integers are declared using the `int` keyword.
For example, the variable `N` is declared to be an integer as follows.
```stan
int N;
```
### Constrained integer {-}
Integer data types may be constrained to allow values only in a
specified interval by providing a lower bound, an upper bound, or
both. For instance, to declare `N` to be a positive integer, use
the following.
```stan
int<lower=1> N;
```
This illustrates that the bounds are inclusive for integers.
To declare an integer variable `cond` to take only binary values,
that is zero or one, a lower and upper bound must be provided, as in
the following example.
```stan
int<lower=0, upper=1> cond;
```
### Unconstrained real {-}
Unconstrained real variables are declared using the keyword
`real`. The following example declares `theta` to be an
unconstrained continuous value.
```stan
real theta;
```
### Unconstrained complex {-}
Unconstrained complex numbers are declared using the keyword
`complex`. The following example declares `z` to be an unconstrained
complex variable.
```stan
complex z;
```
### Constrained real {-}
Real variables may be bounded using the same syntax as integers. In
theory (that is, with arbitrary-precision arithmetic), the bounds on
real values would be exclusive. Unfortunately, finite-precision
arithmetic rounding errors will often lead to values on the
boundaries, so they are allowed in Stan.
The variable `sigma` may be declared to be non-negative as follows.
```stan
real<lower=0> sigma;
```
The following declares the variable `x` to be less than or equal
to $-1$.
```stan
real<upper=-1> x;
```
To ensure `rho` takes on values between $-1$ and $1$, use the
following declaration.
```stan
real<lower=-1, upper=1> rho;
```
#### Infinite constraints {-}
Lower bounds that are negative infinity or upper bounds that are
positive infinity are ignored. Stan provides constants
`positive_infinity()` and `negative_infinity()` which may
be used for this purpose, or they may be supplied as data.
### Affinely transformed real {#affine-transform.section}
Real variables may be declared on a space that has been transformed using an
affine transformation $x\mapsto \mu + \sigma * x$ with offset $\mu$ and
(positive) multiplier $\sigma$, using a syntax similar to
that for bounds.
While these transforms do not change the asymptotic sampling behaviour of
the resulting Stan program (in a sense, the model the program implements),
they can be useful for making the sampling process
more efficient by transforming the geometry of the problem to a more natural
multiplier and to a more natural offset for the sampling process,
for instance by facilitating a non-centered parameterisation.
While these affine transformation declarations do not impose a hard constraint
on variables, they behave like the bounds constraints in many ways and could
perhaps be viewed as acting as a sort of soft constraint.
The variable `x` may be declared to have offset $1$ as follows.
```stan
real<offset=1> x;
```
Similarly, it can be declared to have multiplier $2$ as follows.
```stan
real<multiplier=2> x;
```
Finally, we can combine both declarations to declare a variable with offset
$1$ and multiplier $2$.
```stan
real<offset=1, multiplier=2> x;
```
As an example, we can give `x` a normal distribution with non-centered parameterization.
In this program, the affine transform is applied to every element of vector `x`.
```stan
parameters {
real<offset=mu, multiplier=sigma> x;
}
model {
x ~ normal(mu, sigma);
}
```
Recall the Stan code for the centered parameterization of this model.
```stan
parameters {
real x;
}
model {
x ~ normal(mu, sigma);
}
```
Adding the offset, multiplier transform results in the equivalent non-centered parameterization.
```stan
parameters {
real<offset=mu, multiplier=sigma> x;
}
model {
x ~ normal(mu, sigma);
}
```
Sampling is done on the unconstrained parameters.
After applying the affine transform, the unconstrained parameters are standard normal,
thus the above model is equivalent to the hand-coded non-centered parameterization.
```stan
parameters {
real x_raw;
}
transformed parameters {
real x = mu + x_raw * sigma;
}
model {
x_raw ~ std_normal();
}
```
Use of the affine transform removes the overhead of declaring additional transformed parameters
and directly expresses the hierarchical relationship between parameters.
For a container variable, the affine transform is applied to each element of that variable.
As an example, the non-centered parameterization of Neal's Funnel in the
[Stan User's Guide reparameterization section](https://mc-stan.org/docs/stan-users-guide/reparameterization.html),
$$
p(y,x) = \textsf{normal}(y \mid 0,3) \times \prod_{n=1}^9
\textsf{normal}(x_n \mid 0,\exp(y/2)).
$$
can be written as:
```stan
parameters {
real<multiplier=3> y;
vector<multiplier=exp(0.5 * y)>[9] x;
}
model {
y ~ normal(0, 3);
x ~ std_normal(0, 0.5 * y);
}
```
where the affine transform is applied to every element of vector `x`.
### Expressions as bounds and offset/multiplier {-}
Bounds (and offset and multiplier)
for integer or real variables may be arbitrary expressions.
The only requirement is that they only include variables that have
been declared (though not necessarily defined) before the declaration. array[N] row_vector[D] x;
If the bounds themselves are parameters, the behind-the-scenes
variable transform accounts for them in the log Jacobian.
For example, it is acceptable to have the
following declarations.
```stan
data {
real lb;
}
parameters {
real<lower=lb> phi;
}
```
This declares a real-valued parameter `phi` to take values
greater than the value of the real-valued data variable `lb`.
Constraints may be arbitrary expressions, but must be of type `int`
for integer variables and of type `real` for real variables
(including constraints on vectors, row vectors, and matrices).
Variables used in constraints can be any variable that has been
defined at the point the constraint is used. For instance,
```stan
data {
int<lower=1> N;
array[N] real y;
}
parameters {
real<lower=min(y), upper=max(y)> phi;
}
```
This declares a positive integer data variable `N`, an array
`y` of real-valued data of length `N`, and then a parameter
ranging between the minimum and maximum value of `y`. As shown
in the example code, the functions `min()` and `max()` may
be applied to containers such as arrays.
A more subtle case involves declarations of parameters or transformed
parameters based on parameters declared previously. For example, the
following program will work as intended.
```stan
parameters {
real a;
real<lower=a> b; // enforces a < b
}
transformed parameters {
real c;
real<lower=c> d;
c = a;
d = b;
}
```
The parameters instance works because all parameters are defined
externally before the block is executed. The transformed parameters
case works even though `c` isn't defined at the point it is used,
because constraints on transformed parameters are only validated at
the end of the block. Data variables work like parameter variables,
whereas transformed data and generated quantity variables work like
transformed parameter variables.
### Declaring optional variables {-}
A variable may be declared with a size that depends on a boolean
constant. For example, consider the definition of `alpha` in the
following program fragment.
```stan
data {
int<lower=0, upper=1> include_alpha;
// ...
}
parameters {
vector[include_alpha ? N : 0] alpha;
// ...
}
```
If `include_alpha` is true, the model will include the vector
`alpha`; if the flag is false, the model will not include
`alpha` (technically, it will include `alpha` of size 0,
which means it won't contain any values and won't be included in any
output).
This technique is not just useful for containers. If the value of
`N` is set to 1, then the vector `alpha` will contain a
single element and thus `alpha[1]` behaves like an optional
scalar, the existence of which is controlled by `include_alpha`.
This coding pattern allows a single Stan program to define different
models based on the data provided as input. This strategy is used
extensively in the implementation of the [RStanArm package](https://mc-stan.org/rstanarm).
## Vector and matrix data types
Stan provides three types of container objects: arrays, vectors, and
matrices. Vectors and matrices are more limited kinds of data
structures than arrays. Vectors are intrinsically one-dimensional
collections of real or complex values, whereas matrices are
intrinsically two dimensional. Vectors, matrices, and arrays are not
assignable to one another, even if their dimensions are identical. A
$3 \times 4$ matrix is a different kind of object in Stan than a $3
\times 4$ array.
The intention of using matrix types is to call out their usage in the
code. There are three situations in Stan where *only* vectors and
matrices may be used,
* matrix arithmetic operations (e.g., matrix multiplication)
* linear algebra functions (e.g., eigenvalues and determinants),
and
* multivariate function parameters and outcomes (e.g.,
multivariate normal distribution arguments).
Vectors and matrices cannot be typed to return integer values. They
are restricted to `real` and `complex` values.
For constructing vectors and matrices in Stan, see [Vector, Matrix,
and Array Expressions](expressions.qmd#vector-matrix-array-expressions.section).
### Indexing from 1 {-}
Vectors and matrices, as well as arrays, are indexed starting from one
(1) in Stan. This follows the convention in statistics and linear
algebra as well as their implementations in the statistical software
packages R, MATLAB, BUGS, and JAGS. General computer programming
languages, on the other hand, such as C++ and Python, index arrays
starting from zero.
### Vectors {-}
Vectors in Stan are column vectors; see below for information on row
vectors. Vectors are declared with a size (i.e., a dimensionality).
For example, a 3-dimensional real vector is declared with the keyword
`vector`, as follows.
```stan
vector[3] u;
```
Vectors may also be declared with constraints, as in the following
declaration of a 3-vector of non-negative values.
```stan
vector<lower=0>[3] u;
```
Similarly, they may be declared with a offset and/or multiplier, as in the
following example
```stan
vector<offset=42, multiplier=3>[3] u;
```
### Complex vectors {-}
Like real vectors, complex vectors are column vectors and are declared
with a size. For example, a 3-dimensional complex vector is declared
with the keyword `complex_vector`, as follows.
```stan
complex_vector[3] v;
```
Complex vector declarations do not support any constraints.
### Unit simplexes {-}
A unit simplex is a vector with non-negative values whose entries sum
to 1. For instance, $[0.2,0.3,0.4,0.1]^{\top}$ is a unit 4-simplex.
Unit simplexes are most often used as parameters in categorical
or multinomial distributions, and they are also the sampled variate in
a Dirichlet distribution. Simplexes are declared with their full
dimensionality. For instance, `theta` is declared to
be a unit $5$-simplex by
```stan
simplex[5] theta;
```
Unit simplexes are implemented as vectors and may be assigned to other
vectors and vice-versa. Simplex variables, like other constrained
variables, are validated to ensure they contain simplex values; for
simplexes, this is only done up to a statically specified accuracy
threshold $\epsilon$ to account for errors arising from floating-point
imprecision.
In high dimensional problems, simplexes may require smaller step sizes
in the inference algorithms in order to remain stable; this can be
achieved through higher target acceptance rates for samplers and
longer warmup periods, tighter tolerances for optimization with more
iterations, and in either case, with less dispersed parameter
initialization or custom initialization if there are informative
priors for some parameters.
### Stochastic Matrices {-}
A stochastic matrix is a matrix where each column or row is a
unit simplex, meaning that each column (row) vector has non-negative
values that sum to 1. The following example is a $3 \times 4$
column-stochastic matrix.
$$
\begin{bmatrix}
0.2 & 0.5 & 0.1 & 0.3 \\
0.3 & 0.3 & 0.6 & 0.4 \\
0.5 & 0.2 & 0.3 & 0.3
\end{bmatrix}
$$
An example of a $3 \times 4$ row-stochastic matrix is the following.
$$
\begin{bmatrix}
0.2 & 0.5 & 0.1 & 0.2 \\
0.2 & 0.1 & 0.6 & 0.1 \\
0.5 & 0.2 & 0.2 & 0.1
\end{bmatrix}
$$
In the examples above, each column (or row) sums to 1, making the matrices
valid `column_stochastic_matrix` and `row_stochastic_matrix` types.
Column-stochastic matrices are often used in models where
each column represents a probability distribution across a
set of categories such as in multiple multinomial distributions,
factor models, transition matrices in Markov models,
or compositional data analysis.
They can also be used in situations where you need multiple simplexes
of the same dimensionality.
The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared
with row and column sizes. For instance, a matrix `theta` with
3 rows and 4 columns, where each
column is a 3-simplex, is declared like a matrix with 3 rows and 4 columns.
```stan
column_stochastic_matrix[3, 4] theta;
```
A matrix `theta` with 3 rows and 4 columns, where each row is a 4-simplex,
is similarly declared as a matrix with 3 rows and 4 columns.
```stan
row_stochastic_matrix[3, 4] theta;
```
As with simplexes, `column_stochastic_matrix` and `row_stochastic_matrix`
variables are subject to validation, ensuring that each column (row)
satisfies the simplex constraints. This validation accounts for
floating-point imprecision, with checks performed up to a statically
specified accuracy threshold $\epsilon$.
#### Stability Considerations {-}
In high-dimensional settings, `column_stochastic_matrix` and `row_stochastic_matrix`
types may require careful tuning of the inference
algorithms. To ensure stability:
- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC),
smaller step sizes can help maintain stability, especially in high dimensions.
- **Higher Target Acceptance Rates:** Setting higher target acceptance
rates can improve the robustness of the sampling process.
- **Longer Warmup Periods:** Increasing the warmup period allows the sampler
to better explore the parameter space before the actual sampling begins.
- **Tighter Optimization Tolerances:** For optimization-based inference,
tighter tolerances with more iterations can yield more accurate results.
- **Custom Initialization:** If prior information about the parameters is
available, custom initialization or less dispersed initialization can lead
to more efficient inference.
### Unit vectors {-}
A unit vector is a vector with a norm of one. For instance, $[0.5,
0.5, 0.5, 0.5]^{\top}$ is a unit 4-vector. Unit vectors are sometimes
used in directional statistics. Unit vectors are declared with their
full dimensionality. For instance, `theta` is declared to be a unit
$5$-vector by
```stan
unit_vector[5] theta;
```
Unit vectors are implemented as vectors and may be assigned to other
vectors and vice-versa. Unit vector variables, like other constrained
variables, are validated to ensure that they are indeed unit length; for
unit vectors, this is only done up to a statically specified accuracy
threshold $\epsilon$ to account for errors arising from floating-point
imprecision.
### Vectors that sum to zero {-}
A zero-sum vector is constrained such that the
sum of its elements is always $0$. These are sometimes useful
for resolving identifiability issues in regression models.
While the underlying vector has only $N - 1$ degrees of freedom,
zero sum vectors are declared with their full dimensionality.
For instance, `beta` is declared to be a zero-sum $5$-vector (4 DoF) by
```stan
sum_to_zero_vector[5] beta;
```
Zero sum vectors are implemented as vectors and may be assigned to other
vectors and vice-versa. Zero sum vector variables, like other constrained
variables, are validated to ensure that they are indeed unit length; for
zero sum vectors, this is only done up to a statically specified accuracy
threshold $\epsilon$ to account for errors arising from floating-point
imprecision.
### Ordered vectors {-}
An ordered vector type in Stan represents a vector whose entries are
sorted in ascending order. For instance, $(-1.3,2.7,2.71)^{\top}$ is
an ordered 3-vector. Ordered vectors are most often employed as cut
points in ordered logistic regression models (see
[section](https://mc-stan.org/docs/stan-users-guide/regression.html#ordered-logistic.section)).
The variable `c` is declared as an ordered 5-vector by
```stan
ordered[5] c;
```
After their declaration, ordered vectors, like unit simplexes, may be
assigned to other vectors and other vectors may be assigned to them.
Constraints will be checked after executing the block in which the
variables were declared.
### Positive, ordered vectors {-}
There is also a positive, ordered vector type which operates similarly
to ordered vectors, but all entries are constrained to be positive.
For instance, $(2,3.7,4,12.9)$ is a positive, ordered 4-vector.
The variable `d` is declared as a positive, ordered 5-vector by
```stan
positive_ordered[5] d;
```
Like ordered vectors, after their declaration, positive ordered vectors
may be assigned to other vectors and other vectors may be assigned to them.
Constraints will be checked after executing the block in which the
variables were declared.
### Row vectors {-}
Row vectors are declared with the keyword `row_vector`.
Like (column) vectors, they are declared with a size. For example,
a 1093-dimensional row vector `u` would be declared as
```stan
row_vector[1093] u;
```
Constraints are declared as for vectors, as in the following example
of a 10-vector with values between -1 and 1.
```stan
row_vector<lower=-1, upper=1>[10] u;
```
Offset and multiplier are also similar as for the following 3-row-vector with
offset -42 and multiplier 3.
```stan
row_vector<offset=-42, multiplier=3>[3] u;
```
Row vectors may not be assigned to column vectors, nor may column
vectors be assigned to row vectors. If assignments are required, they
may be accommodated through the transposition operator.
### Complex row vectors {-}
Complex row vectors are declared with the keyword
`complex_row_vector` and given a size in basic declarations. For
example, a 12-dimensional complex row vector `v` would be declared as
```stan
complex_row_vector[12] v;
```
Complex row vectors do not allow constraints.
### Matrices {-}
Matrices are declared with the keyword `matrix` along with a
number of rows and number of columns. For example,
```stan
matrix[3, 3] A;
matrix[M, N] B;
```
declares `A` to be a $3 \times 3$ matrix and `B` to be a $M
\times N$ matrix. For the second declaration to be well formed, the
variables `M` and `N` must be declared as integers in either
the data or transformed data block and before the matrix declaration.
Matrices may also be declared with constraints, as in this ($3 \times 4$)
matrix of non-positive values.
```stan
matrix<upper=0>[3, 4] B;
```
Similarly, matrices can be declared to have a set offset and/or multiplier, as in
this matrix with multiplier 5.
```stan
matrix<multiplier=5>[3, 4] B;
```
#### Assigning to rows of a matrix {-}
Rows of a matrix can be assigned by indexing the left-hand side of an
assignment statement. For example, this is possible.
```stan
matrix[M, N] a;
row_vector[N] b;
// ...
a[1] = b;
```
This copies the values from row vector `b` to `a[1]`, which
is the first row of the matrix `a`. If the number of columns in
`a` is not the same as the size of `b`, a run-time error is
raised; the number of columns of `a` is `N`, which is also the
number of columns of `b`.
Assignment works by copying values in Stan. That means any subsequent
assignment to `a[1]` does not affect `b`, nor does an
assignment to `b` affect `a`.
### Complex matrices {-}
Complex matrices are declared with the keyword `complex_matrix` and a
number of rows and columns. For example,
```stan
complex_matrix[3, 3] C;
```
Complex matrices do not allow constraints.
### Covariance matrices {-}
Matrix variables may be constrained to represent covariance matrices.
A matrix is a covariance matrix if it is symmetric and positive
definite. Like correlation matrices, covariance matrices only need a
single dimension in their declaration. For instance,
```stan
cov_matrix[K] Sigma;
```
declares `Sigma` to be a $K \times K$ covariance matrix, where
$K$ is the value of the data variable `K`.
### Correlation matrices {-}
Matrix variables may be constrained to represent correlation matrices.
A matrix is a correlation matrix if it is symmetric and positive
definite, has entries between $-1$ and $1$, and has a unit diagonal.
Because correlation matrices are square, only one dimension needs
to be declared. For example,
```stan
corr_matrix[3] Omega;
```
declares `Omega` to be a $3 \times 3$ correlation matrix.
Correlation matrices may be assigned to other matrices, including
unconstrained matrices, if their dimensions match, and vice-versa.