Quantitative genetics state-space modeling of phenotypic plasticity and evolution

Living organisms adapt to changes in environment by phenotypic plasticity and evolution by natural selection (or they migrate). At detailed genetic levels these phenomena are complicated, and quantitative genetics attempts to capture essential processes at a higher abstraction level. Phenotypic plasticity is then commonly modeled by reaction norms, which describe how individual traits in a population are expressed in response to changes in environmental variables. The mean reaction norms are evolvable, and here I present a general quantitative genetics state-space model for evolutionary reaction norm dynamics. Reaction norms make use of a reference environment, which is traditionally set to zero. This is problematic when the reference environment is the environment a population is adapted to, for the reason that this environment is a population property, which in itself may be evolvable. With reference to Ergon (2018), I describe models that take such evolvability into account. The resulting models are fundamentally different from most engineering system models, where given reference values are constant, and therefore without consequences can be set to zero. For simplicity I assume only temporal variations in environment, although there obviously are a lot of spatial variations in nature, and I assume that no mutations are involved. Fundamentals from quantitative evolutionary theory are given in appendices.


Introduction
Evolution of biological systems is a fascinating subject, with reproduction, natural selection and mutations as basic concepts (Page and Nowak, 2002). The subject is challenging from a mathematical modeling point of view, and it is an area where also an extensive amount of field studies and field and laboratory experiments are conducted. Natural selection will cause evolution only if there is enough genetic variation in a population, and this may be studied at different theoretical levels. Here, I will use quantitative genetics methods at the phenotypic level, i.e. assuming individual characteristics (height, bird nesting time, etc.) that vary continuously depending on genes, and environmental and developmental factors. When such characteris-tics result from infinitesimal effects of a large number of genes, it follows from the central limit theorem that their distributions tend to be multivariate normal (Fisher, 1918;Turelli, 2017), and such normality is a fundamental assumption in quantitative genetics. This article will focus on some main aspects of quantitative genetics, without mutations involved, and with only temporal variations in environment. I will discuss some recent advances in evolutionary state-space modeling that may be of interest also for the engineering control community (Ergon and Ergon, 2017;Ergon, 2018), and for that purpose some basic concepts are first introduced.
The phenotype of a living organism is the set of observable characteristics, such as its physical and physiological properties, behavior, etc. The phenotype is de-termined by the genotype, i.e. by the inherited genetic instructions, but also by environmental conditions and developmental stage. The mean phenotype in a wild population is subject to evolution based on the principle of natural selection (Darwin, 1859), which is a slow adaptive process that requires sufficient phenotypic variation among individuals in the population, such that the fittest individuals contribute the most to the gene pool of new generations. Although there are several alternative definitions related to reproductive success (Caswell, 2001), we may here, for simplicity, define individual fitness as the number of offspring of the individual. Assume, for example, that some migrating birds are genetically programmed to arrive earlier to their nesting sites in Norway than most other birds of the same species. Also assume that owing to climatic change, these early birds find more and better food, and thus have more offspring than the rest of the birds in the species. In such a case, the mean arrival time in the population may evolve towards earlier dates, such that the population adapts to the warmer climate. As discussed below, however, this is not the obvious result when phenotypic plasticity is involved.
Natural selection is the differential survival and reproduction of individuals due to differences in phenotype. The mathematical theory of natural selection has some of its background from breeding theory, and for simplicity non-overlapping generations are assumed. For illustration, consider an individual phenotypic trait y i,t determining the individual fitness W i,t , where t is time measured in unit of generation. Assume a given environment, and fitness function and distribution of y i,t in a population as shown in Figure 1. As is often done in theoretical analysis (e.g., Lande (2009)), the fitness function is here given by the Gaussian function where θ t is the phenotypic value that maximizes individual fitness in the given environment u t , and where u t and θ t are correlated stochastic processes. As explained below, the phenotypic trait y i,t is here a function of the environment u t . The correlation may, for example, be caused by environmental fluctuations within each generation, and a time delay between a critical development stage and the reproduction stage (Lande, 2009). Equation (1) describes how the individual fitness varies as function of the phenotypic trait. Since per definition individuals with phenotype closer to the fitness peak will have more offspring than others, the population mean valueȳ t will gradually, generation by generation, be shifted towards the fitness peak. Assume now that y has one part x determined by the genetics and one part e determined by the environment, i.e. y i,t = x i,t + e i,t , where x i,t and e i,t are independent and normally distributed, and wherē e t = 0. Also assume (i) that x i,t and e i,t have constant population variances G xx and σ 2 e , respectively, is the mean value of x of the offspring of individual i, and (iii) that fitness is determined only by y (and not directly by x or e). In that case it can be shown that the selection process in breeding is governed bȳ y t+1 =ȳ t + G xx G xx + σ 2 e −1 S, where S is the selection differential defined as the difference in mean phenotype between a group selected for breeding and the entire population. This is the univariate breeder's equation, first proposed by Lush (1937), with the notation G xx G xx + σ 2 e −1 = h 2 (heritability). It was later shown that for natural selection in the wild, S can be replaced by cov (W i,t , y i,t ) (Price, 1970;Lande, 1979), such that the univariate breeder's equation can be expressed as Whenȳ t is located as in Figure 1, cov (W t , y t ) is positive, i.e. fitness increases for increasing values of phenotype. Note that selection does not require Gaussian functions, as used in Figure 1, although the breeder's equation (2) is valid only when at least the phenotypic values in the population are normally distributed. Also note that the assumption of a constant phenotypic variance, σ 2 y = G xx + σ 2 e , may be problematic, because the selection process in itself will gradually reduce the variance G xx . There is, however, ample empirical evidence that this often is compensated by various processes, like mutations and migration of individuals between subpopulations, and the simplifying assumption of constant phenotypic variance is therefore often used in theoretical analysis (Rice, 2004).
Natural selection in populations with genetic variability is thus a necessary prerequisite for evolution. Phenotypic plasticity, on the other hand, is the ability of a living organism to express different phenotypes in different environments, also when no evolution is involved. Such plasticity will normally be adaptive in the sense that it increases fitness, but in extreme environments it may be maladaptive. An example is plants that reduce their photosynthesis and growth, in order to use less water when they become wateror salt-stressed (Tallman et al., 1997). The fact that some migrating birds return to Norway earlier owing to climatic change, may be partly caused by phenotypic plasticity, although it may also be caused by evolution. It is in fact an important consequence of the models in Ergon and Ergon (2017) and Ergon (2018), Figure 1: Fitness function W i,t (blue) and phenotype distribution p (y i,t ) (magenta) in a given environment u and at a given time t, for a case that gives cov (W i,t , y i,t ) > 0, and thus y t+1 >ȳ t . The mean phenotypic value is hereȳ t = 10, while the phenotypic value that maximizes individual fitness is θ t = 20.
that responses to climate change that at first may be explained by phenotypic plasticity, over time may be totally explained by evolution. As explained in more detail below, such a development is called complete genetic assimilation. Since phenotypic plasticity and evolution may be at work at the same time, evolutionary models must take both plasticity and natural selection into account. They should also include models of mutations, although that is not further discussed here.
Phenotypic plasticity can be described and modeled by means of reaction norms, which show how phenotypes vary as function of environmental variables. As an illustration, Figure 2 shows linear individual reaction norms for a population with a univariate phenotypic variable in a univariate environment, which is also used in the simulation examples in Subsections 4.1 and 4.2. In the figure, the environmental cue is defined as the environment during a critical period of development. The figure also introduces reaction norm parameters that may be treated as traits in their own right, often referred to as latent traits. Just as ordinary phenotypic traits, such latent traits may have evolvable mean values in a population, as described in Figure 1. In Figure 2, solid lines indicate the range of environment where the reaction norms may be recorded in an experimental study, while the dashed lines show extrapolations.
Assuming reaction norms as in Figure 2, withc = 0 and G cc = 0, and using a multivariate version of eq. (2), Figure 2: Reaction norms for 100 individuals with reaction norms according to y = a + b (u − c) + e, withā =c =ē = 0 andb = 0.5. The traits a, b and c, and the residual e, are independent and normally distributed, with variances G aa = 0.5, G bb = 0.045, G cc = 0.5, and σ 2 e = 0.5, respectively. Theoretically, the minimum phenotypic variance is found for u = 0 (Lande, 2009). we obtain the model wherez t = ā tbt T , while G = G aa 0 0 G bb and P = G aa + σ 2 e 0 0 G bb . As in Lande (2009), it is assumed that the two latent traits are independent, and that b i,t has no residual component. As discussed in detail below, eq. (4) is an example of the multivariate breeder's equation, and here it is necessary with a note on notation. The individual phenotype is denoted y i,t , while z i,t is the vector of individual reaction norm traits. The evolution of the mean valuesȳ t andz t are governed by the fundamental Price equation, as given in Appendix A, and in that respect there is no difference between the phenotypic traits y i,t and the latent traits z i,t . For simplicity, however, the notation z i,t is used in Appendix A. The outline of the subsequent parts of the article is as follows. Section 2 gives a more general introduction to state-space modeling of phenotypic plasticity and evolution using traditional quantitative evolutionary theory (Lande, 1979;Lande and Arnold, 1983;Gavrilets and Scheiner, 1993b,a;Lande, 2009). For simplicity it is assumed that there are only temporal variations in environment, although there obviously are a lot of spatial variations in nature. Section 2 also introduces the fundamental problem of evolvable phenotypic plasticity reference environments, recently discussed in Ergon (2018). The solution to this problem requires an augmented multivariate breeder's equation, as developed in Section 3.
Section 4 presents simulations in order to show the consequences of the state augmentation. Subsection 4.1 shows the step response of a univariate and linear reaction norm system, while Subsection 4.2 shows the ramp response of the same system. In Subsection 4.3 I simulate the step response of a multivariate and nonlinear reaction norm system. Finally, conclusions and discussions are given in Section 5.
Appendix A shows how the multivariate breeder's equation (4) by means of several assumptions follows from the fundamental selection equation of Price (1970). Appendix B shows how an assumption of frequency-independent selection leads to an alternative form of the multivariate breeder's equation,z t+1 = . This shows that at equilibrium, where E ∂ ln(Wt) ∂zt = 0, the geometrical mean fitness is maximized (Appendix C).
A major difficulty of the approach with evolvable reference traits is to find empirical measures of the reaction norm parameters. In the univariate and linear reaction norm example in Figure 2, for example, individual reference traits c (horizontal reaction norm variation) cannot be distinguished from traits a (vertical reaction norm variation) by means of static experimental data. Different values of the variance G cc of trait c will, however, give different dynamical responses to environmental variations, and assuming that the variance G aa of trait a is known this can be used to find G cc by use of system identification methods (Appendix D). However, the practicability of this for slow evolutionary processes must be expected to be limited.
Appendix E discusses an alternative model formulation using so-called function-valued traits, in order to show that the plasticity reference environment needs to be modeled also in such models. Two additional problems in such cases are also described.
As an example, Matlab code for the step response simulation in Subsection 4.1 is provided in Appendix F.

The problem with traditional reaction norm models
As exemplified in Section 1, a traditional quantitative genetics state-space model of an evolving population system with plastic organisms has three equations. First, the phenotypic plasticity is modeled by individual reaction norms, which describe how a multivariate individual phenotype y i,t is expressed as a linear or nonlinear function of latent quantitative traits z 0,i,t and a stochastic multivariate developmental environment (environmental cue) u t , I will here assume that z 0,i,t is an individual parameter vector as function of time t (generations) in a parametrized model of the reaction norm. The reference environment is defined as the environment the population is adapted to (Lande, 2009). It is often set to u ref = 0 (Gavrilets and Scheiner, 1993b,a;Lande, 2009), but that disguises the plasticity reference problem, as discussed in Ergon (2018), and in Section 3 below. Examples of individual reaction norms are shown in Figure 2.
Second, the individual and scalar fitness function is where θ t is the stochastic vector of phenotypic expression that maximizes fitness in a given stochastic environment u t at a given time (generation). Figure 1 shows an example with an instantaneous value θ = 20. In eqs. (5) and (6), u t and θ t are correlated stochastic processes. Note that in the univariate and linear reaction norm case, the covariance between u t and θ t determines the mean reaction norm slope in a stationary stochastic environment (Lande, 2009;McNamara et al., 2011;Ergon and Ergon, 2017). Third, the state equation that propagates the mean trait values forward in time may under given assumptions be the multivariate breeder's equation (Lande, 1979) where G and P are covariance matrices as described just below. By means of several assumptions, eq. (7) follows from the fundamental selection equation of Price (1970) (Appendix A). The reason for these assumptions is that the Price equation is not dynamically sufficient, i.e. it cannot be used for propagation of the mean state from one generation to next. An important assumption is that the phenotypic traits can be split into two mutually independent and multinormally distributed parts, z 0,i,t = x 0,i,t + e 0,i,t , with mean valueē 0,t = 0, and covariance matri- T and E = E e 0,i,t e T 0,i,t , respectively. As a consequence also z 0,i,t is multinormally distributed, with the covariance ma- T .
I will here assume P and G to be constant, as in, e.g., Lande (2009) and Ergon and Ergon (2017). I will assume populations with non-overlapping generations, where all individuals live in the same time-varying environment, and make standard assumptions necessary for the multivariate breeder's equations (4) and (7) to be valid (Appendix A). For analytical purposes, expressions for mean valuesȳ t andW t can in theory be found from equations (5) and (6), but they are not needed for simulations.
Equation (7) may be expressed asz 0,t+1 =z 0,t + Gβ 0,t , where β 0,t is the selection gradient, and we thus have . Under the assumption of so-called frequency-independent selection, the selection gradient may alternatively be expressed as . This is a nice form for interpretation of fitness landscapes (Arnold et al., 2008), although eq. (7) is better suited for simulations. The population system (5,6,7) reaches an equilibrium when E [β 0,t ] = 0, i.e. when E [z 0,t ] reaches a peak in the multivariate mean fitness landscape (assuming frequency-independent selection).
As pointed out in Ergon (2018), the fundamental problem with traditional reaction norm models is that the reference environment is an inherent part of the population state, independent of the actual environment where the individuals develop. The state of the population thus determines which environment it is adapted to, i.e. where the expected geometric mean fitness is maximized (Appendix C). In the simple univariate and linear case, with the individual reaction norm y = a + b (u − c) + e and independent traits a, b and c (Figure 2), this is simply the environment where the phenotypic variance has a minimum (Lande, 2009;Ergon and Ergon, 2017). The environment the population is adapted to is, in other words, an internal population property, independent of the external environment. It is, however, only when the external environment coincides with the internal reference environment, or vice versa, that the population is adapted to the current environment. As a consequence, the reference environment should be modeled as part of the evolutionary model, which implies a modification of the model (5,6,7). How this should be done is an open question, where the best answer may depend on the problem under study. One alternative is to let u ref be a function of an evolvable G matrix (e.g., Arnold et al. (2008)). That would give a complex solution, especially in the multivariate and nonlinear case, and this alternative is not studied here. As a straightforward solution, Ergon (2018) proposed that the reference environment vector may be modeled as a vector z c,t of mean traits in their own right, just as other reaction norm traits. Equation (7) must accordingly be augmented with thez c,t state variables. The details of this for parametrized models are developed in Section 3. Whether the mean reference traits inz c,t are evolvable is also an open question, but considering the complexity of evolutionary processes, such evolvability cannot be excluded without good arguments (Pigliucci, 2008). Note that only the elements in the environmental reference trait vector that are genetically variable, should be included in the augmented state equation. If all elements in the reference environment have zero genetic variance in the population, they can without consequences be set to zero, and this is thus an implicit assumption in traditional reaction norm models. Also note that evolvable reference traits may be combined with an evolvable G matrix.

Background state-space theory
As a background and reference for the theoretical development, I include a summary of the underlying state-space theory for discrete-time systems. The starting point is then the idea of an abstract discrete-time system that interacts with its environment through a vector φ t of input variables and a vector y t of response variables. A vector x t of variables that takes its values in some set X (a state space) is a state vector if it satisfies the following two requirements: 1. There exists a function g(·) that uniquely determines the response at any discrete time t as a function of the input and the state at t, 2. There exists a function f (·) that uniquely determines the state at any discrete time t as a function of the state at any earlier discrete time t 0 and the input sequence from t 0 to t − 1, for any t 0 and sequence . From this follows that x 1 = f (x 0 , φ 0 ), and generally that x t at any discrete time t can be propagated one step forward in time according to (Åstrom and Murray, 2008) x The function g(·) is known as the output or observation function, and the function f (·) as the state or state transition function, while x t is the state. At t = t 0 the state variables will have or be given some initial values, and from then on all information from the past is carried by the state variables. It should be noted that any specific current state may be the result of a large number of different initial states and input sequences, especially if t 0 is far back in time, and the initial state cannot therefore be reconstructed from the current state without detailed knowledge of the entire input sequence.

State-space augmentation
Assuming sufficient genetic variation, the mean phenotypic values in a population will evolve when the environment varies from generation to generation. As summarized in Section 2, mathematical modeling of this evolution for plastic organisms involves a statespace model, which assuming non-overlapping generations requires three equations. First, eq. (5) describes how a multivariate individual phenotype y i,t is expressed as a linear or nonlinear function of quantitative traits z 0,i,t and a continuously varying developmental environment (cue vector) u t . Second, eq. (6) describes how the individual fitness depends on the difference between the phenotype y i,t and the vector θ t of phenotypic expressions that maximizes fitness in the given environment at a given time (generation). Third, the state equation is traditionally and under the assumptions in Appendix A the multivariate breeder's equation (7) (Lande, 1979).
When eq. (5) is compared with the general state space output function (8), it is apparent that the environmental reference vector u ref must be part of either the current state or the current input. Since eq. (8) describes how the abstract discrete-time system interacts with the current environment through the vector φ t of input variables, and since a reference environment possibly far away from the current environment cannot be part of the current input, it must necessarily be an inherent part of the current state of the population. This is illustrated in Figure 2 in Section 1, where u ref = 0 is the environment where the phenotypic variance has a minimum, also when the environment varies in a range far from u = 0. The current individual state is thus , which leaves u t as the current input in eq. (8). Note, however, that also θ t in the fitness function (6) is an input variable, such that the total current input is φ t = u T Setting u ref =z c,t raises the question of possible biological mechanisms for individual traits z c,i,t . Ergon and Ergon (2017) proposed that individual reaction norms may be shifted along the cue-axis according to how individuals perceive the environment, which results in individual perception traits. In the general multivariate and nonlinear case such perception effects will lead to individual trait vectors z c,i,t , that thus should replace u ref in eq. (5). Assuming that z c,i,t , just as z 0,i,t , can be split into two independent and multinormally distributed parts, z c,i,t = x c,i,t + e c,i,t , with e c,t = 0, and that the additive genetic covariance ma- T is positive definite, the mean traits inz c,t will be evolvable. This results in a dynamical reference environment, which in a stationary stochastic environment will evolve into an equilibrium.
With u ref = z c,i,t , the model (5,6,7) will according to the multivariate breeder's equation result in the augmented state-space model while β t is the selection gradient. Here, G cc = 0 results in x c,i,t =x c,t , and thus a constant mean state variablē z c,t+1 =z c,t . In that special case we may without further consequences setz c,t = z c,i,t = u ref = 0. In case only some of the traits in z c,i,t have genetic variability, only such traits should be included in eq. (11), while the others may be set to zero. In eq. (11), W i,t andW t are still computed from eq. (6). Evolution in a stationary stochastic environment will lead to The expected geometric mean fitness will then be maximized (Appendix C). The reference environment vectorz c,t is closely related to the environment the population is adapted to, which may be denoted u 0 . As discussed in detail for the special case in Ergon and Ergon (2017), an unsymmetrical distribution of the phenotype y results in a difference betweenz c,t and u 0 , but at equilibrium in a stationary stochastic environment the expected deviation is independent of the mean values µ U and µ Θ of u t and θ t , respectively.
The idea of an evolvable reference trait was introduced in Ergon and Ergon (2017), but then based on biological arguments, and as a result of the novel idea of a perception trait as a means of relaxing constraints on the evolution of reaction norms. A main purpose of Ergon (2018) was to show that the plasticity reference environment not only may be modeled, but that it in principle must be modeled, in one way or another, as part of the quantitative genetics state-space model (although this is not necessary if the reference environment is not evolvable).
As discussed in Ergon and Ergon (2017), an important result of a fully evolvable plasticity reference environment is the property of complete genetic assimilation, by which "selection can act in such a manner as to turn an environmentally stimulated phenotype (i.e., plasticity) into a fixed response to prevalent environmental conditions (assimilation)" (Pigliucci and Murren, 2003). I here use the term 'complete genetic assimilation' as in Ergon and Ergon (2017), to describe the evolutionary scenarios where, after an abrupt environmental change, there is an initial increase in phenotypic plasticity, after which the mean plasticity is reduced and the environment range, or value, to which the population is adapted moves towards the current mean environment. This entails that all elements in the reference environment vector have genetic variability, such that they are evolvable.

Parametric reaction norm modeling
With z 0,i,t split into elevation traits z a,i,t and slope and shape traits z b,i,t , the reaction norm function in eq. (10) becomes Following Gavrilets and Scheiner (1993b), this function can be approximated by a power series in terms of the components of q environmental cues, with p different products of This yields the individual reaction norm equation whereũ i,t is a p × 1 vector of all the different cue products involved, as for example u 1,t − z c,1,i,t , (u 1,t − z c,1,i,t ) 2 , (u 1,t − z c,1,i,t ) (u 2,t − z c,2,i,t ) etc.. With m phenotypic variables, y i,t and z a,i,t are m × 1 vectors, and Z b,i,t an m × p matrix of individual quantitative traits (see multivariate and nonlinear simulation example in Subsection 4.3). The elements in Z b,i,t can be ordered in an individual vector z b,i,t in any chosen way. We may for example have z b,i,t = vec (Z b,i,t ), where vec (Z b,i,t ) is a vector form of Z b,i,t such that the columns are linked into a single column vector of length m × p. Note that all of z a,i,t , z b,i,t and z c,i,t may have independent additive genetic and non-additive parts. When eq. (10) is replaced by eq. (13), eq. (11) must be replaced by  z a,t+1 The total number of state variables is thus m+m×p+q, where q is the number of environmental cues. As exemplified in Ergon and Ergon (2017), the system (13,14) has the external references µ Θ , µ U and cov (U, Θ). It follows fromErgon and Ergon (2017) that a symmetric phenotypic distribution p(y) at equilibrium in a stationary stochastic environment results in E [z a,t ] = µ Θ and E [z c,t ] = µ U , while an unsymmetrical p(y) leads to deviations from µ Θ and µ U . These deviations will, however, be independent of the actual values of µ U and µ Θ , such that a positive definite matrix G cc gives complete genetic assimilation in any stationary stochastic environment. It also follows from Ergon and Ergon (2017) and McNamara et al. (2011), that the mean slope values around the origin in a stationary stochastic environment is a function of cov (U, Θ).

A step response simulation
Step responses for a system with the reaction norm used in Figure 2, are shown in Figure 3. The individual fitness was given by the Gaussian function and the state equation in eq. (14) was used, with diagonal matrices G aug and P aug . The population size was N = 1000, and at each generation individual trait values around the updated mean trait values were drawn from distributions according to G aug and P aug (see Appendix F for Matlab code).
Step response phase plotsc = f (ā) corresponding to Figure 3 are shown in Figure 4. This figure also shows phase plots with a smaller change in µ Θ , as well as with no change.
Mean fitness plots are shown in Figure 5. Note that the mean fitness for G cc = 0.5 is recovered after application of the environmental step function, while G cc = 0 results in a permanent loss of mean fitness. The reason for this permanent loss of mean fitness is that the point [6,12] in Figure 4 is reached by phenotypic plasticity, such that the phenotypic variance has an extra term u 2 G bb . Also note that the mean fitness in the original environment is somewhat higher for G cc = 0 (see Ergon and Ergon (2017) for details).

A ramp response simulation
The system in Subsection 4.1 above was also simulated with µ U and µ Θ = 2µ U as ramp functions over 5000 generations, starting at 1000 generations. The ramp responses are shown in Figure 6, and the corresponding Figure 3: Step responses for the system (14,15,16), with changes in the mean environmental cue µ U from 0 to 6, and in the mean µ Θ of the phenotypic value θ that maximizes fitness from 0 to 12, at t = 1000 (blue). The excitation signals u − µ U and θ − µ Θ on top of the mean values µ U and µ Θ were white, with variances σ 2 U = 0.4 and σ 2 Θ = 1.6, while cov (U, Θ) = 0.2. The population size was N = 1000, and the width of the fitness function was ω 2 = 10. The variance of the white and zero mean non-additive component was σ 2 e = 0.5, and the trait covariances were G aa = 0.5, G bb = 0.045 and G cc = 0.5. Results with G cc = 0, as in Lande (2009), are included (magenta). Note that the expected equilibrium mean slope value with G cc = 0 is E b = 0.5, while it with G cc = 0.5 is E b ≈ 0.22 (Ergon and Ergon, 2017).
mean fitness results are shown in Figure 7. Note that for G cc = 0.5 the mean traitsā andc follow µ Θ and µ U , respectively, with constant time lags, with a very minor decrease in mean fitness as result. A plot of y =ā +b (u −c) would show thatȳ follows µ Θ even better thanā, but thatȳ is more noisy thanā. After a transient period, the tracking errors are ∆ā = µ Θ −ā ≈ 1 and ∆c = µ U −c ≈ 2, whileb ≈ 0.5, such that y ≈ā +b (µ U −c) ≈ µ Θ . The tracking properties with G cc = 0 are poor, with a permanent loss in mean fitness as result.
Here it should be noted that the multivariate breeder's equation in general requires that the nonadditive part e of z = x + e is a zero mean and white stochastic process, which when plasticity is involved implies white environmental variations (Appendix A). When the temporal environmental variations are modeled as in eq. (15), however, it is not required that the Step response phase plots corresponding to the step response plots in Figure 3, with G cc = 0.5 (blue) and G cc = 0 (magenta). Note that with G cc = 0.5, the point [E [c] , E [ā]] evolves from [0, 0] to [6,12], such that the system at the end is completely assimilated in the new environment without use of plasticity (blue). With G cc = 0 the system evolves to [0,9], such that the point [6, 12] is reached by means of a plasticity component (dashed magenta). For G cc = 0.5, the phase plane plot for the case that µ Θ changes from 0 to 3 is also shown (violet). For G cc = 0, the point [6, 3] is in this case reached by means of pure plasticity (dashed black), without evolution ofā. With no change in µ Θ , G cc = 0.5 results is no final change in E [ā] (green), while G cc = 0 results in a negative value of E [ā] (cyan). Note that all the dashed lines have the equilibrium expected mean slope value E b = 0.5, but that for G cc = 0.5 the expected mean slope is finally reduced to E b ≈ 0.22 (Figure 3). Also note that the individual slope values vary aroundb with variance G bb , as indicated in Figure 3. For u = 6 and G cc = 0 this gives an extra phenotypic variance in y of 36G bb . These extra variations around the fitness peak give a reduction in mean fitness ( Figure 5).  16) is set to W max = 1.2, such that the mean fitness with G cc = 0.5 over time is approximately 1, which means that the population size is sustained.
environmental cue u is white. The requirement is instead that the residual e in eq. (15) is white, i.e. that the rest of the non-additive effects are white (spatial environmental effects, dominance, epistasis).

A multivariate and nonlinear reaction norm case
As discussed in Ergon and Ergon (2017), as well as in Section 3, an important consequence of an evolvable reference environment is complete genetic assimilation in any stationary environment. This is illustrated in the phase plot in Figure 4. Here, I in addition simulate a multivariate and nonlinear system, where complete genetic assimilation as defined in Section 1 takes place. Figure 8 shows step response phase portraits, i.e.ā 1 = f (c 1 ) andā 2 = f (c 2 ), for a system with the individual reaction norm model with correlated cues u 1 and u 2 , and with independent and zero mean white noise components e 1 and e 2 . The  fitness function was with correlated values of θ 1 and θ 2 . The state equation (14) was used, with z a,i,t = a 1,i,t + e 1,i,t a 2,i,t + e 2,i,t Figure 9 shows the corresponding mean plasticity slope plots. Note that onlyb 11 is different from zero in a stationary stochastic environment, which may have implications for the possibilities to find parameter values from collected data (see discussion in Section 5 and Appendix D).

Conclusions and discussion
This article is limited to a specific problem concerning phenotypic plasticity and evolution, but it still illustrates some of the complexities of evolutionary theory It is also assumed that no mutations are involved, although errors during DNA replication or other types of damage to DNA play vital roles (Rice, 2004). The model (13,14) also assumes that there is only a single population involved, and that this population has no influence on the environment, while in the real world many populations and subpopulations interact with each other and with the physical environment. Also the physical environment is in fact, often very much shaped by living organisms. The relatively new epigenetic theories of heritable changes in gene function that do not involve alterations in the DNA sequence is closely related to plasticity (Valena and Moczek, 2012). This is, however, not discussed in this article. The main point in the article is that the plasticity reference environment u ref is a population characteristic, that ought to be modeled as such, and this is the case also if it is set to zero. Under the assumption of constant additive genetic and phenotypic covariance matrices, the remaining choice is to model u ref as a vectorz c of mean traits. The corresponding additive genetic covariance matrix G cc may be zero, and we may then set u ref = 0, as is often done in traditional evolutionary models. However, if G cc = 0, at least some of the 'reference traits' will evolve in a changing environment, and they must then be included in the augmented state equation (14).
The question whether the reference environment traits are evolvable is a difficult one, but as indicated by especially the mean fitness plots in Figure 7, a population with an evolvable reference trait may over time have better mean fitness than without such evolvability. It therefore seems reasonable that such evolvability, at least in some populations, has evolved over the long time of life on earth. As Pigliucci (2008) states, it is clear "that evolvability -no matter how it is defined -does evolve". This appears to be closely related to the fact that the emergence of order in living systems requires far from equilibrium thermodynamics (Prigogine, 1977;Pross and Pascal, 2013). Note, however, that the mean reference trait vectorz c , as well as the elevation trait vectorz a , in any case evolves slowly, such that mutations may play an important role. As can be seen in Figure 3, the shape and slope traits in z b may evolve much more rapidly. As pointed out in Lande (2009), this may be necessary to prevent extinction as a consequence of rapid environmental changes. However, in a discussion of 'good' or 'bad' properties of wild populations, with regard to different traits and their influence on fitness and survival, it is important to remember that such population systems are not designed. They may certainly be affected by human activities, but they have in any case evolved based on the basically simple principles of natural selection and mutations, and if a certain property is 'good' it is so as a result of evolution. This is of course different for domesticated animals and plants, where breeding has played a major role.
One may ask why not the covariance matrix G also should be modeled, and included in the augmented state equation (14), and the answer is yes, in principle it should. In such cases, evolvability of G cannot be based on individual selection, but on for example mutations. Here, however, I assume that G is constant, such that augmentation with G is not necessary. See Arnold et al. (2008) for a review of empirical, analytical, and simulation studies of the G matrix, with a focus on its stability and evolution.
The biological mechanism behind evolvable 'reference traits' may be that individuals perceive the environment differently, as discussed in Ergon and Ergon (2017), and we could accordingly introduce individual 'perception traits' z c . As shown, such perception traits may be used also in multivariate and nonlinear cases, leading to parametrized models according eqs. (6), (13) and (14). As shown in Appendix E, perception traits may be used also in models based on index environment phenotypes, which through interpolation leads to function-valued models. In such models, however, G cc > 0 leads to non-normal distributions, which is in conflict with the assumptions behind the augmented multivariate breeder's equation (14). Another added difficulty is that the individual state variable z c,i,t does not fit into a function found through interpolation between phenotypic index traits z 1,i,t to z r,i,t (Irwin and Carter, 2013).
The state-space model (13,14) could have been formulated just as a generalization of the model in Ergon and Ergon (2017), based on biological arguments for perception traits. In addition to that, however, the intention in Ergon (2018) was to show that modeling of the reference environment is in principle necessary, from a basic state-space modeling point of view.
The most important result from a practical point of view, is that population systems with a positive definite covariance matrix G cc obtain complete genetic assimilation in any stationary stochastic environment, as discussed in Section 3. This means that the reaction norms at equilibrium after a change from one stationary environment to another, will be shifted to the new environment without any change in slope and shape. The adaptive peak, as determined by the state of the population, thus moves such that the population becomes adapted to the new environment. This movement is illustrated in a phase plane plot in Ergon and Ergon (2017), as well as in Figures 4 and 8. Long after the change in mean environment, complete genetic assimilation will return the mean fitness to its original value, which is an essential difference from the partial genetic assimilation obtained in Lande (2009). More generally, the mean phase space position values inz a andz c in eq. (14) will evolve to new equilibrium values, while the mean slope and shape values inz b after a transient period will return to the original values. As a result, the dynamical responses to variations around the mean of a stationary stochastic environment, will be independent of the specific mean environmental value. This is demonstrated in Figures 4  and 8. In practice, however, complete genetic assimilation in any environment must necessarily be limited by biological constraints, plasticity costs etc.
As mentioned in the Section 1, a main difficulty appears to be to find estimates of G cc from data. With linear reaction norms, it is theoretically impossible to find G cc from data collected at stationarity, but as discussed in Ergon and Ergon (2017), signs of G cc = 0 will show up in transient situations. For the simple example in Ergon and Ergon (2017), it may in fact be possible to find G cc from dynamical experiments, as used in engineering control system identification (Ap-pendix D). A more general application of such methods on evolutionary problems is an interesting area for future research.
It is interesting to note an important difference from modeling of most dynamical engineering systems, where without loss of generality equilibrium values of inputs u t and state variables x t can be set equal to zero (Ch. 5 inÅstrom and Murray (2008)). An equilibrium point of such a system represents a stationary condition for the dynamics, and such a point is often found from the phase portrait for the system (Ch. 4 inÅstrom and Murray (2008)). For discrete time systems an equilibrium point is characterized by x t+1 = x t for all t. For the parametrized evolutionary system (13,14), the situation is more complex, as illustrated by the simple step response simulations in Subsection 4.1. As shown in the basic simulation results in Figure 4 (blue), the equilibrium point after a step in the environmental mean value is with G cc = 0 given by 12, 0.5, 6]. Despite the different state equilibrium points, the final expected mean output value E [ȳ] = 12 is the same in the two cases. 6,12] is reached by use of a final and permanent plasticity component E b µ U = 0.5 × 6. With this follows an increased phenotypic variance, and a corresponding reduction in mean fitness. With G cc = 0.5, on the other hand, the same point is reached without a final plasticity component, which means complete genetic assimilation and thus a complete mean fitness recovery (Figure 5). Finally, it is interesting to note similarities with control system design. In eq. (14), the reference environmentz c is modeled as a multivariate discrete-time integrator, driven by the covariance between individual fitness and reference environment. This results in complete genetic assimilation in any stationary stochastic environment, and good tracking properties as shown in Figure 6. This is an interesting parallel to what is obtained with integral action in state feedback design (Åstrom and Murray, 2008), where an integrator driven by the deviation between set point and output value gives zero control error for constant set points, as well as good tracking properties. The difference is, of course, that biological systems are not designed, but have evolved (although proponents of intelligent design will presumably argue differently, e.g., Dembski and Ruse (2004).

A. The Price and multivariate breeder's equations
A.1. The Price equation The univariate Price equation (Price, 1970) for selection in a population can in its multivariate form be formulated as (Rice, 2004) where we for simplicity may assume non-overlapping generations and use one generation as time interval.
Here, Note that population size N may be finite, in which case and It is important to realize that the Price equation (19) is just an identity, where the right hand side is not a causal explanation of the left hand side. The notational use of covariance function and expectation operator for a final population size N may be unusual, but these notations are very much used in the literature, and the important thing is that we define and understand what they stand for here.
Proof of eq. (19): Since cov (W i,t , z i,t ) = E [W i,t z i,t ] −W tzt , all that must be shown is thatW . For this purpose, note that with population size N t at generation t given, and fitness W i,t defined as above, we find Defining z of f i,j,t as the traits of offspring j (belonging to generation t + 1) of individual i in generation t, and z k,t+1 as the traits of individual k in generation t + 1, eq. (22) gives Adding cov (W i,t , z i,t ) and subtracting E [W i,t z i,t ] − W tzt on the right hand side of eq. (23) completes the proof of the multivariate Price equation.
The multivariate Price eq. (19) is the starting point for the multivariate breeder's equation, based on a series of assumptions. The main reason for these assumptions is that the Price equation can be used only retrospectively, i.e. after selection and reproduction, when the individual fitness W i,t is known, and it cannot therefore be used to propagate the statez t forward in time. Also note that before reproduction fitness must be treated as a stochastic variable (Rice, 2008;Engen and Saether, 2013), but that type of modeling is not discussed in this article.

A.2. The multivariate breeder's equation
We are now ready to introduce the assumptions that are necessary in order to develop the multivariate breeder's equation (4).
1. The vector z i,t of individual traits is the sum of independent additive genetic effects x i,t and environmental effects e i,t (including non-additive genetic variation), i.e. z i,t = x i,t + e i,t .
2. The non-additive effects e i,t are zero mean and white.
3. There are no expected fitness weighted changes in the mean additive effectx t from one generation to the next besides selection, i.e.
4. The additive genetic effects x i,t and non-additive effects e i,t are jointly multivariate normal.
5. The additive genetic effects x i,t and fitness W i,t , as well as the non-additive effects e i,t and fitness W i,t ), are conditionally independent given the phenotypic trait z i,t , such that x i,t and e i,t influence fitness only through the values of z i,t .
From Assumption 1 follows that cov t ] cancels out, and also making use of Assumption 3, the Price equation (19) can be expressed asz From Assumption 1 follows that P t = G t + E t , where P t , G t and E t are the covariance matrices of z i,t , x i,t and e i,t , respectively. Since cov (x i,t , z i,t ) = G t + cov (x i,t , e i,t ) = G t , it further follows from Assumption 4 that the conditional distribution of x i,t given z i,t is multivariate normal with (Johnson and Wichern (2008), Result 7.14) From Assumption 5 and eq. (25) follows that (24) and (27) finally follows the multivariate breeder's equation (4), Here,

zi,t) Wt
= β t is the selection gradient.

B. Selection gradient for frequency-independent selection
When the individual fitness depends only on individual trait values z i,t and not on mean values, the selection is frequency-independent. When W i,t depends also on mean trait valuesz t the selection is frequencydependent, which implies that the fitness of a phenotype depends on its frequency relative to other phenotypes in a given population. Interactions among individuals in the population may then result in ∂W (zi,t) ∂zt = 0 also at equilibrium (Lande, 1979).
In order to find an alternative expression for the selection gradient than β t = P −1 t cov(Wi,t,yi,t) Wt , as follows from eq. (28), we may under the assumption of multinormal distribution of z i,t use that where c is a proportionality constant.
Using that per definition E [g (X)] = ∞ −∞ g (x) p (x) dx, frequency-independent selection gives, after differentiation ofW t and p (z i,t ), Since eq. (28) gives β t = P −1 t cov(Wi,t,zi,t) Wt , this gives the alternative selection gradient expression, first derived in a somewhat different way by Lande (1979) Equation (32) shows that β t is a gradient vector that points in the direction of the steepest mean fitness ascent, and that the length ofW t β t is given by the slope of the mean fitness function in that direction. Inserted into eq. (28) this leads to (Lande, 1979) Note that the pointz t will climb uphill in the multivariate mean fitness landscape, but along the steepest ascent only when G t is diagonal. Also note that E [z t ] at equilibrium will reach a peak in the mean fitness landscape only when the selection is frequencyindependent, such that the geometric mean fitness is maximized (Day and Taylor, 1996).

C. Mean fitness maximization
Evolution towards a fitness peak, as illustrated in Figure 1, will according to eq. (32) reach an equilibrium when E [β t ] = E ∂ ∂zt ln W t = 0. Since the sum rule of differentiation applies, we have which shows that the equilibrium is reached when the expected geometric mean fitness

D. Preliminary example of evolutionary system identification
System identification is a mature discipline in the engineering control community, with prediction error methods developed during the 1980s (Ljung, 1999), and subspace methods mainly from the 1990s and later (Qin, 2006). For evolutionary system identification, predictor error methods of the output error (OE) type is a straightforward choice.
Here is an example of the OE prediction error method applied on an evolutionary system identification problem. Assume a system essentially as in Ergon and Ergon (2017), with the individual reaction norm the individual fitness function where z a = a + e, z b = b and z c = c. Here, u is the environmental cue, while θ is the phenotypic value that maximizes fitness. Assume ω = 10, and G = Also assume θ t = µ Θ + v θ,t as shown in Figure 10, upper panel, where µ Θ is piecewise constant, while v θ,t is white noise with variance σ 2 v θ,t = 1.6. Assume u t = µ U + v u,t , where also v u,t is white noise with variance σ 2 vu,t = 0.4, and where µ U = 0.5µ Θ , and let θ t and u t be correlated, with cov (θ t , u t ) = 0.2. Inputs like µ Θ and µ U can formally be generated as pseudo-random binary signals (PRBS), which are often used for identification of engineering control systems (Ljung, 1999).
Apply the input sequences θ t and u t on the evolutionary system (35) to (37), and collect the mean phenotypeȳ t for t = 1 to T . Now assume that G cc is the only unknown parameter in the system (35) to (37). In order to find G cc , apply the input sequences θ t and u t on a system model with different values of G cc , and collect the resulting outputŝ y t . Also compute the prediction error ε t =ȳ t −ŷ t for each value of G cc . Results for three values of G cc are shown in Figure 10, lower panel.
We may search for the value of G cc that minimizes the quadratic criterion function J = 1 T T t=1 ε 2 t . Results for 100 values of G cc from 0.402 to 0.600 with population size N = 10000 are shown in Figure 11. Smaller population sizes increase the noise in this plot significantly.
For identification of several unknown parameters, a better search method is needed. This requires experimental data that are informative enough, but it also requires a theoretical identifiability analysis (it may not be theoretically possible to identify all parameters). Also note that we must assume a model, i.e. a linear or nonlinear reaction norm, a fitness function, and a covariance structure.
The applicability of dynamical system identification methods in an evolutionary setting remains to be investigated. Figure 10: The input function θ t = µ Θ + v θ,t (upper panel), and the model outputŷ t for G cc = 0 (magenta), G cc = 0.5 (green) and G cc = 1 (blue). The green curve is also the outputȳ t from the assumed evolutionary system (35) to (37) itself.

E. Modeling based on index environment phenotypes
In order to show that the plasticity reference environment needs to be modeled also in function-valued models, I here consider models based on index environment phenotypes. Such models lead to function-valued models through interpolation between the index environments (Kirkpatrick and Heckman, 1989;Kingsolver et al., 2001). Two additional problems in such cases are also described. For clarity, a univariate individual phenotype y i,t and a univariate environmental cue u t are assumed.
In an index environment model, the phenotypic values y i,t in eq. (10) are defined as the individual phenotypic values at r discrete index environments, where γ is the in general nonlinear reaction norm function, and where z c,i,t is the individual reference trait (which is set to zero in traditional models). The individual phenotypic values are also used as individual traits, i.e. z 1,i,t = y 1,i,t etc., and these traits have a phenotypic covariance matrix P r = cov (y i,t , y i,t ), and a corresponding additive genetic covariance matrix G r . Setting z c,i,t = 0, the multivariate breeder's eq. (4) would thus lead to where G rc and P rc are the covariance matrices of the vector y i,t augmented with z c,i,t . This raises two problems. First, with z c,i,t = 0 the traits z 1,i,t = y 1,i,t etc. will not be normally distributed, even if the reaction norm has underlying normally distributed parameters, which is in conflict with the assumptions behind the multivariate breeder's equation (Lande, 1979). Eq. (40) will therefore be more of an approximation than it otherwise would be. Second, the state variable z c,i,t does not fit into a function found through interpolation between z 1,i,t to z r,i,t . A similar problem in a life-history trait setting is discussed in Irwin and Carter (2013). A possible solution is to assume that z c,i,t is independent of z 1,i,t to z r,i,t , and model the evolution ofz c,t independently, i.e. to use eq. (39) combined with z c,t+1 =z c,t + 1 W t G cc P −1 cc cov (W i,t , z c,i,t ) .
F. Matlab code for step response example in Subsection 4.1 %% S t e p r e s p o n s e example c l e a r T=10000 N=1000 w2=10; varu = 0 . 4 ; v a r t h e t a = 1 . 6 ; c o v u t h e t a = 0 . 2 ; x=c o v u t h e t a / varu ; y=sqrt ( v a r t h e t a −xˆ2 * varu ) ; Gaa = 0 . 5 ; Gbb= 0 . 0 4 5 ; Gcc = 0 . 5 ; Paa=Gaa + 0 . 5 ; Pbb=Gbb ; Pcc=Gcc ; v a r e = 0 . 5 ; %% Generate u ( t ) and t h e t a ( t ) du=sqrt ( varu ) * randn (T , 1 ) ; d t h e t a=x * du+y * randn (T , 1 ) ; fo r t =1:1000 u ( t )=du ( t ) ; t h e t a=d t h e t a ; end fo r t =1001:T u ( t)=6+du ( t ) ; t h e t a ( t )=12+ d t h e t a ( t ) ; end %% I n d i v i d u a l p o p u l a t i o n t r a i t s %% around abar , b b a r and c b a r fo r t =1: