Dynamic model adaptation to an anaerobic digestion reactor of a water resource recovery facility

This study deals with model adaptation of the AM2 model to an anaerobic digestion reactor of a water resource recovery facility, namely a 6000 m reactor at VEAS WWRF, the largest of Norway. The model is based on the mass balance with six states including acidogens, methanoges, alkalinity, organic substrate, volatile fatty acid and inorganic carbon. The model adaptation is applied firstly to simulated reactor data for testing the algorithms, and then to experimental data. The experimental data are collected from laboratory analysis and online measurements from January to October 2017. The data of the first 100 days are used for model identification, and the remaining data for model validation. Identification analysis is based on the Fisher Information Matrix and the Hessian matrix. Also, a sensitivity analysis of the parameter estimates is accomplished.


Introduction
Anaerobic digestion (AD) of wastewater is a biological process to produce biogas consisting of methane and carbon dioxide in the absence of oxygen. Mathematical models of the AD process can be useful for control system design and testing, controller tuning, model-based control, state and parameter estimation, optimization of reactor design and operation, analysis, training, etc.
Several mathematical models of the AD process exist. The Anaerobic Digestion Model Number 1 (ADM1) is a well-known model with five stages including disintegration, hydrolysis, acidogenesis, acetogenesis and methanogenesis Batstone et al. (2000). The ADM1 model has 35 state variables with around 100 model parameters, and it represents the gases methane (CH 4 ), carbon dioxide (CO 2 ) and hydrogen (H 2 ). The Hill model is a relatively simple model for representing the behaviour of a biogas reactor with four state variables from which CH 4 gas is calculated Hill and Barth (1977). In the present study, the Anaerobic Digestion Model number 2 (AM2) with six states and 13 model parameters is applied Bernard et al. (2001). The model is implemented in MATLAB. Before using a model effectively, the values of unknown model parameters have to be estimated from experimental data. The main goal of this study is dynamic model adaptation to a simulated biogas reactor and also to experimental data from the VEAS. The model adaptation comprises estimation of the model parameters and model validation. Root Mean Square Error (RMSE) is used to evaluate both the model identification and the model validation. Identifiability analysis and sensitivity analysis are investigated. The identifiability analysis is based on calculating and analyzing the Fisher Information Matrix (FIM). The convexity of the optimization problem of the model adaptation is analyzed with the Hessian matrix at the estimated model parameters Ljung and Chen (2013). The model identifiability is addressed also with a sensitivity analysis. The need for identifiability analysis can be stated as follows. When the measurement variables are not influenced by one or several model parameters, it is possible to have several different identified model parameter sets.
There exist several studies about the model adaptation to experimental data from different biogas reactors based on different models. A pilot anaerobic reactor fed with diary manure at Foss farm in Skien, Norway has been adapted to the Hill model Haugen et al. (2013). The AM2 model is used for production of biogas in the raw industrial wine distillery vinasses in a local winery at Narbonne, France Bernard et al. (2001). There are several studies using the ADM1 model for biogas reactors, such as olive mill solid waste Boubaker and Ridha (2008), lignocellulosic biomass Koch et al. (2010), and sewage sludge digestion for various water resource recovery facilities Shang et al. (2004Shang et al. ( , 2005. The outline of the article is as follows. Section 2 in-cludes a process description and the AM2 model equations. Section 3 explains about dynamic model adaptation of the AM2 model to a simulated biogas reactor. The parameters of the model are identified and the model will be evaluated based on the estimated model parameter. Also a practical identifiability analysis is described based on the Fisher Information Matrix (FIM) and the Hessian matrix. Then the model is validated on a wide range of the operating conditions covering 150 days. A sensitivity analysis for all model parameter estimation is presented. Dynamic model adaptation of the AM2 model to experimental data from the VEAS reactors will be discussed in Section 4. Discussion and Conclusions are given respectively in Section 5 and Section 6.

Process and model descriptions 2.1. Process description
The reactor at the VEAS plant has the form of a cylindrical tank. The diameter of the tank is 19 m, and the maximum volume of the tank is 6000 m 3 . The reactor is equipped with a level sensor. The effective volume can be calculated based on the height of the material inside the reactor which is measured by the level sensor. The reactor temperature is measured. The density of the sludge is assumed same as water density, 1000 kg/m 3 .
The AR expresses the reactor stability Lee et al. (2015). AR less than 0.3 indicates a stable AD process Drosg (2013). An important parameter in AD processes is pH which is included in the AM2 model. The reactor temperature and pH have effects on the performance of AD processes Hajji et al. (2016)Jayaraj et al. (2014. Appropriate pH values for methane production in the mesophilic temperature range is investigated in Kheiredine et al. (2014).

Selection of a dynamical model
In this study, the AM2 model is selected because it is considerably simpler than the ADM1 model, while comprising the relevant parameters and variables. In Figure 1, the AM2 model is compared with the ADM1 model. In the AM2 model, organic material is converted by microorganisms in two phases which are called acidogenesis and methanization. In the first phase, the acidogenic bacteria, X 1 , consume the organic substrate, S 1 , and produce CO 2 and volatile fatty acids, S 2 . In the second phase, the population of methanogenic bacteria, X 2 , uses the volatile fatty acids to produce CO 2 and methane. Z and C are the total alkalinity and total inorganic carbon, respectively. Mass balances give the following differential equations, which constitute a state space model of the AD reactor Bernard et al. (2001).

The original AM2 model
where Z in , S 1in , S 2in , C in in (4)-(7) are, respectively, concentration of the inflow of alkalinity, substrate, VFA and dissolved inorganic carbon. D is the dilution rate, or normalized flow, defined in (8).
F is flow rate and V is the effective volume of the reactor.
The flow of inorganic carbon from the liquid phase to the gas phase, q C , is calculated accordance to Henry's law, cf. (9).
where k L a is liquid-gas transfer coefficient, K H is Henry's constant, and P C is CO 2 partial pressure. Methane flow is directly related to the methanogenic rate, µ 2 , cf. (10).
The growth of acidogenic bacteria, µ 1 , is based on Monod type kinetics and the growth of methanogenic bacteria, µ 2 , is assumed based on Haldane type kinetics. The growth of bacteria depends on temperature and temperature change. Generally, an increase in temperature before reaching the optimum growth temperature will increase bacteria activity and the growth of bacteria. If temperatures get more than the optimum growth temperature, enzyme activity will decrease Eddy et al. (2003). The consensus is that temperature changes greater than 1 • C/d affect process performance, thus temperature variations of less than 1 • C/d are recommended. In general, the variables in the AM2 model are divided in three groups called state variables, environmental variables, and model parameters, respectively. The state variables and the environmental variables are shown in Figure 3. In this study, the model parameters are estimated based on observing input-output and some of state variables are measured.

Modified AM2 model
It can be assumed that a part of the microorganisms continuously die during the process. In this study, we propose to enhance the AM2 model with death rates of acidoges and methanoges microbes, k d1 , k d2 , both with values of 0.2 d −1 , as in the AMD1 model Batstone et al. (2000). Thus, (2) and (3) are modified to become (4) and (5), respectively.
In the VEAS reactor, the inflow and outflow rates may be different, implying a variation of the level and hence the volume of the materials in the reactors varies. Consequently, (4)-(7) are modified to become (13)-(16).
Temperature has effect on reaction rates but is not considered in the original AM2 model. The temperature dependency of the reaction rates may be based on the van't Hoff-Arrhenius model Eddy et al. (2003) or on the Hashimoto model Hashimoto et al. (1995). We assume here the Hashimoto model, as it is based on a study of the behaviour of a large number of real reactors. The Hashimoto model was proven to represent very well the temperature-dependency of AD reactor dynamics in Haugen et al. (2013).
The Hashimoto model implies that the temperature dependency of the maximum reaction rates is modelled as in (17).
for temperatures between 20 0 C and 60 0 C. c H is the Hashimoto's factor. µ 0 is a constant value. We assume that µ imax in 35 0 C will be identified as a model parameter. Thus, the temperature dependency is applied based on (18) for enhancement of the original AM2 model.
The modified AM2 model is the original AM2 model with aforementioned modifications.

Experimental data
Experimental data are collected daily from laboratory analysis and from sensors installed on the pertinent VEAS reactor for the period of 1st of January to 10th of October 2017. The list of the collected data is mentioned in Table 1.

Model adaptation to a simulated biogas reactor
Model adaptation is applied to a simulated model, where of course the true parameter values are known, as a necessary check of the selected parameter estimation algorithm. The model adaptation consists of two parts: (1) Parameter estimation, and (2) model validation where the reliability of the model is validated with data not used in the estimation Stare et al. (2006). Simulated data over the first 100 days are used for estimation, and data over 150 days for model validation.
15 model parameters are estimated, of which 13 model parameters are in the original AM2 model while the remaining two are one parameter of Hashimoto's formula, (18), and Z dis , which is a parameter to represent uncertainty in the influent alkalinity, cf. (19).

Selection of a criterion
The parameter estimates are those minimizing the quadratic criterion in (20).
where Y r,i is the simulated "real" data vector, and Y p,i (θ) is the prediction vector with respect to the model parameters vector, θ, at the same instants t i . In this study, Y is the so-called measurement vector defined by (21).
Q is a weight matrix defined by (22).
where W is the inverse of the measurement error covariance matrix, (23). where y is the mean of measurement values for n samples which is a proper subset of the original data.

Computational solution of the optimization problem
We use the fmincon NLP solver in MATLAB to minimize the criterion (20). In general, fmincon solves the optimization problem of (25).
In the present parameter estimation problem, neither the inequality constraints nor the equality constraints are included. Thus, A, B, A eq and B eq are empty matrices. LB and UB are chosen respectively 10 time smaller and larger than the model parameters in Bernard's work Bernard et al. (2001).

Assumptions
We do not have measurement data of the inflow inorganic carbon. However, it is assumed that C equals to 0.0058 mol/L in a reactor fed with waste waster Bergland and Bakke (2016). The initial relation between concentration acidogens and methanoges bacteria is assumed as in Haugen et al. (2013): Note that during the simulation, X 1 and X 2 evolve individually and according to their material balances. The assumed initial states are shown in Table 2. Figure 4 shows the real inputs used as inputs to the simulation.
Analysis of the feed sludge taken at representative conditions shows S 2in = 82 mmol/L, Z in = 72 mmol/L, and pH in = 6.37. It is supposed that concentrations of VFA and alkalinity and pH in the feed sludge are constant during the simulation.
One of the parameters to be estimated is the alkalinity offset, Z dis , cf.19. Since Z in is not measured during the preparing the data and also to satisfy the alkalinity ratio threshold, Z dis must be estimated to have a reasonable and stable process.
We assume that the simulated AM2 model has model parameters as shown in Table 3. The initial model parameter values are according to Bernard et al. (2001).

Results of the parameter estimation
We decided to use simulated data over 100 days for the parameter estimation. The time-step of 0.1 d. The results of the parameter estimation are shown in Table  3. The table shows the results for noise-free measurements and for noisy measurements where a uniformly For the noise-free case, the parameter estimates are virtually equal to the parameter values used in the simulator, indicating that the parameter estimation procedure outlined in Section 3.1 works satisfactorily. There are however parameter offsets in the case of noisy simulated measurements, which is to be expected.

Evaluation of impact of measurement noise
Root-Mean-Square Error, RMSE, is a useful criterion to evaluate the relative quality of a parameter estimation procedure Muroi and Adachi (2015). RMSE is defined in (27).
where y p,i θ is model-predicted measurement and y r,i is assumed real (but here: simulated) measurement sample number i. Here, we use RMSE to evaluate the impact of assumed measurement noise on the parameter estimation. Table 4 shows RMSE for five simulated measurements. It indicates small differences between the simulated methane and carbon dioxide gas flows. Therefore, both biogas productions resemble quite well with the measured methane and carbon dioxide gas flows. RMSE for pH, S 2 and Z are also small with acceptable differences.   The mean and maximum of the measurement values are calculated based on the identification data set (150 days). The maximum difference between the estimated data and the simulated one in the identification period are mentioned in Table 5. Comparison between the maximum differences and the mean of the measurements indicates that the model error is acceptable.

Structural and practical identifiability
Identifiability analysis is used to assess whether the selected parameter estimation method will work. Two methods of identifiability analysis are presented in the following.

Structural identifiability
Structural identifiability, so-called theoretical identifiability, depends on the model and it is a property of model structure Zhang et al. (2010). The structural identifiability is based on a mathematical approach to show the correlations among the parameters for noiseless data under ideal conditions. For correlated parameters, a change in one parameter can be compensated by a change in another parameter, and the model fitness remain still satisfied. Therefore, structural identifiability is not sufficient to determine if the model parameters can be estimated uniquely for poor data Saccomani (2013).
Since there are 15 model parameters in the modified AM2 model to identify and 5 measurement values, algebraically evaluating the structural identifiability of the parameters is complicated. Also, since the goal of this study is model adaptation to practical data it is possible that parameters which are structurally identifiable are practically unidentifiable. Consequently, we do not focus on the structural identifiability in this study.

Practical identifiability
Practical identifiability depends on experimental condition and quality and the quantity of the measurement variables Petersen et al. (2001). The practical identifiability is a numerical method to determine whether it is possible to achieve a unique estimated parameter set from the available data Dochain (2013). The effect of a small deviation in each model parameter will be monitored to check the output sensitivity. In fact, the practical identifiability of model parameters is based on the output sensitivity function. The Fisher Information Matrix (FIM), and the Hessian matrix are used for this purpose. They are presented in the following subsections.

Identifiability analysis with Fisher Information Matrix
FIM is based on the sensitivity function, ∂y ∂θ , and the measurement accuracy, cf. (28).

FIM =
where Q is the weight matrix. The size of the FIM is p × p, where p is the number of the estimated parameters. FIM provides information about estimation accuracy. Practical identifiability analysis is a numerical approach which is performed by calculating the FIM locally at the estimated parameters. If the determinant of the FIM becomes zero, some sensitivity functions are linear combination of each other. Thus, the unique parameter values cannot be obtained from the measurement data. The parameters are practically identifiable when the determinant of FIM becomes nonzero Li et al. (2018) or, in other terms, the FIM rank becomes full and equals the model parameter number Komorowski et al. (2011)Petersen et al. (2001.

Quality of identifiability analysis
It is necessary to check the quality of identifiability to be certain about reliability of the estimated parameters.
One way to examine the quality of identifiability is investigation on convexity of the model at the estimated parameters based on condition number of the FIM. The condition number of the FIM shows whether the parameter estimation is in well-condition or illcondition.
Another way is investigation on sloppiness of model based on distribution of the eigenvalues of the FIM. Investigation on the eigenvalues of the FIM is useful to determine if the model is a so-called sloppy model. The concept of sloppiness was introduced in Brown and Sethna (2003) when the model output is not sensitive to changes in some of the model parameters which are called sloppy parameters. Meanwhile, stiff parameters are the model parameters that the model output is sensitive on their changes. If there is a clear gap between the eigenvalues of the FIM of model, it means there are some sloppy model parameters. The large eigenvalues are corresponding to stiff parameter and the small eigenvalues are corresponding to sloppy parameters. The existence of sloppy parameters causes the case that it is impossible to identify all model parameters uniquely Villaverde (2019). In fact, the small eigenvalues indicate that there are combinations among some model parameters.

Identifiability analysis for the case study
We compute numerically the FIM at the estimated model parameters with one percent changes. The determinant of the FIM becomes non-zero, so the FIM is full-rank matrix. The quality of identifiability analysis based in the convexity of the model at the estimated parameter set can be checked by computing the condition number of the FIM at this point. The identifiability of model parameters depends on the condition number of the FIM Cintrón-Arias et al. (2009).
The amount of the condition number of the FIM becomes 2.75 × 10 12 which presents an ill-condition. The eigenvalues of the FIM are as follows: The FIM has another outcome which is computing the standard deviation of the estimated parameters. The square root of the diagonal elements on the inverse matrix of the FIM is the standard deviation of the parameter estimation, SD in Table 6.

Identifiability analysis with Hessian matrix
The Hessian (matrix), H, of a function describes the local curvature or convexity or positive definiteness of a function. If all eigenvalues of the Hessian are positive, the function is convex. The larger eigenvalues of the Hessian, the stronger convexity. When the fmincon function is used for parameter estimation, the Hessian returned by the function expresses the identifiability of the parameters Rothenberg et al. (1971). A weak local identifiability occurs when eigenvalues of the Hessian matrix are positive but close to zero Little et al. (2010). For our parameter estimation problem using simulated measurements, all of the eigenvalues of the Hessian are positive indicating identifiability of the parameters. However, the Hessian condition number, which is the ratio of the highest eigenvalue to the lowest eigenvalue Strang et al. (1993), indicates the degree of identifiability. The condition number is calculated. Its high value of 1.8425×10 12 expresses that the parameter estimation problem is ill-conditioned, and hence, the degree of identifiability is low.

Model validation method
Since the quality of the parameter estimation depends on the amount and quality of the measurement data during the identification procedure, the performance of the model with the estimated model parameter should be evaluated in a wide range Dochain and Vanrolleghem (2001). The model validation is to determine if the model is good enough for the presenting the be- haviour of a biogas process Guidi (2008). Based on the estimated model parameters, we assess the behaviour of the process on the validation data set. The validation is based on data from 150 days from mid of April till October.

Sensitivity analysis
The sensitivity of the parameters expresses the reliability of model parameter estimation. We calculate the sensitivity of variable y to the parameters with (30) Bernard et al. (2001).
where y(θ, u, x 0 ) is the simulated measurement variable vector with the model parameter vector θ , the initial state values, x 0 and the input, u. Figure 5 shows the sensitivity of the methane and carbon dioxide production and pH and also on the amount of VFA and alkalinity to the yield coefficients which are part of the estimated model parameters.
According to the result of sensitivity plots in Figure  5, we may conclude: • S 2 is influenced by k 1 , k 2 and k 3 . The effect of k 1 is ignorable comparing to the effects of k 2 and k 3 .
• Z is not influenced by yield coefficients.  • CH 4 is influenced by k 1 , k 2 , k 3 , k 6 . The effects of k 1 and k 3 are ignorable comparing to the effects of k 2 and k 6 .
• CO 2 is influenced by all yield coefficients, but the most influences are regarding k 5 , k 4 and k 2 respectively.
• pH is influenced by all yield coefficients. k 4 , k 5 and k 6 have the most influences on pH. These effects are less than 2%.
Among the yield coefficients as model parameters, k 6 has a strong and direct effect on the methane production. pH and carbon dioxide production are influenced by k 6 changes but these influences are just around 1 % with regard to a 50% change in k 6 . Sensitivity analysis regarding the rest of the model parameters is discussed in Appendix A.

Model adaptation using experimental VEAS data
In this section, we estimate the AM2 model parameters using experimental data from the VEAS reactor. The parameter estimates minimize the optimization criterion (20). Two methods are considered for minimization, the gridding method, also known as the brute force method, implemented from scratch in MATLAB, and nonlinear programming (NLP) using the fmincon function in MATLAB. As the number of parameters to be estimated is 15, the gridding method is impractical because of an execution time of days based on an acceptable accuracy of the parameter estimation. Hence, fmincon function is chosen. Its execution time was a few minutes. The initial values of the model parameters are based on Bernard's work Bernard et al. (2001) and are shown in Table 8. The dilution rate, the reactor temperature and the feed sludge profile are depicted in Figure 4.

Parameter estimation and model validation
The parameter estimation is implemented on 1200 samples over 100 days. The estimated model parameters from the simulation are stated in Table 8. Both experimetal and simulated variables with the estimated parameter values are shown in Figure 6. Comparison between simulation results and measurements indicate that the AM2 model can reproduce the behaviour of the AD reactor in the VEAS for the pertinent period. To evaluate the model adaptation, RMSE is calculated for both the model identification and validation, see Tables 9 and 10.

Identifiability analysis
The RMSE values shown in Tables 9 and 10 indicate that the model adaptation is acceptable. As an alternative approach to identifiability analysis, Table 11 shows the FIM and the Hessian. The condition numbers of both the FIM and the Hessian are very large, indicating that the model parameters are hardly identifiable using the given data set. Which of the parameters which are not identifiable, however, can not be identified.

Discussion
The modified AM2 model which has been adapted to the VEAS reactor is a relatively simple model compared with ADM1 model. Comparing with Hill's model, this model contains the pH, CO 2 , alkalinity and inorganic carbon. Considering alkalinity is important to calculate alkalinity ratio and monitor impact on the stability and performance quality. The modified AM2 model is assumed to be a sufficient accurate model to represent the behaviour of the biogas process.
In the original AM2 model, the impact of temperature is not considered. In addition, it was assumed that the effective reactor volume is constant. Temperature and volume variations are taken into account in the modified AM2 model.
In our study, the model adaptation is an optimization problem to minimize the prediction error. Estimation of model parameters is implemented for both a simulator, where of course the true parameter values are known, and for a real VEAS plant from which we have experimental data. In the simulation-based model adaptation, the effect on measurement noise and uncertainties are investigated. Regard to identifibility analysis based on FIM and Hessian matrix, the model identification is in ill-condition and it shows the selective data are hardly sufficiently informative to estimate reliably the model parameters.
The sensitivity analysis states which model parameters have the most effect on measurements such as the biogas production.
In this study, collected data is not enough informative to estimate the model parameters and there has been lack of data in the inflow sludge. In the next project, first of all, the characteristics of the sludge in the VEAS including the concentration of the influent VFA, Alkalinity and inorganic carbon should be measured during preparing the data. Uncertain measurement of the influent alkalinity, Z in implies that we had to add a new parameter, Z dis , to compensate alkalinity disturbance and also to satisfy the AR condition to have a stable performance.
Each model parameter has specific physical concept. It is possible to measure and analyze some of them indirectly. For example, k La as a model parameter, can be identified by measuring pH, C, flow rate and partial pressure of CO 2 at steady state Bernard et al. (2001). Specially, k La is an unidentifiable parameter because of a low sensitivity on the measurement variable. In this study, we could not compute this parameter because of the lack of experimental measurements.
Volatile Suspended Solids, VSS is an informative parameter consisting of information about concentration of acidogenic and methanogenic bacteria. VSS helps us to be able to track the behaviour of X 1 and X 2 more accurately.
Study on temperature dependence based on the Hoff-Arrhenius model is suggested. It should be compared with the Hashimoto temperature model.

Conclusions
In this study, we have adapted on the AM2 model to both a simulated biogas reactor and to a real reactor at the VEAS plant. The biogas process has some experimental limitation such as alkalinity ratio. We state the model adaptation as an optimization problem. To have reliable identified model parameter set, we need to analyze the identifiability to be sure that there is just one unique set. We focused on the practical identifiability analysis. Investigation on convexity and sensitivity and sloppiness are carried out around the estimated points. The Hessian matrix and the Fisher Information Matrix provide a good information which is useful for a practical identifiability analysis. In spite of the acceptable model error in the considered period, the collected data is hardly not enough informative to obtain reliable model parameters for the modified AM2 model.

A. Sensitivity analysis
Here, there is a discussion about the sensitivity of the AM2 model based on the kinetics model parameters, α, Hashimoto's factor and Z dis . Figure 7 shows that the sensitivity of the methane and carbon dioxide production and pH and also on the amount of the VFA and alkalinity concentrations to the kinetic model parameters.
Based on Figure 7, we conclude: • S 2 is influenced by µ 1max , µ 2max , K S1 , K S2 and K I2 . The effects of µ 2max and K S2 are remarkable comparing of the effects of the rest of the kinetic parameters.
• Z is not influenced by the kinetic parameters.
• k La , K S1 and K I2 have a little influence on the simulated model measurement variables.   Figure 8 shows that the sensitivity of the methane and carbon dioxide production and pH and also on the amount of VFA and alkalinity concentrations to α, c H and Z dis . The results of sensitivity analysis shows: • S 2 is influenced by α, c H . α has a direct effect on VFA while as c H has reverse effect on VFA. Their effects are 20% and 2%, respectively.
• Z is influenced by Z dis . There is no effect on alkalinity by changing in α or c H .
• CH 4 is negligibly influenced by α, c H .
• CO 2 is influenced by α, c H and Z dis .
• pH is influenced by α, c H and Z dis . The effects of α and c H are ignorable comparing the effect of Z dis on the pH. Table 12 shows the percentage of the each individual model parameter effect on the each measurement variable with regard to 50% changes in each model parameter.
This table shows that K s1 , K I2 , k La have ignorable effects on the all five measurements so we can assume these model parameters are unidentifiable based on the collected data.

Nomenclature
The nomenclature is in alphabetical order.

Abbreviations
The abbreviations are in alphabetical order.