Parameter Estimation for a Gas Lifting Oil Well Model Using Bayes’ Rule and the MetropolisHastings Algorithm

Oil well models are frequently used in the oil production process. Estimation of unknown parameters of these models has long been a question of great interest in the oil industry ﬁeld. Data collected from an oil well system can be useful for identifying and characterizing the parameters in the corresponding model. In this article, we present a solution to estimate the parameters and uncertainty of a gas lifting oil well model by designing Bayesian inference and using the Metropolis-Hastings algorithm. To present and evaluate the estimation, the performance of the chains and the distributions of the parameters were shown, followed by posterior predictive distributions and sensitivity analysis. Compared with the conventional maximum likelihood estimation methods that tried to identify one optimum value for each parameter, more information of the parameters is obtained by using the proposed model. The insights gained from this study can beneﬁt the optimization and advanced control for the oil well operation.


Introduction
Gas lifting is one of the dominant methods for oil extraction, especially for light oil. A simplified diagram for a gas lifting well is shown in Fig. 1. A subsurface reservoir is filled with hydrocarbons, and the well into the reservoir helps extract hydrocarbons from the reservoir. In a gas lifted process, a compressor injects gas to a distribution manifold connecting to the outer pipe of the tubing, or annulus. The gas from the annulus passes to tubing, mixing with the oil and water coming from the reservoir to produce a low-density fluid that flows out of the well.
The pressure difference between the reservoir and bottom hole is critical for oil production. The density reduction of the multi-phase fluid in the tubing helps to increase the pressure difference. This is the main reason that gas is injected through the tubing. The injected gas also reduces the flow resistance Brown (1977), which also benefits pushing the oil from the reservoir towards the gathering manifold. The density of the fluid in the tubing plays an important role in the gas lifting model. After producing the fluid from the well, it flows to the separator where the gas, oil and water are separated. The final stage of the flow process is recirculating the separated gas to the compressor for the next production cycle.
Modelling is essential for describing the oil production process and benefits the oil production operations. With analysis of physical properties, mathematical representation is developed as the model of the system. An accurate model has a pivotal role in optimization and advanced control, especially model-based control strategies such as Model Predictive Control. Parameter estimation has long been a question of great interest in the modelling fields. Apart from parameters of models, initial states and inputs are also common targets to be identified in the modelling step, which also demand similar methods. Therefore, the meaning of parameter estimation has been broadened to refer to evaluation of unknown constants, which embodies a multitude of inputs and initial states in systems. It is well established in a variety of studies, that those parameters, initial states and inputs are constants all the time or in a certain period of time.
Statistical methods are commonly used for estimating model parameters. Among these methods, Bayesian estimation Lindley (1961); Jeffreys (1935) has recently been applied to estimate parameters of models in different engineering applications, such as internal combustion engines Khaliullin et al. (2019), flow meters Gyamfi et al. (2018), power grids Hu et al. (2021) and so on. One of the main advantages of Bayesian estimation is that it offers an explicit way to include engineering knowledge of the parameters in the estimation process, and provides effective ways of addressing uncertainty. However, for complex models such like many oil well models, Bayes estimation requires intensive computations Cumming and Goldstein (2010).
Early research publications paid particular attention to simplifying the original Bayes analysis to decrease the computational cost. Bayes linear approach is one of the methods, where the estimation (mean) and its beliefs (variance) are linearly updated using the outputs from the simulator without producing the posterior distribution Craig et al. (1996). The update process is also called linear fitting. The acceptance of outputs is vital for the estimation. However, the criteria of outputs acceptance is difficult to design and requires empirical knowledge.
Based on the previous work, Craig et al. improved and identified a region of input values Craig et al. (1997). This input collection provided adequate belief and information of the uncertainty, compared with the traditional maximum likelihood estimation method which only tried to find one single value of an estimation. Resulting from the limitation of computation development, this method needs to compromise the time of iterations. Joint prior was chosen and only prior means, variances and covariances between all quantities of interest were used for beliefs update. Although this method provides a solution to balance computational cost and benefit of accurate calculation, the design of joint prior and the selection of inputs are complex. Lack of prior exploration also might lead to insufficient input collection.
Another research study Busby et al. (2007) designed emulators to evaluate the uncertainty of a reservoir. Combining fractional factorial designs and sequential designs, random processes with mean and variance of posterior were used to describe emulators, so that the number of simulations was required as small as possible. Numerical results from two-dimensional and highdimensional input were presented. Because of a small number of simulations, this study might not be able to explore the whole input region. The results might be the local optimization, resulting from less exploration and comparison.
Based on multiscale computer experiments, an oil reservoir model evaluation was simplified using Bayes linear uncertainty analysis Cumming and Goldstein (2010). Similar to the Bayes linear approach, the probability in the conventional Bayes analysis is substituted by expectation in the Bayes linear uncertainty analysis. To begin the process, the evaluations of the coarse simulator were performed on different wells. Once the emulation of the coarse simulator is completed, coarse and accurate emulators were combined to update the prior. Meanwhile, the values of the best model input were addressed by the history matching process. Following the history matching, the study reduced the set of possible values of the model input and added time points for forecasting. The final stage of approximation provided a means of applying Bayesian analysis to a complex model, but it cannot identify and characterize accurate uncertainty for the individual parameter.
Nowadays the increase in computational speed and the development of more efficient algorithms make Bayesian technique accessible to solve large scale, complex problems, but it is still not well known by many engineers practitioners. There is a relatively small number of publications applying Bayesian statistical methods to the oil industry for improving the uncertainty analysis of well systems. One study by Maraggi et al. Maraggi et al. (2020) chose a Bayesian approach to estimate two parameters of tight oil reservoirs, which was simplified as a single dimensionless equation. Following an adaptative Markov Chain Monte Carlo (MCMC) algorithm, posterior predictive checks were adopted to examine the inferences. Uncertainty of the estimated ultimate recovery was addressed in the study. The convergence of the chains, acceptance ratio, posterior distributions, correlation between posterior parameters, the reliability plot and posterior predictive checks were presented to evaluate the approach.
Another recent study Costa et al. (2021) applied a similar method to an electric submersible pump system. With the inputs as prior, Bayesian inference and the MCMC methods were adopted for parameter estimation. After estimating five parameters, validation and cross-validation were deployed using two sets of data to examine the model. Dynamic and steady-state uncertainty of the model were obtained by probability density function using uncertainty assessment JCGM (2008). Autocorrelation was used to evaluate the samples and sensitivity analysis was employed for capturing the region of convergence of the likelihood function. It would be better if the work provided the plots of multiple chains to proof the reliable parameter estimation.
Parameters in the real-life system are in certain ranges Narasimhan and Jordache (1999). Classic methods such as maximum likelihood and least squares, aim to find the best or "optimal" estimation of the parameters that describe the data, but they are difficult for assessing and quantifying the parameter uncertainty. An approximating distribution of parameters might benefit us with more insight into a system, such as quantification of the uncertainty of the estimated parameters.
In this work, we estimated the parameters of a gas lifting oil well dynamic model, considering that the main source of errors is Gaussian white noise on the measurements. The process noise is not included in this work. The purpose of this study is to obtain information from a first-principle model and data to provide estimated parameters and the related uncertainty. Bayes' Rule Lambert (2018) was used for inference and MCMC Kruschke (2014) was employed to approximate the posterior parameter distributions. The work can be the foundation of the case where oil fields contain multiple wells and it provides an important solution to advance the understanding of the uncertainty in a complex system. The paper has been organised in the following way. The first section gives a brief overview of the recent history of Bayesian parameter estimation, especially its application in the oil industry. It will then go on to present more details of the gas lifting oil well model which is used in this work. The third section is concerned with the methodology adopted for this study. Section four briefly explain the process of the experiments. The remaining part of the paper analyses the results of the experiments and discuss the possible reasons for the issues in the experiments and the limitations of the current study.

Gas Lifting Oil Well Model
The gas lifting oil well model used in this work is based on a previous work Sharma et al. (2011). Including more details of an oil well model and states variation helps describe the dynamics of the gas lifting oil well. However, unnecessary details increase computational costs. Here we set out to improve the model considering the trade-off between simplicity and accuracy. Our experiments focus on the short-term dynamics of a gas lifting oil well. We assume the reservoir pressure and compressibility factor are constant in our model, owning to the small variation in the data range and simplicity of the model, . Compared with gravity, the influence of friction is small, so we ignore the pressure drop due to friction in this model. Some states are calculated and some of them are measured.  Figure 1: The components of a gas lifting oil well: the red arrows show the gas flow and black arrows present the liquid phase flow, which can contain oil and water. The mass and flow rate are depicted beside the corresponding components where these states occur in the gas lifted oil well.

Gas injec�on valve
A schematic diagram of a gas lifting oil well is shown in Fig. 1. In this process, valves are utilized to control the flow rates. The gas lift choke valve is located between the gas distribution manifold and annulus is used to control the amount of lift gas injected into the well. The production choke valve, which can be used to control the production, is the only valve which multi-phase fluid flows through and it connects gathering manifold and tubing. In the injection point, a gas injection valve is placed between the annulus and the tubing to connect the two parts. Since the gas lifting oil well here is continuous in this study, the gas injection valve is fully opened and the lift gas is continuously injected from the annulus into the tubing. At the downstream of the production choke valve, a multiphase flow meter is installed to measure the flow rate of oil, gas and water. After separating the components by the separator, part of the gas is used as the lifting gas. Here we focus on the mass balance of the oil well and the gas distribution manifold. As the preceding explanation, reduction of bottom hole pressure due to gas injection typically increases oil production rate because gas injection lightens the fluid in tubing. However, injecting excessive gas increases the bottom hole pressure, which leads to a decrease in the oil production rate. This happened because excessive gas injection rate results in slippage, where the gas phase moves faster than the liquid, leaving the liquid phase behind. In this case, less amount of liquid will flow through the tubing. As the gas injection rate increases, the oil production rate rises to the maximum and decrease afterward. Model-based control contributes to the optimization of gas injection rate and then helps yield the maximum oil production rate but the control algorithm relies on the accurate model description. The first principle model is not enough to precisely describe a gas lifting oil well system. Data collected from the system is particularly useful in estimating parameters in the model.
The input of the model is the valve opening of the gas lift choke valve, u. The production choke valve is assumed fully opened during the whole experiment. As one of the parameters to be estimated, the water cut, WC, is the volume of water produced compared to the volume of total liquids produced from an oil well. Other parameters in the model to be estimated are gas to oil ratio, GOR, and the productivity index, PI [ kg/hr bar ]. GOR is the mass ratio of produced gas to produced liquid (oil and water). PI is a mathematical means of expressing the ability of a reservoir to deliver fluids to the wellbore. All these parameters are dimensionless.
There are seven measurements: w ga , w gp , w op , w wp , P wf , P wh , P a in this model, which are output 1 to 7. Flow meters measure the flow rate of the gas in different places, including w ga and w gp . Also, the flow rate of the liquid, including w op and w wp . Flow measurements are impacted by bubbles and are not reliable as the data such as temperature and pressure. Pressure transmitters are used to measure the bottom hole pressure or well flow pressure (P wf ), in the tubing upstream the production choke valve (P wh ), and in the annulus downstream the lift gas choke valve (P a ). We assume the pressure of the gas in the gas distribution pipeline (P c ) is a constant of 200 [bar]. In the production process, temperature sensors detect the temperature in the gas distribution pipeline, the annulus and the tubing.
Because of the small difference between these temperatures, we assumed the temperature is constant everywhere and all these temperatures are assumed to be equal to T [K]. The corresponding notation table is shown in Appendix 1.
Considering the principle of the gas lifted oil well, its model is designed based on the mass balance of three states: the mass of gas in the annulus m ga , in the tubing above injection point m gt and the mass of liquid in the tubing above injection m lt , which is mainly present as Eq. (1) , where w ga is the flow rate of the gas through the gas lift choke valve which is injected into the annulus. The flow rates of the lift gas from the annulus and reservoir to the tubing are w ginj and w gr respectively. The flow rate of produced gas through the production choke valve is presented as w gp . w lr and w lp are the liquid phase flow from the reservoir into the well and through the production choke valve, respectively. These flows are marked in the corresponding position in the flow chart Fig. 2. Auxiliary equations are shown in Appendix 2. Fig. 3 describes the process where parameters are estimated using Bayes' rule and MCMC algorithm. To be-gin this process, Bayesian inference needs to be carried out as the first step. Once the proposed parameters are generated according to the distribution of prior, the posterior can be calculated based on the inference. After comparing the current posterior with the previous one, MCMC decides to accept or reject the parameters. On the long run, the distribution of the accepted samples converges to the true posterior distribution. The final stage of the study is checking the posterior predictive distribution to predict the real output.

Bayes' Rule
In this gas lifting oil well model, there is one input and seven measurements. Here are the notations that are used in Bayes' Rule. θ: the model parameters, which is presented as a 1×3 array: [PI, GRO, WC].
u i : there are n inputs in the system. We used i to count them. The input vector can be presented as a 1×n array. y m(i,j) : measurements of the outputs. In this system, there are 7 outputs: w ga , w gp , w op , w wp , P wf , P wh , P a , which are denoted by j = 1, 2, 3...7. Each of these outputs contains n samples. Therefore, the measurements can be presented as a 7 × n matrix.
y e(i,j) : the estimated outputs, namely the outputs of the model given the input and the parameters. Similarly to the measured output, each estimated output contains the number of n samples. The estimated output is a 7 × n matrix. The model that we used for the estimation process does not contain noise.

Prior
The GOR and WC parameters represent ratios, so they are between 0 and 1. The order of magnitude of the parameter PI is also known. Therefore, the following prior distributions for these parameters were chosen:

Likelihood
We assume the measurement contains white noise which is normally distributed with mean 0 and it can be present as Eq. (3).
The errors for each sample y m(i,−) are independent and the standard variation of the errors for the samples of a given output is considered to be the same, namely In the case where the standard deviation is known (for example from the specifications of the sensors), the likelihood is the product of the probability of the samples of each output: When the standard deviation σ j is unknown, the likelihood should be averaged over all possible values of the model parameters. This process is known as marginalization and the marginal likelihood is given by: The noise is not impacted by the model, so P (σ j | y ej (θ, u i )) = P (σ j ) ∝ 1 σj , where 1 σj is the Jeffreys prior for the standard deviation when σ > 0. (7) , namely the likelihood for the j output is calculated as:

Posterior
According to the Bayes' Rule, the posterior is calculated by prior and likelihood: P (θ | y m , u i , y e ) ∝ P (y m | θ, u i , y e ) P (θ | u i , y e ) (9) The logarithms of likelihood is used here to avoid underflow during the computation. It is difficult to graph the likelihood because it is often a tiny number. With the logarithms of likelihood, the large products become sums and it is easier to plot.
When the standard deviation is known, the log of the posterior is In terms of the cases where the standard deviation is unknown, since the likelihood and prior are followed to Gaussian distribution and uniform distribution individually, the posterior is obeyed to student-t distribution Sivia and Skilling (2006). Considering the various inputs and multiply outputs, the distribution of the posterior is , where S j is the standard deviation of corresponding measured output. With the measurement during steady state, the standard deviation can be calculated as Eq.(12).
The log of posterior can be calculated as:

Markov Chain Monte Carlo
The distribution of the posterior provides vital information for helping tackle the optimal parameters. In this study, the posterior is a three-dimensional space which is related to the joint distribution of combinations of PI, GRO, WC. Considering the complexity of the posterior, the distribution can not be adequately represented by a simple distribution, such as Binomial Distribution, Normal distribution, Cauchy distribution and so on.
Algorithm 1: MetropolisHastings algorithm Initialisation of θ 0 ; N i ; q Calculate ln P (y t m |θ t , u i , y ej ) = ln P (y 0 |θ 0 , u i , y ej ) for iteration = 1,2,...,N i do Generate samples θ * n ∼ Beta(q * θ t n , q * (1 −θ t n )); finn Generate a random constant a ∼ U (0, 1); Calculate ln P (y * m |θ * n , u i , y ej ); if ln a < ln P (y * m |θ * n ,ui,yej ) P (y t m |θ t n ,ui,yej ) + 3 j=1 ln P (θ t n,j |θ * n,j ) P (θ * n,j |θ t n,j ) then if θ * n = 0 then θ t n = θ * n ; end update ln P (y t m |θ t n , u i , y ej ); end Save θ t n ; end Random walk Metropolis-Hastings algorithm Metropolis et al. (1953);Hastings (1970) is a wellknown MCMC method for indirect sampling from a distribution. The MCMC algorithm used in this work is shown in Algorithm 1. Considering the prior is in a certain range, the samples were randomly drew from these ranges as initialization. During the iteration, we utilized proposed beta distributions to generate a sequence of random samples. Since the three parameters are independent, three beta distributions, Beta(q * θ t n , q * (1 −θ t n )), were used and tuned individually. q is tuned to adjust the distribution. The benefit of the modification is that all samples satisfy the prior constraints by adjusting from the beta distribution rangeθ t n ∈ [0, 1] to prior θ t n range. Compared with normal distribution and gamma distribution, beta distribution generates more workable samples.
During random walk, Beta distribution is adopted here as the proposal distribution, which is recentered at θ t−1 n during each step. Some of the Beta distributions used here are skew. The mean value and the mode of the Beta distribution do not coincide. Therefore, the random walk does not satisfy the symmetry requirement. Owning to the asymmetry of the proposal distribution, the correction term 3 j=1 ln P (θ t n,j |θ * n,j ) P (θ * n,j |θ t n,j ) was needed to add Holder et al. (2005).
In the algorithm, the criteria of accepting the proposed parameter is that the posterior of the proposed parameter is larger than the posterior of the previous parameter. To avoid being stuck in a local optimum, we drew a random number between 0 and 1 and compared with the division between previous posterior and the current posterior.
In the case where the standard deviation is known, the calculation can be simplified. Only the part: After the number of N i iterations, we get a list of accepted samples which form a chain. We can check how the chain explore the parameter space by plotting these samples. The distribution of these samples can be used to approximate the distribution of the marginal posterior distribution of the parameters.

Posterior Predictive Distribution
Once the parameter estimation is completed, we can use the oil well model with the estimated parameters to estimate other outputs of the system. After running the MCMC algorithm, we get a group of samples from the accept list, which approximately shows the distribution of parameters θ ∼ P (θ | y m , y e , u i ). Following cut out of the burn-in period of the chain, outputs of the model y p with estimated parameters and inputs are simulated, namely we get P (y p | θ, u i , y e , y m ). Though the iterations of the MCMC algorithm are limited, the distribution of these outputs in each sampling time can generally present where the output locates. Then we can calculate the posterior predictive distribution as: P (y p | u i , y e , y m ) = P (y p , θ | u i , y e , y m ) dθ = P (y p | θ, u i , y e , y m )P (θ | y m , y e , u i ) dθ = P (y p | y e (θ, u i ), y m )P (θ | y m , u i , y e ) dθ (14) We assumed the noise in the measurement follows a normal distribution. Therefore, posterior predictive distribution follows the student-t distribution with (n− 1) degree of freedom.
By calculating a large number of y pj , we get the posterior predictive distribution. The accuracy and efficiency of the parameter estimation process can be evaluated by comparing posterior predictive distribution and measurements.

Experiments
In this project, we used MATLAB as our experimental platform. The work began by generating data from a simulator with preset parameters, and then Gaussian noise was added to the outputs. The preset parameters are W C = 0.18, P I = 2.4 × 10 4 , GOR = 0.15. The preset parameters were only unveiled after the experiments for evaluating the estimation. The information of noise is not accessible during the experiments. It will then go on to estimate the parameters and the uncertainty of the model. The first step in the MCMC algorithm was to generate samples. Here we use beta distribution to propose samples from the parameter space, as shown in Section 3.2. The value of GOR and WC can be used for shape parameters directly, since they are between 0 and 1. PI is between 10 4 to 10 5 , so PI needs to be transferred to [0, 1] before being used as the shape parameter. To find proper q in beta distribution. A series of experiments were run with q = 1, 10, 100, 1000, 10000, 300 was selected after comparing the results, so that proposed samples can reach relatively far area and can also converge to an optimised value.
Because the proposed parameters are impacted by current samples and the shape parameter in beta distribution should be positive, a zero value would lead to a nonsense beta distribution in the next iteration. Therefore a conditional statement is developed, which helps keep the shape parameter as the current value.
With proposed parameters and inputs, the estimated outputs are simulated by the model. Then, the corresponding posterior can be calculated. Once the calculation is completed, the MCMC algorithm moves to the decision stage by comparing the posterior corresponding to proposed parameters and current parameters.
For each MCMC iteration, differential equations in the oil well model need to be solved. Therefore, it takes a long time to run a couple of chains with a large number of iterations, for example, a chain with 1000 iterations for the outputs which contain 3600 samples takes more than 20 minutes with the processor Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz 2.59 GHz. To increase the computation efficiency, the number of measurement data and iterations of the chains need to be chosen exquisitely.
To balance the quality and quantity of the samples, we choose more samples when the inputs change and fewer samples during the steady-state to gain more information from the data. In this work, we run 10 chains and each chain contains 2000 iterations to identify the parameters using data with 20 samples. The number of data samples is not large and the production of normal distribution and t-distribution is similar. Therefore, the results for the case with known variation and unknown variation are approximately similar. The MCMC algorithm can estimate parameters in both cases and chains converge at a similar speed. We will present one of the results in the next section. All codes and resources in this study have been uploaded to Github 1 .

Result
In this section, we will present and analyse the results of experiments in various ways. Reject rate and autocorrelation offer an effective way of checking the efficiency of the chains. To gain insights into every parameter in the estimation process, we drew figures of individual parameter distribution and Markov Chains. The interaction between every two parameters is shown in scatter plots and contour plots. Posterior predictive distribution plots provide the results of the accuracy of the estimation. The impacts of parameters are compared by sensitivity analysis.

Autocorrelation
Autocorrelation offers further in-depth information on the efficiency of the chains, which benefits us to select the step length of MCMC. Furthermore, the estimations of parameters are impacted by the correlation between the samples. We expect independent sampling in our experiments or at least sampling with low autocorrelation. However, according to the Metropolis-Hastings algorithm, every proposed value is based on the previous value. It is impossible to make every step independent from others, while proper step length contributes to an effective sample size. The autocorrelation plot displays the dependence structure of the chain. Fig. 4 illustrates the chain autocorrelation after cutting off the first 1000 samples where the chain has not converged. The correlogram shows the correlations decreases as the lag increases.

Parameter Distribution and Chains
In order to test the influence of the random initial value of the chain and check the explored range of the estimated parameter distribution, we present the plots of the chains and the distribution for each parameter, shown in Fig.5.
The first row provides information on the converging speed of the chains. The burn-in period of these chains is less than 1000 for both GOR and P I, while the chain of W C took a longer time to converge to a certain range. The end of the chains overlap each other and fairly smoothly meander around the optimal estimated value.
The second column illustrates histograms of the parameter from the accepted lists of all chains. According to these plots, all chains converge to around the preset values after exploring the whole prior range. All the chains for the same parameters converge to a similar region and mix well, which indicates the convergence is achieved.

Reject Rate
When running MCMC experiments, we recorded the number of reject decisions during iterations. Following divided by the number of iterations, the reject rate is calculated. The reject rate shows how many proposed samples were rejected when the chain moved. The re-   Figure 6: Estimation of parameters with 2D plots in scatter plots and contour plots. The contour plots were drawn in the range with minimum and maximum values. The converge parts were zoomed in to present the contour clearly.
ject rate in our experiments is around 94.8%.

2D Parameters Distributions
Three scatter plots are shown in Fig.6 to present the distribution of each two parameters. These plots confirm the conclusion in Fig.5. The contour plots in the second column demonstrate more details in the converging area.

Posterior Predictive Distributions
According to Section 3.2, we calculated y p . There are 1000 samples of y p for each sampling time. The quantiles of these samples were chosen and presented with the measurement and the output of the model with preset parameters in Fig.7. Most posterior predictive distributions are around the real outputs, especially the first dynamic stage. For output 1 and 7, the posterior predictive distributions tackled the first dynamic change and parted from the real output after the second input change. Compared  Figure 7: Posterior predictive distribution: In each plot, dash lines are quantiles of the posterior predictive and the purple dash line is the median. The solid orange lines are the data that contains noise. To compare with the real outputs, we also drew the outputs of the system with preset parameters. The real outputs do not contain noise, which are presented as blue solid lines.
with other posterior predictive distributions, the one for output 4 is wider and the measurement stays within the quantile range. For output 3, the posterior predictive distributions are slightly above the measurement.

Sensitivity Analysis
To identify the impact of each parameter on the outputs, we increased and decreased the parameters within ranges and checked the outputs. As shown in Fig. 8, WC does not impact on output 1,2,5,6,7 a lot. If PI changes tremendously, it will have an effect on all outputs, apart from output 4 compared with two other parameters. GOR also contributes to the changes of outputs, especially output 2 and output 6, compared with the other parameters.

Discussion
• Proper step length plays a vital role in the Metropolis-Hastings algorithm Kruschke (2014). We used beta distribution in MCMC to propose new parameters. The parameters in beta distribution impact the step length of MCMC. For example, the first shape parameter in the beta distribution is related to the previous accept value. However, the mode of the beta distribution is not exactly the same as the first shape parameter. When q is large, the shape of the beta distribution is sharp and closer to the previous accept value. However, the large shape parameter leads to small walking steps, which causes a longer time for the chain to explore a wide area. Therefore, tuning the first shape parameter is a trade-off between the relevance of the previous step and the walking speed.
• According to the result of autocorrelation, the effective sample size still can be improved by tuning the step length of MCMC. Furthermore, the model here is a gas lifting oil well model only with boundaries of parameters. For a specific system, the empirical engineering knowledge might make a significant contribution to the prior, which further benefit the efficiency of the chains.

Conclusion
This study set out to estimate three parameters and their uncertainty of a gas lifting oil well model. With Bayesian interference and Metropolis-Hastings algorithm, the proposed solution was able to identify the real values of the parameters and present their distributions. Based on the estimation, we demonstrated posterior predictive distributions and sensitivity analysis, which has gone some way towards enhancing our understanding of the true value of the outputs of the system and the impact of the parameters. The model with estimated parameters can benefit optimization and advanced control in oil production. The work here can be considered as the first step of the improvement of a real-life gas lifting oil well model. A further study could apply this solution to an oil well system with real data.
Valve characteristic as a function of its opening for the lift gas choke valve is shown in Eq (1). is the input of the model, the valve opening and is the valve characteristic as a function of valve opening. Both notations are dimensionless.
Density of gas in the gas distribution pipeline: Density of gas in the annulus: We assume the mixture in the tubing is homogeneous, and the density is constant everywhere. and are the mass of gas and liquid respectively above the injection point. The density of the mixture of liquid and gas in the tubing above the injection point: Pressure of gas in the annulus downstream the lift gas choke valve is presented in Eq. (6), where is a compressibility factor. We regard = 0.6816 as a constant at 170 bar.
= . (5) Pressure of gas in the annulus upstream the gas injection valve: Gas expansion factor in the gas lift choke valve: Density of the liquid based on the water cut: The volume of gas present in the tubing above the gas injection point: Pressure in the tubing downstream the gas injection valve is depicted in Eq. (10), where is a compressibility factor. We regard = 0.6741 as a constant at 150 bar.
Pressure in the tubing upstream the production choke valve: The bottom hole pressure or well flow pressure: Gas expansion factor in the gas injection valve: Gas expansion factor in the production choke valve: The mass flow rate of the gas through the gas lift choke valve is shown in Eq. (15), where 6 is valve constant and used to calculate the flow through the valve based on the pressure gradient: The mass flow rate of the gas injected into the tubing from the annulus: The mass flow rate of the liquid from the reservoir: The mass flow rate of gas from the reservoir: = The mass flow rate of the mixture of gas and liquid from the production choke valve: = ̅ 6 ( 2 ) 3 √ max( ℎ − , 0) Mass flow rate of gas through the production choke valve: Mass flow rate of liquid through the production choke valve: Oil compartment mass flow rate from liquid product considering water cut: Water compartment mass flow rate from liquid product considering water cut: