Back to Parameter Estimation in Engineering and Science Home
Table of Contents
. ... CHAPTER 1 MINIMIZATION OF SUM OF SQUARES FlINCflONS C HAPTER 8___________________ a re linearly dependent ror a certain combination or the parameter values. (Hint: write the sensitivities in terms or z - fJ,t / fJ2 a nd R - fJl fJ2/ fJ,·! W hat c an you conclude rrom this linear dependence? What new parameters could be selected to eliminate the linear dependence mentioned above? 7.19 7.20 D ESIGN O F O PTIMAL EXPERIMENTS R epeat Problem 6.26 ror the model U sing the d ata in Table 7.14 between 3.6 a nd 18.0 sec ror temperat~re histories rrom thermocouple I (which is a t x - L) a nd thermocouple 5 (which is a t x - 0) estimate k a nd I I in the model given by (8.5.25) with T o- 81.66°F, L - I / 12 r. a nd q - 2.67 B tu/rt2-sec. Let t in (8 .5.25) be the times i~ T able 7.14 minus 3.3. In o ther words. 3.6 sec in Table 7.14 corresponds to time 0.3 sec in (8 .5.25). Use as initial estimates k - 40 B tu/hr-ft- of a nd I I - I f tl/hr. (Be carerul with units.) Use OLS with the Gauss o r o ther method. 7.21 R epeat Problem 7.20 b ut use the average or temperatures I. 2. 3. a nd 4 instead or 1 a nd the average or 5. 6. 7. a nd 8 instead of 5. 7.n Derive (7 .5.16) and (7.S.17). 7.23 Derive (7.8.18). 7.14 Verify the sensitivity coerricient values given in Fig. 7.8 using the approximate equation. (7.10. 1). A programmable calculator o r c omputer should be used. Investigate using values of 6bJ -rbJ where f is equal to (a) 0.01, (b) 0.001. a nd (c) 0.0001. 8.1 I NTRODUcnON Carefully designed experiments c an result in greatly increased accuracy o f the estimates. T his h as been demonstrated by various authors, b ut special mention should be made o f the work o f G . E. P. Box and collaborators. See, for example, Box a nd Lucas ( I) a nd Box a nd H unter (2). An important work o n o ptimal experiments is the book by Fedorov (3). I n m any areas o f research, great nexibiJity is p ermitted in the proposed experiments. This is particularly true with the present ready accessibility of large-scale digital computers for analysis of the d ata a nd a utomatic digital d ata acquisition equipment for obtaining the data. This means that transient, complex experiments c an be performed that involve numerous measurements for many sensors. With this great flexibility comes the opportunity o f designing experiments to obtain the greatest precision o f the required parameters. A common measure o f the precision o f a n e stimator is its variance; the smaller the variance, the greater the precision. Information regarding the variances a nd covariances is included in determination o f c onfidence regions. We shall utilize the minimum confidence region to provide a basis for the design o f experiments having minimum variance estimators. T he design o f o ptimal experiments is complicated by the necessity o f a dding practical considerations a nd constraints. T he best design for a ..19 CHAPTER 8 DESIGN O F OPTIMAL EXPERIMENTS 410 particular case, for example, might have certain unique restrictions on the dependent variable vector 1) o r o n the independent variables such as time o r position. The optimal design problem involves two parts: ( I) the determination of an objective function together with its constraints a nd (2) the extremization of the objective function. When we say that we desire to find the optimal experiment, we wish to determine the conditions under which each observation should be taken in order to extremize a certain optimal criterion. For example, the best duration of the experiment may be needed or the optimal placement o f sensors may be required. In cases involving partial differential equations the optimal boundary a nd initial conditions may also be needed. Many of these cases are illustrated in subsequent sections. In most of this chapter it is assumed that the form of the model is k nown although it contains unknown parameters. I f a form in terms of a finite number of parameters is not known, the search for an optimal strategy may be quite different. This involves discrimination, which is discussed in Section S.9. 8.2 411 T his case illustrates the necessity o f certain practical constraints. T he maximum value o f IX I m ust be finite if 1,.,1 is to be finite. Next it must be decided what constraints, if any, a re to b e placed o n the measurements in a fixed X range. I f there are none, the optimal solution is to concentrate all the measurements a t the maximum I XI. I n other cases where X is a function o f time, measurements a t e qual time intervals might be dictated by the capabilities o f the measuring equipment. The latter .case is emphasized in this text because o f its common occurrence. F\lrthermore, equal spacing o f measurements usually provides more information for checking the validity of the model a nd the statistical assumptions than does concentrating the measurements a t the maximum I XI. Model,., = P X(t)/or Fixed Large n turd Eqlllllly Spaced Measurements T he model ,., = P X c an represent cases where X is a ny k nown function of time. (The word " time" is used b ut the results could also apply for other variables such as position, temperature, etc.) F or a large number o f observations. ~ c an be approximated by O NE P ARAMETER E XAMPLES In order to illustrate optimal design, some one-parameter linear a nd n onlinear examples are given in this section. The standard conditions designated I IIII - I I (see Section 6. \.5.2) are considered to be valid. 8.2.1 8.1 O NE PARAMETER EXAMPLES Linear Examples for O ne P arameter (S.2.2) where In = n ~I a nd the measurements are assumed to be uniformly spaced in l over 0 ", I ' " In' Example 8.2.1 Model.,,; = pX, K'ith N o Constraints Consider first the case of the linear model.,,; = p Xi • Owing to the standard assumptions, ordinary least squares (OLS) a nd maximum likelihood (ML) yield the same estimator a nd variance of Compare the value of A associated with n measurements of ' I = / JCt: and for 'Ii 2 /JC ( itn/n)'" with i -I,2, . ..• n, tl,>t.J for i >}, and where m is a nonnegative exponent. Let n be large. C is an arbitrary constant which plays no fole in this problem but is included for later use for scaling ' I. Notice that the first case has all the measurements concentrated at the location of the maximum ' I of the second case. n ~ j -I »X J b=-~~- N eo V (b)= - Solution 2 ~ where (S.2.1) Observe that minimization of the variance of b implies the maximization of N ote that ~ is maximized by ( a) making the maximum value of IXI as large as possible, ( b) c oncentrating all the n measurements a t the maximum permissible value of lXI, a nd ( e) making n as large as possible . ~. ..... /7 For the first case the sensitivity X is C t: and then A obtained from (8.2.1) is nClt;'". For the second case. (8.2.2) yields n (it )2'" ::.nt,,-I (t'Cltl'"dt_ -C ltl," n -"- A- ~ C l ~ i-I n 10 2 m+1 In both cases A is proportional to nClr;m and thus is made larger by increasing n. C HAPTER II N l '.' o DESIGN O F O PTIMAL E XPERIMENTS c 2, or t~. The ratio of the first /1 to the second is 2m + I. Hence for all models with m > 0, fJ can be estimated more accurately by concentrating all the measurements at the maximum ' I; this becomes more apparent as m is increased in value. For dynamic experiments, however, measurements uniformly spaced in time are usually more appropriate than concentrated ones. I n this section the c onstraint o f a fixed b ut large n umber o f o bservations, n, is investigated. I n a ny p ractical experiment, n m ust be finite. N o c onstraint is p laced o n .the magnitude of X ( I) o r, equivalently, o n lJ. F or t he case o f n e qually spaced measurements starting a t I = 0 a nd e nding a t I ", t he criterion for o ptimum m easurements for the linear model lJ = PX ( I) is t o maximize (8.2.3) with respect to I ". I t is a ssumed t hat n is l arge a nd the s tandard a ssumptions 11111-11 a re valid. N otice t hat tJ." is a function o f I " b ut n ot n ; t" is n ow simply the maximum I . A necessary condition for tJ." to be a m aximum with respect t o t he d uration o f the experiment I " is t hat (8.2.4) ~here T is the time I " t hat maximizes tJ."; (8.2.4) c an a lso be written as (8.2.S) T his expression is interesting because it provides insights into conditions t hat p ermit an optimum time t o exist. In words. (8 .2.S) s tates t hat a t t he o ptimum time. the square o f the sensitivity must equal the average value o f t he squared sensitivity. Example 8.2.2 The velocity distribution for laminar flow between parallel plates separated by the distance H is U ONE PARAMETER EXAMPLES A-. This would provide an optimal experiment for this case provided the standard assumptions are valid relative to e rron in u and the y measurements are errorless. S olution The optimal distance y can be found by using (8.2.5) with I being y / H and X ( t)- 1(1- t); the optimal value o fT- y / H is then found from T2(1_ d - T-lfo~12(1- t)2 d t-(T 2/3)-0 .ST'+O.2T 4 which simplifies to the algebraic equation 2 4Tl_4ST+20-0; this is a simple quadratic equation which can be solved for T-O.724 a nd 1.15 but only the fint is physically possible. Hence the optimal maximum y is 0.724H. N ow (8.2.S) is a necessary, b ut n ot s ufficient condition. I t is also true a t r elative m axima a nd m inima a s well a s. t he t rue m aximum. F or t he m axima t here a re a n umber o f possibilities with respect t o (8.2.S). F irst. i t m ight n ot b e s atisfied a t a ny finite time, thus indicating t hat t here is n o m aximum a t a finite time. Next, it might b e s atisfied a t all time T , i ndicating t hat t he m aximum is a ttained a t all times I ;> O. A lso (8.2.S) c ould b e s atisfied a t o ne a s well a s m any v alues o f T . E ach o f these cases is illustrated below. S ome g eneral o bservations c an be d rawn f rom (8.2.5). Visualize a f unction IX (1.)1 t hat is zero a t 1 - 0, i ncreases monotonically with I until s ome t ime I m aa, and d ecreases monotonically t o zero. S uch a n X (I) is s hown i n Fig. 8.1. H ere x ( I)'" l exp(l- t) 1 .0 ~--...,. (8.2.6) _ _- --------...,.....--_ _ _ __ _ 0 .8 0.6 0.4 0.2 o where Uo is the maximum velocity and y is the distance measured from one wall. Suppose that value " 0 is the parameter of interest and t hat" is to be measured at equal intervals from y - 0 to where /1" is maximized. Find the y value to maximize t F 1pre I .. Sensitivity coefficient, of fixed larae I I. X 2, o r tn a ad 4~ for , ,-lJtexp(l- t). T he only eoastraint is t hat r f C HAPTER \I D ESIGN O F O PTIMAL E XPERIMENTS As long as IXI is increasing, the instantaneous X2 must be larger than the average X2 a nd consequently a maximum in !!." c annot o ccur when I XI is monotonically increasing. After the m aximum X 2, the instantaneous value of X2 decreases, but the average continues to rise for a while; for a ny IXI function reaching a m aximum m onotonically a nd then decreasing, the maximum !!." must be at some time greater t han the time a t which IXI h as a maximum. Consider now the X ( t) function given by (8.2.6) which is shown in Fig. 8.1 along with a" a nd X 2 . T he m aximum of !!." is a t or = 1.691817 (see Problem 8.1). Notice that th.· X 2 crosses !!." a t its maximum as indicated by (8.2.5). Condition (8.2.5) c an be readily used for other cases. Several special cases are now considered ; see Figs. 8.2a a nd 8.2b. C ase I is for a constant X a nd thus the average of ( X)2 is equal to (X)2 a t all times; hence all times can represent optimum conditions. Case 2 is for X a n exponential which increases asymptotically to unity. T he m aximum !!." occurs only at infinite t, b ut a m aximum is closely approximated in a finite time. Case 3 is for a decaying exponential which has a maximum a t t = O. Case 4 has a m onotonically increasing X a nd as a consequence a monotonically increasing !!.". F urther cases are shown in Figs. 8.3a a nd 8.3b. Case 5 is a cosine which has a m aximum !!." at 1 =0 . C ase 6 is the sine function which has a maximum !!." a t 1 =2 .2467. Both these sinusoidal cases have numerous maxima o r minima, b ut only one global maximum. T he final case, case 7, h as its function X d epicted in Fig. 8.3a, a nd its !!." shown in Fig. 8.3b; it is first positive, becomes negative a nd asymptotically approaches - 0.5. T his case has a local maximum of !!." n ear 1=0.4, b ut the true maximum occurs a t infinity. 4 6 8 10 t (il) Figure I .la Sensitivity coefficients for X I-I, X 2-1- ~-I. X ,- ~-I, a nd X ._,1/2. Example 8.2.3 A p oint on a rotating wheel is observed normal to its axis and is seen to move a distance s with respect to the axis. The known model is s = Psin w I where w is known angular velocity. The measurements of 1 c an be assumed to be errorless. but those o f s satisfy the standard conditions. A large number of uniformly spaced measurements are to be taken starting at 1 = O. F or a n optimal test to estimate P. w hat should be the duration of the test? Solution T he conditions for Fig. 8.3b are satisfied. The t:." for this case is t:.;; which has a global maximum a t w I = 2.2467 radians. Hence the duration of the test should be t~ - 2.2467/ w. 2 4 t 6 8 10 n (bJ F lame 1.2b t:.~ criterion for the sensitivities given in Fig. 8.2a. The only constraint is t hat o f a fixed large n. · "'", " , ': . : ~ C HAPTER. DESIGN O F O PTIMAL EXPERIMENTS 426 4n 1.2 O NE PARAME'I'ER EXAMPLES restricted. F or b oth types of '1's the maximum I'll might be specified' to be '1m. . by appropriately adjusting C. F or the model ' 1=pCsinl a nd 1>0, C would be equal to '1 m. .1Psin I" for 0 < I" < ."12 a nd '1 m...1P for I > ." 12. F or a nother example. let 'I be temperature. I time. and Q the rate of energy input; for some physical c onditions" - PQI a nd the maximum temperature ( 'I) is known and is to be attained by adjusting the energy input Q. In the following analysis the maximum 1,,1 is to be '1 ma • for both types of models; this is equivalent to prescribing the maximum IXI since '1m•• - IPIIXlm..· T he derivation of a criterion starts with tJ." which includes the constraint of fixed large n for measurements uniformly spaced in I . T he problem is to maximize tJ." subject to the constraint of the maximum X2 being equal to exactly X~.... Let X (I)- Cf(I) where C is to be adjusted to make maxX 2 . . X~••. Let Xm . be the positive square root o f X.! .... T hen X.! . .... C1.! •• . where J.! •• designates the maximum J2 value. We can write tJ." as Flaure l .3a Sensitivity curves ror COsl, s inl, a nd another runction derined in Fia, B.3b. t J."-I; t i'·X 2 dl.,"-t i '"C2j 2 dl = ! i'"X~•• o 0 I" 0 Xs • c os t . X6 • s in t X = [(wt)l/Z exp(-1/4t) 7 e rfc((t/4)1/Z))/2 O.C 2 (-L) dl Jm... Observe that ( X I Xm . .)Z =(JIJma .. )2 is independent of C. Then for arbitrary values of C (or X m . ) the criterion to maximize is . 0 .6 (8.2.7) 1).4 Z 4 6 8 10 F lpre l .3b· A" criterion ror the sensitivities given in Fia. B.3a. T he only constraint is t hat or Although X + is indicated to have a dependence only o n I . it may also depend on I ". F or example. for 0<1<1,,<"'/2, X +"'sinl/sinl" for ' 1= p Csinl. As the first example o f the use o f the criterion given by (8.2.7). let X (I):aCt'".I>O. C >O. a nd then Xm ... -Ct,,'". Hence X +-=(111,,)'" where m is a n exponent equal to o r greater than zero. Using (8.2.1), tJ. + for this case is a lilled r irIe n , (8.2.8) Model 'I - PX ( I) for Fixed Lorge n a Ni Fixed Maxitflllm Valw ofl,,1 F or some models of " . there are upper bounds of 1,,1 implicit in the model. Examples are t he" models of p Csinl. PCcost. p Ce- ZI , a nd pCtanh31. In each of these cases the largest possible 1,,1 is I PC!. In other cases, such as for lJ - PCt, there are no implicit limits on, 1,,1 if the t range is not Note that this result is independent of C and I ", unlike the similar case treated in Example 8.2.1 where no constraints are used. I t is also unlike the result of the single constraint o f fixed large n; see tJ.4in Fig. 8.2. The result given by (8.2.8) means that there is no unique optimum time I " for X - Ct'" a nd X m.. being the same value in each case. . CHAPTER 8 428 D ESIGN O F O PTIMAL E XPERIMENTS F or t he case of X = C sin ( the use of (8.2.7) yields + I .1 = - I. f unction o f I " o nly a nd n ot o f I o r TJ max ; TJ nom itself is n ot a f unction o f I ". N ote t hat m aximizing .1 s ubject t o t he c onstraints o f fixed n and TJ max is e quivalent t o m aximizing t he t erm i nside t he b rackets o f (8.2.11), w hich is d efined t o b e 1'"( -.- )2 dl = --- -s-2sin ( 2 (.- 1 in 1. 0 4 sin 2 1. S in I . ..19 ' .1 O NE P ARAMETER EXAMPLES 7T (8.2.9) (S.2.13) T he l atter p ortion o f t he I . c urve is t he s ame a s given by .1~ in Fig. S.3b a nd t he 0 t o 7T / 2 p ortion is s hown a s the d ashed c urve in the s ame figure. N ote t hat t he m aximum is u naffected b y t he c onstraint o f a fixed r ange o f TJ . T he s ame is t rue f or the m aximum .1' f or the X = C c os I c urve (see .1; in Fig. S.3b). B ased o n t he a bove e xamples t he s hape o f t he .1 + c urve m ayor m ay n ot b e a ffected by the 1j r ange c onstraint. A lso t he l ocation o f t he m aximum m ight o r m ight not b e c hanged. F or l inear m odels t his c riterion is e quivalent t o t hat g iven b y (S.2.7). A n ecessary c ondition t o m aximize .1 + w ith respect t o III is f ound f rom a.1 + / a/" = 0 w hich y ields for "2 " I " 1 .1 - ( T) = [ X + (T) ] { I + [ 2T/ TJ'; ( T) ] [ drj'; (T) / dl" ] } (8.2.14) w here T is t he v alue o f I " m aximizing .1 + a nd . 1- (T) is d efined b y (8.2.15) 8 .2.2 O ne-Parameter N onlinear Ca~s, 1j = 1j( fl. t ) As a n e xample o f a n onlinear m odel let 1j b e g iven b y F or o ne-parameter n onlinear c ases with the s tandard a ssumptions o f 11111-11 v alid. the v ariance o f t he e stimator b o f fl in the m odel1j = 1j( fl. I) is a pproximately -I V (b);;:;02[.± X/] w here X ,= a1j ( fl. I,) ap I -I I A gain t he o ptimal e xperiment i nvolves minimizing the s um o f t he s quares o f t he sensitivity coefficients. As in the l inear c ase the o ptimal u nconstrained e xperiment w ould involve l ocating all the o bservations a t t he m aximum p ossible IX I. A n a nalysis is g iven b elow f or cases for which it is m ore p ractical t o use u niformly s paced m easurements in I . T he c onstraints o f ( I ) a fixed n umber n o f e qually s paced m easurements b etween 1 =0 a nd I . a nd ( 2) a m aximum v alue o f ITJI. d esignated TIm . . ' a re t o b e i ncluded in the analysis. F or l arge n (S.2.2) p ermits w riting (S.2.11) '; X N N .W + P a1j =-- Tlnom a fl' + _ 1 jm • • 71m = - Tlnom (8.2.16) for which it is c onvenient t o m ake TJ nom e qual t o C. T hen X + b ecomes X += - p/exp(-p/) (S.2.1O) f /-h .. TJ = C exp( - PI) w hich h as a m aximum a mplitude a t 1 + = PI = I a s s hown i n Fig. 8.4. F or t his c ase t he m aximum TJ o ccurs a t I = 0 s o t hat TJ'; is e qual t o u nity. T he 1 .0 r-------------~------------------------~ 0 .8 0 .6 t+ 0.4 A+ 0.2 = l it has a maxfmum a t t + = 1.691817 2 ( 8.2.12) where Tlnom is s ome n ominal v alue o f 71 w hich is c hosen t o m ake 71'; a (8.2.17) 4 t +=lIt o r t~ " tltn FJaare .... Sensitivity coefficient, , ,/ C, a nd A+ f or., - C ellp( - Pt). 5 CHAPTER 8 DESIGN OF' OPTIMAL EXPERIMENTS 430 l ocation o f t he m aximum u · d erined b y (S .2.13) is a t { ll= 1.691S17. Unlike linear p arameter e stimation cases. the possible d ependence o f I~ o n { l c omplicates o ptimal design in n onlinear c ases. F or t he present case this d ifficulty is not as severe as it first may seem. however. Notice t hat u· s hown in F ig. S.4. t hough h aving a u nique m uimum . h as values within . SO% o f its m3llimum for the large r ange o f I · = { ll b etween o ne a nd t hree. H ence in this e xample a n a ccurate i nitial e stimate o r ( l is n ot necessary 10 o btain a g ood e xperiment d esign. A nother a pproach is t o n ote t hat since t he optimal~· o ccurs w hen 1 IIC :: 0.184. t he d uration c ould b e selected to m ake 111 C a pproximately this value. A nother e xample w hich h as a m odel related to (S.2.16) is 11 = c[ I - exp( - (l/)]. ( S.2.IS) lI n. . m = C w hich h as the sensitivity X'" o f f i/exp( - fir) . In this e xample. 11 initially increases with time so thatlln~ = I - exp( - (lIn)' U sing (S .2. 14) for d eterm ining the optimal d uration T gives ~ [ fiTexp( - fiT)]2 - (T)= (S .2. 19) I + [ 2fiT exp( - fiT) ] [ I - exp( - fiT) ] - I U O NE PARAMETER EXAMPLES 431 This model is identical to (8.2.16) and lInom is also C. Furthermore. '11+-11 In . . I. With the condition or a large number of equally spaced meas~rem::~ example is the same as considered ror (8.2.16) for which we round the op;imal experiment to have the d uration., =- 1.691/1. This time., can be compared with the value actually used. From Fig. 7.13 the hi value is about 2.7 h r- I and thus ., =- 1.6912.7 =- 0.63 hr. From Table 6.2 the maximum time is 1536 sec or 0.427 hr: this time corresponds to fJr = 1. 15 at which time A + in Fig. 8.4 is about 80% or its maximum value. Hence ror estimating IJ in Model I . the experiment was well designed. For the more complicated models discussed in Section 7.5.2. the optimal duration may be different. I'hi: 1.2.3 Iterative Search Method O ne o bvious w ay t o m aximize 6 . 6". o r 6 + is t o p lot i t versus I " a nd t hen o bserve t he v alue o f I " w hich maximizes t he s elected 6 f unction. A m ore d irect p rocedure is to linearize in a s imilar f ashion a s i n t he G auss m ethod. L et u s illustrate t he m ethod b y c onsidering 6". A ssume t hat a m aximum exists a nd t hat a n e stimate o f t he o ptimal T is T (k). A n ecessary c ondition a t t he m aximum 6" is given b y (8.2.5). E xpanding b oth s ides using a truncated T aylor s eries gives w here 6 - is t he s ame a s A'" o f Fig. 8.4. T his c ondition is s atisfied only n ear T = 0 w here A'" = ~. F or s mall I v alues ' lJ is a pproximately CPI w hich h as t his A'" v alue (see (S .2.S»). w hereas for larger I v alues A · d ecreases. A gain it is o bserved t hat t he c onstraint o n t he m aximum v alue o f 1111 c hanges t he o ptimal c onditions. which c an b e s olved for t he c orrection 6.,.(A:) t o get E xample 8.2.4 Consider again the example of the cooling hillet investigated in Example 6 .2.3 anli Sections 7 .5.2 and 7.9.2. The hillet was heated to the temperature To and then allowed to cool in open air at a temperature T . . ... 8 1.5°F (301 K). Though several models were considered for this hillet. let us consider here only Model I. which is [see (7.5.10») ( T - T ~)/( To - T ",) = exp( - (II). The parameter to be estimated is (I. The optimal duration of the experiment is to be found for a large number of measurements uniformly spaced in I starting at 1 =0. The initial temperature dirrerence To - T", can be set by simply heating the hillet before placing it in the air. The air temperature T. . is a fixed. known value. The temperature T must b e between T", and To . S olution The model can be considered to he l J" T - T . . = C exp( -IJI) where C ... T o- T '" (8.2.2Ia) ( S.2.2Ib) A few p oints c an b e m ade i n c onnection w ith the iterative p rocedure f or finding t he o ptimal T given b y (8.2.21). First. a n i nitial estimate o f T(O) is n eeded. A r easonable v alue t o u se is twice the I v alue a t w hich IX I is a m aximum. T his is t he v alue t hat is f ound f or T (I) if o ne s tarts a t T(O) c orresponding t o t he m aximum v alue o f IX I. S econd. improved values o f T a re given b y (8.2.22) 431 CHAPTER 8 DESIGN o r OP1lMAL EXPERIMEI'ITS T hird, in o rder to be sure t hat e ach iteration helps to increase 6 ", the values of 6 " s hould also be calculated a nd c ompared a s one proceeds. I f 6 " s hould decrease a smaller 6T s hould be selected. ( It is also possible t hat the m ethod is l eading to a m inimum r ather t han a m aximum.) F ourth, 'T c annot b e negative even though this p rocedure seeks a maximum in the region - 00 < T < 0 0 . T he p rocedure b ased o n (8.2.21) is n ot a ppropriate if the maximum 6 " o ccurs a t t he b oundary p oint T = 0 a nd a 6" / at" is n ot z ero there. See 6 j in Fig. 8.2b . Finally. the p rocedure t erminates when 16T(k)1 is m uch smaller than T(k). 8.3 CRITERIA FOR OPTIMAL EXPERIMENTS FOR MULl lPLE PARAMETERS 8.3.1 General Criteria W hen t here are two o r m ore p arameters t o e stimate, the choice o f a c riterion to indicate the o ptimal design o f the experiments is less straightforward than for the case o f o ne p arameter. M any c riteria have been proposed. They a re u sually given in terms o f XTX. F or b oth l inear a nd n onlinear e stimation. X represents the sensitivity matrix. Recall t hat t he covariance matrix o f the e stimator v ector b is (X TX)-10 2 for the s tandard a ssumptions of additive, zero mean , c onstant v ariance, independent. normal, measurement errors in Y; a dditional assumptions are t hat t here are n o e rrors in the i ndependent v ariables a nd t hat fJ is a c onstant p arameter v ector with n o p rior information. T he value of 0 2 n eed not be known. T hese a ssumptions a re d esignated 11111-11. F or these assumptions O lS, G auss-Markov, ML, a nd M AP all give the s ame e stimator. S ome o f t he criteria which h ave been suggested in terms o f XTX a re a s follows: ( a) m aximization of the d eterminant o f XTX ( or e quivalently, the maximization o f t he p roduct o f t he eigenvalues o f XTX). ( b) m aximization o f the m inimum eigenvalue o f XTX; a nd ( e) m aximization o f t he trace o f XTX. These criteria a re listed by Badavas a nd Saridis (4], w ho used the second criterion. Additional criteria a re listed o n p. 52 o f F edorov (3]. M cCormack a nd Perlis (5] used a criterion similar in principle to (e). W e r ecommend the first one because it is e quivalent t o m inimizing the hypervolume o f the confidence region (provided the assumptions 11111-11 a re valid). A criterion similar to maxlXTXI w as used by S mith (7) a s early as 1918. T he b est-known early work involving maxlXTXI was reported by Box a nd l ucas ( I] in 1959. however. A nother d erivation for the closely related criterion o f m aximization o f IXTI/--1XI is given in C hapler I I in N ah i's book [6]. [See (8 .3.2) below.] T he I .J O P1lMAL EXPERIMENTS FOR M ULnPLE PARAMETERS derivati~n is b ased o n t he C ramer-Rao lower b ound w hich is appealing, he states, s ince t he lower b ound d oes n ot d epend o n t he knowledge o f t he specific e stimator ( lS, M l, e tc.) t o b e used. As m en!ioned ~b?v~, we derive o ur c riterion based o n t he assumption t hat we WIsh t o mInimIZe t he hypervolume o f t he confidence region. I n s o d oing it is implied t hat e ach p arameter is considered in the s ame m anner a nd t hat t he c ost o f e ach m easurement is the same. I n S ection 8.8 the case o f o nly s elected estimated p arameters b eing o f i nterest is discussed. A criterion is derived in A ppendix 8 A t hat is valid for the assumptions o f a dditive, zero m ean n ormal e rrors in Yj , a nd e rrorless d ependent v ariables. Specifically the a ssumptions a re d enoted] 1--] I II. F or t he O lS, G auss-Markov, a nd M l e stimators, t he criteriQn is the maximization o f t he d eterminant o f t he covariance matrix o f t he e stimator v ector b. F or the s tandard a ssumptions d enoted 11111-11, t he a bove e stimators all have the s ame c ovariance matrix o f (XTX)-1(J2; t hen t he related criterion for optimal experiments is t o m aximize (8.3.1 ) F or m aximum l ikelihood estimation a nd a ssumptions d enoted 11--1011, t he criterion is t o m aximize (8.3.2) F or o rdinary l east squares estimation with the s ame a ssumptions, the criterion is t o m aximize (8.3.3) T he l ast two expressions a re v alid for correlated, n onconstant v ariance m easurement e rrors. T he m ax IXTXI c riterion reduces t o t he .:1 c riterion utilized in S ection 8.2 f or o ne p arameter. T he c onstraint o f 'a fixed n umber o f o bservations n is also o f i nterest for the m u/tiparameter case. W e wish t o d efine a max 6 " c riterion t hat i ncludes this c onstraint i n s uch a w ay t hat 6 " is consistently defined with the o ne-parameter c ase a nd a lso s o t hat a r eplication o f d iscrete measurements will n ot c hange its value. S uch a c riterion is m ax.:1"=max ~ n (8.3.4) where .:1 c ould be r eplaced by the expressions i n ( 8.3.1,2,3) d epending o n t he a ssumptions a nd e stimation m ethod. C HAPTER. DESIGN OF O mMAL EXPERIMENTS When the measurements are uniformly spaced in I between 0 a nd tIl a nd '. n is large. 6 ST for p = 2 is I .J o mMAL EXPERlMEM'S FOR MULTIPLE PARAMETERS case X is a s quare matrix. This results in the following simplications for 6 ST• 6 ML• a nd 6 0LS• IXI' 6 ML .... ~LS . . - (S.3.5) where X j(t) is the sensitivity coefficient for p arameter; a nd time I . T he extension to p > 2 is direct. I f, in addition. there is a constraint of the maximum 1) being specified. one can modify (S.3.5) by replacing the integrals by a typical expression of C/ = (1),:)-2 1 ( "X/' ( t)X/ ( /)dt I" ) 0 (S.3.6a) 435 I~I (8.3.9) Note that the s ame criterion is given for bOlh M l a nd O lS estimation for the assumptions of 11--1011. Also observe that the optimal choice of X elements are affected by the accuracy o f a nd correlation between the measurements. Consider now the criterion of maximizing 6 ST• which is equivalent to maximizing the absolute value o f where III il1)(/) + _ 1 )mu X /(/)=-aa-' 1) nom 1)", =-- (S.3.6b) which are similar to the expressions given in Section S.2.2. Then for two parameters with the constraints of large n with uniform spacing in I a nd the maximum 1) being 1 )mu' we have the criterion of maximizing (S.3.7) I f there are multiresponses in the experiment and measurements are taken with uniform time spacing starting a t I = 0, a nother 6 + criterion must be given. As above, the symbol 6 + means that the constraint of the same 1) range is included in addition to the constraint of uniform M. Examples of multi response cases involving transient temperature measurements a t more than one position are studied in Section S.5. l et m denote the number of independent responses. This case can be treated by extending the definition of C / given by (S.3.6a) to + Ci j+ - ( 1)", )-2 - I ~ ~ mt" Ie-, i" Xi IXI"" 1) nom PI + ( I,x,,)") + ( /.x,,)dl (S.3.S) 0 where x" is used to designate the kth response. By defining C / in this manner. 6 + is unchanged in value if m sensors are located a t the same position (or measure the same quantity). (8.3.10) since when X is a square matrix, 4=IX TXI-IXI 2• In the remainder of Section 8.3.2 let IXI d enote tbe absolute value of tbe determinant o f X. As mentioned above there usually must be constraints o n tbe range o f operability (a term used by Atkinson a nd H unter (8]). Let R ( I) define tbe region of operability; the f vector bas elements that can be illustrated by writing the linear model as ." - P.I, + . .. + IlpJ,. However. not all the values of Xi) may b e attainable. as, for example, when I, - I. Let those values which are lvailable for experimentation define tbe attainable region R (x). a subspace of tbe p-dimensional X space. The design problem then becomes that o f selecting n points in R (x) which maximizes IXI. Atkinson a nd H unter (S] bave sbown. tbat tbe value of the determinant given by (S.3.10) is proportional to the volume o f the simplex formed by the origin and the p experimental points. Thus an optimal design is o ne for wbicb the simplex volume is maximized. I t follows, then, tbat for an n = p design to b e optimal, tbe experimental points must lie on the boundary o f R (x). & L21 UMME~$IMp=2 Constraints 0 10 < I, < I a nd 0 < 12 < I Consider tbe simple linear model 8.3.2 Case of Same Number of Measurements as Parameters ( n = p ) O ne possible multiparameter case is when the number of measurements and parameters are equal. Without prior information the minimum number of measurements n needed to estimate p parameters is n =p . In this (S.3.10) witb tbe constraints 0 <1, <I a nd CHAPTER 8 DESIGN O F O PTIMAL EXPERIMENTS 06 R(f) f or ~=Blfl+BZfZ 8.3 O PTIMAL EXPERIMENTS F OR M ULnPLE PARAMETERS ~IIII. 437 R(.!) ( operability r egion) R (f) /2 (oPer. region) "max R(~) f or n=B1+BZfZ o L -_ _ _ _..I.-_....:.(.at~tafnabl1fty o Xl w r egion) ~ W F 1pre '.6 Several regions of operability a nd attainability for constraint of 0 <. ' I <. ' I. ... FIIIUfe 8 5 Regions of operability and attainability (R (I) a nd R (xl) for constraints of 0<. I. <. I a nd 0 <. 12 <. I. T hus the region of operability R ( f) is the unit square shown in Fig. 8.5. F or this case the sensitivity coefficients are X t =11 a nd X 2 =12 a nd the absolute value of the determinant of X is (8.3.11) where the vertical bars on the right side mean absolute value. F or the m odel'll = f3 t + f3d2' the attainable region R(x) is the unit vertical line a t X t = I shown in Fig. 8.5. F rom the geometrical interpretation of maxi XI, the optimal design for two experiments consists of 2 points in R (x) which, together with the origin, form a triangle of greatest area. F or this case the optimal two points are the extremes of the line, ( X t ,X 2)=(1,0) a nd ( I, I). F or this design IXI = I. I f the attainable region R (x) happens to be the operability region R ( f) which is the unit square, an infinity of designs give the same maximum value of the determinant, namely, one experiment a t ( X.,X 2)=(0, I) and the other anywhere between and including (1,0) a nd ( I, I) o r o ne a t (1,0) a nd the other anywhere between and including (0, I) a nd ( I, I ). All these designs also give a IXI value of unity. Constraint 0 10< 'II < '11m. . while satisfying the condition max TI - 1Jmax - maxi P. I. + Pd21 The region R (x) is the triangular region bounded on one side by the line determined by varying I. a nd 12 in (S.3.13); see Fig. S.6a. T he largest IXI value is found by the two points a nd the origin comprising the largest triangle in R (x). In this example the optimal conditions are (P,XI,P2X2)-(Tlm . .'O) a nd (O, Tlm. . )' This results in maxlXI being equal to ITI!... / PI P21 · Case 1 In this case the operability region R ( I) is greater than the attainable region R (x). As a n example consider the model TI = Pt + Pd2 for which R (x) is the vertical line a t PIX t . .. PI shown in Fig. S.6b. Hence the two extreme points along R (x) together with the origin form the maximum triangle. The maximum IXI is I(Tlm. . - PI>! P21 '"' max1/21, which is made larger by increasing max1/21 . C aseJ In the last case TI is given by (S .3.14) where F requently a more realistic constraint than on the 1, values is o n the range of'll. In this section the case o f 0 < '11 < 'I1mu is investigated for n = p = 2 a nd the linear model. Three different variations of this case are considered. Case 1 In this case there are no constraints on the 1;'5 so that R ( 0 is equal to R (x). For the model TI" Pd. + Pd2 the optimal design points are found from (8.3.12) (S.3.13) e is adjusted to make maxlTlI- TIm. .' In symbols, e is eTIm. . maxi Pd. + P dll (S.3.IS) T hen the maximum IXI value is 2 m axi!'I f 2 2- f21 11 I '!' I' max IXI - max( e 21!'I f 2 2- f ' 1 '1) - TIm.. I' 21JI2 1 max( P d, + Pdl> (S .3.16) In order to illustrate this expression, consider the case o f TI - e (P t + Pdz) for w hich/'l- IZI-1. I f we further c hoosc/n to b e equal to o r greater t han/i2>O. the N l·~l CHAPTER. DESIGN OF OPTIMAL EXPERIMENTS . I J OPTIMAL t XPERIMINTS f OR M UL11PtE PARAMETERS j 0 0· . .. . .. max/fid22- f2.1i21 value isf12 (by s ellingfi2- 0) a nd max/XI is o 2 maxlXI ' " 1J :" p (fld12)( fl. + Pd22) -2 - C122 (B.3.17) 1 A which is similar to that given for case 2. N ote thaI now the maximum o f max/X/ is n ot simply given by the maximum value of f21. Rather. differentiating (8.3.17) with respect to fld22 a nd selling the equation equal to zero gives fld22lopl'" P. 1 .3 - .5 (B.3.IB) which are then both equal to 1Jm. . / 2. T hen we find max (maxi XI) to be 1Jm. . / 2P2 o r. equivalently. 1J~. . / 4fl. P2' ( Much smaller maxlXI values are found for certain other I1d22 values; for example. it goes to zero for both I1d22 a pproaching zero a nd infinity.) The optimum two measurement points are shown in Fig. B.6c a nd are (P.X •• 112X2)-(1Jm.. /2.O) a nd (1Jm. . /2.1Jm. . / 2). - 1.0 N onl.a, Example / 0' p = 2 A model studied first by Box a nd Lucas [ I] a nd later by Atkinson and Hunter (8) is next considered. Preliminary estimates of P. = 0.7 a nd P 2=O.2 yield the model and sensitivities of (see Problem 8.10) 1J = 1.4[ exp( - 0.21) - exp( - 0.7 I )] - 1.5 (8.3.19a) P.X. = 0.7[ (0.8 + 1.41)exp( - 0.71) - 0.8exp( - 0.21)] P2X 2 . . 0.2[ (2.8 -1.41)exp( - 0.21) - (8.3.19c) B (8.3.19b) 2.8exp( - 0.71)] which are plotted in Fig. 8.7. The operability range of 1J is between 0 a nd 0.6; X .(t) at:Jd X 2(t) are also finite bUI may be negative as well as positive. .6 .4 .2 - .1 fIawe U 0 .1 .2 .3 .4 R (.) for 8 0. and Lucas (I) c umple. (Printed b y p ermiaion of the Biometrika TfUltea.) N ote that X . a nd X 2 are uncorrelated and have maximum absolute values a t different 1 values. Plotting X 2 versus X I as in Fig. 8.8 provides the attainable region R (x) which is a curved line in this case. The points of the optimal design, shown in Fig. 8.8 by heavy dots labeled A a nd B, together with the origin, form the triangle of maximum area within R (x). Associated values of I are 1.23 a nd 6.86, which values are affected by the choice of the parameters. Since the p 's are not precisely known when the experiment is designed, one might wish to relate these values to associated measured 11 values. For example, a t 12 -6.86, 11 has reduced to o f its maximum value. Atkinson and Hunter also studied optimal designs for up to 20 measurements. F or these cases 6 " is given by t 0 - .2 - .4 0 2 4 6 8 l Ot 12 14 16 18 F lpre ' .7 " and sensi.ivities for Box a nd Lucas III example. 20 ( f X/~)( ~ f XA ) -( X I.XI2 )2 6 I -I J -I I -I 6" - - - - - - - - - - .,,2. - - - - - - , ,,2 (8.3.20) CHAPTER 8 DESIGN O F O PTIMAL EXPERIMENTS Their results for maximum values of 6" are given in Table 8.1. In each case the optimal design is found to consist of measurements solely a t the two times indicated above. When n is even, equal numbers of measurements a t each point maximize 6". F or odd n an extra measurement a t either of the two conditions give the same maximal 6". Table 8.1 Optimal Designs for up to 20 Measurements for the Box and Lucas Model Given by (8.3.19)· Number of measuremen.s at maximum n 1= 1.23 1=6.86 I I 2 2 2 3 3 5 10 I 2 I 2 3 2 3 5 10 apply with the others denoted in 11111-11. N o constraints are to be included for 11 o r ttl· F or this case the optimal value of ttl is found by maximizing (S.3.5). T he sensitivity coefficients are X .-I a nd X 2 =sint a nd the CiJ values are I C2 2= 2" - C II = I, I C. 2 =- (I-cost,,) I. 2 sm t Il' 4t" (8.4.2) tIl These expressions are plotted in Fig. S.9 along with 6". T he optimal tIl is 5.5 which is considerably larger than 2.25, the optimal tIl for estimating only P2. 8.4.2 A" 2 3 3 4 5 5 6 10 20 441 . ... ALGEBRAIC EXAMPLES F OR l WO PARAMETERS AND L ARGE" 0.1642 0 .1459 0.1459 0.1642 0.1576 0 .1576 0.1642 0.1642 0.1642 Exponential Models with O ne Linear and O ne Nonlinear P anmeter Exponentially decaying solutions commonly occur in science a nd engineering. O ne is (S.4.3) This could describe the temperature in a fin (Section 7.5.1) o r t hat of a cooling billet (Section 7.5.2) [ Tco would be assumed known in (7.5.4) a nd (7.5.10).] F or the assumptions denoted 11111-11 a nd no constraints o n 11 o r t the criterion to maximize again is 6", given by (S.3.5). T he sensitivities are all X 2= a P P. = - ( P2 )t + exp( - t+) (S.4.4) 2 ·Reprinted by permission from Technometrics (8). where Noted by many is the conclusion that mp optimal conditions for determining p parameters consist of m repeated optimal experiment. However, this conclusion is not always valid, as pointed out in (8). N ote that to obtain the 20 measurements in Table 8.1, 10 different experiments must be run. Because it is wasteful to disregard data a t o ther times when the transient experiment has been performed, the emphasis in this chapter is upon many equally spaced measurements. t+ = P2t. Functions similar to X. a nd X 2 are shown in Fig. S.4. 1.2r-~--~---r--.---~--~--r--,---.---r---r--, 1.0~--~----------------------------------------i Model: 0.8 n A 1s n=~1 ;~2 sin t a maximum a t t n=5.5 0.6 8.4 ALGEBRAIC EXAMPLES FOR T WO PARAMETERS AND LARGE n 8.4.1 Linear Model 11 = PI + P2 sin t T o illustrate the case of a large number of uniformly spaced measurements, consider the model N N . to (8.4.1 ) Assume that the assumptions of additive, zero mean, independent errors 2 3 4 5 6 7 8 9 tn F Ipre U Sensitivity curves for the model.,,-fl.+flzsinl. 10 11 12 CHAPTER I 44J DESIGN O F OP11MAL EXPERIMENTS U ALGEBRAIC EXAMPLES FOR l WO PARAMETERS AND L ARGE" 1 .0,----,r----,---r---r---r--.---.--___- .....----. 1 .6 1 .4 1 .4 1\.', (I-e-8z ) t 1.2 Hodel: 1.Z Maximum a" a t t : • 7.185 1 .0 0 .8 0 .6 Model: Maximum 00 0 .4 n · 81 e - 8Zt 6" 0 .6 O. + a t t~ • 1.191 . .0 3 t " • BZt" F Ipre . ... Sensitivity curves ror the model" - Il. u p( - Illl). 4 F !pre 1.11 Sensitivity curves ror the m odel" - 1l.(I- e - -,,) with no constraint on maxi· m um". I f m easurements were desired a t only two locations, the optimal locations a re a t - 0 a nd I, the former being where X I is a m aximum a nd the latter where IX21 is. O ne c an demonstrate this by plotting X 2 versus X I a s in Fig. 8.8 a nd t hen finding the maximum triangle including the origin. I f o nly two measurements are to be taken from each experiment in a series of experiments, the measurements sho.lld b e m ade a t j ust these two times in all the experiments. . T he integrals C!J associated with X I a nd X 2 a re plotted in Fig. 8. to a long with 4 ". A large number n o f equally spaced observations in 0 < I < lit is used. T he optimal duration o f a n e xperiment for determining both PI a nd P2 is the time a t which 4 " is a maximum, 1,,+ - 1.191. This maximum occurs between the times of the maxima of 1,,+ - 0 for C II a nd I,,. = 1.69 for C22 • T hese latter times are the optimal values if only PI a nd only P2 were to b e e stimated. A model similar to (8 .4.3) is ,+ (8.4.5) Sensitivities for (8.4.5) a re X I . . I - exp( - 1+). X 2 =( ::)1 + exp( - 1+) where t + - Pt. T he integrals CI} a nd 4 " a re d epicted in Fig. 8.11. T he m aximum of 4 " is a t - 7.184 which is between the value o f t,,· ... 0 0 a nd 1.69. m axima values for C II a nd Cl l. l t is significant to note t hat a t time - 1.191. t he optimal value for model (8.4.3), the 4 " value shown in Fig. 8.11 is still very small. Hence a n e xperiment design t hat is o ptimal for (8.4.3) is very p oor for the similar exponential model, (8.4.5). I,,. I,,. Example 8.4.1 Consider again the cooling billet example studied in Example 8.2.4 a nd o ther sections. The model can be in the following forms T - T... - (To - T... )exp( - PI) This could represent the same physical cases mentioned above except now To is assumed known. Both models a re illustrated o n t he following page. (8.4.6) T - To-(T. . - To ) [I-exp( - PI)] (Q) ( b) '. . r ~ , .' .. , :: - "' 8.!1 O PTIMAL E SllMAll0N F OR P ARllAL DlFFEREN11AL E QUAll0N Consider two cases. F or the first case. ( a). assume temperature Too is accurately known a nd (To - Too) a nd /1 are the parameters. This describes the billet problem because Too is accurately known. The second case corresponds to ( b) for which the initial temperature To is considered to be known a nd (Too - To) a nd /1 are now /1. a nd /12' respectively. The optimum durations for both cases for a large number of equally spaced measurements from 0 t o t . a re to be found. N o c onstraints on T or 1 a re to be used. Assume that the measurements satisfy the s tandard conditions denoted I I I I I-I J. A n estimate o f /1 in ( a) a nd ( b) is 2 .7/hr. Solution F or (a) the dependent viuiable can be considered to be T - Too; this model is similar to (8.4.3) a nd the optimal duration is 1 .= 1 .191//1:::.1.191/2.7=0.44 hr. See Fig. 8.10. F or (b) the dependent variable is T - To which is a nalogous to 11 o f (8.4.5); from Fig. 8.11 the optimal duration is 1.=7.185//1:::.2.66 hr. T he d uration of the optimal e xpeliment is relatively long when Too is u nknown; in fact, at 1.+ = 7.185, T = 199.92 c ompared to the value o f 200 which is a pproached as 1 -+00. 1.5 OPTIMAL P ARAMETER E STIMATION INVOLVING T HE P ARTIAL DIFFERENTIAL E QUATION O F H EAT C ONDUCTION -N ,.... .. ·' W ... ....5 CHAPTER 8 DESIGN O F OPTIMAL EXPERIMENTS T o illustrate design of optimal experiments in more complex cases, studied next are cases involving the partial differential equation of heat conduction. Considerations not encountered in the algebraic models given above e nter when the model involves this equation. F or example, space as well as time dependence is met. Thus in a ddition to finding optimal duration of experiments, optimal locations of sensors are needed. Furthermore, the response a t any location is affected by the prescribed time variation of b oundary conditions. Another significant aspect o f estimation involving partial differential equations is that the parameters can be present in the equation a nd/or in the boundary conditions. T he criteria derived in Appendix 8A apply to estimation involving ordinary a nd partial differential equations. F or simplicity, the cases considered in this section were selected because they have solutions in terms of known functions; similar methods of analysis can be used, however, even if the e quations must be solved numerically as commonly occurs for nonlinear differential equations. The criterion utilized is that of maximizing .:1= IXTXI subject to appropriate constraints. This is the condition to employ when the standard conditions denoted 11111-11 apply. When many transient measurements are obtained using a single sensor, the standard assumption of independent measurement errors may not be valid. I f the correlation parameters are not known, however, it is still reasonable to choose the maximization o f IXTXI a s the criterion. The transient heat conduction equation for heat flow in a plane wall with constant thermal conductivity k a nd density-specific heat product c c an be written as or (8.5.1 ) where a = k / c is called the thermal diffusivity. T he differential equation can be written in terms o f the single parameter a b ut sometimes there a re b oundary conditions which involve k. I n the following analyses when only a appears, it is the parameter, b ut when k a ppears in boundary conditions, k a nd c a re the parameters. T he p arameters k a nd c are chosen because of their physical significance although others c an be used as indicated in Section 7.10. F or the standard assumptions 11111-11, a fixed large number o f equally spaced observations a nd a c onstraint o n t he maximum range of'll [which is the increase in T o f (8.5.1)], the criterion for one parameter is to maximize .:1 + given by (8.2.13). I f the same conditions are valid for two parameters, .:1 + is given by (8.3.7,8) where t he; a ndj subscripts could b e I a nd 2 with the subscript I referring to k a nd the SUbscript 2 to c. Several examples a re given in this section. First considered are semi-infinite bodies for which the body starts a t x = 0 a nd c ontinues indefinitely in the plus x direction. Although such bodies d o n ot exist in nature, many heat-conducting bodies can b e so modeled, a t least for some period o f time. Also considered are finite bodies. Temperature measurements in a finite plate heated on one side a nd insulated o n the other are tabulated in Table 7.14 a nd illustrated in Fig. 7.17. These measurements also illustrate a semi-infinite body; until time 6 sec, the temperatures in the plate are the same as those that would be measured if the plate were thicker. 8.5.1 Semi-Infinite Body Examples Temperature Bolllfllllry Condition (Single Parameter) Suppose that the temperature in a semi-infinite body is initially uniform a t the temperature To. Let the temperature a t x = 0 have a step increase to T « J' T he ttmperature in dimensionless form can be given as [9] T+ T - T. 0 T «J-To = erfc[(41+)-1/2]; (8.5.2) r N W N' . ~c .. . . 4U C HAPTER. DESIGN O F OP11MAL EXPERIMENTS where erfc(z) is called the complementary error function and is the integral. e rfc(z)= 12 foo exp( - u 2 )du /2 'IT +<1 0 0 ' It' 0 M 0 N 0 0 . In 0 0 .0 . 0 . 0 (8.5.3) I N ote that although T is a function of x a nd I. the dimensionless temperature can be plotted in terms of the single dimensionless variable 1+. F or temperalure b oundary conditions involving the heat conduction equation (8.5.1). the only parameter that enters is a (if temperatures To a nd Too are n ot parameters). Note that T is a nonlinear function of a. Thermal diffusivity ( a) is also called a " property" a nd has been estimated for many materials by many different experimentalists. some of whom have used (8.5.2) as their model. The solution given by (8.5.2) has a natural constraint on the range of temperature T because T must be between To a nd Too' Even though a t some interior location x a nd a t some time I the temperature may be much less than Too' the temperature near x = 0 approaches Too' Instead of requiring the temperature a t x to reach the same maximum value at the e nd of the experiment, we apply the constraint a t the heated surface ( x ... 0) where the temperature rise is the greatest. Hence the " nominal" rise in T is taken to be Too - To. The dimensionless a sensitivity is X +a a T _ (4 + )-1/1 (- I ) .. = Too - To aa - 'lT1 exp 41 + - In co co .S I. c .... I t. . .g >. .Q CII > \D Ot 'i .'... " +<1 ' It' cr j S :8 )(]I 1t • '1. . +.,F .S (8.5.4) J: M s.. 0 ~d (8.5.5) N ote that fJ. + is a function of I,,., the maximum time in fJ. + . Plotted in Fig. 8.12 are T + a nd X..+ versus I + a nd fJ. + versus I,,.. F or a given location x for measurement of temperatures. the sensitivity X..+ has a maximum at 1+ = a l/ x 2= a l/ x 2= 0.5 at which time T + = 0.3173. Hence if only one measurement is to be taken from those produced by one sensor. it should be selected a t a time corresponding to , + = 0.5; if instead the time o f measurement is fixed but a nyone location is to be selected, then the optimum x is (2a/)I/2. I f many equally spaced-in-time measurements are used. the optimal duration for using data is when fJ. + is maximized; it is time t,,+ = 1.2 (when T + -= 0.5). I f a good estimate of a is not initially available. the optimal times can be estimated using the corresponding T + values indicated. i. In T he fJ.+ function for a large number of uniformly spaced measurements starting a t 1 =0 a nd for the maximum T in the body being Too is fJ.+ = (/,,+>-110': ( X..+ )2dl+ I B 0 N )( 1 . ~r)( ~ a <I •1 +~ . + +~ ~ .... .... H I 0 . . In In In N N .- 0 0 0 +>C" 0 0 0 +. .. 0 . ...., CHAPTER 8 DESIGN OF OPTIMAL EXPERIMENTS 8.5 o mMAL ESTIMATION FOR PARTIAL DIFFERENTIAL EQUATION Constant Heat Flux Boundary Condition (Two Parameters) 0 .1 I f a flat electric. heater is affixed to the surface of a large body a nd a const~nt c urrent IS passed through the heater, the surface heat flux into the body IS constant. The surface temperature will respond in a similar manner fr~m 3.3. t~ . 12 sec as that shown in Fig. 7. 17. I f the body is semi-infinite with an Inillal temperature To a nd is subjected to the constant heat flux q, the temperature response can be written as T - To=2( ~ ) (t+)1/2ierfc[ ( 4/+) - 1/2] ierfc( z) = 'TT - 1/2 exp( - z2) - z erfc(z) 0 - 0.2 x+ 1 and - 0.4 x+ (8.5.6) 2 (8.5.7) 2 al / x . where 1 + is again In this case T is a nonlinear function of the I wo paramet:~s' . k an.d c (sinc~ a = k / c). A nother combination of parameters is a an.d k .; In thiS cas~ . T. I~ nonlinear in a but linear in k - I . [See (7.10.11 ).) DimensIOnless sensItiVities for the parameters k a nd c are 5 F lpre 8.13 Sensitivities a + T X +=_c_ _ __ ! -. ) 1/2 ( 'IT 2 - qx/k a ceX -I P (4/+) (8.5.9) [Verify.that the relatio~ ~i~~n by (7.10.9) is satisfied by (8.5.6), (8.5.8), a nd (8.5.9).) These ~wo sensltlvltle! are depicted in Fig. 8. 13; starts positive a nd goes. negative w~ereas X 2 is always negative a nd larger in magnitude. At the time that XI goes to zero, the temperature T is insensitive (i.e., unch~nged) by small changes in k. O ne significance of X + being larger in m agmtude than is that, if only k o r c were to b / estimated, there would be o n the average less relative uncertainty in c than k. I t is also instructive to evaluate T a nd the sensitivities a t the surface ( x=O); we get xt xt 1 T (O,/)- To=2q ( kc'TT kaT(O, I ) ak \ . ' '" . ,. )1/2 (8.5.10) = _ q(_I_) 1/2 = caT(O, I ) kc'TT ac (8.5.11 ) Since the two sensitivities at x =O are proportional,.:1 is equal to zero and = 0 alone, no matter how accurate, cannot permit the mdependent estimation of both parameters. ~easurements a t x x t and x t for k and c for semi-infinite body with q-conSlanl. Because the sensitivities for x > 0 a re not proportional as shown in Fig. 8. 13, a ny interior location can b e used to provide d ata for estimating k a nd e. N ot all locations o r d urations o f the experiments are equally as effective, however. In order to find a meaningful optimal experiment, a constraint for the temperature rise is needed because as shown by (8.5.10) T goes to infinity as 1 increases without limit. F rom physical considerations only a finite maximum temperature is possible (materials melt o r vaporize). T he c onstraint of the same m aximum temperature rise can be introduced using (8 .3.6,7). Let fJnom be equal to q x / k ; q is analogous to the adjustable constant C in Section 8.2. The quantity '1m. . in (8.3.6b) is the maximum rise o f T max- To; thus 'I';, also in (S.3.6b), is ( Tmax- To)/(qx/k), which from (8 .5.10) is + = T max - T.0 'I", q x/k = T (O/)-T.0 '" qx/k (kl) 2 = _ ___ = 2(/+ /'IT)1/2 " 'lT1/2 ex2 " (8.5.12) T he maximum temperature which occurs a t x =O a nd a t time I" is made to be the same in each case by appropriately adjusting q. (The x given explicitly in (8.5.12) refers to the location x >O of a sensor.) A plot o f .:1 + defined by (S.3 .7) versus 1,,+ for one interior measurement yields a maximum.:1+ value o f 0.000167 a t 1 ,,+=al,.!x 2 =8.5. Again this results can be interpreted in two ways. First, for a given location o f the temperature sensor, say, a t x =O.02m in an iron block ( a=2x 1 0-' 450 C HAPTER. OESIGN OF OPTIMAL EXPERIMENTS m 2/sec), the optimal duration is 'n =8.5 x 2/ a - 170 secs. Second, for the same example if the optimal duration were desired to be 170 secs, then the sensor should be located 0.02 m from the heated surface. It is instructive to study the case of two sensors, each producing equally spaced, independent measurements starting at 1 =0. I f two thermocouples are located at the same x , the use of C/ [defined by (8.3.8)] in (8.3 .7) would give the same optimal value of .:1 + . I f a search is made for the optimal two locations, they are found to be at x = 0 and a t any x > 0 so that 1 + -= a /n / x 2 = 1.5; the associated .:1 + value is 0.00263, which is almost 16 times the maximal value mentioned above for one sensor. Hence a design involving two sensors positioned as indicated would result in much greater accuracy in the estimates of k a nd c than if only a single sensor were used or if two were used at the same x . Heat Flux Boundary COMition to CtllIM a S tep CluuIge ; " S ur/ace Temperature Temperatures inside the semi-infinite body change most for a given temperature range when the heated surface takes a maximal step increase. Both k a nd c can be estimated if this change in temperature is caused by a prescribed heat flux. ( If the surface temperature is the specified boundary condition, only a can be estimated. See Section 8.5. \. \.) When the temperatures change most, the sensitivity coefficients would also be expected to be greatest in magnitude. [See (7.10.9) for a relation between T, aT/ak, a nd aT/ac .] We would anticipate for this reason that this case may have the optimal heat flux boundary condition. A surface heat flux having the time dependence q =a(t,")-1/2 (8.5.13) produces a step rise in surface temperature of T", - To. The constant a is related to T", - To by a =(kC)I/2( T", - To). The temperature distribution [9] a nd the k and c sensitivities are T (x,t)- T. T +(t+)E o =erfc[(4t+)-1/2]=A,t+=at ( 8514) T", - To x2 •• xt= T",~To ~r = -i(A-B), xt= T ",-To c aT = -'!'(A+B) ik 2 (8.5.15) (8.5.16) 1.5 O mMAL ES11MAnON FOR PAR11AL DIFFERENnAL EQUAnON xt In Fig. 8.30, is the X, curve and C it is the .:1; curve in Fig. 8.3b. Because the above case has a limited range of T, a constraint on T is incorporated in the solution. An optimal location for one sensor again cannot be for x -O as the for one sensor occurs a t sensitivities are proportional there. Optimal , + _ 10 a t which time .:1 + is the maximal value of 0.00232. I f two sensors : re optimally placed. they are a t x - 0 a nd a t the x corresponding to , + _ a l / x 2 .... 1.25 where .:1 + is the much larger value of 0.0113. Again two s~nsorsn located as indicated are much more effective than one. I,: '.5.1.4 S"""""ry o f O ptimal Desigru for &mi·lnj"mite Bodies Subjected to Heat n ux BouIulIIry CONIititHu A summary of results for the heat nux boundary condition is given in Table 8.2. Cases I a nd 4 are for a single sensor a t x =0; precise measurements at only that location cannot be used to estimate inde~ndently k a nd c. However, if only k o r c is estimated, x =O is the optimal location. Also given are cases 2 and 5 which are for a single sensor a t x > 0. T he optimal results are given by cases 3 and 6 for two sensors. Table 8.2 Summary of Maximum Values o f .:1 + for S eml·lnnnhe Bodies with H eat Flux Boundary Condition• .:1 + and the C / a re Normalized t o Contain the Same Number of Measurements In Eac:b Case Boundary Location of Maximum Case Condition Sensors fl+ I 2 3 4 5 6 q -const. q -const. q -const. q for T - T", q for T - T", q for T - T", x -O x -x>O x -O,x x -O x -x>O x -O,x 0 0.000167 0.00263 0 0.002317 0.0\13 Time of Max. f l+, t,,+ _ a/,,/x 2 B.5 1.5 10.0 1.25 Components of Maximum fl + C it 0.125 O.OIBI 0.0631 0.25 0.05B5 0.1275 C2~ c.1 0.125 0.\119 O.09BI 0.25 0.2325 0.2003 0.125 0.0431 0.0597 0.25 0.1062 0.\192 The covariance matrix of the estimated parameter vector b having elements k a nd c is given by ( XTX)-1.,2 provided standard assumptions of additive zero mean constant variance, independent normal errors apply (more s~cifically, ~ssumptions denoted 11111-11). Then f or" being the CHAPTER 8 DESIGN O F OPTIMAL EXPERIMENTS 6. T he n umber o f o ptimal conditions c an b e less than, equal to, o r m ore total n umber of measurements, the covariance o f b is (8.5.17) (8.5.18) Values o f C ,/'s a re given in the last three columns o f T able 8.2. We can use them, for example, to give the approximate s tandard d eviation o f k a s (8.5.19) T he s~cond f actor in (8.5.19) c an be considered to be relative measurement e rr?r I n the temperature a nd the factor wilh the square root is a n amplificatl~n f actor for the ~onductivity. T he s maller the amplification, the more precIse are the k estImates. F or n = 25 the amplification factor is 5.2 for ~ase 2 a~d 0.84 for case 6. T his corroborates that larger values o f tJ. + result I n e xpenments t hat pe~mit e stimating parameters with greater accuracy. A nother use for expressIons such as (8.5.19) is in determining the n umber n o f m easurements needed for specified accuracy. Conclusions that can be drawn from Table 8.2 for estimating k a nd c are as follows: I. 2. A single sensor a t x = 0 is n ot permitted. W hen o ne s ensor a t x ~O is used the optimal time In+ is a bout 10 for both heat flux b oundary c onditions. 3 . W hen two sensors are used, o ne s hould be a t x == 0 a nd the o ther a t x > O. N ote t hat the.~ptimal c~nditions for on~ s ensor are nol r epeated. 4. T he h eat f1~x condItIon c ausing a s tep change in s urface temperature ~~ases 5, 6) IS much superior to the constant flux condition, cases 2 a nd 5. N eN ~ 1.5 OPTIMAL ESTIMATION FOR PARTIAL DIFFERENTIAL EQUATION T he optimum o f the optimal designs given in Table 8.2 is case 6. Hence, when k a nd c a re estimated in a semi-infinite body, this would be the recommended design. I t c an be shown (13) that if more than two sensors are to be used, a bout h alf should be placed a t x == 0 a nd the remainder a t x = ( alnl 1.25)1/2. t han t he n umber o f p arameters. F or a given heat flux b oundary c ondition a nd o ne s ensor, tJ. + is maximized only with respect to 1,,+. Also for given q (l) b ut w ith two sensors, tJ. + is maximized with respect to two parameters relating to the location o f t he parameters. Finally for a rbitrary q (I), tJ. + c an b e m aximized b y v arying the function q (l) which involves a n i nfinite set o f f unctions, two o f which are illustrated in T able 8.2. O f all these possible functions n one c an yield larger tJ. + values for semi-infinite bodies t han t he h eat flux function o f cases 5 a nd 6. 8.5.2 F inite Body Examples SinllJoidllllllitilll Tempe",,,,re in II Plllte C onsider for the first finite b ody e xample the case o f a p late which h as a s inusoidal initial temperature a nd z ero temperature b oundary c onditions, T (x,O)= T ",sin( 7.), T ( L,/) =0 T(O, I) = 0, (8.5.20) T he s olution o f (8.5.1) with these conditions is T (x,l) = T",exp( - 'lT21+)sin( 'IT:), (8.5.21) Again for temperature b oundary c onditions, only the thermal diffusivity a a ppears-not k a nd c i ndependently. T he d imensionless a sensitivity is X+ (.:!. ' 1+) = ~ aa = T L T", a , 'lT 2 + e xp( - 'lT 2, + ) sin('lT!.) L (8.5.22) T his expression has maximal m agnitude a t x / L =0.5 a nd 'lT 2, + == I (replace I + in Fig. 8.4 b y 'lT 2, + t o see the I + d ependence). Consequently if only o ne s ensor location is c hosen, it should b e a t x / L - 0.5. F urther, if only o ne time is selected, it should b e a t 1 - L 2/ 'lT 2a. Since the range o f T is constrained t o b etween 0 a nd Tift' t he maxtJ. + c riterion is a ppropriate f or n e qually spaced measurements starting a t 1==0 (n is " large"). Using (8.2.13) with TIft/ Tift = I, a n e xpression for tJ. + is given. Necessary conditions for a m aximum a re 11: - atJ.+ - a = 0, + (8.5.23) I" Using Fig. 8.4 the optimal d uration is I," - a/,,/ L 2= 1.691817I 2 'lT ; t he CHAPTER 8 DESIGN OF OPTIMAL EXPERIMENTS o ptimal x I L is 0.5 a s ror one measurement. N ote t hat though there is o nly one p arameter (namely. a). 11 + is maximized with respect to two variables. We can also locate optimal positions ror two sensors. In this case o r o ne p arameter. 11 + is given by e,l+ as defined by (8.3.8) with m = 2. 'I': = I. a nd ;=j = I; 11 + is maximized by putting both sensors a t x I L = 0.5. Constant Heat Flux a t x = 0, Insulated a t x = L A case permitting the two parameters k a nd c t o be estimated is a p late exposed to a constant heat flux q o n o ne s ide a nd i nsulated o n the other. Mathematically this problem is d escribed by (8.5.1) a nd k aT(O,t) T (x,O)= To, ax = q, a T(L,t) aL =0 • .5 OPTIMAL £ snMAnoN FOR PARnAL DlFF£RENnAL EQUAnON p lotted versus 1+ ror various positions in the plate. ( X + a nd X + a re d etermined in Problem 8.11). After an initial period. a nd 2 _ X i ncrease linearly with time whereas X I a pproaches various c onstant valu~s i ncluding zero. Since X I goes to zero near x / L =0.5. this is a p oor location for a temperature sensor in this case. Suppose t hat b oth k a nd c a re to be estimated using many equally spaced measurements. Assume that the standard assumptions denoted 11111-11 are valid. Since T increases without limit as t -+oo,.a c onstraint is n eeded. T he 11 + c riterion given b y (8.3.7) c an b e used with C / defined by (8.3.8) to include this constraint. T he t erm " .: is (Tift - To)/(qLI k ) which r+ + (8.5.24) 0 .1 T he dimensionless temperature (9) is 455 1-</", ~~ ~,. + -I ( x + )2 - -2 ~ - Ie - " , cos mrx + (8.5.25) ~ "'~ where T+ = (T - To)/(qLI k). x + = x l L. a nd t + = at I L2. In Figures 8.14. 15, a nd 16 the dimensionless temperature a nd k a nd c sensitivities are 0 .75 o~~==~~ ____________________ __________ ~ ~ +- T + = t + + -I - x + 3 2 '1T2"_ln2 .. - 0.1 >< - O.l - 0.3 o.r; o F lpe 8.15 Dimensionless sensitivity x t for q -C a t x -o and q -O at x - L. 1-<1 u - 0.4 + I-< ~~ u~ - 0.3 - 0.1 F lpre 8.14 x -L. Dimensionless temperatures in a finite body with q - C at x - 0 and q -O a t F Ipre 8.16 Dimensionless sensitivity Xl + for q - Cat x -o and q -O at x - L. CHAPTER 8 DESIGN OF OPTIMAL EXPERIMENTS 11; is given by (8 .5.25) evaluated at x + = 0 a nd In+; n~tice t hat is a functio~ of only In+' By using in this manner the maxImum temperature, m , IS m ade to be same value for each duration I n' Consider first the case o f a single sensor. The optimal location is a t x = 0 a nd the optimal duration for taking uniformly spaced measurements is In+ = 1.2. See case I o f Table 8.3. This location is suggested from an inspection of Figs. 8. 15 a nd 16 because the magnitude o f the k a nd c sensitivities are largest at x = O. Their magnitudes were also largest for the semi-infinite body but we found that a single sensor a t x = 0 for the semi-infinite body would not permit both k a nd c to be estimated. The difference between the two cases is that though the k a nd c sensitivities are proporlional [see (8 .5. 11)] a t the heated surface of the semi-infinite body, they are not proportional a t x = 0 for the finite body (since X approaches increases with time). I t does happen that the k a nd c a constant and sensitivities at x = 0 for the finite body are nearly proportional until time 1 + = 0.3; clearly 6 + must have a maximum at a larger time than that. 11; t xt . Table 8.3 Summary of Maximum Values of 6 + for Finite Bodies Insulated on O ne Side B oundary conditions C ase I 2 3 4 5 T Location o f T emperature M aximum a t x =O S ensors ~+ q =constant q =constant q =constant q . for T = Tin q for T = T", x =O x=L x =O a nd L x =O x =O a nd L 0 .00098 0 .00019 0.00588 0.0291 0 .0358 T ime o f Maximum ~+, tn = atnl L2 + \,2 1.3 0.65 \.8 0.76 Two additional optimal cases for T given by (8 .5.25) are listed in Table 8.3. Case 2 is for a single sensor a t x = L . Case 3 is for two sensors optimally located; of all possible two locations the best are at x = 0 a nd L . I f more than two sensors are used, the optimal design is approximated by having m / 2 sensors at x = 0 a nd m / 2 a t x = L . See Problem 8.13. Recall from the way 6 + is defined that having a multiple number of sensors at the same location does not change the 6 + values. Notice that f l + of case I is a bout one-sixth 'of 6 + for case 3. Hence the use of one sensor at x = 0 a nd a nother at x = L is much more effective for accurately estimating k a nd c than placing both a t x = O. In addition to optimal experiment durations a nd optimal sensor loca- 1.5 OPTIMAL ESTIMATION FOR PARTIAL DIFFERENTIAL EQUATION 457 tions, optimal boundary conditions could b e sought. The optimal heat flux boundary condition a t x =0 is a h eat flux history which causes the surface temperature to take a step increase to the maximum temperature. Cases 4 a nd 5 in Table 8.3 are for this boundary condition. Notice that .1 + o f case S, which is for measurements a t x = 0 a nd L , is the largest of all those listed in Table 8.3. A still larger value is found if a n optimal boundary condition a t x = L is used (10). In Tables 8.2 a nd 8.3 a number o f o ptimal experiments are given. I f we have the freedom to choose ( I) the location a nd n umber of the temperature sensors, (2) the time variation o f the heat n ux, a nd (3) the geometry, a n o ptimal experiment o f those listed can be selected. In each case the decision is simply based o n the size o f .1 + , with the largest values being best. NotiCe for comparable heating conditions a nd locations of sensors that the plate insulated a t x = L is always better. O ne could continue this search by modifying the insulation boundary condition a nd by investigating other geometries such as cylinders a nd spheres. 8.5.3 AddltlonaJ Cases Applications o f the optimal criteria for various ordinary a nd partial differential equations are unlimited. The purpose o f this subsection to provide more references. Some analyses o f o ptimal experiments involving ordinary equations are given by Heineken e t al. ( II) a nd Seinfeld a nd Lapidus [12, p. 432). These references relate to optimal design for chemical rate constants. An ordinary differential in connection with the optimal design for heat transfer coefficients is studied by Van Fossen [13]. F urther cases involving optimal estimation o f p arameters in the heat conduction equation o r associated boundary conditions a re given in references 14-20. M ost o f these cases involve consideration of linear partial differential equations. T he d ependent variable is usually a nonlinear function of the parameters even though the differential equation model is linear; nonlinear differential equations introduce further complications in the design o f experiments. Two papers studying nonlinear differential equations models are [21), which considers the case o f temperature variable k , a nd (22), which contains a study o f optimal experiments for freezingmelting problems. O ne difficulty is that the sensitivities must be obtained numerically (see Section 7.10); the integrals in the C /'s must then be evaluated using trapezoidal o r Simpson's rule. This is n ot a bothersome difficulty. O ne more complexity is including the constraint o f m aximal range o f 11 when 11 is o btained from a nonlinear equation. In that case 11; in ( is n ot a simple function of In ' r , 451 CHAPTER I DESIGN O F O PTIMAL EXPERIMENTS 1.6 NONSTANDARD A SSUMPTIONS 459 8.5.4 Optimal Heat Conduction Experiment N W ( Xl As noted above there are many possible optimal experiments diHering in geometry, number of sensors, boundary conditions, an~ so on. We naturally wish to design " best" experiments but practical aspects frequently mean that the optimum of all the optimal experiments cannot be chosen. Section 7.9.4 describes an experiment that is optimal in many respects for estimating k a nd c ; this section is devoted to a description of the design of that experiment. From a comparison of optimal results in Tables 8.2 a nd 8.3 the finite plate heated on one side a nd insulated on the other is found to be better than the semi-infinite geometry. I t is also experimentally practical. T he locations for two o r more thermocouples are a t the heated ( x = 0) a nd insulated ( x = L ) surfaces. An equal number should be placed a t each surface. Because eight were available, four were a t x = 0 a nd four a t x = L. In order to ensure no direct heat losses from the heater, the heater was placed between two identical specimens, both of which had two sensors a t x =O a nd two at L. This placement of multiple sensors at the same location is c ontrary to i ntuition-one feels that a better design would be to place each sensor at a different position relative to the heated surface. I f the heat conduction model used is correct, then the optimal locations are a t x = 0 a nd x = L . Placing them in this manner one maximizes /j. + which minimizes the variances of k a nd c. F urthermore the assumptions of constant variance and independent errors can be checked more readily than if measurements are not replicated. T he insulation boundary condition a t x = L can only be approximated since there are no perfect thermal insulators. The validity of this assumption can be investigated by noting if there is a charactersitic "signature" in the residuals. With a n electric heater a step increase in heat flux (i .e., constant flux) of finite duration is easily introduced. The heat flux to cause a step change in temperature a t x = 0 (which is the optimal experiment in Table 8.3) is n ot as readily applied. For that reason a constant heat f1ull for a finite duration was used. Figure 8.17 shows the /j. + criterion for this geometry for an equal number o f sensors a t x = 0 a nd L. T he heat flux is constant between times o a nd Iq • The constraints of a fixed large number of measurements and same maximum temperature rise are used. I t is found that a shorter duration of heating than the interval over which data are used results in increased values of /j. + . This means that there are two optimal times in this experiment: the duration of heating (1 9+ = 0.5) a nd the maximum time at which data are used (1"+ ;;:; 0.75). T he experiment was designed to be near these conditions. 6 + t+ n F lpre 1.17 T he 4 + criterion for a finite plate insulated a t x - L a nd heated a t x -o with a c onstant heat nux d urin, times 0 < , < '. after which the nux is zero. There are a n equal number o f temperature s enson a t x - 0 as a t x - L. A rter the experiment is performed a nd parameters estimated, one should check the validity of the assumptions. Residuals for a n actual experiment are shown in Figs. 7.20 a nd 7.21. Most o f the residuals tend to decrease with time for tlie last third o f the experiment. This suggests heat losses a t x '" L a nd thus a n imperfect model. Moreover, the residuals are highly correlated rather than being uncorrelated. In careful work both conditions would be further considered. I t is anticipated, however, that the experiment design would not be greatly altered as the result o f such investigation. See the next section for a brief discussion o f the treatment of correlated errors. 8.6 NONSTANDARD ASSUMPTIONS In this section the basic criterion is modified for cases when two standard assumptions are no longer valid. The cases of nonconstant variance measurement errors a nd correlated errors are considered. 8.6.1 Nonoonstant Variance F or all the standard assumptions being valid except that the error variance is not constant (i.e., E (f}>- o,l), the error covariance matrix is given by " ,-diag(a: . .. a~). F or maximum likelihood estimation the criterion to CHAPTER 8 DESIGN OF OPTIMAL EXPERIMEl'ITS maximize is (S .3.2). All the equations given above which include various constraints still may be used for n onconstant v ariance by simply replacing Xi) by X!!O,-I . 8.6.2 8.8 NOT ALL PARAMETERS O F I Nn:REST 461 T o. i llustrate the criterion given !>y (S. 7.1) assume t hat o ne previous expertment has been performed a nd t hat negligible p rior i nformation is a vailable s o t hat V i I is V i'=[XT-f-'X], Correlated Errors A p articular type of correlated errors is the first-order autoregressive error which is d escribed hy £j = p,£, _ I + Ui , i = 1,2, . .. , n !. .. .. n ; j = l , . .. , p T hen the second experiment would be designed so t hat (8.7.3) (S.6. 1) where the Uj a re normal a nd i ndependent with zero mean a nd v ariance 0 2. W hen m aximum likelihood estimation is used. this case can also build o n the previous results by replacing Xi} by i= (8.6.2) ~s ma~i~i~ed by the varying, the experiment duration. etc. Only the terms [X -f Xh would be changed. In s ome cases the second experiment might be similar to the first o ne while in o ther 'c ases it would be quite different. . tn T he c riterion given by (8.7.1) c an a lso be expressed in a different form. By mUltiplying (8.7.1) by IVpl we find ~IVpl = II + X T-f-'XVpl =11 + - f-'XVpXTI where Xo is defined to be zero for all permissible j values. F or m any } equally spaced measurements in time. Zi) c an be approximated by ax!! Z!!~ X !!(I-P,)+PiMTt (8.7.2) N ow using the identity 11+ ABI = I' + BAI, (8.7.4) c an (8.6.3) which indicates that as P, a pproaches unity (perfect correlation) the time derivatives of the sensitivity coefficients become important. ~= Since ing (8.7.4) b e written as 1 -f+XVpXTI IVpll-f1 (8.7.5) IVpll-f1 is a positive constant, maximizing ~ is equivalent to maximiz- 8.7 SEQUENTIAL OPTIMIZATION (8.7.6) S uppose that a set of experiments have been performed a nd the associated parameters a nd p arameter c ovariance matrix have been estimated. These experiments need not have been optimally designed b ut the next experiment ( or set of measurements) is to be optimally designed. Suppose also t hat (subjective) M AP e stimation is b eing used a nd t hat the s tandard a ssumptions denoted 11--1113 are valid. T he criterion to maximize in this case is (see Appendix SA) Hence we have a choice between maximizing ~ o r T . O ur choice should depend o n t he relative dimensions o f the two matrices, which are p X P a nd n x n, respectively. T he d eterminant o f lower dimension would be chosen. A c ase favorable to using T is f or p < 2 a nd for n = I, t hat is, a single measurement o f the d ependent v ariable is m ade. 8.8 N OT ALL PARAMETERS O F I NTEREST (8.7.1) where XTt} - IX is for the proposed experiment a nd Vp is the covariance matrix of the estimated p arameter values based o n d ata o f the previous experiments a nd p rior information. The dimensions of XTt} - IX a nd Vp m ust be the same. that is, involve the same n umber o f parameters. T here are p arameter e stimation problems that require the estimation o f p arameters in addition to those o f p rimary interest. The extra parameters a re s ometimes termed nuisance parameters. In Example 8.2.4 the p arameter {J (reciprocal time c onstant o f the billel) might be the o ne o f i nterest; however, it might also b e necessary to estimate simultaneously the fluid - 462 CHAPTER 1 DESIGN OF OPTIMAL EXPERIMENTS temperature Too' Another type of problem is when statistical parameters such as the correlation p in the autoregression error model (8.6. 1) a re found. Though the p value may be needed to estimate the confidence region, generally its value is not needed as accurately as those of the "physical" parameters. Further examples are given by Hunter and Hill . .. NOT ALL PARAMETERS O f' I NTERrsr p arameter of interest, the criterion is t o maximize Ai2' as implied by (8.8.3); here C II - (23,24). 1. l'"e-211J1 d,- _1_ [ I-exp( 21,,+ I" 0 21+)] " (8.8.5) Appendix 8 8 gives a derivation of a criterion when out of a total of p estimated parameters only the first q ( p > q) are of interest. For the standard assumptions designated 11111-11 the criterion is to maximize (8.8.1 ) where XI is an n X q matrix and is for the first q parameters and where X2 is a n n X r matrix which is for the remaining r = p - q parameters that are not of primary interest. The symbol !lp means the usual determinant of all the parameters. i.e., The minimum q a nd r values are q = I a nd r = l . In summation form this simple case results in (8.8.2) Let the condition of a fixed large number of measurements equally spaced in time be valid; by using the notation given by (8.3.5), !l12 can be approximated by ! lIZ= n!li2°;;;; n [ C II - C~2Ci21] = n !l;Ciil F or CZ2 ' see Problem 8.1. From these expressions we see that !li2 can be pl~tled a s a function o f 1,,+ . . Pll" a s depicted in Fig. 8.18. F or p, being the p nmary p arameter o f interest, the optimal value o f 1,,+ is small as possible. I f instead fJl is the parameter o f primary interest, the subscripts I a nd 2 of the C 's in (8.8.3) are interchanged. Since the reSUlting Ail is proportional to (/1,/ P )2, plotted also in Fig. 8.18 is pf Ai2/ fJ~ versus 1,,+. T he optimal 2 time for this case is 1,,+" 2.0. This dimensionless time can be compared with the optimal times for estimating PI alone o f zero, Pl alone o f 1.692, a nd b oth p, a nd Pl o f 1.191. Hence the optimal durations o f the experi. ment can be quite different for the various objectives o f p, only being of interest, a nd so on. O.~------~-------T------~--------r-----~ . 05 O. .04 (8.8.3) A comparison of this expression with (8.5.11) shows that !liz is proportional to the reciprocal o f the variance of b l. Hence maximizing !liz has the beneficial effect of minimizing the variance of b l . As an example of the use of the max !li2 criterion, consider the exponential model (8.8.4) which has one linear and one nonlinear parameter. For (8.8.6) PI being the .03 .02 .01 o'~~ o _ _~~_ ___~~_ ___~~_ ___~~_ ___~ 0 .5 1 .0 .. 1 .5 2 .0 2 .5 tn F Ipre . ..1 Criteria ror optimal estimation or PI and where each may b e or primary interest. Pl in the m odel" - PI exp( - Pl') CHAPTER 8 DESIGN OF OPTIMAL EXPERIMENTS 8.9 DESIGN CRITERIA FOR MODEL D lSCRIMINA1l0N 8.9.1 8.9 DESIGN CRITERIA FOR MODEL DISCRIMINATION Sometimes the physical model is n ot known but several alternate models can be proposed. In such cases the problem is to select the " best" model, that is, the one that best fits the data. A method of model selection, termed model discrimination, involves experimental designs that maximize differences between predicted responses of two or more models. A chemical engineering example of a case where discrimination is needed occurs when substance A reacts in the presence of a catalyst to form substance B, which in turn forms C. Two possible models are A -+B-+C a nd A-+B+=tC. T he predicted concentrations of substance B versus time for the two models are shown in Fig. 8.19. I f the reaction is observed only until time II' no discrimination can be accomplished because the predicted responses are nearly identical until I I' Measured values of the B concentration are required after time II ( and preferable near 12) to determine the best model. Many methods of model discrimination have been proposed. Given first is a method that results in a criterion similar to fl. Next discussed is a method utilizing information theory. The former method is simpler in application but the latter has a more satisfying basis. In each case the analyses start with consideration of two competing mathematical models. Linearization Method In this method the objective is to seek experiments that cause the minimum values o f the sum o f squares functions to be quite different for two competing models. Suppose two models are available a nd t he best one is to be determined. Let the standard assumptions 1111--11 b e valid a nd OLS estimation be used. (The analysis can be modified for other cases). The sum o f squares function for model i c an b e written as (8.9.1 ) Let the model equation be ,,(i)=f(i)(X,lJc,IJ(i» where x is the independent variable vector, IJc is the vector o f p arameters common to both models (if there are common ones), a nd IJ ( i) is the q vector o f p arameters distinctive to .model i. Suppose that a nominal set o f p arameters is chosen a nd t hat , ,(I) is expressed in terms o f a T aylor series near this nominal set so that (8.9.2) where X(i) is the sensitivity matrix for IJ (i). I ntroducing (8.9.2) into (8.9.1) where the fllJ ( I) values are chosen to minimize S ( i) yields min S 0) = (Y _ ,,(0») T(Y _ ,,(0») + 2(X(i) fllJ (i») T( " (0) _ Y) + (X(i) f llJ(;)) TX(;) f llJ(i) (8.9.3) which implies the fllJ ( i) vector o f f llJ(i) (8.9.4) Let us now subtract minS(2) from m inS(I) a nd a ttempt to find the maximum of the absolute value o f the difference o r CD .... o \ c: I \ o ... +' ~A .. B .. C '-i "I c: t il u c: o u (8.9.5) Time Discrimination example involving concentration of substance B for models A . .. B . .. C and A . .. B pe. C = maximin S(I) - min S(2)1 = maxl(Y - ,,(0») T[ X(2)(X T(2)X(2» - I XT(2) _ X(I)(XT(J)X(I» -IXT(J) J(Y _ ,,(0»)1 ." I +' Figure 8.19 = (XT/ilX(il) -IXT(;)(Y _ ,,(0») Although we d o n ot know Y - ,,(0), let us assume temporarily that Model I is correct a nd t hat the measurement errors are sufficiently small so that (8.9.6) CHAPTER (I DESIGN OF OPTIMAL EXPERIMENTS T hen C given by (8 .9.5) becomes C = minll1/1 T ( 11M ( I) 11/1 (1)1 (8.9.7a) (8.9.7b) T he q X q matrix M (I) is exactly the same matrix whose determinant is maximized when X (I) is for q p arameters of primary interest a nd X (2) is for r p arameters of less interest, see (8.8.1). If instead of Model I being correct, Model 2 which involves r p arameters is correct, (8.9.7) becomes (8.9.8a) 1.9 DESIGN CRITERIA FOR MODEL D ISCRIMINAnON examp~e, the~e a te c~rtain h eat conduction problems in which changes occurn~g d unng .heatmg o f a material may be due to a change o f phase o r a chemIcal reacllon. O ne o f these is reversible a nd the other is not. This suggests that the critical temperature range be covered using a cooling after a heating process. T he b ehavior o f the change of phase a nd reaction models are quite different during the cooling period. Another example where discrimination might be used in determining if h::z (11 + (12 1 o r h =(1. + (12( T - Tao) i~ the better model o f Section 7.5.2. Example 8.9.1 Consider Ihe Iwo compeling models (8.9.8b) The problem now is to select some criterion that has the effect o f maximizing C. I f C is fixed at some value, (8.9.7a) a nd (8.9.8a) both 'describe the surfaces of hyperel\ipsoids since both are very similar to the confidence region expression given by (6.8.39). The coordinates are the 11(1's. In the case o f (8 .9.7a), for a given hypervolume C is maximized by maximizing the determinant of M <I). F or Model· 2 being correct, the analogous criterion is the maximization of M<2). But since we d o n ot know which model is correct, we choose a criterion that does not prefer o ne model over the other. Such a criterion is simply formed from the augmented XTX matrix. T hat is, we propose that discrimination can be i mproved by designing experiments so that XTUlX m XTmXUl I (8.9.9) is maximized. Note now that the X matrix is c omposed o f sensitivity matrices from two different models a nd t hat X U) has dimensions n x q a nd X (2) has dimensions n x r. An advantage of the 11 c riterion given by (8 .9.9) is t hat it is simple; its use is similar to the 11 criterion discussed in Sections 8.1-8.7. A further advantage is that no d ata a re needed for the design o f experiments using this criterion; one needs only the models and some approximate values of the parameters. T he effect of maximizing 11 given by (8 .9.9) is to emphasize the differences between the models. All models fail a t s ome point and it may be t hat a t these points the greatest discrimination power is present. F or The standard error assumptions are valid. The optimal duration o f experimenls for a large fixed number o f uniformly spaced measurements slarting al 1 -0 is to be found. No conslrainls on the ranges o f the ,,'s are 10 be used. Solution Since the constant parameter fJl appears in both models, bolh models are alike to Ihal extent. Hence the emphasis should b e upon the fJz terms. Using the above notation we have The quanlily to maximize is d given by (B.9.9). To include the assumplion of a fixed large number of uniformly spaced measurements, d should be modified 10 4~ as indicated by (B.3.S) . • n this case ell would b e ell of Fig. B and e ll would be .II e ll of Fig. B.9. The resulting 4~ is nearly zero unlil lime I . . 2.5 a l which time d~ rises quickly to the first local maximum or aboul 0.4 a t t im; S.S. Arter this time the 4~ criterion gradually oscillates 10 larger values with Ihe global maximum being 0.5 a t I~-+oo. These results are reasonable because sin I a nd I - e - , are similar unlil 1 - I .S b ut are quite dissimilar for I :> 3. According to Ihe max4~ criterion which assumes many equally spaced measurements slarling al 1 -0, then, Ihe experiment should be o r infinite duration but for practical purposes it could be any lime grealer Ihan I~"" S 10 discriminale between the two models. 8.9.2 Information Theory M ethod Suppose t hat two rival models are available, , ,(;)=«;'(x,/I(;') where i _I a nd 2. Assume that estimates b (i) for the parameters appearing in the ith CHAPTER 8 DESIGN OF OPTIMAL EXPERIMENTS m odel are available a nd t hat t he associated estimated covarian~e m atrix VIi) is known. Typically these a re o btained b y fitting each model I n t urn t o d~ta from previously p erformed e xperiments. Using the par~meter values b (i) t he values of the d ependent v ariable ' 1(1) c an b e pre~lcted ~o~ an.Y p roposed e xperiment. assuming the ith model is correct. T hIs predIctIon IS d esignated 1.9 DESIGN CRITERIA FOR MODEL DISCRIMINATION L et the m easurement e rrors be n ormal ( more specifically, 11--1011) a nd l et the model errors have covariance matrices V<I) a nd V<2). T hen it c an b e s hown t hat J I •2 ( x) = - m + 4 tr [V(l)V<2)+ V(2)~t)] + 4(y(2) - y(I» T (V(I) + V (l)(y(2) _ y(I» (8.9.10) T he c ovariance matrix of the prediction e rror in (8.9. to), a ssuming t hat m odel i is correct, c an b e shown to be approximately w here V (I) == (v<i» - I. A n i mportant s pecial case o ccurs w hen o ne d ependent v ariable is present in the model a nd o nly o ne m easurement o f i t is m ade. T hen for i = 1 ,2, V (;) = a nd V (;) = 0;-2 w here ol p (8.9.11) . ( i)' f p(I)(1Jlx) 00 - 00 p(II(1Jlx)ln p(21('Ilx) O r=Si2 h w here If is the covariance of the m easurement e rrors o f Y a nd X IS t e sensitivity matrix for the ith model a nd t he experiment being considered. T he s econd term o n the right side o f (8 .9. 11) is s imilar t o t hat given by (6.2.12a) o r (6 .5.6). T he hypothesis that the ith model is correct le~ds to r~garding t~e o utcome o f a proposed experiment x as a r andom v anable ' I WIth p robabIlity density function p (i)('Ilx) h aving mean a nd c ovariance ~ive~ b~ (8.9. 10) a nd (8.9,\1), respectively . I f M odel I is c orrect. then ' I IS d lstnbuted a s p(I)(1Jlx); if Model 2 is correct, 1J is distributed as p(2)~1Jlx) . K ullback [2S) h as suggested t hat t he quantity In[p(I)(1Jlx)/p( 2)(1Jlx») IS a ~easure ~f t~e f avorability of hypothesis l over h ypothesis 2. T he e xpected i nformatIon I n f avor of Model (or hypothesis) I is drt But since it is n ot k nown whether Model I o r 2 is c orrect, Kullback suggested t hat the measure o f t otal information J I.2 b e maximized where T he objective is to select an experiment x t hat maximizes J I.2(x). A large value of J c an be o btained o nly if p(11 is m uch larger than p m, o r vice versa. In either case the result is a s trong preference for one model over the o ther. T he q uantity J 1.2 is called by Kullback the information for discrimination a nd is s imilar to (SA.4). (8.9.13) + p . ~ ~ v1~)x1j)xfl) (8.9.14) k -I/-I a nd 2 Si is a n e stimate o f V (Y) . T hen (8.9. 13) becomes T he i nformation r egarding the e xperiment x is c ontained i n 0 :, o~, y(\), a nd Y (2). T he o bjective is n ow t o c hoose the m easurement s o t hat J I•2(X) is maximized. Box a nd Hill (29) were tbe first t o d erive (8.9.\5); t hey have b een p ioneers i n t he a pplication o f s equential design o f e xperiments for model discrimination. L et us briefly c onsider s ome i mplications.of (8.9. 13}-{8.9. 15). H ypothetical plots o f t he predicted values y (I) a nd y (2) a re s hown i n F ig. 8.20 a s a f unction o f t ime I ( which is x i n this case). I f a single time is t o be c hosen t o d ecide between models I a nd 2, time II' w here the responses coincide, w ould n ot b e h elpful; time 11 , w here ( Y(2) - Y(1))2 is a maximum, w ould b e b etter. T he single b est m easurement l ime a ccording t o (8.9.15) is w hen ( y(2) _ Y(1)2 is a m aximum p rovided o~ a nd o~ v ary o nly sli~t1y wit~ I . T he d ecision o f w hich model t o c hoose d epends u pon h ow y (\) a nd y (2) c ompare w ith the measured value Y a t t he s ame time. I f Y (I) is n earer t o Y t han y{J), t hen m odel; w ould b e s elected (for i ,j= 1 ,2 a nd N =j). S hould Y b e m idway between y(l) a nd y (2), t here is n o b asis for m odel d iscrimination. I t is interesting t o c ompare t he criterion for this c ase w ith the o ne p reviously given (~) . F or this l atter c riterion, S (I)- S(21 w ould b e z ero a nd t hus the observation would be s ought a t s ome o ther I w here .. . , ' C HAPttR' bESIGN OF OP11MAL EXPERIMENTS .. . y := ,. ... If 'C:J - c: E.! .... ., 0 "- " -a' a .., &1>< o (> F lpre ' .20 Discrimination between two predicted Y's using the inrormation theory method. IS(t)-S(211 is a maximum. Hence the two criteria may not yield the same optimal experiments. There are several ways to treat more than two models. O ne is the following. After each experiment ;s performed. the likelihood L ( i l associated with each model a nd its current parameters is computed. We then design the next experiment in such a manner as to discriminate between the two models having the largest likelihood values. Another method for discrimination between more than two models is given by Box a nd Hilt (29). " ..,s.. : i 8 >. . .0 ., C ~ Termination Criteria A general sequential procedure of mechanistic model building can be visualized as including the steps in Fig. 8.21. (By "mechanistic" we mean a model that can be derived from basic principles.) Note that on the left are tasks performed by the analyst. in the center by the computer. and on the right by the laboratory. After starting one can propose some competing models. G . E . P. Box has made the point that one should not be timid in proposing models. The process itself should lead to discarding unsuitable models. Next comes performing experiments. followed by estimating all the parameters for all the models (block 3). In block 4. optimal experiments are sought to discriminate between the competing models. The method o f Box a nd Hilt could be used for this purpose. If desired. the experiment in block 2 could have been designed using the method in Section 8.9.1 . which does not directly utilize experimental data. After the optimal experimental conditions (designated xj ) in block 4 are found. the new experiment is performed (block 5). after which the estimates for all the parameters are found. b l". b121• • •• • T hen in block 7 a test ... ..'". ~ c ~t .'" .... '" I- 47. CHAPTER 8 DESIGN o r OPTIMAL EXPERIMENTS is m ade to ascertain if any of the proposed models is satisfactory. At the same time certain of them m ay b e d iscarded. T he rest of this section is a discussion of a termination criterion a nd suggestions for determining if a nother model is needed. T he wide applicability of the maximum likelihood method of estimation of parameters a nd o f generalized likelihood ratio tests suggest the consideration o f likelihood ratios in selecting the better o f two models. Suppose that the objective is to choose one of two hypotheses, H I ( Model I is c orrect) o r H2 ( Model 2 is correct). Let L(i)(Y,b(i» be the maximum j oint p robability density function associated with the d ata o btained thus far for the i th model a nd the associated parameters b(i). A likelihood ratio test can be constructed as follows: I. 2. 3. I f L ( I) I L ( 2) ... A , accept hypothesis 2. I f L ( I) I L (2) '> B, a ccept hypothesis 1. I f A < L (I)I L (2)< B , investigate alternate models a nd p erform more experiments. 8.9 DESIGN CRITERIA ·FOR MODEL DISCRIMINATION 473 tio~s c an b e g~ined t hrough inspection of the residuals. I f, for example, the resIduals a re hIghly c orrelated for a proposed model, then either the model should be improved o r the errors must be considered as being correlated. I f r epeated experiments continue to show high correlation in the residuals ? ne s houl? e xamine them. to see if there is some characteristic " signature': 1 0 t he resIduals. I f t here IS, o ne s hould a ttempt t o improve the model t o remove these signatures; if there is n o s ignature one would model the errors as being autoregressive, moving average, etc., processes. Example 8.9.2 Two models have been proposed for a process in which m different thermocouples have been used to make n measurements each. The assumptions of additive, zero mean, constant variance, independent, normal errors are made. The variance is unknown, there are no errors in the independent variables, and there is no prior information. (These assumptions are designated 11111011.) Find the likelihood ratio. Solution Methods for choosing A a nd B a re discussed from differing points of view by Ghosh (30), by Fedorov (3), a nd by Bard (31). B ard suggests that the relations between A a nd B a nd the probabilities of error which Wald [32] gave for testing simple versus simple hypotheses where sample sizes are large will work approximately in this situation. T hat is, if we let QI be the probability that H I is a ccepted when H2 is t rue a nd a2 the probability that H2 is a ccepted when H I is true, then for independent observations The parameters for the ith model are found by maximizing the natural logarithm of the joint probability density function ( pdf) of independent normal errors with respect to fJ ( i). The maximum value of the pdf is L (I).,. ( 2,,) - _/2 a - "'~ exp ( _ R(i») 2a 2 R(l). f t [}}k-1J}V(b(l»t j-Ik-I (8.9.16a) provided L ( I) is also maximized with respect to a 2 which leads to Q2:::5 A (B-I) B -A (8.9.16b) These relations mean, for example, that if we wish to be 90% c ertain that we accept HI only if H I is t rue a nd 80% c ertain that we accept H2 only if H2 is true, then Q1 = 0.1 a nd Q2=0.2. T hen using (8.9.16a), A ~0.2/0.9= 0.222 a nd B~0.8/0.1 = 8. I f we h ad s tarted with A a nd B, then the corresponding probabilities would be found using (8.9.16b). I n a ddition to continuing experimentation when the likelihood ratio is between A a nd B, we s hould also inspect the residuals to see if any insight c an be gained for improving any of the models o r for proposing a nother model. This would then lead to blocks 8 a nd 9 in Fig. 8.21. Regions of large departure in the residuals from r andom c onditions can sometimes imply improvements in the models. Also insight into statistical assump- R (l) _ a (l)"" ( )1/2 mn Then L ( i) becomes L (I)-(2tr) - -/2(a(l» - - ex p ( - ;rn ) and thus the likelihood ratio is After obtaining this ratio, we can determine to a given confidence whether Model I or Model 2 is to be accepted using the procedure described above. Before accepting a given model, one should investigate if the postulated assumptions are actually reasonable. - . C HAPTER. DESIGN 474 o r OPTIMAL EXPERIMENTS 20. REFERENCES N I . BOll, G . E. P. a nd Lucas. H. L., " Design o f Experiments in Nonlinear Situations," ~ C J) 2. 3. 4. S. 6. 7. B. 9. 10. II. 12. \3. 14. IS. 16 . 17. lB. 19. APPENDIX I A C RmRlA FOR A U P ARAMEnRS O F INTERFST B iommika 46 (1959), 7 7-90. BOll, G . E. P. a nd H unter, W. G., " Non -sequential Designs for the Estimation of Parameters in Nonlinear Models," Tech. Rep. No. 28, University of Wisconsin, Dept. of Statistics, Madison, Wis., 1964. Fedorov, V. V., T~or;ya Opli,,",I'IIogo EIc.Jp~rimmla, I watel'stvo Moskovskogo Universiteta, 1969, translated by W. J. Studden a nd E. M. K limo, TIIeory o f Opli,,",1 Exp~rim~lIu, Academic Press, Inc., New York, 1972. Badavas, P. C. a nd Saridis. G. W., " Response Identification o f Distributed Systems with Noisy Measurements a t Finite Points." III! Sci. 1 (1970), 19-34. McCormack, D. J. and Perlis, H. J., " The Determination of Optimum Measurement Locations in Distributed Parameter Processes." Proceedings of the 3rd Annual Princeton Conference o n Information Sciences a nd Systems, 1969, p p . 510-51B. Nahi, N. E., ElI;,,",I;Oll T1r~ry allll ApplicalioM, J ohn Wiley a nd Sons, Inc., New York, 1964. Smith, K ., " On the Standard Deviations of Adjusted a nd Interpolated Values o f a n Observed Polynomial F unction and its Constants a nd the G u idance they give Towards 1\ Proper Choice of the Distribution of Observations." B io_lrika U (1918), 1-85. Atkinson. A. C. a nd Hunter. W. G., " The Design o f Experiments for Parameter Estimation," T«""om~tricl. 10 (1968). 271-289. Carslaw. H. S. a nd Jaeger. J. c., COllducl;OIl o f H~al ;11 Solids, 2nd ed., Oxford University Press, L ondon, 1959. Beck, J . V., " The Optimum Analytical Design of Transient Eltperiments for Simultaneous Determinations of Thermal Conductivity and Specific Heat." Ph.D. Thesis, Dept. o f Mechanical Engineering, Michigan State University, 1964 . Heineken, F. G., Tsuchiya. H. M. and Aris. R .• " On the Accuracy of Determinins R ate Constants in Enzymatic Reactions," Malh. B iold. I (1967). 115-141. Seinfeld, J. H. and Lapidus, t., Mal"~malical M~I"ods ill C"~m;cal ElIg;IInrillg Yol. J Proc~u Mod~lillg, Ellimalioll, and Id~lIlificalioll, Prentice-Hall, Inc., Englewood Oirrs, N.J., 1974. Van Fossen. G . J., Jr ., "Design o f Eltperiments for Measuring Heat-Transfer Coefficients with a Lumped-Parameter Calorimeter." N ASA T N D-78J7, 1975. Beck, J. V., "Analytical Determination of Optimum Transient Ellperiments for Measurement of Thermal Properties," Proc. Jrd 1111 . H~al Tramf~r Co",. 44 (1966). 74-80. Beck, J. V., " Transient Sensitivity Coefficients for the Thermal Contact Conductance," 1111 . J . H~al M tW Tramf~r (1967). 1615-1617. Beck, J . V.• " Determination of Optimum T reatment Ellperiments for Thermal Con . .c t Conductance," 1111. J . H~al MillS Tramf~r 11 (1969), 621~33 . Bonacina. C. a nd Comini, G., "Calculation of Convective H eat Transfer Coefficients for Time-Temperature Curves." 1111. 111$1. R tfri,. Frftlll~mladl (1972), 157-167. Comini, G ., "Design of Transient Experiments for Measurements of Convective Heat Transfer Coefficients," 1111. 111$1 . R tfrig . Frrutkmladl (1972), 169-178. C annon. J. R . and Klein. R. E., " Optimal Selection o f Measurement Locations in a Conductor for Approltimate Determination of Temperature Distributions," J. Dyll. SYI. M~III. COlltrol, 93 (1971), 193-199. .1 Seinfeld, J . H., "Optimal Location o f Pollutant Monitoring Station. in a n Ainhcd," A lmor. £lhJiroll. 6 (1972), 847-8SB. 21. Beck, J . V., "Analytical Determination o f H ip T emperature Thermal Properties o f Solids Using Plasma Arcs," TlwmtDi COfIdtM:liDity, P rocftdi",. o f , . £i,1I11I C OIIfBM«, 1969. 22. Van Fossen, G . J., Jr., " Model Building Incorporating Discrimination Between Rival Mathematical Models in H eat T ransfer," Ph.D. Thesis, Dept. o f Mechanical Engineering, Michigan State University, 1973. Hunter, W. G . a nd Hill, W. J .. " Design o f Experiments for S ubsets o f Parameters," Tech. Rep. No. 330, University o f Wisconsin, D ept o f Statistics, Madison, W iI., March 1973. 24. Hunter, W. G., Hill, W. J., a nd Henson, T. L , "Designing Experiments for Precise Estimation of All o r Some o f t he C onstants in a M~hanistic M odel," Call. J . Cirmt. £11,.47 (1969),76-80. 23. 25. Graybill, F . A., /IIlrotiMCIIOfl 26. 27. 28. 29. 30. 31. 32. 10 Malrices willi Appl;C/lliOtU I II S IlIlisliu, W adsworth Publishing Company, I nc., Belmont, Calif., 1969. Meyers, G . E., Allalylical M~llIotI.r ill CoNlucliOll HMI T raMfn, McGraw-Hili Book C ompany, New York, 1971. Parker, W. J., Jenkins, R . J., Butler, C. P., a nd Abbott, G. L., " Flash Method o f Determining Thermal Diffusivity, H eat C apacity, a nd Thermal Conductivity," J . A ppl. PIlYI. 31 (1961), p . 1679. Kullback, S., lII/omJIIlioll TIlNt)' I11III Slalis/icl, John Wiley a nd Sons, Inc., New York, 1959. BOll, G . E . P. a nd Hill, W. J., " Discrimination among Mechanistic Models," T«/rllomttriCI 9 (1967), 57-71. Ghosh, B. K. S~qumlial T ml o f Slalislical HYPOI"~IU. Addison-Wesley, Reading. Mass., 1970. Bard, Y., NOllliMar P ara_In Esli,,",liOll, Academic Press, I nc., New York. 1974. Wald, A., S~qu~IIlial Alllllylis, J ohn Wiley a nd Sons, Inc., New York, 1947. APPENDIX BA O PTIMAL EXPERIMENT CRITERIA FOR ALL PARAMETERS O F INTEREST F or the standard assumptions of additive, zero mean, normal measurement errors in the dependent variable, the joint probability density o f the estimated parameter vector b is p (b) I: (2'11')-p/lIPr I /lexp[ - 4(b-1I )T p -'(b_lI) ] ( 8A.I) where P is the covariance matrix o f b. This expression also assumes errorless independent variables. We also assume that the error covariance m atrix." is known to within a multiplicative constant ,,2. These assumptions are designated 11--10 1-. ( 8A.!) is exact if the dependent variable is linear in the parameters; if 11 is nonlinear in the parameters, then the expression is approximate. 476 CHAPTER 8 DESIGN OF O PTIMAL EXPERIMEIVTS F or the assumptions given above the confidence region can be found from an expression similar to (see (6.8.38)] (8A.2) (b - P) Tp - I(b - P) = c onstant = C 2 F or a given value of C 2 this equation describes a hyperellipsoid which has a hypervolume given by volume=1TP/2C(~1~2'" \)1 /2[ r( f APPENDIX 8 8 C RfI'ERIA FOR 'NOT A LL PARAMETERS O F I NTEREST where p o n the right side designates the n umber o f parameters. Discarding irrelevant constants, a measure o f u ncertainty is (8A.7) But minimizing this function is e quivalent t o maximizing I p-II which was given above using the minimum confidence volume approach. . -I + I )] (8A.3) A PPENDIX 8 8 where p is the number of parameters, n ·) is the gamma function, a nd ~, is the ith eigenvalue of P . Now the determinant of P is equal to the product of its eigenvalues. Thus to minimize the hypervolume of a confidence region, the determinant of P should be minimized. This is equivalent to maximizing the determinant of the inverse of P. F or the standard assumptions of 11111-11 this leads to the criterion of maximizing f l = IXTXI. which has been given by Box a nd Lucas [ I) . T he criterion of max Ip - II is more general, however. Exactly the same criterion can be derived using the Shannon (28) c oncept of a measure of uncertainty which is related to information theory . He showed that the unique (except for a posjtive multiplicative factor) suitable measure of uncertainty associated with the probability density function of the random parameter vector b, which is d enoted p(b), is given by (8A.4) H (p)=. - E(lnp)= - fp(b)lnp(b)db Information is gained when uncertainty is reduced . Suppose Po(b) is the prior density of b. that i5, re5ulting from previous experiments. Let PI(b) be the posterior density after another experiment has been performed. The amount of information 1 gained by the experiment is (28) (8A.5) O ur goal is to select an experiment that maximizes I. Since H (Po) is unaffected by the new experiment. we simply minimize H (PI)' Let us evaluate H (p) for the standard assumptions 11 --1013. Then pCb) is given by (8A. I) and thus H (p(b»= - E[lnp(b)]= - E (H - pln2'IT- ln IPI-(b-P)T p - l(b-Il)]} i (p (I + In 2'IT ) + In IPI} Suppose t hat o f the total n umber p o f t he estimated parameters only a subset o f them need be estimated accurately. L et the estimated parameter vector b b e partitioned into two vectors b l a nd b1 so that (88.1) where b is p X I, b l is q X I, a nd b1 is a n r v ector where r =p - q. T he v ector b l consists o f those b's o f p rimary interest a nd ~ c ontains the others. Let the same statistical assumptions denoted b y 11-101- and discussed in the beginning of Appendix 8A be valid. Let the covariance matrix of all the estimated parameters be designated P a nd b e partitioned as ( 88.2) where P II is q X q Ilnd is for the b l vector, etc. F or this case the j oint p robability density of b is given by (8A.I). I f the experimenter desires precise estimates o f o nly b l' H unter a nd Hill [23,24] state that the marginal distribution o f b l is then needed. I t is obtained b y integrating (SA.I) with respect t o b1. F rom T heorem 10.6.1 o f G raybill [25] the marginal probability density o f b l is pCb,) = ( 21T)-'/1IPIII-I/lexp[ - !(b,- II.)TPJi I(b l - II.) ] ( S8.3) Following the same reasoning a s i n Appendix SA , the criterion is to maximize ( S8.5) T he terms in f l. s hould be related to the sensitivity matrix. Let X b e p artitioned as = i {pln2'IT+ InIPI+ t r[p - 1E(b-Il)T ( b-Il)]} = ~ { p In 2'IT + In IPI + tr[ P - IP]} = O PTIMAL E XPERIMENT C RITERIA F OR N OT ALL PARAMETERS O F INTEREST (8A.6) (S8.S) 4,. C HAPTER. DESIGN or OP11MAL EXPERIMENTS where XI is n x q a nd X l is n x r. Then for m aximum likelihood estimation. P =(Xr",-IX)-1 where X T",-IX c an be written as (see(6.1.17a)) x r",-IX l X r",-IX l I PROBLEMS 8." Consider the model '11- Pf(/l) where f e'l) c an assume only the values Indicated below. The optimal conditions for estimating P are needed. f (/ l) ] ( 88.6) I 1(1;) S 6 2 I 7 0 8 -I I I 3 Taking the inverse of (88.6) a nd identifying the upper left matrix as P II results in the criterion being to maximize 0 2 2 2.S 4 f (/l) 9 -2 10 -3 II 12 -4 -3 ( a) W hat single; should be chosen if only one measurement could be taken? ( b) W hat; value(s) should be selected if two observations a re to be taken? (88.7) Repeated observations a t a ny 'I are permitted. (e) Same as ( b) except repeated observations are not permitted. ( d) What t hree; values would be selected if repeated ; values are not I f the errors are independent a nd have a constant variance (i.e .• 11001-11). this expression reduces to the o ne given by ~unter a nd Hill (23.24) which is 8.5 allowed? F or the model '11- Pdl(/l) + P dl.ll) the below discrete values are permitt~ ( 88.8) 1 .(/,) Using (6.1.17). /).qp given by (88.7) can be related to the usual/), by /)., 8 3 4 S ( 88.9) IxN-IX11 10 2 /). = - - - '9 12(/;) I 3 S 0 2 -4 6 2 4 where /)., is the determinant of the expression given by (88.6). ( a) W hat are the two optimum locations to take measurements? ( b) W hat are the best three locations to take observations? Repeated values are not permitted. PROBLEMS (e) Same as ( b) except repeated values are permitted. For the model ' I - P. exp( - P2/) verify that the optimal locations for n - 4 are a t P 2/-O and I . There are no constraints on " o r I . Study the region 0 < 1< 1.2 using the spacing of A /- 0.1 . Use a programmable calculator o r a c omputer. 1 .7 F ind the optimal two values o f 1 + - Pli for estimating PI a nd P l in the m odel" - P. sin Pl/. There are no constraints on " o r I . 1 .1 F ind the optimal value o f 1,,+ - P1/" for a large number o f uniformly spaced measurements in 0 < I < I " for the model " .,. P. sin I + . Use a computer if necessary. N o constraints are to be used on " or I • 1.9 F or the model o f the cooling billet. T - T.., + (To - T..,)exp( - PI). find the optimal duration of the experiment for a large number o f equally spaced measurements. The parameters are TGo T..,. a nd p. 1.10 F or the m odel" - ( P .! P I- Pl)lexp( - P2/)-exp( - PIt)) find expressions for the p, a nd P l sensitivity coefficients. See (8.3.19). 8.11 F ind general expressions for the sensitivity coefficients plotted in Figs. 8.15 a nd 8.16. 8.6 Unless otherwise stated. assume that the standard conditions designated 11111-11 are valid for the following problems. 1 .1 F or X - Ie - 1+ I show that ~" given by (8.2.3) becomes ~"- e 1[ 1,,-1 - e- 21• ( 2+ 1,,-' + 21,,)]14 • •2 Verify that at I" - 1.691817. d~" / dl" - 0. At the same value of I" show that the sufficient condition for a maximum. dl~" / dl~ < O. is also satisfied. . For ' I - PC sin I show that ~ + given by (8.2.13) becomes (for I " > , ,/2) ~ + _ ! [ I - J... sin 21 2 8.3 21" " ] Also show that ~ + has extrema when tan T - T is satisfied. Use Myers (26. p. 4421 to find the first three nonzero positive roots of Ian T - T . Derive (8 .2. 14). CHAPTER 8 DESIGN O F OPTIMAL EXPERIMENTS A PPENDIX A ___________________ 8.11 A plate which is subjected to a large instantaneous pulse of energy Q at x = 0 a nd is insulated at x = L has the solution for the temperature of IDENTIFIABILITY CONDITION where t + = 1/' 2a t iL 2, c is the density-specific heat product, and Q has units of energy (Btu or J) per unit area. For x = 0 the temperature is infinity at time zero and decays to To+ Q I c L for large time. At X " L the temperature starts at To a nd increased to To + Q I cL. (a) Find an ellpression for the a sensitivity at x l L = I . (b) Evaluate using a. c omputer the ellpression found in ( a) for 0 < t + < 3. F or a filled value of Q (and no restriction on the range of T ) show that the optimum time to take a single measurement is t + = 1.38. Also show that this time corresponds to the time that the temperature at x = L has reached one haIr of the m uimum temperature rise. This "one-haIr" time is the basis of finding a in pulse or nash ellperiments. See the paper by Parker, Jenkins, Butler, and Abbott (27). ( c) Also using a computer find the optimum ellperiment duration for many equally spaced measurements at x I L = I. 8.13 ( a) A large number of measurements uniformly spaced in time have been made at x = 0 a nd x = L in the heat conducting body discussed in Section For mo and m l sensors at x = 0 and L , respectively, show that ~ + given by (8.3.7) can be written as ~+ ... [ zCIt.o+(I-z)Cltl][ z c2to+(I-Z)C2t.] - [zC I i.o+(I-Z)C ltlt where z = m ol m and 1 - z = m Il m and where The third subscript in Cij~O o r Cij~1 refers to x =O or L , respectively. The standard statistical assumptions are valid. ( b) Derive an ellpression for z at which ~ + is a m uimum, assuming that z can assume any value in the interval 0 to I. ( c) T he following values are for the heat conducting body discussed in Section c lto=0.07609 c lto=0.1062 c 2to=0.1552 c 2tl=O.126 C I t I = - 0.0422 c ltl=0.0l48 T he values correspond to the dimensionless time -=0.65. The first two subscripts correspond to k (a I subscript) or c (a 2 subscript). Using the ellpression derived in part (b), find a value for z. ( d) What conclusions can you draw from the results of this problem? t: A.I I NTRODUcnON T he problem of investigating the conditions under which parameters can be uniquely estimated is called the identifiability problem. A convenient means of anticipating slow convergence or even nonconvergence in estimating parameters can save unnecessary time and expense. Also if easy-to-apply identifiability conditions are known, many times insight can be provided to avoid the problem of nonidentifiability, through either the use of a different experiment o r a smaller set of parameters that are identifiable. The purpose of this appendix is to derive the identifiability criterion that the sensitivity coefficients in the neighborhood o f the minimum sum of squares function must be linearly independent over the range of the measurements. This criterion applies for linear and nonlinear estimation. This criterion is derived only for a weighted sum of squares function which includes least squares, weighted least squares, and ML estimation with normal errors, in each case with no constraints on the parameters. f or MAP estimation with prior parameter information it might be possible to estimate the parameters even if the sensitivity coefficients are linearly dependent. This condition of independence of the sensitivity coefficients is particularly convenient if the number of the parameters is not large, say, less than six. Even if the number is larger, linear dependence between two o r three o f the parameters can sometimes be readily detected from graphs of the sensitivity coefficients. The p lolling of the coefficients is extremely important and should be done for each new problem before attempting to estimate the parameter. 481