From: steves@truevision.com (Steve Spicklemire)
Newsgroups: comp.sources.hp48
Subject: v01i018:  chisq10 - Non-linear Chi^2 fit v1.0, Part01/01
Date: 16 Aug 91 00:22:26 GMT
Followup-To: comp.sys.hp48

Checksum: 1237628350 (verify with brik -cv)
Submitted-by: Steve Spicklemire <steves@truevision.com>
Posting-number: Volume 1, Issue 18
Archive-name: chisq10/part01

BEGIN_DOC chisq.doc

I've been writing software to connect the HP48SX to the
Universal Laboratory Interface from Vernier Software,
(2920 S.W. 89th Street, Portland, OR 97225, (503) 297-5317).
In the process I've cooked up a non-linear chi-square
program for the ULI. It's not highly optimized, nor
is it the most modern algorithm, but it works! (and it's free!
so stop complainin'.) Anyway here it is.

The following is a 'simple' non-linear chi-square fitting
routine. It uses the 'almost' linear approach. (In fact it
is simply a slight alteration of the linear chi-square
program I wrote, which explains some of the variable
copying stuff that may seem silly).
I hope to visit it again sometime in the next month or
so, but in the mean time I thought someone might be able
to use it as-is.

The basic plan is to set up the function in terms of the
parameters in the object nflst. Then to get some pseudo
data, set the paramters to something close to what you
think the correct paramters will be for your actual data.
Now run `min max step DODAT.' This creates a fake \GSDAT that
is a crude representation of noisy data (I picked a really crude
poisson like error approximation. Take it out if you don't
like it!) Run DOFIT to fit the pseudo data.

You can test your installation by runing NEWP then DOFIT once the
variables have been downloaded. In a bit, you should see the
data (sinusoidal) plotted, and then the fit. (Make sure
you're set for radian mode!)

You can easily see the effects of changing your paramters
by changing them and re-plotting using the 'EQ' defined by
BLDDIV. When you think the paramters are close to correct,
put the REAL data in \GSDAT and run DOFIT.

Iterate DOFIT untill the parameter settle down. CMAT is
the covariance matrix of the last iteration. The diagonal
elements are the variances in the paramters.

\GSDAT has some sample data in it that I acquired from
an RS232 based data acquisition board (ULI from Vernier Software)
and a ball-bearing shaft encoder. Yes, it was a pendulum.
the 'x' column is the time in microseconds, and the 'y's
are proportional to the angle of deflection. Any guesses
how long the pendulum was? ( a short Physics quiz)

BTW if you have uncertainty values for the y values, put them
in the third column of \GSDAT. If you don't, the variances are
scaled to assume unit uncertainty in all the 'y's.

There are some references to NRC. That's `Numerical Recipies
in C', Press et. al., Cambridge Univ. Press, 1988. This
code is really not what's in there at all (they do Levenberg-
Marquardt for Non-Linear Chi^2), but the idea of a
'design' matrix is presented in a reasonably coherent
manner there, hence the reference.

enjoy!

Steve Spicklemire
University of Indianapolis
Indianapolis, IN 46227

and...

Truevision Inc.
Indianapolis, IN

END_DOC

BYTES: #9BFFh	3178

BEGIN_RPL chisq
%%HP: T(3)A(R)F(.);
DIR
\GSDAT
[[ 330165 9 ]
 [ 390001 19 ]
 [ 439537 29 ]
 [ 485486 39 ]
 [ 531116 49 ]
 [ 579301 59 ]
 [ 634344 69 ]
 [ 713356 79 ]
 [ 900890 75 ]
 [ 967547 65 ]
 [ 1019917 55 ]
 [ 1067056 45 ]
 [ 1112531 35 ]
 [ 1159533 25 ]
 [ 1211986 15 ]
 [ 1279142 5 ]
 [ 1472620 3 ]
 [ 1549472 13 ]
 [ 1604145 23 ]
 [ 1652155 33 ]
 [ 1698056 43 ]
 [ 1744266 53 ]
 [ 1794497 63 ]
 [ 1855927 73 ]
 [ 2017782 81 ]
 [ 2124127 71 ]
 [ 2183653 61 ]
 [ 2233212 51 ]
 [ 2279482 41 ]
 [ 2325474 31 ]
 [ 2374437 21 ]
 [ 2430805 11 ]
 [ 2516450 1 ]
 [ 2701813 7 ]
 [ 2766094 17 ]
 [ 2817686 27 ]
 [ 2864449 37 ]
 [ 2910201 47 ]
 [ 2957724 57 ]
 [ 3011418 67 ]
 [ 3082616 77 ]
 [ 3270900 77 ]
 [ 3344829 67 ]
 [ 3399324 57 ]
 [ 3447282 47 ]
 [ 3493429 37 ]
 [ 3540216 27 ]
 [ 3591697 17 ]
 [ 3655204 7 ]]
BLDDIV
@
@ BLDDIV gets the gradient of the fitting function
@ WRT the parameters and puts them into a 'linear'
@ fit list. (This is so the non-linear fit can
@ be treated as a linear fit via the first order
@ Taylor series (WRT the parameters) of the non-linear
@ function f(x).)
@
\<<
  NFLST 1 GET                           @ get the dependent variable
  NFLST 2 GET PURGE                     @ clear out the paramemters
  NFLST 1 GET PURGE                     @ clear out the dependent
  1 PARSIZ FOR I                        @ for each parameter get
    GFUNC                               @ the function f(x)
    I VARNAM                            @ the paramter p
    \.d                                 @ and the gradient of f(x) WRT p
  NEXT PARSIZ 1 +                       @ make a PARSIZ + 1 list
  \->LIST 'FLST' STO                    @ which is the linear fList
\>>
CHISQ
@
@ Calculate the 'reduced chi-square' for the current model
@ and parameters.
@
\<<
   YSTAR                                @ model Y
   REALY                                @ actual data
   - DUP TRN SWAP *                     @ This is chi-squared
   \GSDAT                               @ divide by number of data points
   SIZE 1 GET /
\>>
DODAT
@
@ DODAT generated pseudo data based on the current contents
@ of NFLST and the variables present in NFLST.
@
\<<
  UPDAT                                 @ reset variables
  BLDDIV                                @ get gradient
  NEWP                                  @ re-generate variables
  CL\GS                                 @ clear \GSDAT
  FLST 1 GET
  \-> strt stop incamt depVar           @ set up steps of dependent
  \<<
    strt stop FOR I
       I depVar STO                     @ reset dependent variable
       GFUNC EVAL \->NUM                @ evaluate function
       DUP ABS \v/ RAND
       .5 - * \-> rVal errVal           @ crude Poison
       \<<                              @ get 'error amount'
          I rVal errVal +
          errVal ABS 3 \->ARRY \GS+     @ convert to vector & save
       \>>
    incamt STEP
  \>>
\>>
DOFIT
@
@ perform a linear fit using the current FLST.
@
\<<
   UPDAT                                @ copy parameter values into FITP
   TMSIG DUP                            @ get TMAT/SIG^2
   TMAT TRN                             @ get the 'T' matrix
   SWAP * INV                           @ get TRN(TMAT)*TMATSIG
   DUP 'CMAT' STO                       @ save the correlation matrix
   SWAP TRN YVEC *                      @ get TRN(TMAT)*Y
   *                                    @ get (TRN(TMAT)*Y)/(TRN(TMAT)*TMAT)
   'FITP' STO                           @ store result in FITP
   NEWP                                 @ Copy FITP to variables
   SCATRPLOT                            @ SCATRPLOT to get scale right
   DOPLOT                               @ Plot result
\>>
DOPLOT
@
@ Plot the data and the fitted function together.
@
\<<
  ERASE                                 @ clear the screen
  SCATTER DRAW                          @ draw scatter plot
  NFLST 1 GET INDEP                     @ set independent variable
  GFUNC 'EQ' STO                        @ save the equation
  FUNCTION DRAW                         @ draw the function
  GRAPH                                 @ draw it
\>>
EVLST
@
@ evalute the ith basis funcion from FLST at the current setting
@ of the dependent variable.
@
\<<
  FLST SWAP 1 + GET EVAL                @ do it.
\>>
FITP
[[ 41 ]
 [ .0000052 ]
 [ 1600000 ]
 [ 41 ]
 [ .000000001 ]]
FLST
@
@ Flst is generated by BLDDIV. It is essentially just the
@ Gradient of the model function in parameter space.
@
{
  X
  'SIN(B*(X-C))*EXP(-(E*X))'
  'A*(COS(B*(X-C))*(X-C))*EXP(-(E*X))'
  'A*(COS(B*(X-C))*-B)*EXP(-(E*X))'
  1
  'A*SIN(B*(X-C))*-(X*EXP(-(E*X)))'
  }
  GFUNC
@
@ Get the model function.
@
\<<
  NFLST 3 GET                           @ Do it.
\>>
NEWP
@
@ Update the variable form of the parameters from the data stored
@ in FITP (This is set up by DOFIT and DODATA).
@
\<<
  1 PARSIZ FOR I                        @ For each paramter...
    FITP { I 1 } GET                    @ Get the Ith value
    I VARNAM STO                        @ Store in the i'th name
  NEXT
\>>
NFLST
@
@ NFLST is the non-linear function list.
@ It has three parts.
@
  {
   X                                    @ the dependent variable 1.variable
   { A B C D E }                        @ the parameter list
   'A*SIN(B*(X-C))*EXP(-E*X)+D'         @ the fitting function
}
PARSIZ
@
@ Get the number of parameters.
@
\<<
  NFLST 2 GET SIZE                      @ Do it
\>>
REALY
@
@ Get the experimental Y vector
@
\<<
  VECSIZ \-> N                          @ For the number of data points.
  \<< 1 N FOR I
    \GSDAT { I 2 } GET                  @ Get the ith data point
    NEXT
    { N 1 } \->ARRY                     @ Convert to an array.
  \>>
\>>
  TMAL
@
@ Get TMAT*FITP
@
\<<
  UPDAT TMAT FITP *                     @ How simple!
\>>
TMAT
@
@ Get the 'design matrix' for the fit. (NRC p530)
@
\<<
  VECSIZ PARSIZ
  \-> N M
  \<<
        1 N FOR J
            \GSDAT { J 1 } GET          @ get the j'th X
            FLST 1 GET STO              @ store it
            1 M FOR K
               K EVLST                  @ evaluate the k'th function at x
            NEXT
        NEXT
    { N M } \->ARRY
  \>>
\>>
TMSIG
@
@ Get the 'design matrix' for the fit. (NRC p530)
@ Divide by sigma^2 if there is sigma data present.
@
\<<
  VECSIZ PARSIZ
  \-> N M
  \<<
    IF \GSDAT SIZE 2 GET 3 < THEN
        1 N FOR J                       @ Just do TMAT
            \GSDAT { J 1 } GET          @ get the j'th X
            FLST 1 GET STO              @ store it
            1 M FOR K
               K EVLST                  @ evaluate the k'th function at x
            NEXT
        NEXT
    ELSE
        1 N FOR J                       @ Do the sigma division.
            \GSDAT { J 1 } GET          @ get the j'th X
            FLST 1 GET STO              @ store it
            \GSDAT { J 3 } GET 2 ^
            \-> sigmaVal
            \<<
              1 M FOR K
                K EVLST sigmaVal /      @ evaluate the k'th function at x
              NEXT
            \>>
        NEXT
    END
    { N M } \->ARRY
  \>>
\>>
UPDAT
@
@ Update the paramter list FITP with the current values in the
@ named variables.
@
\<<
  1 PARSIZ FOR I
    I VARNAM RCL                        @ Get the ith variable
  NEXT
  { PARSIZ 1 } \->ARRY                  @ Convert to an array
  'FITP' STO                            @ Store the result
\>>
VARNAM
@
@ Get the i'th parameter's variable name.
@
\<<
  \-> I                                 @ Get I
  \<<
    NFLST 2 GET                         @ Get variable list
    I GET                               @ Get the I'th variable name
  \>>
\>>
VECSIZ
@
@ Get the size of a data vector.
@
\<<
  \GSDAT SIZE 1 GET                     @ Get it
\>>
XVEC
@
@ Get the vector of x values for this data set.
@
\<<
  VECSIZ \-> N                          @ Get the number of data points
  \<<
    1 N FOR I                           @ For each data point
      \GSDAT { I 1 } GET                @ Get x
    NEXT
    { N 1 } \->ARRY                     @ vectorize it.
  \>>
\>>
YSTAR
@
@ YStar generates the model's idea of the best fit value
@ for all the y's.
@
\<<
  VECSIZ \-> N                          @ Get the number of data points
  \<<
    1 N FOR I                           @ for each data point
      \GSDAT { I 1 } GET                @ get the i'th X
      NFLST 1 GET STO                   @ save it
      GFUNC EVAL \->NUM                 @ get the function result
    NEXT
    { N 1 } \->ARRY                     @ Convert to a vector
  \>>
\>>
YVEC
@
@ Get the vector used in the Least Squares Algorithm.
@ This is really the difference between the real data (REALY)
@ the current model prediction (YSTAR) with the design matrix*fitp
@ added in. In a linear model YStar = TMAT*FITP so that YVEC = REALY.
@ In a non-linear situation, you need to be more careful.
@
\<<
  REALY YSTAR - TMAL +                  @ Do it.
\>>
END
END_RPL