R glm standard error estimate differences to SAS PROC GENMOD

Posted by Michelle on Stack Overflow See other posts from Stack Overflow or by Michelle
Published on 2011-11-27T22:25:55Z Indexed on 2011/11/28 1:50 UTC
Read the original article Hit count: 364

Filed under:

glm

I am converting a SAS PROC GENMOD example into R, using glm in R. The SAS code was:

proc genmod data=data0 namelen=30;
model boxcoxy=boxcoxxy ~ AGEGRP4 + AGEGRP5 + AGEGRP6 + AGEGRP7 + AGEGRP8 + RACE1 + RACE3 + WEEKEND + 
SEQ/dist=normal;
FREQ REPLICATE_VAR;  
run;

My R code is:

parmsg2 <- glm(boxcoxxy ~ AGEGRP4 + AGEGRP5 + AGEGRP6 + AGEGRP7 + AGEGRP8 + RACE1 + RACE3 + WEEKEND + 
SEQ , data=data0, family=gaussian, weights = REPLICATE_VAR)

When I use summary(parmsg2) I get the same coefficient estimates as in SAS, but my standard errors are wildly different.

The summary output from SAS is:

Name         df   Estimate      StdErr    LowerWaldCL  UpperWaldCL      ChiSq   ProbChiSq
Intercept    1   6.5007436    .00078884      6.4991975    6.5022897    67911982 0
agegrp4      1   .64607262    .00105425      .64400633    .64813891   375556.79 0
agegrp5      1    .4191395    .00089722      .41738099    .42089802   218233.76 0
agegrp6      1  -.22518765    .00083118     -.22681672   -.22355857   73401.113 0
agegrp7      1  -1.7445189    .00087569     -1.7462352   -1.7428026   3968762.2 0
agegrp8      1  -2.2908855    .00109766     -2.2930369   -2.2887342   4355849.4 0
race1        1  -.13454883    .00080672     -.13612997   -.13296769    27817.29 0
race3        1  -.20607036    .00070966     -.20746127   -.20467944   84319.131 0
weekend      1    .0327884    .00044731       .0319117    .03366511   5373.1931 0
seq2          1 -.47509583    .00047337     -.47602363   -.47416804   1007291.3 0
Scale         1 2.9328613     .00015586      2.9325559    2.9331668     -127

The summary output from R is:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  6.50074    0.10354  62.785  < 2e-16 
AGEGRP4      0.64607    0.13838   4.669 3.07e-06 
AGEGRP5      0.41914    0.11776   3.559 0.000374 
AGEGRP6     -0.22519    0.10910  -2.064 0.039031  
AGEGRP7     -1.74452    0.11494 -15.178  < 2e-16
AGEGRP8     -2.29089    0.14407 -15.901  < 2e-16
RACE1       -0.13455    0.10589  -1.271 0.203865    
RACE3       -0.20607    0.09315  -2.212 0.026967 
WEEKEND      0.03279    0.05871   0.558 0.576535 
SEQ         -0.47510    0.06213  -7.646 2.25e-14

The importance of the difference in the standard errors is that the SAS coefficients are all statistically significant, but the RACE1 and WEEKEND coefficients in the R output are not. I have found a formula to calculate the Wald confidence intervals in R, but this is pointless given the difference in the standard errors, as I will not get the same results.

Apparently SAS uses a ridge-stabilized Newton-Raphson algorithm for its estimates, which are ML. The information I read about the glm function in R is that the results should be equivalent to ML. What can I do to change my estimation procedure in R so that I get the equivalent coefficents and standard error estimates that were produced in SAS?

To update, thanks to Spacedman's answer, I used weights because the data are from individuals in a dietary survey, and REPLICATE_VAR is a balanced repeated replication weight, that is an integer (and quite large, in the order of 1000s or 10000s). The website that describes the weight is here. I don't know why the FREQ rather than the WEIGHT command was used in SAS. I will now test by expanding the number of observations using REPLICATE_VAR and rerunning the analysis.

Developer IT

R glm standard error estimate differences to SAS PROC GENMOD - Developer IT

R glm standard error estimate differences to SAS PROC GENMOD

r

sas

glm

Related posts about r

Related posts about sas

Scheduling of SAS generated codes in SAS DI studio to IBM TWS

Tutoriel SAS : Lire des classeurs Excel 2007 avec SAS 9.1, par Stéphane Colas

Livre SAS : Économétrie théorie et application sous SAS, par V. Delsart, A. Rys et N. Vaneecloo

SAS unifie la vision des risques pour les banques avec SAS Risk Management for Banking

SAS Expanders vs Direct Attached (SAS)?

Categories cloud