quietly: /// infile ptid mridate age male race weight height packyrs yrsquit alcoh physact /// chf chd stroke diabetes genhlth ldl alb crt plt sbp aai fev dsst atrophy /// whgrd numinf volinf obstime death /// using http://www.emersonstatistics.com/datasets/mri.txt list in 1 drop in 1 replace obstime= obstime / 365.25 stset obstime death tabulate race diabetes, row col chi logit diabetes i.race logistic diabetes i.race logit diabetes ib2.race logistic diabetes ib2.race egen ldlCTG = cut(ldl), at(0 70 100 130 160 190 250) /// Descriptive statistics for censoring distribution bysort death: tabstat obstime, stat(n max) col(stat) by(ldlCTG) g censor = 1 - death stset obstime censor sts graph, t1("Censoring Distribution") stci, p(10) stci, p(25) stci, p(50) stci, p(75) stci, p(90) stci, rmean /// Descriptive statistics for LDL tabstat ldl, stat(n mean sd min q max) col(stat) list obstime death if ldl==. /// Descriptive statistics for survival by LDL stset obstime death sts graph, by(ldlCTG) plotopts(lwid(1.25)) ylabel(0.5(0.1)1.0) /// legend(label(1 "LDL 11 - 69 mg/dL") label(2 "LDL 70 - 99 mg/dL") /// label(3 "LDL 100 - 129 mg/dL") label(4 "LDL 130 - 159 mg/dL") /// label(5 "LDL 160 - 189 mg/dL") label(6 "LDL 190 - 247 mg/dL")) sts list, by(ldlCTG) at(2 5) sts list if ldl!=., at(2 5) stci, p(10) by(ldlCTG) stci, p(20) by(ldlCTG) /// To compute restricted mean I restrict observations to 5.75 years based on censoring /// distribution within the LDL strata g obs575= obstime replace obs575= 5.75 if obstime > 5.75 g death575= death replace death575= 0 if obstime > 5.75 stset obs575 death575 stci, rmean by(ldlCTG) stcox i.ldlCTG, robust predict dummyfit egen grbg=mean(dummyfit) if ldl==160 egen grbg2= mean(grbg) replace dummyfit= dummyfit / grbg2 drop grbg grbg2 stcox ldl i.ldlCTG, robust testparm i.ldl* mkspline ldl0 70 ldl70 100 ldl100 130 ldl130 160 ldl160 190 ldl190= ldl stcox ldl0 ldl70 ldl100 ldl130 ldl160 ldl190, robust predict splinefit egen grbg=mean(splinefit) if ldl==160 egen grbg2= mean(grbg) replace splinefit= splinefit / grbg2 drop grbg grbg2 test ldl0=ldl70=ldl100=ldl130=ldl160=ldl190 stcox ldl ldl0 ldl70 ldl100 ldl130 ldl160 ldl190, robust test ldl0 ldl70 ldl100 ldl130 ldl160 ldl190 stcox ldl0 ldl70 ldl100 ldl130 ldl160, robust stcox ldl predict linearfit egen grbg=mean(linearfit) if ldl==160 egen grbg2= mean(grbg) replace linearfit= linearfit / grbg2 drop grbg grbg2 g logldl= log(ldl) stcox logldl predict logfit egen grbg=mean(logfit) if ldl==160 egen grbg2= mean(grbg) replace logfit= logfit / grbg2 drop grbg grbg2 g ldlsqr= ldl^2 stcox ldl ldlsqr predict quadfit egen grbg=mean(quadfit) if ldl==160 egen grbg2= mean(grbg) replace quadfit= quadfit / grbg2 drop grbg grbg2 twoway (scatter linearfit dummyfit quadfit logfit splinefit ldl, /// color(blue pink green orange black yellow) /// legend(label(1 "Linear fit") label(2 "Dummy fit") label(3 "Quad fit") /// label(4 "Log fit") label(5 "Spline fit")))