Application of regression analysis and classification trees in calculating additional population risk of ischemic heart disease

Бесплатный доступ

Our research goal was to perform a comparative analysis of regression analysis application and tree classification application in calculating additional population risk on the example of ischemic heart diseases (IHD). Our research object was a random population sample comprising both male and female population aged 25-64 in Kemerovo region (1,628 people) within ESSE-RF multi-centered epidemiologic research. We considered the following IHD risk factors: lipid metabolism parameters, arterial hypertension, lifestyle factors, psychoemotional peculiarities, and social parameters. IHD occurrence was assessed as per sum of 3 epidemiologic criteria: on the basis of ECG changes coding as per Minnesota code, Rose questionnaire, and cardiac infarction in case history. We calculated additional population IHD risk determined by risk factors as per unified original algorithms, but with various statistic analysis techniques: logistic regression analysis and classification trees. We built up mathematic models for IHD probability as per risk factors, with predictive significance equal to 83.8% for logistic regression analysis and to 71.9% for classification trees. The applied statistical analysis techniques show different contributions made by risk factors into IHD prevalence which results from absence of correlation between them. IBD risk additional to population one and determined by risk factors as per both statistical analysis techniques in sex-age groups changed from negative values in age groups younger than 45 to positive values in older people. Increase in additional IHD risk in aged groups as per both techniques was practically linear with slight deviations. Difference in additional population risk calculated as per two statistical analysis techniques was insignificant and as a rule it didn't exceed 1.5%. Consequently, both techniques give similar results and can be equally used in calculating IHD population risk.


Regression analysis, risk factor, ischemic heart disease, population risk, predictive models, statistical analysis techniques

Короткий адрес:

IDR: 14238019   |   DOI: 10.21668/health.risk/2017.3.04

Статья научная