A major life depression is associated with
a higher daily consumption of alcohol
All data from the NESARC Study
Categorical Explanatory Variable
We have a categorical variable encoded with No=0, implying that subjects with no major depression in life are at the baseline 0.
3549-3549
MAJORDEPLIFE
MAJOR DEPRESSION - LIFETIME (NON-HIERARCHICAL)
----------------------------------------------
35254 0. No
7839 1. Yes
----------------------------------------------
Quantitative Response Variable
The amount of ethanol (alcohol) consumed averagely in the past year is our response variable.
3675-3682
ETOTLCA2
AVERAGE DAILY VOLUME OF ETHANOL CONSUMED IN PAST YEAR, FROM ALL TYPES OF ALCOHOLIC BEVERAGES COMBINED
(NOTE: Users may wish to exclude outliers)
--------------------------------------------------
0.0003 - 219.9555 Ounces of ethanol
Blank Unknown
--------------------------------------------------
We see that there is a large number of abstinent people in the dataset. We still don't know if they are associated with depression or not. It would be interesting to see if they are more the one or the other.
Linear Regression Test
OLS Regression Results
==============================================================================
Dep. Variable: ETOTLCA2 R-squared: 0.001
Model: OLS Adj. R-squared: 0.001
Method: Least Squares F-statistic: 18.86
Date: Fri, 29 Jan 2016 Prob (F-statistic): 1.42e-05
Time: 15:57:58 Log-Likelihood: -59022.
No. Observations: 26655 AIC: 1.180e+05
Df Residuals: 26653 BIC: 1.181e+05
Df Model: 1
Covariance Type: nonrobust
================================================================================
coef std err t P>|t| [95.0% Conf. Int.]
--------------------------------------------------------------------------------
Intercept 0.5366 0.015 35.377 0.000 0.507 0.566
MAJORDEPLIFE 0.1474 0.034 4.342 0.000 0.081 0.214
==============================================================================
Omnibus: 84158.156 Durbin-Watson: 2.010
Prob(Omnibus): 0.000 Jarque-Bera (JB): 20279387185.774
Skew: 50.216 Prob(JB): 0.00
Kurtosis: 4274.926 Cond. No. 2.62
==============================================================================
The R-squared number is rather low, sugesting that only 0,1% of the "Etanolic Intake" can be explained by a "major depression in life". At least the F-statistic is so low that we can assume the small effect is statistically significant.
Now I would not go further but lets assume that R-squared is higher, in that case we believe that a major depression in a lifetime can predict Beta=14.74% increase in alcohol intake, with a high statistical significance of P>0.000
Graph: a major life depression could increase average ethanol intake from 54% to 68%
Program Code and Output
In [1]:
%matplotlib inline
import numpy
import pandas
import statsmodels.api as sm
import seaborn
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
In [2]:
# bug fix for display formats to avoid run time errors
pandas.set_option('display.float_format', lambda x:'%.2f'%x)
In [3]:
data = pandas.read_csv('nesarc_pds.csv', low_memory=False)
In [4]:
#setting variables you will be working with to numeric
data['p'] = pandas.to_numeric(data['MAJORDEPLIFE'], errors='coerce')
In [5]:
data['ETOTLCA2'] = pandas.to_numeric(data['ETOTLCA2'], errors='coerce')
In [6]:
subGroupEthanol = list(filter(lambda x: x > 10.0, data['ETOTLCA2'].dropna()))
seaborn.distplot(subGroupEthanol)
Out[6]:
In [7]:
reg1 = smf.ols('ETOTLCA2 ~ MAJORDEPLIFE', data=data).fit()
print (reg1.summary())
In [8]:
# listwise deletion for calculating means for regression model observations
sub1 = data[['ETOTLCA2', 'MAJORDEPLIFE']].dropna()
# group means & sd
print ("Mean")
ds1 = sub1.groupby('MAJORDEPLIFE').mean()
print (ds1)
print ("Standard deviation")
ds2 = sub1.groupby('MAJORDEPLIFE').std()
print (ds2)
# bivariate bar graph
seaborn.factorplot(x="MAJORDEPLIFE", y="ETOTLCA2", data=sub1, kind="bar", ci=None)
plt.xlabel('Major Life Depression')
plt.ylabel('Mean Number AVERAGE DAILY VOLUME OF ETHANOL CONSUMED IN PAST YEAR')
Out[8]:
In [ ]:
No comments:
Post a Comment