Jun-23-2018, 01:52 PM
Dear community,
we are working on an assignment with different statistical exercises that need to be programmed in Python.
It's going pretty well, we are now only stuck with a technical question.
We imported a dataset from sklearn, we defined dataframes, now we want to test the means in two different columns with each other (knowing which statistical test has to be used is definitely not our problem). We simply do not know how to compare two columns from different data frames. Can you please help us? Our attempt was using
For more background, our whole answer to the exercise:
Holly
we are working on an assignment with different statistical exercises that need to be programmed in Python.
It's going pretty well, we are now only stuck with a technical question.
We imported a dataset from sklearn, we defined dataframes, now we want to test the means in two different columns with each other (knowing which statistical test has to be used is definitely not our problem). We simply do not know how to compare two columns from different data frames. Can you please help us? Our attempt was using
Chas_0 = regressors[regressors['CHAS'] == 0.0]['DIS'] Chas_1 = regressors[regressors['CHAS'] == 1.0][outcome[outcome'MEDV']] print(Chas_0.mean(),Chas_1.mean())as we once learned it in an exercise, but in that we had two columns in one data frame.
For more background, our whole answer to the exercise:
from sklearn import datasets import pandas as pd boston = datasets.load_boston() regressors = pd.DataFrame(boston.data, columns=boston.feature_names) outcome = pd.DataFrame(boston.target, columns=["MEDV"]).values[:] NOX = pd.DataFrame(boston.target, columns=["NOX"]).values[:] ## 2 Two sided T-Test import numpy as np import scipy.stats as stats Chas_0 = regressors[regressors['CHAS'] == 0.0]['DIS'] Chas_1 = regressors[regressors['CHAS'] == 1.0][outcome[outcome'MEDV']] print(Chas_0.mean(),Chas_1.mean()) ##3 Wilcoxon Test stats.wilcoxon(regressors[regressors['DIS']],outcome[outcome['MEDV']])Thank you in advance,
Holly