Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.
Right, it is not just an overlap of the CIs. It is what you sketched in your answer. The grader generalized his justification to any overlap indicating p>.05, but in case we had a language issue and he meant fully overlapping, it is better that you modeled the exact case. I have emailed the grader with your simulation code and results (not taking credit for them myself), and let's see what he says. My suspicion is that this is a biostatistics in epidemiology course, so the grader may be more an epidemiologist than a statistician. But maybe there is something we are overlooking.
WHuber, I think you are confused about how the word assume is used here vs in math proofs. In proofs, you say: let us assume that x = .... and carry on from there, with no statement that every x has to follow that rule. Here, the word assume is in a statement of causality: if A is true, then we therefore assume that B is also true. This latter use does in fact involve being very confident. You can look it up in a dictionary if you doubt me. The thread you linked to at first is not helpful to me because it does not speak to the sort of samples I am dealing with. But I appreciate your help!
FWIW the grader did not say that the CIs had to be contained one in the other, he just said they had to overlap (at all) and p would be assumed to be >.05. To be fair, English is not his native language, and it is mine, so it is possible that he meant the total overlap as per the example. Still, in English, as far as I know, if we say that we assume something is the case, it means that we can be very confident that it is the case, and an error rate of about 30% doesn't fit in that semantic field.
I don't quite understand how the answer can be correct as given if Dave's simulation is correct and the odds of p being < .05 are about 1 in 3 or 4. That's a pretty hefty chunk of the cases that the assumption would be wrong, so therefore we should not be making that assumption. I mean, isn't it a fundamental statistical rule of thumb that p has to be .05 or less, so in line with that sort of tolerance, we should not assume that overlapping CIs show that p > .05 unless the simulation returned that result about 95% of the time or more?
Thank you very much! Out of curiosity, and to help me understand why it is so, is the hand-waving explanation for this that the enveloping CI is only so big because N is so small? Or what is the underlying structural reason for this happening?
WHuber, I looked at that question you linked to, but the discussion seemed to be about cases where the sample sizes and variances were about equal, which is not the case here. So I am still wondering about this sort of case.
WHuber, I think we need more data in the lot of worries group to say anything meaningful about it. I don't trust data that is tiny in quantity and also seems to have 12.5% outliers. Especially for something like BMI, where it is very variable and the association with other variables is usually pretty weak. I think you need quite a lot more data in such a case to be able to really do this kind of comparison in a meaningful way. But maybe I'm wrong.
My concern is more about whether it makes sense to apply the math of overlapping confidence intervals and p-value of a t-test given the nature of the samples. I could be wrong, but intuition says that the CI of the small sample is high because of the small N and the possible 1/8 ratio of outliers in the data, and that trying to compare the CIs of it and the large sample is kind of comparing apples to oranges. To me, step one in analyzing data is always check if the test/methodology makes sense, and only then, if it does, apply it.