ChiefsPlanet - Science A question for you math/stats whizzes...

ChiefsPlanet (https://chiefsplanet.com/BB/index.php)

- Nzoner's Game Room (https://chiefsplanet.com/BB/forumdisplay.php?f=1)

- -

A question for you math/stats whizzes... (https://chiefsplanet.com/BB/showthread.php?t=247963)

Fat Elvis

08-02-2011 09:18 PM

A question for you math/stats whizzes...

OK, so it has been many, many (many) years since I've had a statistics course so I've forgotten how to figure some pretty simple problems out. Because of my nearing senility, I was wondering if some of you folks could help clear the cobwebs.

Suppose I have a four sided die that is equally weighted; each side has a .25 probability of showing up on any roll. ( I know that I'm wording this wrong) How rolls would it take to say that a number did not appear for 2.57 standard deviations?

(I have a couple more questions as well, but I will wait for an explaination for this question first)

Thanks

Fat E

Bugeater

08-02-2011 09:46 PM

48÷2(9+3)

Fat Elvis

08-02-2011 09:52 PM

Quote:

Originally Posted by Bugeater (Post 7797115)

48÷2(9+3)

Could you explain how you arrived at that? Thanks.

loochy

08-02-2011 10:07 PM

Ask Frankie. He has a super IQ and he knows everything.

Fat Elvis

08-02-2011 10:28 PM

I came up with 16 rolls. At 2.57 standard deviations, on a standard bell curve, there is a 99% probability a given outcome would be expected. If there is a 25% chance that a number would appear (four sided die), wouldn't that mean that there is a 75% chance that that one of the other numbers appears-- and that there is a 75% chance on each successive roll that one of the other numbers appear?

In other words, .75 x .75 x .75...until there is only a .01 probability that the next roll will be one of the three numbers without the original number being rolled?

Does that make sense? Am I completely off base?

DanT	08-02-2011 11:06 PM

Quote:

Originally Posted by Fat Elvis (Post 7797177)

Hi Fat Elvis,

The problem involves a discrete distribution that puts 25% probability on each of the integers from 1 to 4. The course materials should have provided you a formula for computing the standard deviation for such a distribution, right? If you were to plot the density of that distribution, you will see that it looks nothing like a normal curve. Instead, the distribution puts the same amount of weight on each of only 4 points. You can compute the standard deviation for the problem's discrete uniform distribution using the formula you have for a standard deviation for a discrete probability distribution. (Alternatively, you can simply use one of the two Excel's functions for calculating POPULATION standard deviations and apply that to the four integers, like this:

=stdevp(1,2,3,4)
)

If you do that, you will learn what the population standard deviation is for that distribution. Once you know that, then you just need to compute what 2.57 of those standard deviations would equal. Then you just need to find the next greatest integer, aka the ceiling, aka the minimum integer whose value is at least as big as 2.57 * SD, where SD is the standard deviation whose value you computed at the beginning of the problem.

If I'm interpreting that problem correctly, I suspect that the intent of the problem is to help the student get familiar with translating a word problem into a math problem and then apply concepts you learned early in the class about standard deviations and other parameters that describe key features of distributions.

Hope this helps gets you started on the other problems!

CrazyPhuD

08-02-2011 11:09 PM

42 rolls.

DanT	08-02-2011 11:15 PM

Quote:

Originally Posted by DanT (Post 7797216)

I should also note that the wording for that problem could be crucial. I've never seen a problem worded quite that way, so because of that, I'd caution you to make sure that you understand what the problem is asking for. The approach you described in your most recent post is off-track for the problem you described in the topic thread header, but it would not be too far afield for attacking another sort of problem in elementary probability courses, one that concerns what are called negative binomial distributions, so make sure you know what the problem really is! ;)

Bugeater

08-02-2011 11:17 PM

Quote:

Originally Posted by DanT (Post 7797225)

That's exactly what I was thinking.

Shaid

08-03-2011 12:03 AM

Quote:

Originally Posted by Bugeater (Post 7797115)

48÷2(9+3)

LMAO

Ming the Merciless

08-03-2011 01:52 AM

18.41 rolls to be near 2.51 standard deviations--assuming a normal distribution

2.51 std deviations = ~.005 (half of a percent)

I am reading the question as this: how many times would I have to roll the dice to be at 2.51 std devs for not seeing that number come up..

so

.75% ^ x = .005

18.41 rolls in a row without N number showing up

disclaimer: im drunk , tired and going to bed LOL

Ming the Merciless

08-03-2011 01:58 AM

Quote:

Originally Posted by Fat Elvis (Post 7797177)

No, just slightly since its more like 99.5 and change %

oh crap i dint read dan T's.......i guess it isnt a 'normal' distribution?

Third Eye

08-03-2011 02:16 AM

Quote:

Originally Posted by Pawnmower (Post 7797352)

As mentioned earlier, this is a uniform discrete distribution, not a normal distribution.

Ming the Merciless

08-03-2011 02:24 AM

Quote:

Originally Posted by Third Eye (Post 7797365)

As mentioned earlier, this is a uniform discrete distribution, not a normal distribution.

caught that 60 seconds b4 you posted, thx

Third Eye

08-03-2011 03:15 AM

My best guess, given the wording, is that we are working with a geometric distribution with p=.25 and trying to determine the minimum number of trials it will take for a first success that occurs more than 2.57 standard deviation away from the mean. The mean for a geometric distribution is 1/p, so 1/.25=4. The variance is given by (1-p)/p^2 or .75/.0625=12. Thus the sd is 12^.5 or approximately 3.464. 2.57 sds is approximately equal to 8.9. So, to be at least 2.57 sds away from the mean you need at least 4 + 9 rolls, or 13.

TimeForWasp

08-03-2011 04:28 AM

Wow , what a question. I was lost at OK,

DanT	08-03-2011 04:29 AM

Quote:

Originally Posted by Third Eye (Post 7797384)

Very nice. This could well be an answer to the original problem that Fat Elvis is trying to solve and it's the sort of alternative answer that I had in mind when I mentioned negative binomial distributions (of which the geometric distribution is a special case). It's why it's important to know what that problem's wording really is, because the problem as translated to the topic thread header is a little bit weird and seems to be lacking some crucial details.

The last sentence of the problem as posed by Fat Elvis is

"How rolls would it take to say that a number did not appear for 2.57 standard deviations?"

Let me just make a couple of comments about where you have interpreted that sentence of the problem differently than I have.

You are interpreting the problem to mean that the problem concerns the situation where the die roller has a particular value for the die in mind (e.g. a value of 3 dots) and is concerned about the number of throws of the die that could occur before that number first comes up. This is a reasonable interpretation. It leads to the crucial decision that the relevant standard deviation is for a geometric distribution with parameter p=0.25 and not the standard deviation for a discrete uniform distribution on the integers from 1 to 4. This matters a lot of course because those two distributions have different standard deviations. Your calculation of the standard deviation for the distribution you picked is indeed correct. Now keep in mind that there are two different versions of geometrically distributed variables, one counts just the number of failures until the first success, the other counts the total number of throws, including the throw with the success, so it is always bigger by 1. You are using the latter version, the one that Wikipedia denotes with the letter X to distinguish it from the other version, which it labels with a Y:

http://en.wikipedia.org/wiki/Geometric_distribution

X & Y have different means, of course, because X is Y+1, but they have the same standard deviation. This is pertinent to Fat Elvis problem because you are assuming that you are also assuming that the target value is 2.57 sd away from the mean, not just simply 2.57 sd. Note that the problem as translated by Fat Elvis doesn't have the phrase "away from the mean" in it. Still, I think this could be that the original problem, the one that Fat Elvis translated to the topic threader could have included that phrase. If it didn't, then under your interpretation of the problem one would simply report 9 as the value, not 9 + 4 as the value, an easy fix.

Man, no wonder statistics is so confusing to students. The wording of the problem really matters a lot! ;)

BigRichard

08-03-2011 06:15 AM

Quote:

Originally Posted by Fat Elvis (Post 7797121)

Could you explain how you arrived at that? Thanks.

I think I can help you with this one, he took the standard deviation of this thread... http://chiefsplanet.com/BB/showthrea...ght=solve+math.

He then cross multiplied it with a filter evasion.

At that point you add 1.21 gigawatts to the flux capacitor and voila, you have your answer.

Dayze

08-03-2011 09:26 AM

Quote:

Originally Posted by Bugeater (Post 7797227)

That's exactly what I was thinking.

me too....

:evil:

Fat Elvis

08-03-2011 10:22 AM

Quote:

Originally Posted by DanT (Post 7797216)

DanT-

I really appreciate your help. It was late last night (late for me), so I really wasn't expressing myself very clearly. I'm not in a class now, and it has been over 25 years since I've taken a stats/probability class, so I've forgotten most everything. I think that is why the wording is pretty wonky. I was just looking at some old D&D dice that I had stumbled across while working in the basement, and for some reason it had me thinking about stats and probability. I think I worded things the way that I had because, in my mind, I was trying to generalize the concepts beyond the die/dice.

Perhaps you can clear some things up for me since my addled brain is a bit foggy. Lets assume that an event has a .25 probability of occuring; would the population size affect the mean and standard deviation? Perhaps it only affects variance? The smaller the population size, the greater the variance? This would be due to the Central Limit Theorem, correct?

I think Third Eye was getting what I was asking; I just did a poor job of asking the question. I have more questions though; I just need to relearn how to crawl before I start walking again with this....

Dartgod

08-03-2011 10:40 AM

A 4 sided die? What kind of two-dimensional world do you live in? I'll bet that thing is a bitch to play craps with.

DanT	08-03-2011 12:34 PM

Quote:

Originally Posted by Fat Elvis (Post 7797797)

In answer to your questions, you have to bear in mind that there is an important distinction between the population mean and the population standard deviation and the sample mean and the same standard deviation. The parameters that describe the population distribution are fixed constants that may or may not be known to the investigator. You asked me to suppose that an event has a 0.25 probability of occurring. OK, well you are now telling me the relevant POPULATION distribution--a Bernoulli with a probability of success of 0.25. Once I know that, I can compute the population mean, which will be 0.25 and the population standard deviation, which will be the square root of the variance, which equals the square root of 0.25 * ( 1 - 0.25 ).

Now, in real life, we often don't know what the parameters are for the relevant population distribution. We simply have access to a sample of observations from that distribution. We can use the sample to produce estimates of the unknown population parameters. Two such estimates are the sample mean and the sample standard deviation. These estimators have sampling distributions associated with them and those distributions do indeed depend on the sample size. Sample means based on a sample size of, say, 10, will vary more from sample to sample than would sample means based on a sample size of, say, 1000. The Central Limit Theorem pertains to the sampling distribution of the sample mean. It says that if the sample is from a population distribution that has a fixed finite mean and a finite population standard deviation, then the sampling distribution for the sample means can be approximated by a normal distribution, as the sample size gets larger and larger. So, for example, the Bernoulli distrbution has finite population means and standard deviations, so the Central Limit Theorem would apply to the sampling distribution of sample means for samples taken from that distribution. The sampling distribution for sample means based on a sample size of 10 will look sorta like a bell curve, if the population mean for the Bernoulli distribution is somewhere between, say, 0.30 and 0.70, but if you use sample sizes of 1,000 or more, then the sampling distribution for the means will really look very much like a bell curve, except for population means close to the edges, very low probability or very high probability events.

DanT	08-03-2011 12:36 PM

:LOL:

Quote:

Originally Posted by Dartgod (Post 7797851)

A 4 sided die? What kind of two-dimensional world do you live in? I'll bet that thing is a bitch to play craps with.

DanT	08-03-2011 12:51 PM

Quote:

Originally Posted by DanT (Post 7798298)

I should have said that when we compute statistics on observations from a Bernoulli, we code Successes as a 1 and Failure as a 0. So the sample mean is really just the proportion of successes among all of the observations. (The population mean, on the other hand, is the probability that an observation--i.e. the next observation from the distribution, for example--will be a success.)

Fat Elvis

08-03-2011 05:12 PM

Quote:

Originally Posted by DanT (Post 7798358)

I find this all very fascinating. I really appreciate your help and time explaining this to me.

When using a Bernoulli, would you use the same methods to calculate the observation of two (or three or more) successes (not necessarily consecutive)? Or is using a Bernoulli limited to coding a single success or failure?

DanT	08-03-2011 07:30 PM

Quote:

Originally Posted by Fat Elvis (Post 7799143)

The Wikipedia entries on mathematical topics tend to be quite reliable. The page on the Bernoulli distribution mentions the appropriate related distribution that applies to your question, which is the Binomial. The Bernoulli is for a single trial. The Binomial pertains to the number of successes observed in n independent trials from the same Bernoulli distribution.

http://en.wikipedia.org/wiki/Bernoulli_distribution

CaliforniaChief

08-03-2011 07:35 PM

Quote:

Originally Posted by Bugeater (Post 7797227)

That's exactly what I was thinking.

LMAO

Extra Point

08-03-2011 07:46 PM

I'm imagining how DanT's posts would sound with the Hawking amplifier. Smart guy! Third Eye and Pawnmower, too. (No homo!)

DanT	08-03-2011 09:00 PM

Quote:

Originally Posted by Extra Point (Post 7799443)

I'm imagining how DanT's posts would sound with the Hawking amplifier. Smart guy! Third Eye and Pawnmower, too. (No homo!)

The Hawking amplifier remark reminds me of the funny Epic Rap Battle of History between him and Albert Einstein:

http://www.youtube.com/watch?v=zn7-fVtT16k

All times are GMT -6. The time now is 10:03 PM.