Statistics
- A random variable is a variable whose value is a numerical outcome of a random phenomenon.
- A parameter is a number that describes the population. In practice we do not know it’s value.
- The most common method of fitting a line to a scatter plot is least squares. The line that minimizes the sum of the squares of the vertical distances of the observed y-values from the line.
- The residual is the difference between the observed value and that predicted by the regression line.
- A statistic is something that describes a sample. When the data is random, in that we’ve no expectation of it’s outcome, the statistic is considered a random variable that obeys the laws of probability.
- The link between probability and data is formed by the sampling distributions of the statistics.
- The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible sample of the same size from the same population.
I would imagine something like the double slit experiment. Each pulse and note of which slit it appears through is a sample. That individual sample could be said to have a distribution of where it might land based on physics. Multiple samples would hone in on the overall distribution of where the light will hit for the population (i.e. all light rays).
A binomial distribution can be thought of as a sampling distribution with the population ultimately at infinite trials? Only when the sample size relative to the population size is large is the random variable approaching some true probability of success.
Notes
Khan Academy
12/02/23 16:35:08
- Descriptive statistics, can we describe a large amount of data with a smaller amount of information or numbers.
- Inferences, conclusions, judgements about the data.
- When faced with just a collection of numbers, how do we find patterns in it.
Averages: an attempt to get the central tendancy, a typical number in your collection of numbers. * Arithmetic mean: sum of no. divided by no. of numbers.