问题集:概率统计

问题集:概率统计
Exisfar问题集:概率统计
统计推断的两大学派:频率派和贝叶斯派
统计有两大分支:统计描述 (Descriptive statistics)、统计推断 (Statistical inference)
The frequentist views probability as the long-run frequency of events, making inferences based on sample data. In contrast, the Bayesian treats probability as a measure of subjective belief, incorporating prior knowledge into its inferences. The key difference lies in their definitions and interpretations of probability, yet the two approaches can also complement each other.
- Frequentist: Parameters are fixed unknown constants estimated from data (e.g., sample mean estimates population mean).
- Bayesian: Parameters are random variables with prior distributions (e.g., assuming a coin bias toward 0.6), updated to posterior distributions using data.
后验分布为什么有用?(相比于频率派的MLE)
- 更全面的参数认知:不仅知道“最可能值”,还知道它的不确定性、分布形状。
- 更稳健的预测:考虑所有可能的参数,而不是依赖单一估计。
- 更直接的决策支持:可以计算概率、比较假设、优化行动。
- 小样本下更稳定:先验信息防止过拟合
Maximum-likelihood Estimation (极大似然估计)
Why
To estimate the parameters of a statistical model so that it best matches the true data distribution .
What
- MLE selects the parameter that maximizes the likelihood function , which measures how probable the observed data is under the model.
- For independent samples , the likelihood is:
How
Step-by-Step Process:
- Assume a model: Choose a parametric distribution (e.g., Gaussian, Bernoulli).
- Collect data: Observe samples from .
- Write the likelihood: Compute or .
- Optimize: Find that maximizes the (log-)likelihood:
- For simple models (e.g., Gaussian mean), this can be solved analytically.
- For complex models (e.g., neural networks), gradient-based optimization (e.g., SGD) is used.
参数化方法和非参数化方法
- Parametric approach:
- Def: use probability distributions having specific functional forms governed by a small number of parameters whose values are to be determined from a data set.
- Limitation: the chosen density might be a poor model of the distribution that generates the data, which can result in poor predictive performance.
- Nonparametric approach:
- Def: make few assumptions about the form of the distribution.
- Example: Histogram methods, kernel density estimator.
Comment
匿名评论隐私政策