{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 预估比例\n", "\n", "这章主要讲如何选择 prior distributions。" ] }, { "cell_type": "code", "execution_count": 203, "metadata": {}, "outputs": [], "source": [ "import numpy as np \n", "from empiricaldist import Pmf \n", "import matplotlib.pyplot as plt \n", "import pandas as pd\n", "from scipy.stats import binom " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "一枚硬币,随机抛掷 250 次,观测的结果是 140 次正面朝上,110 次背面朝上。将这枚硬币随机抛掷一次,是正面朝上的概率最有可能是多少?\n", "\n", "\n", "\n", "**上图来自[搜狐](https://www.sohu.com/a/225881799_100071627)**\n", "\n", "我们可以用「分布」那部分的例子来思考。我们可以把这个问题转化为:\n", "\n", "我们目前有 $n$ 枚硬币,其各有如下特征:\n", "\n", "- 随机抛硬币 0,正面朝上的概率是 0/n\n", "- 随机抛硬币 1,正面朝上的概率是 1/n\n", "- 随机抛硬币 2,正面朝上的概率是 2/n\\\n", "...\n", "- 随机抛硬币 n,正面朝上的概率是 n/n\n", "\n", "我们从这些硬币中随机挑了一枚,随机抛掷 250 次,观测的结果是 140 次正面朝上,110 次背面朝上。这枚硬币最有可能是哪枚硬币?\n", "\n", "我们先用 uniform prior,也就是说,每一枚硬币是该硬币的概率相同。" ] }, { "cell_type": "code", "execution_count": 204, "metadata": {}, "outputs": [], "source": [ "# 稍微修改一下之前的 update_bowls_pmf\n", "def update_coins_pmf(n, h, t):\n", " \"\"\"\n", " n: 总共几枚硬币\n", " h: 正面朝上\n", " t: 背面朝上\n", " \"\"\"\n", " hypos = np.linspace(0, 1, n+1)\n", " prior = Pmf(1, hypos)\n", " likelihood_head = [i/(n-1) for i in range(n+1)]\n", " likelihood_tail = [1- i for i in likelihood_head]\n", " likelihood = {\n", " \"head\": likelihood_head,\n", " \"tail\": likelihood_tail\n", " }\n", " dataset = [\"head\"]*h + [\"tail\"]*t\n", " posterior = prior.copy()\n", " for data in dataset:\n", " posterior *= likelihood[data]\n", " posterior.normalize()\n", " return posterior" ] }, { "cell_type": "code", "execution_count": 205, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0. , 0.11111111, 0.22222222, 0.33333333, 0.44444444,\n", " 0.55555556, 0.66666667, 0.77777778, 0.88888889, 1. ])" ] }, "execution_count": 205, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hypos = np.linspace(0, 1, 10)\n", "hypos" ] }, { "cell_type": "code", "execution_count": 206, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | probs | \n", "
---|---|
0.000000 | \n", "1 | \n", "
0.111111 | \n", "1 | \n", "
0.222222 | \n", "1 | \n", "
0.333333 | \n", "1 | \n", "
0.444444 | \n", "1 | \n", "
0.555556 | \n", "1 | \n", "
0.666667 | \n", "1 | \n", "
0.777778 | \n", "1 | \n", "
0.888889 | \n", "1 | \n", "
1.000000 | \n", "1 | \n", "