C3-Probability and Information Theory

阿新 • • 發佈：2018-12-16

probability==> degree of belief
frequentist probability==> directly related to the rates at which events occur.
bayesian probability==> related to qualitative levels of certainty
random varaible==> a varaible that can take on different values randomly.
- discrete:has a finite or countably infinite number of states
- continuous:is associated with a real value.
probability distribution==> a description of how likely a random varaible or set of random variables is to take on eac of its possible states.
- probability mass function(PMF)==> a probability distribution over discrete variable
  - PMF maps from a state of random variable to the probability of that random variable taking on that state.
  - $P(\text{x}=x)$ or $\text{x}\sim P(\text{x})$
  - the domain of $P$ must be the set of all possible states of $x$
  - $\forall x\in \text{x},0\leq P(x)\leq 1$ .
  - $\sum\nolimits_{x\in\text{x}}P(x)=1$ .
- joint probability distribution==> a probability distribution over many variables
  - $P(\text{x}=x,\text{y}=y)$ or $P(x,y)$
- probability density function(PDF)==> a probability distribution over continuous random variable
  - the domain of $p$ must be the set of all possible states of $\text{x}$ .
  - $\forall x\in\text{x},p(x)\geq0$ .
  - $\int p(x)dx=1$ .
  - $u(x;a,b)$ , where $b>a$ . For all $x\notin[a,b]$ , $u(x;a,b)=0$ ; within $[a,b]$ , $u(x;a,b)=\frac{1}{b-a}$ . Namely $\text{x}\sim U(a,b)$ .
Marginal Probability
- The probability distribution over the subset.
- For discrete random variable,know $P(\text{x},\text{y})$ , find $P(\text{x})$ with the sum rule: $\forall x\in\text{x},P(\text{x}=x)=\sum\limits_yP(\text{x}=x,\text{y}=y)$ .
- For cotinuous variable, $p(x)=\int p(x,y)dy$
Conditional Probability
- $P(\text{y}=y|\text{x}=x)=\frac{P(\text{y}=y,\text{x}=x)}{P(\text{x}=x)}$
- intervention query(干預查詢)==>compute the consequences of an action.(the domain of causal modeling)
The Chain Rule of Conditinal Probabilities
- $P(\text{x}^{(1)},\cdots,\text{x}^{(n)})=P(\text{x}^{(1)})\prod_{i=2}^nP(\text{x}^{(i)},\cdots,\text{x}^{(i-1)})$
Independence:
- $\forall x\in\text{x},y\in\text{y},p(\text{x}=x,\text{y}=y)=p(\text{x}=x)p(\text{y}=y)$
- For simplify: $\text{x}\perp\text{y}$
Conditional Independce:
- $\forall x\in\text{x},y\in\text{y},z\in\text{z},p(\text{x}=x,\text{y}=y|\text{z}=z)=p(\text{x}=x|\text{z}=z)p(\text{y}=y|\text{z}=z)$
- For simplify: $\text{x}\perp\text{y}|\text{z}$
Expectation
- For discrete variables, $\mathbb{E}_{\text{x}\sim P}[f(x)]=\sum\limits_xP(x)f(x)$ .
- For continuous variables, $\mathbb{E}_{\text{x}\sim p}[f(x)]=\int\limits_xp(x)f(x)$
- linear: $\mathbb{E}_{\text{x}}[\alpha f(x)+\beta g(x)]=\alpha\mathbb{E}_{\text{x}}[f(x)]+\beta\mathbb{E}_{\text{x}}[g{(x))}]$
Variance
- $\text{Var}(f(x))=\mathbb{E}\big[(f(x)-\mathbb{E}[f(x)])^2\big]$
- the square root of the variance is known as the standard deviation.
Covariance
- $\text{Cov}(f(x),g(y))=\mathbb{E}[(f(x)-\mathbb{E}[f(x)])(g(y)-\mathbb{E}[g(y)])]$
- how much two values are linearly related to each other and the scale of these variables.
- high absolute value:
  - the values changes very much
  - far from their respective means
- positive: both variables tend to be relatively high values
- negative: one high and other low.
- relationship between covariance and independence: independence==>0 covariance; 0 covariance!=> independence
- covariance matrix:
  - For a random vector $x\in \mathbb{R}^n$
  - $\text{Cov}(\mathbf{x})_{i,j}=\text{Cov}(\text{x}_i,\text{x}_j)$ , the diagonal elements of the covariance: $\text{Cov}(\text{x}_i,\text{x}_i)=\text{Var}(\text{x}_i)$ .

C3-Probability and Information Theory

probability==> degree of belief frequentist probability==> directly related to the rates at which events occur. bayesian probability==> related t

What your cell phone camera tells you about your brain: Research unites cognitive science and information theory

The finding, which was published in the journal Science, is an outgrowth of National Science Foundation-supported research into improving pedagogy in STEM

Naftali Tishby——Information Theory of Deep Learning演講翻譯（二）

要想聽懂這一段，先準備一點基礎知識： Tishby另一個視訊，介紹的更詳細一點。 1.PAC學習：Probably Approximately Correct，PAC框架主要確定資料是否可分，確定訓練樣本個數，判斷時間空間複雜度等。 2. 假設空間：Hypoth

機器學習概念：最大後驗概率估計與最大似然估計（Maximum posterior probability and maximum likelihood estimation)

joey 周琦假設有引數 θ \theta, 觀測 x \mathbf{x}, 設 f(x|θ) f(x|\theta)是變數 x x的取樣分佈， θ \th

Cryptonetworks and the Theory of the Firm

Cryptonetworks and the Theory of the FirmIf Austrian economics is the theoretical foundation for Bitcoin, I think Ronald Coase is the inspiration for gener

Beyond text: How Spokata uses Amazon Polly to make news and information universally accessible as real

This is a guest blog post by Zack Sherman, Founder of Spokata. In their own words, “Spokata is a mobile audio platform that streams real-time news

a blockchain is a digital mechanism capable of not only storing data and information, but also…

A Consensus-Based Definition of “Blockchain” to Be Used by the U.S. CongressIntroduced to the U.S. Congress on September 26th was a bill titled H.R.6913 —

Towards understanding Probability and Statistics for computer vision

Introduction to independenceFirst, let’s talk about “independence”. Do you remember the joint probability that we talked last time? In short, joint probabi

Online Sports Betting Tips and Information ?

There are two types of wagers that you can place. One is the money line. This is your straight up betting. The odds are always 50-50. It's

python安裝失敗提示“one or more issues caused the setup to fail . Please fix the issues and then retry setup.For more information see the log file”

ase ice body orm bubuko mat 解決方法 3.4 mage 換了項目組，換了新電腦，重裝Python時遇到提示如下圖所示：原因：需要安裝Windows 7 Service Pack 1 直接點擊“update your

[Draft]iOS.Architecture.16.Truth-information-flow-and-clear-responsibilities-immutability

4.4 client ini .com bug rac developer ogr required Concept: Truth, Information Flow, Clear Responsibilities and Immutability 1. Truth

CONSENSUS:BRIDGING THEORY AND PRACTICE(第6章)

客戶端互動本章介紹了幾個客戶端和Raft-based的複製狀態機互動的問題： 6.1節講述了客戶端如何發現叢集，即使叢集的成員會隨著時間變化； 6.2節講述了客戶端的請求是怎麼路由到leader處理的； 6.3節介紹了Raft如何提供線性一致性的； 6.4

Author name disambiguation using a graph model with node splitting and merging based on bibliographic information

分隔需要 sin 相似性度量進行 ati 判斷特征向量 edi Author name disambiguation using a graph model with node splitting and merging based on bibliographic

Deep Learning: Theory and Experiments

<article> <script async="" src="//pagead2.googlesyndication.com/pagead/js/adsbygoogl

Caterpillar sis service information training and software

Cat et sis caterpillar heavy duty truck diagnostics repair. Training demonstration allows.cat electronic technician et.caterpillar workshop service manu

【學習筆記】Pattern Recognition&Machine Learning [1.2] Probability Theory(2) 基於高斯分佈和貝葉斯理論的曲線擬合

高斯分佈不必贅述，這裡記錄個有意思的東西，即從高斯分佈和貝葉斯理論出發看曲線擬合（即選擇引數w）。首先假設我們使用多項式擬合曲線，根據泰勒展開的方法，我們可以用有限項多項式在一定精度內擬合任何曲線。 &nb

【學習筆記】Pattern Recognition&Machine Learning [1.2] Probability Theory(1)貝葉斯理論

這節講了概率論中的一些基本概念，這裡記錄一下對貝葉斯理論的理解。首先簡單描述一下貝葉斯理論。對於一個隨機事件，我們首先給出先驗分佈，不妨設為p(w)

postman Installation has failed: There was an error while installing the application. Check the setup log for more information and contact the author

Error msg: Installation has failed: There was an error while installing the application. Check the setup log for more information and contact the autho

How to read version (and other) information from Android and iOS apps using Java

How to read version (and other) information from Android and iOS apps using Java https://medium.com/@mart.schneider/how-to-read-version-and-oth

PBRT_V2 總結記錄 Random Variables and Probability Mass Function

參考： https://www.scratchapixel.com/lessons/mathematics-physics-for-computer-graphics/monte-carlo-methods-mathematical-foundations Rand

C3-Probability and Information Theory

相關推薦