![Hands-On Mathematics for Deep Learning](https://wfqqreader-1252317822.image.myqcloud.com/cover/81/36698081/b_36698081.jpg)
Conditional probability
Conditional probabilities are useful when the occurrence of one event leads to the occurrence of another. If we have two events, A and B, where B has occurred and we want to find the probability of A occurring, we write this as follows:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1032.jpg?sign=1739431491-BoiaU4QlGnZ5wjRyWSgBC51t7XXtq1Ot-0-fc184c88a0e61dfbe8f58d2d4b95597f)
Here, .
However, if the two events, A and B, are independent, then we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1104.jpg?sign=1739431491-pbdNKIi36X1LJ9k04xxrQrD3Idfp1fcI-0-2dc1ce0a587ed365a3700b52dc12fa8a)
Additionally, if , then it is said that B attracts A. However, if A attracts BC, then it repels B.
The following are some of the axioms of conditional probability:
.
.
.
is a probability function that works only for subsets of B.
.
- If
, then
.
The following equation is known as Bayes' rule:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1751.jpg?sign=1739431491-f2bmbbN3CpT8qBbYNtJzrXeU9Y2rINyA-0-0bf994254deb78aa6b69b69e7ea6d87a)
This can also be written as follows:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_68.jpg?sign=1739431491-fh3WSjx73Van7dXpFdRAoRkfKRfS899A-0-2265aad6d242e853571c177793948a3c)
Here, we have the following:
is called the prior.
is the posterior.
is the likelihood.
acts as a normalizing constant.
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_273.jpg?sign=1739431491-sUhbD75qLYLupFbXczNMOrkqoHu3xjWU-0-74657cbafb0fad50609ffdc2c401f808)
Often, we end up having to deal with complex events, and to effectively navigate them, we need to decompose them into simpler events.
This leads us to the concept of partitions. A partition is defined as a collection of events that together makes up the sample space, such that, for all cases of Bi, .
In the coin flipping example, the sample space is partitioned into two possible events—heads and tails.
If A is an event and Bi is a partition of Ω, then we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_225.jpg?sign=1739431491-8utJbCl74GINRux8vCu6Tf53jVCu2crm-0-fabcbbf596340aceb317573e29de7ff9)
We can also rewrite Bayes' formula with partitions so that we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_58.jpg?sign=1739431491-C9PhhBLKHLfj60TucligAFvULFXuVUNE-0-062840c250617607dfa010c01b54450f)
Here, .