In short

A random experiment is any action with an uncertain outcome — tossing a coin, rolling a die, drawing a card. The sample space S is the set of every possible outcome. An event is any subset of the sample space. You can combine events using union, intersection, and complement — exactly the operations of set theory, applied to uncertainty.

Toss a coin. Before it lands, you do not know whether you will see heads or tails. While it is in the air, there is nothing you can do to change the answer — physics already determined it when your thumb left the coin — but you still cannot predict it. That gap between what is determined and what you can predict is where probability lives.

Now toss a coin a million times. You cannot predict any single toss, but you can predict, with enormous confidence, that close to half of them will come up heads. One flip: total mystery. A million flips: total regularity. The same gap, seen at two scales.

Probability is the branch of mathematics that takes that regularity seriously. It gives you a language to describe events whose individual outcomes are unpredictable but whose long-run behaviour is not. It was invented in the 17th century to solve gambling problems — "how should we split the pot if a game of chance is interrupted halfway through?" — and it has since turned into the engine of statistics, information theory, quantum mechanics, cryptography, insurance, weather forecasting, and every machine learning model you have ever heard of. All of it rests on three little ideas: a random experiment, its sample space, and events inside that sample space.

A random experiment

Probability starts with an action. Not a formula, not a number — an action with an outcome you cannot predict in advance. A random experiment is any such action. Some examples to pin this down:

What makes these random experiments, as opposed to just experiments, is that all of the following hold:

  1. The experiment can, in principle, be repeated any number of times under the same conditions.
  2. The set of possible outcomes is known in advance.
  3. You cannot predict, before the experiment is performed, which particular outcome will occur.

A cooking experiment where you sauté onions until they are brown is not a random experiment in this sense — you know exactly what will happen if you leave them on heat. Dropping a stone from a cliff is not a random experiment — the stone reliably falls. But tossing a stone and seeing whether it lands face-up or face-down is. Random experiments are the raw material of probability, and the first thing you do with one is list all the outcomes it can produce.

The sample space

The list of every possible outcome of a random experiment — every possible answer to the question "what happened?" — is called the sample space, written S (some books use \Omega). It is a set, and every individual outcome is an element of it.

For the coin toss, S = \{H, T\} — two elements.

For the single die, S = \{1, 2, 3, 4, 5, 6\} — six elements.

For a single draw from a deck, S is the set of all 52 cards — a set with 52 elements.

For the roll of two dice where order matters, S has 36 elements: (1,1), (1,2), \ldots, (6,6). Notice that (3, 5) and (5, 3) count as different outcomes if you distinguish the two dice, because the first die showed something different in each case.

For tomorrow's rainfall, S is the set [0, \infty) — every non-negative real number. In this case S is infinite, and in fact uncountable. Probability can handle this, but the tools are heavier, and for most of this introduction you should think of S as a finite set.

The sample space of rolling two diceA six by six grid of dots representing all thirty-six possible outcomes when two dice are rolled, with rows labelled 1 through 6 for the first die and columns labelled 1 through 6 for the second die.S = sample space of two-dice rolldie 1die 2123456123456
Every dot is one outcome. The sample space $S$ of rolling two distinguishable dice has 36 elements — every ordered pair $(i, j)$ with $i, j \in \{1, 2, 3, 4, 5, 6\}$.

A sample space should be exhaustive (every possible outcome must be in it) and mutually exclusive (no two outcomes can occur at the same trial). Pick those two conditions carefully. "Getting more than 3 on a die" is not an outcome — it is a collection of outcomes. "Getting 4" is an outcome. The distinction between outcomes and collections of outcomes is exactly the distinction between elements of S and subsets of S — and subsets of S are what the next section is about.

Events

An event is any subset of the sample space. That is the whole definition, but it packs a lot in.

Rolling a die, let A be the event "the result is even." Then A = \{2, 4, 6\}, a subset of S = \{1, 2, 3, 4, 5, 6\}. If you roll a 4, the actual outcome is the element 4; and because 4 \in A, you say the event A has occurred. If you roll a 3, then 3 \notin A, and the event has not occurred.

Another event: B = "the result is at least 5" = \{5, 6\}. And another: C = "the result is 7" = \{\}. The last one is empty, because it is impossible. The empty set is a perfectly legitimate event — the impossible event — and so is the full set S, the certain event, which occurs no matter what.

Events as subsets of a die's sample spaceA rectangle representing the sample space of a six-sided die, containing six numbered outcomes. Two overlapping ovals mark the events A equals the result is even, and B equals the result is at least five.S = {1, 2, 3, 4, 5, 6}A: evenB: ≥ 5246513
The sample space $S$ contains six outcomes. The red oval is the event $A = \{2, 4, 6\}$ (the result is even). The lighter oval is $B = \{5, 6\}$ (the result is at least five). The overlap $A \cap B = \{6\}$ is the outcome where both events happen.

Now the vocabulary. An event consisting of exactly one outcome is called a simple event or elementary event. An event consisting of more than one outcome is a compound event. On the die, \{4\} is a simple event — "the result is 4." And \{2, 4, 6\} is a compound event — "the result is even" — built from three simple events.

Types of events

A handful of vocabulary that you will see everywhere in the rest of probability:

The word equally likely also appears everywhere, though it is not a property of a single event but of a collection: the outcomes of S are equally likely if there is no reason to expect any one of them to occur more often than any other. A fair die produces six equally likely outcomes. A biased die does not. This matters because the simplest formula in probability — P(A) = n(A)/n(S), coming up in the next article — assumes the outcomes of S are equally likely.

Algebra of events

Because events are subsets of S, you can combine them using set operations. Every set operation has a meaning in probability — and "set operation" translates exactly into "and/or/not" in English.

Event operations

Let A, B \subseteq S be events.

  • Union: A \cup B is the event "A happens or B happens (or both)." The set of outcomes in A, in B, or in both.
  • Intersection: A \cap B is the event "A happens and B happens." The set of outcomes in both.
  • Complement: A^c (also written A' or \overline{A}) is the event "A does not happen." The outcomes in S that are not in A.
  • Difference: A \setminus B (also A - B) is the event "A happens but B does not." Equivalent to A \cap B^c.

Every single one of these is a subset of S, so every single one of these is itself an event. The operations take events in and give events back — that is what makes them an algebra.

The rules of set theory apply without modification. The most useful ones are:

De Morgan's laws deserve a closer look. In English: "the event 'neither A nor B' is the same as 'not A and not B.'" And "the event 'not (both A and B)' is the same as 'not A or not B.'" You will use them constantly when translating English-language probability questions into set operations.

Two concrete worked examples

Example 1: sample space of tossing a coin three times

An experiment consists of tossing a fair coin three times and recording the sequence of outcomes.

Step 1. List the sample space. Each toss is H or T, so a complete outcome is a string of three letters.

S = \{HHH, HHT, HTH, HTT, THH, THT, TTH, TTT\}

Why: there are 2 \times 2 \times 2 = 8 outcomes because each of three independent tosses has two results. Write them all out so you can pick events off visually.

Step 2. Define the event A = "exactly two heads" as a subset of S.

A = \{HHT, HTH, THH\}

Why: you scan through S and keep the strings with exactly two Hs. There are three ways to place the one T among three positions.

Step 3. Define the event B = "first toss is heads" as a subset of S.

B = \{HHH, HHT, HTH, HTT\}

Why: you keep every outcome that starts with H. The remaining two tosses are unconstrained, giving four such strings.

Step 4. Compute A \cap B, A \cup B, and A^c.

A \cap B = \{HHT, HTH\}
A \cup B = \{HHH, HHT, HTH, HTT, THH\}
A^c = \{HHH, HTT, THT, TTH, TTT\}

Why: intersection keeps the strings in both. Union keeps the strings in either. Complement keeps the strings in S that are not in A.

Result: The event "exactly two heads and first toss is heads" contains the outcomes HHT and HTH — two out of the eight total outcomes.

Events A and B for three coin tossesA rectangle representing the sample space of three coin tosses, containing eight outcomes. Two overlapping ovals mark event A (exactly two heads) and event B (first toss heads). Their overlap contains HHT and HTH.S: 8 outcomes of 3 coin tossesA: exactly 2 HB: 1st is HTHHHHTHTHHHHHTTTHTTTHTTT
The sample space of three coin tosses contains 8 outcomes. Event $A$ (exactly two heads) contains $\{HHT, HTH, THH\}$; event $B$ (first toss is heads) contains the four outcomes starting with $H$. Their intersection — the overlap region — contains exactly $HHT$ and $HTH$.

Example 2: drawing a card

A single card is drawn from a well-shuffled standard deck of 52. Let A = "the card is a heart" and B = "the card is a face card (J, Q, or K)."

Step 1. Identify the sample space and the two events as subsets.

S has 52 elements. The event A contains all 13 hearts: A = \{A\heartsuit, 2\heartsuit, \ldots, K\heartsuit\}. The event B contains the 12 face cards: \{J, Q, K\} in each of the four suits.

Why: sample space first, events as subsets second. Writing the events explicitly keeps you from counting wrong.

Step 2. Compute A \cap B.

A \cap B = "heart and face card" = \{J\heartsuit, Q\heartsuit, K\heartsuit\}.

Why: the intersection keeps only the cards that are in both sets — a card has to be a heart and a face card.

Step 3. Compute A \cup B using inclusion-exclusion on the sizes.

|A| = 13, |B| = 12, |A \cap B| = 3. So

|A \cup B| = |A| + |B| - |A \cap B| = 13 + 12 - 3 = 22.

Why: if you naively added 13 + 12 = 25, you would count the three cards in the overlap twice. Subtract them once to get the right total.

Step 4. Compute A^c (the event "not a heart").

A^c contains the 39 non-heart cards. In particular |A^c| = 52 - 13 = 39.

Result: The four quantities the problem asks about are |A| = 13, |B| = 12, |A \cap B| = 3, |A \cup B| = 22. Events and their sizes are what you will plug into the probability formulas in the next article.

Hearts and face cards in a deckA rectangle representing the 52 cards of a deck, with two overlapping regions: event A (13 hearts) and event B (12 face cards). Their intersection contains 3 cards (the jack, queen, and king of hearts).S: 52 cardsA: 13 heartsB: 12 face cards10 hearts(non-face)J♥ Q♥ K♥|A ∩ B| = 39 face cards(non-heart)30 cards outside both (non-heart, non-face)
The deck splits into four zones: $10$ non-face hearts, the $3$ face hearts in the overlap, $9$ face cards in other suits, and $30$ cards that are neither hearts nor face cards. The four zones sum to $52$, and $|A \cup B| = 10 + 3 + 9 = 22$ — matching the inclusion-exclusion calculation.

Common confusions

A few things students reliably get wrong about sample spaces and events.

Going deeper

If you only need probability at the level of basic coin-and-dice problems, you have the full setup now and can move on to Classical Probability. The rest of this section is about how probability handles infinite sample spaces, and why the set-of-all-subsets approach needs refinement in those cases.

Countable and uncountable sample spaces

The sample spaces in this article are all finite — at most a few dozen outcomes. Probability can handle more:

For finite and countable sample spaces, you can ignore this distinction entirely: every subset of S is an event, full stop. The distinction only bites in the continuous case, and even there, you can treat every reasonable subset (any interval, any circle, any region with a sensible area) as an event.

Why events form an algebra

The union, intersection, and complement operations on events are closed: taking them gives you back an event. That closure is what makes "events under \cup, \cap, {}^c" an algebra in the formal sense. Combined with the fact that \emptyset and S are always events, this gives you the structure called a Boolean algebra — the same structure logic runs on, which is why every English phrase with and, or, not can be translated directly into operations on events.

The name "probability theory" is a bit misleading, because the theory itself is built on top of set theory: pure set operations, with a probability function assigning each set a number in [0, 1]. That function is what the next articles introduce — classical probability first, then its axiomatic formulation.

Where this leads next

You now have the vocabulary of probability: random experiments, sample spaces, events, and how to combine events using set operations. The next articles put numbers on events.