Tabularly

Joining tables: a visual guide to inner, left, right, and outer joins

May 30, 2026

Most people who work with tabular data learn joins by getting bitten by them. You merge two files, the result has 10x the rows you expected, and you spend an hour figuring out why. This post is meant to short-circuit that.

Joins are how you combine two tables on a shared piece of data — for example, customers (with id and name) and orders (with customer_id and amount). You match customers.id to orders.customer_id and end up with a single combined table.

The question is: what do you do with rows from one table that have no match on the other side?

There are four standard answers. They differ only in how they handle the unmatched rows.

Setup

Let’s use these two tables throughout:

Customers:

idname
1Ada
2Bob
3Carol

Orders:

cidamount
1100
2200
4999

We’re joining on customers.id = orders.cid. Notice: Carol (id 3) has no orders. The order with cid 4 has no matching customer.

Inner join

Only keep rows where the match succeeds on both sides.

idnamecidamount
1Ada1100
2Bob2200

Carol disappears (no orders). The orphan order (cid 4) disappears (no customer).

This is what most people call “merge” when they don’t specify. It’s the safest default because it never invents data.

Left join

Keep all rows from the left side. Fill in nulls where the right side has no match.

idnamecidamount
1Ada1100
2Bob2200
3Carolnullnull

Carol is preserved with empty cells for cid/amount. The orphan order (cid 4) is still dropped because it’s on the right side.

This is the version you want when your left table is your “source of truth” — for example, you have a customer list and you want to bolt on optional information (orders, last login, lifetime value) without losing any customers.

Right join

Keep all rows from the right side. Mirror of left.

idnamecidamount
1Ada1100
2Bob2200
nullnull4999

Carol disappears. The orphan order (cid 4) is preserved.

Right join is rare in practice — most analysts use a left join with the tables swapped, because that’s easier to read. But it exists for completeness.

Outer join (full outer)

Keep everything from both sides.

idnamecidamount
1Ada1100
2Bob2200
3Carolnullnull
nullnull4999

Carol and the orphan order both survive.

Use outer when you’re auditing for orphans on either side — for example, “find all the customer records without orders AND all the orders without a matching customer.” The null cells are exactly where you look to spot data quality issues.

When the cardinality multiplies

A subtle point: if Ada had three orders instead of one, every type of join would produce three rows for Ada. Inner: three. Left: three. Right: three. Outer: three. Joins multiply when the right side has multiple matches.

If you didn’t expect this — for example, you wanted “one row per customer with their total spend” — what you actually want is a join followed by an aggregate, not a join alone. Aggregation isn’t a Tabularly operation today, but it’s worth knowing the conceptual distinction.

Multi-column joins

Sometimes a single column isn’t a unique enough match. For example, if customer_id is reused across region, you might want to match only when both line up. Multi-column joins handle this: specify customer_id ↔ customer_id AND region ↔ region — both conditions must hold for two rows to join.

This is one of the cases where most online “merge CSV” tools fall over — they only support single-column joins. Tabularly’s join tool supports multi-column out of the box.

Picking the right join

A flowchart that fits in three sentences:

  1. Do you want to keep ALL the rows from one side regardless of matches? Use left (and put that side on the left).
  2. Do you want to keep ONLY rows that match? Use inner.
  3. Do you want to keep EVERYTHING from both sides for auditing? Use outer.

If you’re not sure, start with inner. It’s the safest because it never invents data — if the result is missing rows you expected, you can switch to left or outer and see them.

A note on join chains

Joining three or more tables is the same operation as joining two, applied in sequence. A+B+C means: join A and B, then join the result with C. Each step in the chain is its own inner/left/right/outer decision. Tabularly’s Pro tier supports chained N-table joins in one config (with inner/left only past two tables, because right/outer get semantically ambiguous in chains — what does “right” mean when you’re chaining three tables?).

Why this matters

Joins are the foundation of relational thinking. Getting the wrong one isn’t just a syntax error — it changes the meaning of your result. The difference between “all my customers” and “all my customers who have placed an order” is one keyword in a SQL query, and the wrong choice will silently report wrong numbers to your stakeholders.

If you’re handling tabular data and you’ve never sat down with the four-quadrant Venn diagrams, take 15 minutes today to do so. The investment pays back forever.