Pagination: cursor vs offset, and when to pick which

Pagination looks like a simple topic. Hundreds of items, send them a page at a time. But pick the wrong pagination mechanism and you get a performance disaster at scale.

Two main approaches: offset-based and cursor-based. Each has its strengths and weaknesses. Here’s how I decide which to use.

Offset-based pagination (the traditional one)

The most common approach. “Page 3, 20 items per page” means you want items 41 to 60.

API:

GET /api/posts?page=3&per_page=20
or
GET /api/posts?offset=40&limit=20

SQL:

SELECT * FROM posts
ORDER BY created_at DESC
LIMIT 20 OFFSET 40;

Strengths:

The user can jump straight to page 5
Easy to surface metadata like “40 to 60 of 1000”
Simple to render “Page 1, 2, 3 … 50” navigation in the UI
Bookmarkable URLs (“?page=3” as a direct link)

Weaknesses:

Slow at large offsets. SQL OFFSET 10000 makes the database scan 10000 rows and throw them away. It gets worse as the dataset grows.
Inconsistent results. While page 1 is rendering, a new item gets inserted. The user clicks page 2 and sees the same item again (duplicate).
Skipped items on deletion. Someone deletes an item on page 1. Page 2 now effectively starts one item late, and you skip an item.

Offsets between 100 and 1000 are usually fine. 10,000+ gets rough.

Cursor-based pagination (the modern default)

A cursor is a “where did I leave off” marker. You use the last item’s ID or timestamp to start from the next item.

API:

GET /api/posts?after=cur_abc123&limit=20

SQL:

SELECT * FROM posts
WHERE created_at < (timestamp decoded from cursor)
ORDER BY created_at DESC
LIMIT 20;

The cursor is usually base64-encoded: {"created_at":"2024-11-15","id":12345}.

Response:

{
  "items": [...],
  "next_cursor": "cur_xyz789",
  "has_more": true
}

The client requests the next page using next_cursor.

Strengths:

Fast even on a large dataset. Each query is O(log n) index lookup, then LIMIT 20. None of offset’s problems.
Consistent results. Even if new items are inserted, the cursor returns what existed up to that point. No duplicates, no skips.
Ideal for real-time feeds. Twitter and Facebook infinite scroll is cursor-based.

Weaknesses:

The user can’t jump straight to page 5. Only sequential (next, previous).
Hard to show “40 to 60 of 1000” (needs a separate count query).
Impossible to render “Page 1, 2, 3” navigation. Only “Load More” or “Next”.

Decision criteria

Pick offset-based:

Small to medium dataset (less than 10,000 total)
Users want to jump between pages
“Total count” needs to be visible
Traditional web UI (search results and similar)
Admin panel, CRM, reporting

Pick cursor-based:

Large dataset (10K+)
New items are added constantly (feed, notifications, messages)
Infinite scroll UX
Mobile app (cursors feel more natural)
Real-time or near-real-time data

Hybrid approach

On some APIs it’s worth offering both:

# Cursor-based (preferred, for real-time feeds)
GET /api/posts?after=cur_abc

# Offset-based (backward compat, for search)
GET /api/posts?page=3&per_page=20

Stripe does this. Cursor-based is the primary mode, offset exists on a few endpoints.

Performance comparison

A simple benchmark on a 100K-row posts table in PostgreSQL:

| Query Type | Total | Time |
|————|——-|——|
| Offset page 1 (LIMIT 20 OFFSET 0) | 100K | 2ms |
| Offset page 100 (LIMIT 20 OFFSET 2000) | 100K | 5ms |
| Offset page 1000 (LIMIT 20 OFFSET 20000) | 100K | 50ms |
| Offset page 5000 (LIMIT 20 OFFSET 100000) | 100K | 450ms |
| Cursor-based (any page) | 100K | 2ms |

Cursor-based is constant time per query. Offset degrades linearly with page number.

What goes into a cursor?

When you design a cursor:

1. Include the ordering field. Whatever you sort by goes into the cursor.

ORDER BY created_at DESC
cursor: {created_at: "2024-11-15 10:30"}

2. You need a tiebreaker. Multiple items with the same created_at require an ID too:

cursor: {created_at: "2024-11-15 10:30", id: 12345}
SQL: WHERE (created_at, id) < ('2024-11-15 10:30', 12345)

3. Encode it as opaque. Base64 it. The client can’t parse the internals, and you keep the freedom to change the format.

cursor = base64(JSON.stringify({ca: ts, id: 123}))
// "eyJjYSI6IjIwMjQtMTEtMTUiLCJpZCI6MTIzfQ=="

4. Security: sign the cursor. It shouldn’t be tamperable. Add an HMAC-SHA256.

cursor = base64(payload + ":" + hmac(payload, secret))

SQL indexing for cursor-based

Cursor-based only pays off with fast queries. You need the right index:

CREATE INDEX idx_posts_created_at_id ON posts (created_at DESC, id DESC);

A composite index runs your cursor queries in O(log n). Without it cursor-based is also slow.

Bi-directional cursors

The user wants to go back. “Previous page” button.

API:

GET /api/posts?after=cur_xyz   # forwards
GET /api/posts?before=cur_abc  # backwards

Backend:

# For 'before', flip the order, then reverse
SELECT * FROM posts
WHERE created_at > :cursor_ts
ORDER BY created_at ASC  # reversed
LIMIT 20;
# Reverse the list before returning

Return both cursors in the response:

{
  "items": [...],
  "next_cursor": "cur_next",
  "prev_cursor": "cur_prev"
}

Metadata challenges

Showing “total results” is harder with cursors. You need a separate count query:

SELECT COUNT(*) FROM posts WHERE ...;

That’s slow on a big dataset. A few optimisations:

1. Approximate count. Show “100K+”. In PostgreSQL, pg_class.reltuples gives an approximate value.

2. Cache the count. Recompute every 5 minutes and cache it.

3. Don’t show the total. In many UIs the total doesn’t matter. You don’t see “123,456 posts” in the Twitter feed.

Common mistakes

1. Generating the cursor on the client. The client sends back the last item ID it saw as the cursor. Security problem. The server should always generate the cursor.

2. A non-unique ordering field. ORDER BY created_at with duplicate values. The cursor breaks. Always add a tiebreaker.

3. No limit validation. The client sends limit=10000 and the backend returns it. Enforce a max limit server-side (100 is typical).

4. Computing total count on every offset request. Cache it.

Takeaway

On a small dataset, offset is fine. As the dataset grows, cursor-based becomes mandatory. A hybrid approach (offer both) gives you flexibility.

Cursor with ordering field + tiebreaker, opaque encoding, signed cursors. Composite indexes in SQL. Those rules give you performant pagination.

Every big API (Stripe, GitHub, Twitter) is cursor-based. Model yours on theirs.