Question 1

How do I count the tokens in a webpage?

Accepted Answer

Paste the URL into the box above and press Count tokens. Keep fetches the page server-side, counts tokens on the raw HTML, and counts tokens again on the Markdown version so you can see the difference. Nothing is uploaded from your machine.

Question 2

Which tokenizer is this using?

Accepted Answer

The o200k_base encoding that OpenAI ships with every current model, from GPT-4o through the GPT-5 family and the o-series reasoning models. It runs locally on the server and is the same tokenizer OpenAI publishes. The numbers should match the OpenAI Tokenizer playground to within a token or two on any page.

Question 3

Does the number apply to Claude and Gemini too?

Accepted Answer

Close, but not exact. Anthropic and Google do not ship pure-JavaScript tokenizers that run on Cloudflare Workers, so the tool cannot give you a precise Claude or Gemini count. The number it does give you is a solid proxy because the major tokenizers fall within roughly 10 to 15 percent of each other on English prose. Model-specific counts will land once portable tokenizers are available.

Question 4

What is the Markdown column showing me?

Accepted Answer

Two things happen at once. The tool sends a second request to the same URL with Accept: text/markdown. If the site honours that header and returns real Markdown, the column is green and you see the actual tokens agents get today. If the site returns HTML anyway, the column is amber and the number is the Markdown version Keep would extract from the HTML. That is the hypothetical saving if the site added Markdown support.

Question 5

Why is Markdown so much smaller than HTML?

Accepted Answer

HTML carries the page chrome along with the article. Scripts, style tags, nav menus, cookie banners, tracking pixels, SVG icons, inlined social share widgets. None of that is useful to an AI agent. Markdown drops all of it and keeps just the headings, paragraphs, links, and lists. On most content pages that is an 80 to 95 percent reduction.

Question 6

Does the site I check know I ran this?

Accepted Answer

It sees a normal server-side fetch with a regular browser user agent. Two requests per check, one for HTML, one for Markdown. Nothing identifies the request as coming from Keep. The URL you paste is not stored or tied to an account.

Question 7

Why does the HTML column look bigger than the page feels?

Accepted Answer

A page that looks short in your browser often has 50 to 100 kilobytes of inlined React, Tailwind, analytics, and CSS keyframes. The tool counts every byte the server returned. If the site is entirely client-rendered, the initial HTML may even be an empty shell and the real content arrives later via JavaScript, in which case the HTML tokens are mostly scripts.

Question 8

What counts as a token?

Accepted Answer

A token is roughly three to four characters of English text. Short common words are often one token each. Rare words and code fragments can take three or more tokens. Whitespace, punctuation, and capitalisation all shift the count. The tool uses the official BPE tokenizer so the number matches what the model actually charges.

Question 9

Can I count tokens in raw text instead of a URL?

Accepted Answer

The OpenAI Tokenizer playground is the right tool for pasted text. This page specialises in the URL case because that is where the HTML-vs-Markdown comparison matters. If you already have the Markdown and just want a count, paste it into any tokenizer and the number will match.

Question 10

What is Markdown for Agents?

Accepted Answer

A set of four conventions that let a website serve clean Markdown to AI agents instead of HTML. The dedicated Markdown for Agents checker at /tools/markdown-for-agents runs all four checks plus llms.txt and MCP discovery against any URL.

Question 11

Is there a Keep API for token counting?

Accepted Answer

Keep is the upstream product. Every page you bookmark gets stored as clean Markdown in a searchable library, which means you get the token-efficient version without having to check each site yourself. This tool is the public lookup for one URL at a time.

Question 12

How long a page will it handle?

Accepted Answer

The tool truncates the Markdown at roughly 10,000 characters for the preview and savings math. Very long articles are handled fine, they just show a truncated marker in the preview. The token numbers are computed on the truncated Markdown so the savings percentage is honest even when the underlying page is huge.

Token counter for URLs

How to count the tokens in a URL

Why count tokens at the URL level

Which tokenizer the tool uses

Claude and Gemini token counts

Frequently asked questions

How do I count the tokens in a webpage?

Which tokenizer is this using?

Does the number apply to Claude and Gemini too?

What is the Markdown column showing me?

Why is Markdown so much smaller than HTML?

Does the site I check know I ran this?

Why does the HTML column look bigger than the page feels?

What counts as a token?

Can I count tokens in raw text instead of a URL?

What is Markdown for Agents?

Is there a Keep API for token counting?

How long a page will it handle?

Save any page as clean Markdown with Keep

Other tools

Markdown for Agents checker

URL to Markdown

llms.txt validator