Essay
I Watched AI Chatbots Watch Me
I built a local monitor for my own browser traffic and used ChatGPT, Claude, Gemini, and Grok normally for 10 days. What fell out was a more concrete picture of how much of the experience is conversation, and how much is telemetry.
2026-04-12
I wanted to answer a simple question:
When you use AI chatbots in the browser, how much of the network traffic is actually the conversation?
So I built a local monitor for my own machine. It hooks into Chrome DevTools, captures AI-related traffic into a local SQLite database, and lets me inspect the requests later instead of trying to manually scroll through the Network tab forever.
Then I used the products normally for 10 days.
What I got was not a scandal. It was something more useful: a concrete picture of how much browser-based AI is wrapped in telemetry, analytics, suggestions, ads, and background infrastructure.
The setup
This was intentionally simple.
- one local capture daemon
- my own browser traffic only
- normal day-to-day usage
- no decompilation, no privileged access, no remote interception
At the end of the run I had almost 20,000 captured requests across ChatGPT, Claude, Gemini, and Grok.
That is the first thing that changes your intuition. These products are not a tidy "send prompt, get answer" loop. They are full browser apps with lots of supporting traffic around the main interaction.
The top-line numbers
Over 10 days of my own usage:
| Platform | Total requests | Sessions | Notes |
|---|---|---|---|
| ChatGPT | 13,304 | 48 | biggest footprint in this capture |
| Grok | 4,741 | 21 | heavy telemetry and suggestion traffic |
| Claude | 1,264 | 10 | smallest request volume of the group |
| Gemini | 242 | few | least-used in this dataset |
That works out to roughly:
- 277 requests per ChatGPT session
- 226 requests per Grok session
- 126 requests per Claude session
Those numbers are usage-dependent, so they are not universal. But they are enough to make the point: the visible conversation is only one layer of what is happening.
ChatGPT: the busiest surface
ChatGPT generated by far the most traffic in this capture, which partly reflects that I used it the most. But the shape of the traffic was interesting too.
In the sampled period:
- about 29% of ChatGPT traffic was telemetry or tracking rather than core conversation traffic
- I captured 1,634 analytics events
- I saw repeated A/B-test and experiment traffic
- I saw autocomplete-related requests before message submission
The event stream was especially revealing because it makes attention patterns visible. Things like link clicks, focus changes, pastes, and UI interactions show up as data instead of just interface behavior.
That is not shocking in the abstract. Most large web products do analytics. What changed for me was seeing the volume and granularity in one place.
Grok: suggestion traffic with more context than I expected
Grok was the most surprising system in the capture.
Two things stood out:
- It sent a lot of suggestion-related traffic while typing.
- Some of those requests carried more conversation context than I expected that early in the interaction.
In this run I captured:
- 271 keystroke-related transmissions
- 40 unique leaked conversation titles in the local database
That second point is the one that sticks. Before the user experience feels like "I sent a message," the product may already be sending surrounding context to support suggestions and continuity.
Again, this is still your own browser making requests you can inspect in DevTools. The difference is that almost nobody actually watches it long enough to build a mental model.
Claude: smaller footprint, still real telemetry
Claude had the lightest request volume in my capture, but that does not mean it was telemetry-free.
What I saw:
- 1,264 requests over 10 sessions
- roughly 36% telemetry in this capture
- analytics events tied to product behavior and feature usage
- no keystroke capture in the way I saw on other platforms
The absolute volume was much lower than ChatGPT, which matters in practice. But it was still clearly a modern product telemetry surface, not a bare prompt-response channel.
Gemini: the browser product sits on top of Google's larger stack
Gemini had the fewest sessions in my dataset because I used it the least, so I want to be careful not to overstate the comparison.
Still, even limited captures made one thing obvious: a large share of the traffic sits on top of Google's broader web infrastructure rather than a clean, isolated chat surface.
That makes sense if you know the company. It just looks different when you see the requests directly instead of thinking about "Gemini" as a self-contained app.
What actually surprised me
The biggest surprise was not that these products do telemetry. Of course they do.
The surprises were:
- how many requests it takes to support what feels like one simple interaction
- how early suggestion and assistance features begin sending data
- how much product experimentation and feature-flagging traffic surrounds the main flow
- how different the implementation styles are across providers even when the UX looks similar
This is why I think local capture is useful. Privacy policies talk in categories. Network traces show behavior.
The important caveat
This is not a universal ranking of "which chatbot is best for privacy."
It is a measured snapshot of:
- my accounts
- my usage
- my browser
- my capture window
- the specific product behavior during that period
Some platforms may look heavier simply because I used them more. Some telemetry can be product-quality instrumentation rather than something more sinister. Some features only activate for certain users or experiments.
So the right way to read this is not as a final scoreboard. It is as an existence proof that these products are much easier to inspect than people assume, and that the surrounding telemetry layer is substantial.
Why I built the tool
The browser already exposes all of this. The hard part is not access. The hard part is making the data survivable.
After a few sessions, the Network tab is noise. A local monitor that stores the traffic, classifies it, and lets you ask questions later is much more useful.
Questions like:
- how much of this traffic is actually telemetry
- are keystrokes sent before submit
- what identifiers does this platform assign
- what protocol does it use to stream responses
- what changed between two capture sessions
Those are practical questions, not abstract privacy debates.
What I took away
Browser-based AI is not just "AI." It is AI plus product analytics, growth infrastructure, suggestions, caching, experiments, and whatever else the company has layered around the chat box.
That does not automatically make it bad.
But it does mean the default mental model most people have is too simple.
If you care about privacy, product design, protocol behavior, or just how these systems are really built, watching the traffic is worth it. Not because it reveals hidden magic, but because it removes the abstraction.
Once you do that, the products feel less like mysterious chat interfaces and more like what they really are: large web applications with an LLM in the middle.