№ 09Learn With Darin · Field Guide

Grok from xAI.

A practitioner's guide to the assistant that lives inside X, the standalone app, and grok.com. Where the real-time data angle genuinely wins, where the model lineup pays off, and where the product's posture creates tradeoffs you have to plan for.

Updated May 2026 ~22 min read Covers Free, SuperGrok, SuperGrok Heavy, X Premium+, API
Part 01

What Grok actually is in 2026

Grok is xAI's assistant, and like most of the big models it ships through several front doors. The product was first introduced inside X (formerly Twitter) in late 2023, then split out into a standalone product through 2024 and 2025, then rolled into a proper consumer surface and an API. As of May 2026, those surfaces are:

  • Inside X, at grok.x.com and via the in-app Grok panel on web and mobile. This is where Grok was born and where its X-data integration is most direct: @grok mentions in a thread, the "Explain this post" button, and the embedded chat panel that knows about the current timeline view.
  • The standalone web app at grok.com. This is the canonical Grok-as-assistant surface: a clean chat UI, model picker, mode switches (Think, DeepSearch), file uploads, image generation, voice. No timeline noise. If you're using Grok for work rather than for X, this is the door you want.
  • Standalone mobile and desktop apps. iOS and Android shipped first in 2025. macOS and Windows builds followed; both are thin wrappers over the web app with OS niceties (system tray, share intents, voice mode in the background).
  • The API at console.x.ai, with OpenAI-compatible endpoints. Pricing is per-token and broadly competitive with the other frontier APIs.

Inside all of these, you're talking to the same family of models: Grok 3 and Grok 4, with the Heavy variant on the top tier and Think and DeepSearch as modes you flip on per turn. Which models you can pick is a function of your plan, not the surface. Free has the smallest pickbox. X Premium+ rolls some of Grok in with the X subscription. SuperGrok and SuperGrok Heavy are the standalone tiers.

Note The two most common confusions in 2026 are (1) thinking "Grok inside X" and "the Grok app" are different products (they share an account, sync conversations, and run the same models), and (2) assuming X Premium+ unlocks the same caps as SuperGrok. It mostly does for chat, but Heavy and the higher DeepSearch budgets sit behind the standalone tier.

The thing that makes Grok worth a guide of its own, separate from the other six already on this hub, is one specific capability: it can read X in real time, at scale, with native API access to the firehose. None of the other major assistants can do that with the same fidelity. Whether that capability matters to you is the central question this guide is going to keep returning to.

Part 02

The X integration: where the real-time data angle wins

"Real-time web access" is something most assistants now claim. Grok's version is different in degree and in kind. The other tools query a search engine that crawled the web on some lag and then snippets the results back through a context window. Grok queries X directly, with structured access to posts, threads, replies, quotes, engagement metrics, and the social graph. For anything that lives on X first, that's a meaningful gap.

Here's the practical contrast:

Grok with X access

  • Sees posts seconds after they're posted, with thread context, quote-tweet chains, and reply structure intact.
  • Can answer "what's the consensus on X right now" with an actual sample of posts, not a summary of articles that summarized posts.
  • Knows who is posting (verified status, follower count, account age) and can weight accordingly when you ask.
  • Can be invoked directly inside a thread with @grok for in-context analysis the rest of the timeline can read.
  • Inherits X's content surface: trends, lists, communities, spaces transcripts when available.

Other assistants without it

  • Reach X content through the public web only: scraped pages, news articles that quote posts, third-party aggregators.
  • Lag is typically minutes to hours, sometimes days, and quote/reply structure is often flattened or lost.
  • Can't easily distinguish a post with 10 likes from one with 100,000 unless that signal happens to surface in the article they read.
  • Have no in-context invocation; you copy a link in, they fetch the page, and that's the extent of the interaction.
  • Stronger on long-form web (blogs, papers, docs) and on sources X never sees.

Where this matters, in order of how often I actually reach for it:

  • Live event tracking. A keynote, an outage, a sports moment, a market open. Asking Grok "what's the read on X about <event> in the last 30 minutes" returns a synthesized answer with citations to specific posts. Other tools can't do this without a multi-step web search dance, and the freshness gap shows.
  • Sentiment on a specific post or topic. Drop a link, ask "how are people reacting." You get a distribution of responses, not just the post's text re-explained.
  • Public-figure quote-checking. "Has <person> said anything about <topic> in the last week" is fast and reasonably accurate, with links to the actual posts. The other tools either can't or hallucinate.
  • Company / product launch chatter. When a launch happens, X is usually where the unfiltered first reactions surface. Grok reads them in place.

Where it doesn't help, and you should reach elsewhere: anything that lives mostly off-X. Long-form analysis. Documentation. Code questions. Most academic and scientific topics. The X corpus is one slice of the world; treating Grok as a general-purpose assistant means leaning on the same web search the others use, where its lead disappears.

Part 03

Model lineup and modes

The lineup as of May 2026 is the cleanest it's been since Grok launched. Two main models, one heavy variant, two modes you flip on per turn.

  • Grok 3. The previous flagship, still in the picker for cost and latency reasons. Solid for everyday chat, fast, cheap on the API. The default on Free and on most X Premium+ contexts.
  • Grok 3 Mini. The tiny one, mostly an API-tier model. Used implicitly inside the consumer apps when you're hitting cap and getting a fallback.
  • Grok 4. The current flagship. Genuinely competitive with the other frontier models on reasoning benchmarks; first-party benchmarks should be taken with the usual salt, but third-party evals broadly back the claim.
  • Grok 4 Heavy. The top-shelf variant. Multi-agent under the hood: it spawns parallel reasoning paths and reconciles them before answering. Slower, more expensive in token economics, noticeably better on hard reasoning, math, and long-horizon tasks. SuperGrok Heavy only on the consumer side.

Then the two modes, both of which can be toggled with the buttons next to the input box:

  • Think. Extended reasoning before the response. The same idea as OpenAI's o-series and Anthropic's extended thinking: visible chain-of-thought in a panel, slower output, better on hard problems. Available on Grok 4 and Grok 4 Heavy.
  • DeepSearch. An agentic web research mode. Grok plans a search, fans out into the web (and X), reads sources, follows citations, comes back with a structured answer. Closer to Claude's agentic web search or OpenAI's deep research than to a single web query. Slower (often minutes), better on questions that need synthesis across many sources.

Here's the rough capability picture by tier. Caps move; treat the numbers as orders of magnitude, not exact ceilings.

Tier Cost Models available Modes Headroom
Free $0 Grok 3 (with fallback to Mini under load) Limited DeepSearch, no Think Tight. A handful of Grok 4 turns per day on a rolling promo basis, then back to Grok 3.
X Premium+ $40/mo (X subscription) Grok 3, Grok 4 within caps Think, DeepSearch within caps Comfortable for chat-shaped use; not enough for heavy DeepSearch days.
SuperGrok ~$30/mo Grok 3, Grok 4 Think, DeepSearch, full caps The standalone tier. Higher DeepSearch budget than X Premium+.
SuperGrok Heavy ~$300/mo All of the above plus Grok 4 Heavy Think, DeepSearch, Heavy mode Effectively uncapped for normal use. Aimed at researchers, traders, journalists.
API Per-token All models, including Heavy Think and DeepSearch via parameters Pay for what you use. OpenAI-compatible SDK.
Tip If you're already paying for X Premium+, do not also pay for SuperGrok unless you actually use DeepSearch heavily. The chat caps overlap. The Heavy variant and the bigger research budgets are the only things SuperGrok Heavy buys you that Premium+ doesn't.

Beyond the chat models, two other capabilities matter for picking a tier:

  • Aurora image generation. xAI's in-house image model, integrated into the chat surface. Quality is broadly competitive with the mid-tier image models; it's permissive about styles other generators refuse, which is a feature for some workflows and a problem for others.
  • Companions. Voice personalities you can chat with in voice mode. Largely a consumer product, not a workflow tool. Worth knowing exists; not worth picking a tier over.
Part 04

Practical workflows and recipes

The recipes that actually justify a Grok subscription, in order of how often I lean on them. Each one leverages the X-data angle or DeepSearch; if your work doesn't, a different assistant will probably serve you better.

i.

Real-time event synthesis.

Something is happening right now: a launch, an outage, an earnings call, a political moment. Open Grok, ask "summarize what's being said on X about <event> in the last hour, weighted toward verified accounts and high-engagement posts, with three representative posts per cluster of opinion." You get a usable readout with links. Refresh every fifteen minutes during the event window.

ii.

Sentiment-on-a-topic before you write about it.

Before publishing a take on a contested topic, ask Grok for the existing distribution of views on X, with examples. The point is not to copy the takes; it's to know which arguments you'll need to address and which ones are strawmen versus actually held. Surprisingly useful for not embarrassing yourself in public.

iii.

Quote-checking and provenance.

"Did <person> actually say <quote>? If so, when, in what context, and what was the response?" Grok handles this well because it can reach into post-level history with citations. Verify the citations yourself before quoting; the model can still confabulate, but the cost of checking is one click.

iv.

DeepSearch for a brief on a niche topic.

Pick a question that needs synthesis across many sources, not just one good article. "Compare the major positions on <policy> as of this month, with the strongest argument from each side and the empirical evidence each cites." DeepSearch will spend minutes, return a structured brief, and link out. Treat the output as a starting outline, not a finished piece.

v.

"Explain this thread" inside X.

When you land in a thread that assumes context you don't have, hit the Grok button on the post. You get the missing background and a plain-language read of what's actually being argued. Faster than scrolling up; better than guessing.

vi.

Track a developing story across days.

For ongoing stories, run a daily "what's new on <topic> in the last 24 hours that I might have missed" query. Save the chat and reuse it. The combination of fresh X data and Grok's ability to summarize across many posts beats reading a feed manually for keeping up with a single topic.

vii.

Use the API where the X data is the differentiator.

If you're building anything that consumes X content programmatically (a dashboard, an alerting tool, a research pipeline), the Grok API is the cheapest way to combine "search X" with "have a model summarize the result." The OpenAI-compatible SDK means you can swap providers later if needed.

Part 05

The content-posture angle

Grok will answer questions other models won't. That's a deliberate product choice by xAI, and it deserves to be discussed plainly because it's the part of the product most likely to determine whether you'll use it for work.

What "more permissive" looks like in practice:

  • Will discuss topics flagged as "sensitive" by some other assistants (drug policy, weapons, contested historical claims, adult themes in fiction) with fewer pre-emptive refusals.
  • Has a "fun mode" that loosens the conversational register; "regular mode" is closer to the other assistants in tone.
  • Generally answers questions about itself, its training, and its company more directly than peers, including occasionally critical things.
  • Aurora, the image model, has fewer subject-matter restrictions than most peers, particularly around public figures.

Why it matters for professional use:

  • Useful when other assistants over-refuse legitimate work: a journalist researching extremism, a clinician asking about overdose thresholds, a fiction writer working with adult material, a policy analyst writing about contested topics. Grok's lower refusal rate cuts down on the "I can't help with that" friction that makes the other tools feel overcautious.
  • Risky when the same permissiveness produces output you can't ship. Marketing teams in regulated industries, public-sector workflows, and anything customer-facing in a brand-safety-sensitive context have to reckon with the chance that Grok will produce something the other assistants would have declined to produce. The fact that you could get the answer doesn't mean you should publish it.
  • Mixed for image generation. Aurora's permissiveness is what makes it useful for some creative work and problematic for compliance-sensitive workflows. Pick deliberately.

The honest framing: Grok's posture is a tool feature, not a moral verdict. Some teams will treat the permissiveness as table stakes, others will treat it as a non-starter. Both are reasonable. The mistake is using the tool without understanding which camp your team is in.

One operational note: the posture is not constant. xAI has tightened and loosened defaults multiple times since launch, and the system prompt that shapes Grok's behavior has been updated publicly more than once. If you depend on a specific behavior, test it again every few weeks; it can move.

Part 06

Limits and pitfalls

The places Grok will let you down. Most are direct consequences of what makes it distinctive in the first place; the X-data angle and the permissive posture both have downsides built in.

X data quality varies wildly

X is a public posting surface, which means a meaningful share of any topic-cluster will be bots, low-effort engagement-farming, repost chains, and accounts misrepresenting themselves. Grok weights for engagement and verified status by default, but those are imperfect signals. Always ask for citations and read at least two of them. If the citations are all from one cluster of accounts, treat the synthesis as one perspective, not the perspective.

DeepSearch hallucinates under time pressure

DeepSearch is the mode most likely to invent a citation that doesn't exist or to misattribute a quote. The agentic loop is fast and confident, and confident-and-fast is exactly the failure mode for source-grounded work. Open every linked source it cites for any claim you'd actually rely on. About 1 in 10 DeepSearch citations in heavy use is wrong in some material way; the other 9 are fine, which makes the bad one easy to miss.

Third-party integration ecosystem is thin

Compared to the Claude / ChatGPT / Copilot ecosystems, Grok has fewer first-party connectors, no equivalent of Custom GPTs or Claude Projects with rich tool-calling, and limited workspace integration. The API is solid; the consumer-side "build on top of Grok" surface is much more limited. If your workflow assumes a marketplace of integrations, look elsewhere.

Regional rollout is uneven

Some Grok features ship to the US first and lag elsewhere by weeks or months. The EU has had specific availability constraints around Grok's training-data posture and X-data access. If you're outside the US, check the actual feature list against your region before subscribing; the marketing page is sometimes ahead of the rollout.

The politics-of-the-product tradeoff

Grok is associated with X and with xAI's CEO, both of which carry political associations across multiple axes. For some teams that's a non-issue; for others, using Grok in a customer-facing or branded workflow comes with reputational considerations the other assistants don't. This is not a comment on whether those associations are fair; it's a comment on whether your stakeholders will care. If they will, you'll want to know that before the integration is live.

The X-data angle disappears off X

The single biggest mistake new Grok users make is treating it as a general-purpose assistant and being underwhelmed. For long-form code, document drafting, math help, or anything where the source material isn't on X, Grok is fine but rarely best-in-class. Use it where its differentiator applies; reach for another tool where it doesn't.

Warn Treat anything Grok tells you about a person's current views with extra skepticism. A high-engagement post from three years ago can outweigh a dozen recent posts that revise the position; Grok's weighting won't always notice. For anything reputational, verify against the person's own current account before quoting.
Part 07

When to reach for Grok vs another tool

The simplest decision rule, after a year of using Grok alongside the other big assistants:

Reach for Grok when the answer lives on X, in real time, or behind a posture other tools won't take. Reach for a different tool for almost everything else. — TWD

Concretely:

If your task is... Reach for
Live event tracking, sentiment, X-native researchGrok (its differentiator)
Quote-checking a public figure's postsGrok
Synthesizing a fast brief across many web sourcesGrok DeepSearch or Claude / ChatGPT deep research; pick by which corpus you trust more
Long-form writing, editing, document workClaude or ChatGPT
Code, repo-aware engineering workClaude Code or Codex
Workspace-integrated office tasksMicrosoft Copilot or Gemini
Anything requiring brand-safe / compliance-sensitive outputClaude or Gemini; not Grok
Topics other assistants over-refuse legitimatelyGrok (with sourcing discipline) or a self-hosted open-weight model
API-side X data plus model in one billGrok API

If you only subscribe to one assistant, Grok is unlikely to be it; the X angle is too narrow to be a primary daily driver for most people. If you already pay for one and you spend real time on X, adding SuperGrok or X Premium+ is the cheapest way to pick up a capability the others genuinely don't have. If your work is X-adjacent (journalism, trading, communications, comms-aware policy work), Grok stops being a curiosity and starts being a daily tool.

And if any of this is out of date by the time you read it: x.ai/news tracks model and product releases; the in-app changelog at grok.com is a faithful echo. Both consumer apps lag the web by a release or two.