№ 09Learn With Darin · Field Guide
Grok from xAI.
A practitioner's guide to the assistant that lives inside X, the standalone app, and grok.com. Where the real-time data angle genuinely wins, where the model lineup pays off, and where the product's posture creates tradeoffs you have to plan for.
What Grok actually is in 2026
Grok is xAI's assistant, and like most of the big models it ships through several front doors. The product was first introduced inside X (formerly Twitter) in late 2023, then split out into a standalone product through 2024 and 2025, then rolled into a proper consumer surface and an API. As of May 2026, those surfaces are:
- Inside X, at
grok.x.comand via the in-app Grok panel on web and mobile. This is where Grok was born and where its X-data integration is most direct:@grokmentions in a thread, the "Explain this post" button, and the embedded chat panel that knows about the current timeline view. - The standalone web app at
grok.com. This is the canonical Grok-as-assistant surface: a clean chat UI, model picker, mode switches (Think, DeepSearch), file uploads, image generation, voice. No timeline noise. If you're using Grok for work rather than for X, this is the door you want. - Standalone mobile and desktop apps. iOS and Android shipped first in 2025. macOS and Windows builds followed; both are thin wrappers over the web app with OS niceties (system tray, share intents, voice mode in the background).
- The API at
console.x.ai, with OpenAI-compatible endpoints. Pricing is per-token and broadly competitive with the other frontier APIs.
Inside all of these, you're talking to the same family of models: Grok 3 and Grok 4, with the Heavy variant on the top tier and Think and DeepSearch as modes you flip on per turn. Which models you can pick is a function of your plan, not the surface. Free has the smallest pickbox. X Premium+ rolls some of Grok in with the X subscription. SuperGrok and SuperGrok Heavy are the standalone tiers.
The thing that makes Grok worth a guide of its own, separate from the other six already on this hub, is one specific capability: it can read X in real time, at scale, with native API access to the firehose. None of the other major assistants can do that with the same fidelity. Whether that capability matters to you is the central question this guide is going to keep returning to.
The X integration: where the real-time data angle wins
"Real-time web access" is something most assistants now claim. Grok's version is different in degree and in kind. The other tools query a search engine that crawled the web on some lag and then snippets the results back through a context window. Grok queries X directly, with structured access to posts, threads, replies, quotes, engagement metrics, and the social graph. For anything that lives on X first, that's a meaningful gap.
Here's the practical contrast:
Grok with X access
- Sees posts seconds after they're posted, with thread context, quote-tweet chains, and reply structure intact.
- Can answer "what's the consensus on X right now" with an actual sample of posts, not a summary of articles that summarized posts.
- Knows who is posting (verified status, follower count, account age) and can weight accordingly when you ask.
- Can be invoked directly inside a thread with
@grokfor in-context analysis the rest of the timeline can read. - Inherits X's content surface: trends, lists, communities, spaces transcripts when available.
Other assistants without it
- Reach X content through the public web only: scraped pages, news articles that quote posts, third-party aggregators.
- Lag is typically minutes to hours, sometimes days, and quote/reply structure is often flattened or lost.
- Can't easily distinguish a post with 10 likes from one with 100,000 unless that signal happens to surface in the article they read.
- Have no in-context invocation; you copy a link in, they fetch the page, and that's the extent of the interaction.
- Stronger on long-form web (blogs, papers, docs) and on sources X never sees.
Where this matters, in order of how often I actually reach for it:
- Live event tracking. A keynote, an outage, a sports moment, a market open. Asking Grok "what's the read on X about <event> in the last 30 minutes" returns a synthesized answer with citations to specific posts. Other tools can't do this without a multi-step web search dance, and the freshness gap shows.
- Sentiment on a specific post or topic. Drop a link, ask "how are people reacting." You get a distribution of responses, not just the post's text re-explained.
- Public-figure quote-checking. "Has <person> said anything about <topic> in the last week" is fast and reasonably accurate, with links to the actual posts. The other tools either can't or hallucinate.
- Company / product launch chatter. When a launch happens, X is usually where the unfiltered first reactions surface. Grok reads them in place.
Where it doesn't help, and you should reach elsewhere: anything that lives mostly off-X. Long-form analysis. Documentation. Code questions. Most academic and scientific topics. The X corpus is one slice of the world; treating Grok as a general-purpose assistant means leaning on the same web search the others use, where its lead disappears.
Model lineup and modes
The lineup as of May 2026 is the cleanest it's been since Grok launched. Two main models, one heavy variant, two modes you flip on per turn.
- Grok 3. The previous flagship, still in the picker for cost and latency reasons. Solid for everyday chat, fast, cheap on the API. The default on Free and on most X Premium+ contexts.
- Grok 3 Mini. The tiny one, mostly an API-tier model. Used implicitly inside the consumer apps when you're hitting cap and getting a fallback.
- Grok 4. The current flagship. Genuinely competitive with the other frontier models on reasoning benchmarks; first-party benchmarks should be taken with the usual salt, but third-party evals broadly back the claim.
- Grok 4 Heavy. The top-shelf variant. Multi-agent under the hood: it spawns parallel reasoning paths and reconciles them before answering. Slower, more expensive in token economics, noticeably better on hard reasoning, math, and long-horizon tasks. SuperGrok Heavy only on the consumer side.
Then the two modes, both of which can be toggled with the buttons next to the input box:
- Think. Extended reasoning before the response. The same idea as OpenAI's o-series and Anthropic's extended thinking: visible chain-of-thought in a panel, slower output, better on hard problems. Available on Grok 4 and Grok 4 Heavy.
- DeepSearch. An agentic web research mode. Grok plans a search, fans out into the web (and X), reads sources, follows citations, comes back with a structured answer. Closer to Claude's agentic web search or OpenAI's deep research than to a single web query. Slower (often minutes), better on questions that need synthesis across many sources.
Here's the rough capability picture by tier. Caps move; treat the numbers as orders of magnitude, not exact ceilings.
| Tier | Cost | Models available | Modes | Headroom |
|---|---|---|---|---|
| Free | $0 | Grok 3 (with fallback to Mini under load) | Limited DeepSearch, no Think | Tight. A handful of Grok 4 turns per day on a rolling promo basis, then back to Grok 3. |
| X Premium+ | $40/mo (X subscription) | Grok 3, Grok 4 within caps | Think, DeepSearch within caps | Comfortable for chat-shaped use; not enough for heavy DeepSearch days. |
| SuperGrok | ~$30/mo | Grok 3, Grok 4 | Think, DeepSearch, full caps | The standalone tier. Higher DeepSearch budget than X Premium+. |
| SuperGrok Heavy | ~$300/mo | All of the above plus Grok 4 Heavy | Think, DeepSearch, Heavy mode | Effectively uncapped for normal use. Aimed at researchers, traders, journalists. |
| API | Per-token | All models, including Heavy | Think and DeepSearch via parameters | Pay for what you use. OpenAI-compatible SDK. |
Beyond the chat models, two other capabilities matter for picking a tier:
- Aurora image generation. xAI's in-house image model, integrated into the chat surface. Quality is broadly competitive with the mid-tier image models; it's permissive about styles other generators refuse, which is a feature for some workflows and a problem for others.
- Companions. Voice personalities you can chat with in voice mode. Largely a consumer product, not a workflow tool. Worth knowing exists; not worth picking a tier over.
Practical workflows and recipes
The recipes that actually justify a Grok subscription, in order of how often I lean on them. Each one leverages the X-data angle or DeepSearch; if your work doesn't, a different assistant will probably serve you better.
Real-time event synthesis.
Something is happening right now: a launch, an outage, an earnings call, a political moment. Open Grok, ask "summarize what's being said on X about <event> in the last hour, weighted toward verified accounts and high-engagement posts, with three representative posts per cluster of opinion." You get a usable readout with links. Refresh every fifteen minutes during the event window.
Sentiment-on-a-topic before you write about it.
Before publishing a take on a contested topic, ask Grok for the existing distribution of views on X, with examples. The point is not to copy the takes; it's to know which arguments you'll need to address and which ones are strawmen versus actually held. Surprisingly useful for not embarrassing yourself in public.
Quote-checking and provenance.
"Did <person> actually say <quote>? If so, when, in what context, and what was the response?" Grok handles this well because it can reach into post-level history with citations. Verify the citations yourself before quoting; the model can still confabulate, but the cost of checking is one click.
DeepSearch for a brief on a niche topic.
Pick a question that needs synthesis across many sources, not just one good article. "Compare the major positions on <policy> as of this month, with the strongest argument from each side and the empirical evidence each cites." DeepSearch will spend minutes, return a structured brief, and link out. Treat the output as a starting outline, not a finished piece.
"Explain this thread" inside X.
When you land in a thread that assumes context you don't have, hit the Grok button on the post. You get the missing background and a plain-language read of what's actually being argued. Faster than scrolling up; better than guessing.
Track a developing story across days.
For ongoing stories, run a daily "what's new on <topic> in the last 24 hours that I might have missed" query. Save the chat and reuse it. The combination of fresh X data and Grok's ability to summarize across many posts beats reading a feed manually for keeping up with a single topic.
Use the API where the X data is the differentiator.
If you're building anything that consumes X content programmatically (a dashboard, an alerting tool, a research pipeline), the Grok API is the cheapest way to combine "search X" with "have a model summarize the result." The OpenAI-compatible SDK means you can swap providers later if needed.
The content-posture angle
Grok will answer questions other models won't. That's a deliberate product choice by xAI, and it deserves to be discussed plainly because it's the part of the product most likely to determine whether you'll use it for work.
What "more permissive" looks like in practice:
- Will discuss topics flagged as "sensitive" by some other assistants (drug policy, weapons, contested historical claims, adult themes in fiction) with fewer pre-emptive refusals.
- Has a "fun mode" that loosens the conversational register; "regular mode" is closer to the other assistants in tone.
- Generally answers questions about itself, its training, and its company more directly than peers, including occasionally critical things.
- Aurora, the image model, has fewer subject-matter restrictions than most peers, particularly around public figures.
Why it matters for professional use:
- Useful when other assistants over-refuse legitimate work: a journalist researching extremism, a clinician asking about overdose thresholds, a fiction writer working with adult material, a policy analyst writing about contested topics. Grok's lower refusal rate cuts down on the "I can't help with that" friction that makes the other tools feel overcautious.
- Risky when the same permissiveness produces output you can't ship. Marketing teams in regulated industries, public-sector workflows, and anything customer-facing in a brand-safety-sensitive context have to reckon with the chance that Grok will produce something the other assistants would have declined to produce. The fact that you could get the answer doesn't mean you should publish it.
- Mixed for image generation. Aurora's permissiveness is what makes it useful for some creative work and problematic for compliance-sensitive workflows. Pick deliberately.
The honest framing: Grok's posture is a tool feature, not a moral verdict. Some teams will treat the permissiveness as table stakes, others will treat it as a non-starter. Both are reasonable. The mistake is using the tool without understanding which camp your team is in.
One operational note: the posture is not constant. xAI has tightened and loosened defaults multiple times since launch, and the system prompt that shapes Grok's behavior has been updated publicly more than once. If you depend on a specific behavior, test it again every few weeks; it can move.
Limits and pitfalls
The places Grok will let you down. Most are direct consequences of what makes it distinctive in the first place; the X-data angle and the permissive posture both have downsides built in.
X is a public posting surface, which means a meaningful share of any topic-cluster will be bots, low-effort engagement-farming, repost chains, and accounts misrepresenting themselves. Grok weights for engagement and verified status by default, but those are imperfect signals. Always ask for citations and read at least two of them. If the citations are all from one cluster of accounts, treat the synthesis as one perspective, not the perspective.
DeepSearch is the mode most likely to invent a citation that doesn't exist or to misattribute a quote. The agentic loop is fast and confident, and confident-and-fast is exactly the failure mode for source-grounded work. Open every linked source it cites for any claim you'd actually rely on. About 1 in 10 DeepSearch citations in heavy use is wrong in some material way; the other 9 are fine, which makes the bad one easy to miss.
Compared to the Claude / ChatGPT / Copilot ecosystems, Grok has fewer first-party connectors, no equivalent of Custom GPTs or Claude Projects with rich tool-calling, and limited workspace integration. The API is solid; the consumer-side "build on top of Grok" surface is much more limited. If your workflow assumes a marketplace of integrations, look elsewhere.
Some Grok features ship to the US first and lag elsewhere by weeks or months. The EU has had specific availability constraints around Grok's training-data posture and X-data access. If you're outside the US, check the actual feature list against your region before subscribing; the marketing page is sometimes ahead of the rollout.
Grok is associated with X and with xAI's CEO, both of which carry political associations across multiple axes. For some teams that's a non-issue; for others, using Grok in a customer-facing or branded workflow comes with reputational considerations the other assistants don't. This is not a comment on whether those associations are fair; it's a comment on whether your stakeholders will care. If they will, you'll want to know that before the integration is live.
The single biggest mistake new Grok users make is treating it as a general-purpose assistant and being underwhelmed. For long-form code, document drafting, math help, or anything where the source material isn't on X, Grok is fine but rarely best-in-class. Use it where its differentiator applies; reach for another tool where it doesn't.
When to reach for Grok vs another tool
The simplest decision rule, after a year of using Grok alongside the other big assistants:
Reach for Grok when the answer lives on X, in real time, or behind a posture other tools won't take. Reach for a different tool for almost everything else. — TWD
Concretely:
| If your task is... | Reach for |
|---|---|
| Live event tracking, sentiment, X-native research | Grok (its differentiator) |
| Quote-checking a public figure's posts | Grok |
| Synthesizing a fast brief across many web sources | Grok DeepSearch or Claude / ChatGPT deep research; pick by which corpus you trust more |
| Long-form writing, editing, document work | Claude or ChatGPT |
| Code, repo-aware engineering work | Claude Code or Codex |
| Workspace-integrated office tasks | Microsoft Copilot or Gemini |
| Anything requiring brand-safe / compliance-sensitive output | Claude or Gemini; not Grok |
| Topics other assistants over-refuse legitimately | Grok (with sourcing discipline) or a self-hosted open-weight model |
| API-side X data plus model in one bill | Grok API |
If you only subscribe to one assistant, Grok is unlikely to be it; the X angle is too narrow to be a primary daily driver for most people. If you already pay for one and you spend real time on X, adding SuperGrok or X Premium+ is the cheapest way to pick up a capability the others genuinely don't have. If your work is X-adjacent (journalism, trading, communications, comms-aware policy work), Grok stops being a curiosity and starts being a daily tool.
And if any of this is out of date by the time you read it: x.ai/news tracks model and product releases; the in-app changelog at grok.com is a faithful echo. Both consumer apps lag the web by a release or two.