Back to blog
ChatGPTUpdated June 10, 202622 min read

The 7 ChatGPT Prompt Frameworks Every Power User Knows (2026)

7 prompt frameworks ranked by output quality: CRAFT, RTF, CARE, TAG, RACE, BAB, and Chain-of-Thought. When to use each, side-by-side comparison.

NH
Nafiul Hasan
Founder, Prompt Architects

TL;DR: Seven prompt frameworks, ranked by how reliably they improve output. You only need to master three. A ChatGPT prompt framework is a repeatable structure that forces every prompt to include the parts a model needs to do good work. CRAFT, Chain-of-Thought, and CARE cover roughly 90% of real use cases. The rest are situational. This guide shows you exactly when to reach for each, with copy-pasteable templates.

What is a ChatGPT prompt framework?

A ChatGPT prompt framework is a fixed, reusable structure — usually an acronym — that ensures every prompt you write includes the components a large language model needs to produce a useful answer: context, the role it should adopt, the action you want, the format of the output, and the tone. Frameworks are not magic words. They are checklists that force completeness, and completeness is what separates a sharp answer from a generic one.

That direct answer is worth unpacking, because the why is more useful than the acronym. Humans write prompts the way they speak — fast, partial, assuming shared context. We drop the audience, skip the format, forget to say who the model is supposed to be. The model then fills those gaps with its statistical default, which is the blandest, most average version of every choice. A framework removes the guesswork by making you state each variable explicitly.

This is the same principle the major labs recommend. OpenAI's own best-practices guide tells you to give the model a role, be specific about the desired output, and put the most important information first. A framework is just a way to do all of that without having to remember it each time.

The frameworks below are ordered by output quality lift — how much measurable improvement they tend to produce over an unstructured prompt for the kind of task they're built for. We'll cover what each one is, when it shines, when it fails, and a real template you can copy today.

The 7 prompt frameworks at a glance

Here is the full set, ranked by typical quality lift and tagged with the task they're built for. Use this as a map; the deep dives follow.

RankFrameworkBest forComponentsBeginner-friendly
1CRAFTGeneral tasks (the default)5Yes
2Chain-of-ThoughtReasoning, math, codeVariableModerate
3CAREStyle and voice matching4Yes
4RACEOutput expectations4Yes
5BABMarketing and conversion copy3Yes
6RTFFast, simple one-offs3Very
7TAGQuick brainstorms and Q&A3Very

A quick honesty note on numbers: the percentage lifts you'll see floating around the internet (and in earlier versions of this post) come from internal or informal tests and vary wildly by task, model, and prompt quality. The one figure on this page that comes from peer-reviewed research is the Chain-of-Thought result, which we cite directly below. Treat the ranking as a practitioner's heuristic, not a lab measurement — and always test on your own task.

If you only have ten minutes, learn CRAFT, Chain-of-Thought, and CARE. They handle the overwhelming majority of what most people use ChatGPT for. The other four are worth knowing so you can move faster in specific situations, but they are optional.

What is the CRAFT framework and why is it the default?

CRAFT stands for Context, Role, Action, Format, and Tone. It is the strongest general-purpose prompt framework because it forces you to specify the five things models most often get wrong when left to guess. When you genuinely don't know which framework fits, use CRAFT.

Each letter targets a specific failure mode:

  • Context — the background the model needs to make relevant choices. Without it, the model assumes the most generic scenario.
  • Role — who the model should be. "Act as a senior copywriter with ten years of SaaS experience" produces sharper output than no role at all, a technique OpenAI explicitly recommends for specialized tasks.
  • Action — the precise task. Not "help with my pricing page" but "write 3 headline variants."
  • Format — the shape of the answer. Table, numbered list, JSON, 200-word paragraph.
  • Tone — the voice. Confident, academic, playful, plain.

Here is CRAFT in practice:

[CONTEXT] We're a B2B SaaS at $10K MRR, targeting solo developers.
[ROLE] Act as a senior copywriter with 10 years of SaaS experience.
[ACTION] Write 3 headline variants for our pricing page.
[FORMAT] Numbered list. Each headline ≤ 8 words. Add a 1-line rationale below each.
[TONE] Confident, specific, no buzzwords.

Why it works: it closes all five ambiguity gaps at once. The model isn't guessing who you are, who it is, what you want, how to shape it, or how it should sound. Every default it would otherwise pick is now your choice.

When it fails: pure reasoning tasks (use Chain-of-Thought instead) and situations where the desired style is easier to show than to describe (use CARE). CRAFT also adds overhead you don't need for a quick factual lookup — don't wrap "what's the capital of Peru" in five labeled fields.

If you build a lot of prompts, CRAFT is the framework worth turning into a saved template with swappable variables. That's exactly the pattern we cover in our guide to reusable prompt templates.

How does Chain-of-Thought prompting improve reasoning?

Chain-of-Thought (CoT) prompting improves reasoning by instructing the model to show its intermediate steps before giving a final answer, rather than jumping straight to a conclusion. This single change produces the largest, best-documented quality lift of any technique on this list — and unlike most "framework lift" claims, it is backed by peer-reviewed research.

In the foundational 2022 paper, Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, Wei et al. showed that prompting Google's PaLM 540B model with just eight worked examples raised its accuracy on the GSM8K grade-school math benchmark from 18% to 57% — surpassing a fine-tuned GPT-3 with a verifier and setting a new state of the art at the time. Google's own research summary describes the same effect across arithmetic, commonsense, and symbolic reasoning tasks.

There are two practical variants.

Zero-shot CoT — the simplest possible version. You append a trigger phrase to any prompt:

A store had 23 apples. They sold 15 in the morning, then received a
shipment of 38, then sold 27 in the afternoon. How many apples do they
have at end of day? Let's think step by step.

That last sentence is doing real work. It nudges the model into a sequential reasoning pattern instead of a single guess.

Few-shot CoT — you include one to three examples that demonstrate the reasoning chain before asking your real question:

Q: Roger has 5 tennis balls. He buys 2 cans, each with 3 balls. How many now?
A: Roger started with 5. Two cans of 3 balls is 6 more. 5 + 6 = 11. The answer is 11.

Q: A cafeteria had 23 apples. They used 20 for lunch and bought 6 more. How many now?
A: Let's reason step by step.

The model sees the pattern in your first example and matches it on the second.

Why it works: language models are pattern matchers. When you show the pattern of step-by-step reasoning — or simply ask for it — the model allocates more of its computation to the intermediate steps, which dramatically reduces single-shot errors on multi-step problems.

When it fails: simple factual lookups don't benefit and only get slower. Pure creative writing rarely needs it. And on the newest reasoning-tuned models (the o-series, GPT-5's thinking modes, Claude's extended thinking), much of the chain-of-thought benefit is now built in — explicitly asking for steps can still help on hard problems, but the marginal lift is smaller than it was in 2022. We go deeper on this in our Chain-of-Thought prompting deep dive.

When should you use the CARE framework?

CARE stands for Context, Action, Result, Example. Use it when the output style is hard to put into words but easy to demonstrate — brand voice, a specific formatting convention, a tone you "know when you see it." The Example is the whole point: one good sample does more than three paragraphs of adjectives.

Context: We're writing for our product blog, which targets indie founders.
Action: Write a 200-word intro paragraph for a post on prompt engineering.
Result: The voice should match our existing posts — punchy, specific, opinion-driven.
Example: Here's a previous intro that nails our voice:
"[paste a real 150–200 word intro from your blog]"

Why it works: describing style is lossy. "Punchy and opinion-driven" means something different to every writer and every model. A concrete example collapses all that ambiguity into a single anchor the model can imitate. This is the same principle behind few-shot prompting, and OpenAI's guidance puts it bluntly: examples are worth a thousand words of instruction.

When it fails: you don't have a representative example, or your samples vary so widely in style that the model can't extract a consistent pattern. If your "voice" is actually five different voices, CARE will average them into mush.

CARE is the framework to reach for when you're scaling content and need it to sound like you — newsletters, blog posts, support replies, social captions. Pair it with a saved library of your best examples and you've effectively encoded your brand voice.

What is the RACE framework and how is it different from CRAFT?

RACE stands for Role, Action, Context, Expectation. It is a leaner cousin of CRAFT: it keeps Role, Action, and Context, but merges Format and Tone into a single "Expectation" field that describes everything you want the finished output to satisfy.

Role: B2B copywriter, 10 years of SaaS experience.
Action: Draft 5 cold-email subject lines.
Context: Targeting CTOs at 50–200-employee SaaS companies; our product is observability tooling.
Expectation: ≤ 50 characters each, no spam-trigger words, a mix of curiosity and benefit framings.

Here is the difference in one table:

CRAFTRACE
Components5 (Context, Role, Action, Format, Tone)4 (Role, Action, Context, Expectation)
Format and ToneSeparate fieldsMerged into "Expectation"
Speed to writeSlower, more granularFaster, more compact
Best whenFormat and voice are distinct concernsYou can bundle all output requirements together

Why use RACE over CRAFT: it's faster to compose and reads more naturally for output-driven tasks where format and tone blur together (subject lines, ad copy, micro-content).

Why use CRAFT instead: Format and Tone really are different concerns. A table can be written in three tones; a confident tone can be delivered in five formats. Merging them sometimes causes you to specify one and silently drop the other. When precision matters, keep them apart.

In practice, fluent users drift toward RACE for quick output and CRAFT for high-stakes work. Both land in roughly the same quality neighborhood; the choice is about composition speed.

What is the BAB framework for marketing copy?

BAB stands for Before, After, Bridge. Unlike the general frameworks above, BAB is a copywriting structure — it maps directly to a persuasion shape, not a generic instruction set. You describe the reader's painful current state (Before), the desirable future state (After), and the thing that gets them across (Bridge).

Before: Indie founders waste 2 hours a week formatting prompts by hand.
After: One click turns a rough idea into a structured, model-ready prompt.
Bridge: Prompt Architects ships every framework on this page as a Chrome
extension preset — pick a task, get the scaffold instantly.

Why it works: it forces the copy into a tension-then-resolution arc, which is how persuasive writing actually moves people. The model can't drift into feature-listing because the structure demands a problem and a payoff.

When it fails: it's the wrong tool for anything that isn't conversion copy. Don't use BAB to summarize a document or write code. The best move is often to nest BAB inside CRAFT's Action field — CRAFT handles role, context, format, and tone; BAB shapes the actual argument.

Use BAB forDon't use BAB for
Landing page hero sectionsTechnical documentation
Ad scripts and social hooksData analysis or summaries
Email sequencesCode generation
Sales-page body copyNeutral, factual Q&A

What is the RTF framework for quick tasks?

RTF stands for Role, Task, Format. It deliberately drops Context and Tone to optimize for speed. You learn it in two minutes and it handles a surprising share of daily, low-stakes work — debugging help, quick explanations, log summaries, documentation snippets.

Role: Senior backend engineer.
Task: Explain JWT vs. session authentication.
Format: Markdown table comparing 6 dimensions.

Why it works: for self-contained tasks, Context and Tone often don't move the needle — the role and format carry most of the weight. RTF gives you 80% of CRAFT's benefit at a fraction of the typing.

When it fails: anything where context actually matters (audience-specific writing, decisions that depend on your situation) or anything that ships to customers. RTF is a personal-productivity tool, not a production framework. The moment output quality matters, upgrade to CRAFT.

Think of RTF as the framework you use a hundred times a day without thinking, and CRAFT as the one you reach for when it counts.

What is the TAG framework?

TAG stands for Task, Action, Goal. It's the most minimal structure on this list — three short sentences for fast brainstorming and single-shot questions. Where RTF assigns a role, TAG focuses on intent and the desired outcome.

Task: I'm building a Chrome extension.
Action: Give me a 1-line value proposition.
Goal: A CTO should instantly understand who it's for.

Why it works: stating the goal explicitly ("a CTO should instantly understand") gives the model a success criterion, which sharpens otherwise-vague brainstorm prompts.

When it fails: TAG has too little structure for production output. It's a thinking tool — great for generating options, exploring angles, and warming up a problem. Once you know what you want, switch to a fuller framework to produce the polished version.

A bonus worth knowing: the CO-STAR framework

CRAFT's most notable cousin is CO-STARContext, Objective, Style, Tone, Audience, Response. It deserves a mention because of where it came from: CO-STAR was developed by GovTech Singapore's Data Science and AI team and was the framework behind the winning entry in Singapore's national GPT-4 prompt engineering competition.

CO-STAR's twist is that it splits two things CRAFT bundles or omits:

  • Style and Tone become separate fields (style = how it's written, tone = the emotional register).
  • Audience gets its own explicit slot, which forces you to name who the output is for.
# Context: We're launching a budgeting app for first-time savers.
# Objective: Write a 3-sentence app store description.
# Style: Clear, concrete, benefit-led.
# Tone: Warm and encouraging, never preachy.
# Audience: People in their 20s who feel anxious about money.
# Response: Plain text, under 60 words, no exclamation marks.

Use CO-STAR when audience and tone are the variables that make or break the output — public communications, marketing to a specific demographic, anything where "who am I talking to" is the central question. For most other work, CRAFT's five fields are enough.

Side-by-side: which framework for which task?

Here's the decision logic in one place. Find your task on the left; the framework on the right is your starting point.

Your taskReach for
Not sure which to useCRAFT
Math, code, multi-step logicChain-of-Thought
You have a sample of the desired outputCARE
Fast, self-contained questionRTF or TAG
Output-driven micro-content (subject lines, captions)RACE
Landing page or ad copyBAB inside CRAFT
Audience- and tone-critical writingCO-STAR
Complex reasoning with brand voiceCRAFT + CoT + few-shot

And the mirror-image table — what each framework is genuinely best and worst at:

FrameworkSweet spotAvoid for
CRAFTGeneral-purpose defaultPure reasoning; trivial lookups
Chain-of-ThoughtMath, logic, multi-step codeSimple facts; pure creative writing
CAREVoice and style matchingWhen you have no good example
RACECompact output-driven tasksWhen format and tone must be separate
BABConversion copyAnything non-marketing
RTFFast personal-productivity tasksCustomer-facing output
TAGBrainstorming and explorationPolished, final deliverables
CO-STARAudience/tone-critical writingCode and data work

Can you combine prompt frameworks?

Yes — and in real production workflows, you usually do. Frameworks are scaffolding, not mutually exclusive categories. The trick is knowing which pairs reinforce each other and when stacking starts to hurt.

The combinations that reliably work:

CombinationUse case
CRAFT + Chain-of-ThoughtComplex reasoning that still needs a specific voice and format
CARE + few-shot examplesBrand-voice-consistent content produced at scale
BAB inside CRAFT's ActionMarketing copy with explicit role, context, and tone constraints
RTF + CoTFast technical reasoning where you don't need full context
CO-STAR + CAREAudience-targeted content anchored to a real sample

The hard rule: don't stack more than two frameworks on one prompt. CRAFT + CoT + RACE + BAB on a single request gives the model so many overlapping instructions that it starts dropping some of them. More structure is not linearly more quality — past two frameworks, the curve bends down. This is one of the patterns we unpack in the advanced prompting guide.

Here's a clean two-framework stack — CRAFT scaffolding with a Chain-of-Thought action:

[CONTEXT] We run a logistics startup; this feeds a customer-facing dashboard.
[ROLE] Act as a senior operations analyst.
[ACTION] Given the shipment data below, calculate the on-time delivery rate
         by region. Think step by step and show your working before the answer.
[FORMAT] First the reasoning, then a summary table by region.
[TONE] Precise and neutral.

Data: [paste data]

CRAFT sets the stage; the "think step by step and show your working" inside the Action pulls in Chain-of-Thought. Two frameworks, no conflict.

The 5 most common framework mistakes

Frameworks fail in predictable ways. Avoid these five and your output quality jumps without learning anything new.

  1. Treating the framework as magic words. The acronym is a container, not a spell. "Act as an expert" with no domain or experience attached produces generic output. Fill every field with specifics: "a senior tax attorney specializing in cross-border SaaS revenue" beats "a tax expert" every time.

  2. Stacking too many frameworks. As covered above, more than two structures on one prompt confuses the model and causes dropped instructions. If your prompt has four labeled framework layers, simplify.

  3. Using the wrong framework for the task. CRAFT on a pure math problem wastes effort and misses the Chain-of-Thought lift. CoT on a creative writing task can over-constrain the model and flatten its range. Match the framework to the type of work, not your habit.

  4. Never iterating. Even a well-built framework prompt usually isn't the best version on the first try. Change one variable, re-run, compare. The fastest path to a great prompt is a good framework plus two or three quick iterations — not a perfect first draft.

  5. Memorizing frameworks but not learning when to break them. The most skilled users know when no framework is needed (a one-line factual lookup) and when stacking is required (multi-step reasoning with brand constraints). Frameworks are training wheels for completeness; the goal is to internalize the components, not to label every prompt forever.

A one-week plan to internalize all of this

You don't learn frameworks by reading about them — you learn them by running real tasks through them. Here's a seven-day path that turns this article into a reflex.

  • Day 1–2 — CRAFT. Apply it to ten real tasks you'd normally do casually. Compare each result to how you used to prompt. You'll feel the difference fastest here.
  • Day 3 — Chain-of-Thought. Run one reasoning task and one math task with and without "think step by step." Notice where the lift is real and where it isn't.
  • Day 4 — CARE. Take one brand-voice content task and feed the model a real example of your writing. Watch the voice match tighten.
  • Day 5 — RTF. Use it for five quick prompts. Note the moments it falls short of CRAFT — those are your signal for when to upgrade.
  • Day 6 — Combine. Build one CRAFT + CoT prompt for a genuinely complex task. Keep it to two frameworks.
  • Day 7 — Templatize. Pick your five most common prompts. Convert each into a saved template with the framework baked in and the specifics as swappable variables.

After day seven, you'll stop thinking in acronyms. You'll just instinctively include context, a role, the action, the format, and the tone — which was the whole point. Most fluent users can't tell you which framework they "used"; they simply write complete prompts.

If you want to skip the manual setup, saving your best framework prompts as reusable presets is exactly what a prompt library is for. Build the template once, swap the variables forever.

What changed for prompt frameworks in 2025–2026?

The honest update for 2026: the newest models have narrowed the gap between framework prompts and casual ones — for general chat. GPT-5 and Claude Opus 4 are far better than their predecessors at inferring missing context and reasoning without being told to. Industry write-ups consistently note that all the current frontier models still produce clearly better output from structured prompts than improvised ones, but the penalty for sloppiness in everyday use is smaller than it was two years ago.

Two things have not changed:

  1. Production work still demands structure. The moment you move from the chat window into RAG pipelines, agents, structured outputs, or anything that has to run reliably a thousand times, frameworks win decisively. Consistency, not peak quality, is the game in production — and structure is how you get consistency.

  2. Reasoning still rewards explicit steps on hard problems. Even with built-in reasoning modes, asking for worked steps on genuinely difficult, multi-stage problems still helps. The Chain-of-Thought principle outlived the model generation it was discovered on.

The broader shift is where the skill lives. The 2026 trend is away from hand-tuning prompts in a playground and toward systems that generate, structure, and optimize prompts programmatically. Tools like Prompt Architects ship CRAFT, CARE, and Chain-of-Thought as one-click presets across ChatGPT, Claude, and Gemini — so the structure is automatic and you spend your attention on the specifics that actually move quality. The framework is the floor; your domain knowledge is the ceiling.

Frequently asked questions

Which ChatGPT prompt framework is best? CRAFT (Context, Role, Action, Format, Tone) is the strongest general-purpose framework — it covers roughly 80% of everyday tasks. Chain-of-Thought wins for reasoning, math, and code. CARE wins when output style is hard to describe but you can show one example. Pick by task type, not by which framework sounds most impressive.

Do I need to learn all 7 frameworks? No. Master CRAFT for general work, Chain-of-Thought for reasoning, and CARE for style matching. That covers about 90% of the value. The other four (RTF, TAG, RACE, BAB) are useful in specific situations but optional for most users.

What's the difference between CRAFT and RACE? CRAFT has 5 components (Context, Role, Action, Format, Tone). RACE has 4 (Role, Action, Context, Expectation). RACE merges Format and Tone into "Expectation," making it leaner. CRAFT is more granular; RACE is faster to compose. Both produce similar quality on most tasks.

Does Chain-of-Thought prompting actually work? Yes, and it is the most empirically validated technique on this list. In the original 2022 study, chain-of-thought prompting raised PaLM 540B's accuracy on the GSM8K math benchmark from 18% to 57% with just eight worked examples. The lift is largest on multi-step reasoning, math, and logic tasks.

Can I combine prompt frameworks? Yes — this is common in production workflows. CRAFT + Chain-of-Thought is standard for complex reasoning. CARE + few-shot examples works for brand-voice content. The frameworks are scaffolding, not exclusive categories. Avoid stacking more than two at once, though, or the model gets confused.

What is the CO-STAR framework and how is it different? CO-STAR (Context, Objective, Style, Tone, Audience, Response) was developed by GovTech Singapore's Data Science and AI team and won their national prompt engineering competition. It is essentially a 6-part cousin of CRAFT that splits Tone from Style and adds an explicit Audience component. Use it when audience and tone are the make-or-break variables.

Why don't my framework-based prompts always work? Three common reasons. (1) Frameworks structure intent but can't replace specificity — "senior copywriter with 10 years of SaaS experience" beats "copywriter." (2) Framework adherence without iteration leaves quality on the table — re-run with one variable changed each time. (3) Wrong framework for the task — using CRAFT for pure reasoning misses the Chain-of-Thought lift.

Are prompt frameworks still relevant with GPT-5 and Claude Opus 4? Yes. Newer models handle vague prompts better, so the gap has narrowed for casual chat. But for production work — RAG, agents, structured outputs, multi-step reasoning — structured prompts still produce measurably better, more consistent results than improvised ones.


By Nafiul Hasan — Founder of Prompt Architects, where he builds prompt-engineering tooling used across ChatGPT, Claude, and Gemini. Last updated: June 10, 2026.

Frequently asked questions

Free Chrome Extension

Stop rewriting prompts. Start shipping.

Works with ChatGPT, Claude, Gemini, Grok, Midjourney, Ideogram, Veo3 & Kling. 5.0★ on the Chrome Web Store.

Create An Account