# The Holbrook Report — full corpus
> Curiosity, taken seriously. Education, not advice. Claims are labeled [FACT] / [CHARACTERIZATION] / [PROJECTION] and every report ends with its sources. Each report is written with an AI model and directed, verified, and edited by Aaron Holbrook.
---
# How a Local Business Gets Found Now: Search, AI Answers, and the Shift From Links to Citations
*A walk through the machinery of local discovery in 2026 - the Map Pack and classic ranking, the structured data that makes a site machine-readable, Generative Engine Optimization and the llms.txt that the engines ignore - and how one Huntley, Illinois detailing shop was rebuilt to be found by both Google and the models. Where the link economy is giving way to a citation economy, and what stays the same underneath.*
Source: https://theholbrookreport.com/reports/local-discovery-in-the-age-of-ai/ · Published: 2026-06-05
A person in Huntley, Illinois wants their car detailed. Ten years ago they would have typed "auto detailing near me" into Google, skimmed a page of blue links, and clicked one. Today the same person has at least three different ways to ask the same question, and only one of them reliably ends in a click on a business's website. They might still scroll the links. They might read the boxed summary Google now writes at the top of the page and never scroll at all. Or they might skip Google and ask an AI assistant to "find me a good detailer near Huntley that does ceramic coating," and act on whatever names it returns.
These are three different machines with three different rules. A local business that is built for the first one is not automatically visible in the other two. This report is about how all three actually work in 2026, what is changing between them, and what a small business can do about it. It uses a single named example throughout: OG Detailing, a family-owned shop whose website was rebuilt from the ground up for exactly this transition. The conflict is disclosed above; the numbers from that engagement are labeled as first-party measurements wherever they appear.
The short version: the fundamentals of being a findable local business have not changed, but the surface they are read through has. The web is shifting from an economy of *links you click* to an economy of *facts that get cited*, and the businesses that win the second one are the ones whose facts are consistent, structured, verifiable, and genuinely better than the alternatives.
---
# Movement 1 - How local discovery works in 2026
## The three surfaces
When someone looks for a local service today, the answer can come from three places, and they behave very differently.
Classic results
The ranked web links plus the local "Map Pack" of three businesses with a map. Driven by Google's local ranking system and your Google Business Profile.
The AI answer
A generated summary at the top of the results (Google's AI Overviews), or an answer from an assistant like ChatGPT, Perplexity, or Claude. It paraphrases and cites sources rather than listing them.
The model's memory
What an AI model "already knows" about your business from its training data, with no live lookup at all. This is the surface businesses understand least and control least directly.
Classic results are the surface every local-marketing playbook was written for. The other two are newer, and the rest of this report is largely about them. But they are not replacements stacked in sequence so much as layers competing for the same moment of intent, and the same underlying facts feed all three.
## Classic local ranking: relevance, distance, prominence
Google states plainly what determines local ranking. There are three named factors: relevance ("how well a Business Profile matches what someone is searching for"), distance ("how far each business is from the customer who's searching"), and prominence ("how well-known a business is"). [FACT: google-local-ranking] Google adds that prominence includes "how many websites link to your business and how many reviews you have," and that "more reviews and positive ratings can help your business's local ranking." [FACT: google-local-ranking]
Two of those three are not things you can write on a webpage. Distance is set by where the searcher is standing. Prominence is built off-site, out of links, citations, and reviews accumulated over time. Only relevance is fully inside your control, which is why so much local SEO effort goes into the one asset that controls it: the Google Business Profile. The profile, not the website, is what populates the Map Pack, and the Map Pack is what sits above the organic links for almost every "near me" query.
The supporting cast is the set of off-site signals that feed prominence:
- **Reviews.** Quantity, recency, and average rating all feed local ranking, and Google explicitly says responding to reviews helps too. [FACT: google-local-ranking] Reviews are also the single biggest consumer-facing signal, which the next movement returns to.
- **NAP consistency.** The business name, address, and phone number need to read identically across the website, the Google profile, Yelp, Apple Maps, directories, and any other listing. Inconsistent contact data is one of the most common, and most fixable, local-ranking drags.
- **Citations and links.** Mentions of the business on other reputable local sites, directories, and community pages.
This is the part that has not changed and is not going to. A business with a complete profile, real reviews, and consistent contact data across the web has always done better in local search, and still does.
## Making the site machine-readable: structured data
A web page is written for humans. Structured data is the same page's facts restated in a format a machine can read without interpreting prose. For a local business this usually means a `LocalBusiness` block in JSON-LD: name, address, geo-coordinates, phone, hours, services, area served, and links to the business's other profiles.
OG Detailing's site carries a full stack of this. The home page declares an `AutoDetailing` business (a `LocalBusiness` subtype) with address, coordinates, opening hours for every day of the week, a list of more than twenty served towns, the owner as a named `Person`, an `EducationalOccupationalCredential` for the owner's IDA certification, and an `OfferCatalog` of every service. Each service page adds a `Service` block tied to a `GeoCircle` around the shop. A `sameAs` array links the entity to its Google, Yelp, and Facebook profiles. [FACT: ogdetailing] Here is the spine of it, trimmed:
```json
{
"@context": "https://schema.org",
"@type": "AutoDetailing",
"name": "OG Detailing",
"telephone": "+1-224-650-0067",
"address": {
"@type": "PostalAddress",
"streetAddress": "11212 Sunset Lane",
"addressLocality": "Huntley",
"addressRegion": "IL",
"postalCode": "60142"
},
"geo": { "@type": "GeoCoordinates", "latitude": 42.1659156, "longitude": -88.4325465 },
"hasCredential": {
"@type": "EducationalOccupationalCredential",
"name": "IDA-Certified Detailer",
"recognizedBy": { "@type": "Organization", "name": "International Detailing Association" }
},
"sameAs": [
"https://www.google.com/maps?cid=6057556312879019609",
"https://www.yelp.com/biz/og-detailing-huntley",
"https://www.facebook.com/profile.php?id=61564935925235"
]
}
```
Structured data does two jobs. It makes the page eligible for richer search results (the star ratings, hours, and FAQ accordions that show up under some listings), and it gives any machine reading the page an unambiguous statement of the facts instead of forcing it to parse them out of sentences. The `sameAs` array is quietly the most important line: it tells a machine that this website, that Google listing, that Yelp page, and that Facebook profile are all the *same entity*. That is the thread the AI layer pulls on when it tries to decide who you are.
There is one hard limit worth stating now, because it shapes the whole strategy. You cannot mark up your own star rating and expect it to show. Google's policy is explicit: "if the entity that's being reviewed controls the reviews about itself," its pages "are ineligible for star review feature," and "ratings must be sourced directly from users." [FACT: google-review-snippet] Self-supplied ratings are not just ignored; they make the page ineligible. Genuine, third-party-sourced reviews are the only ones that count. This is a recurring theme: the system is designed to reward real signals and discard manufactured ones.
## Speed is part of the product
Google measures three Core Web Vitals, and as of 2024 they are Largest Contentful Paint (loading, target 2.5 seconds or less), Interaction to Next Paint (responsiveness, target 200 milliseconds or less, which replaced First Input Delay in 2024), and Cumulative Layout Shift (visual stability, target 0.1 or less). [FACT: webdev-vitals] They are a real input to ranking, and a much more direct input to whether a visitor stays.
This is where the OG Detailing rebuild started, and the gap was dramatic. The previous site, built on WordPress, took 23.2 seconds to render its main content on a phone and scored 57 of 100 on mobile performance and 58 on SEO in Google's Lighthouse audit. The rebuilt site, a static Astro site served from Cloudflare's edge, renders its main content in 3.6 seconds and scores 82 on performance, 100 on accessibility, and 100 on SEO. [FACT: ogdetailing] That is the same business, the same photos, the same services, made roughly six times faster to first paint.
What this really means in practice: Performance is not a vanity metric for a local business. It is the difference between a phone user who waits and one who taps back to the results and calls the next shop on the list. A slow site loses customers before ranking is even relevant.
## GEO: being inside the answer, not under it
Classic SEO tries to rank your link. Generative Engine Optimization, or GEO, tries to get your business named and cited *inside* an AI-generated answer. The term comes from a 2023 academic paper (Aggarwal and co-authors, later published at the KDD 2024 conference) that coined "GEO" and ran controlled tests on what makes a source more visible in generated answers. Their headline finding: GEO methods can boost a source's visibility in generative-engine responses by up to roughly 40%, and the most effective tactics were adding citations, adding direct quotations from relevant sources, and adding statistics. [FACT: geo-paper]
A later 2025 audit of 1,702 real citations across Brave, Google AI Overviews, and Perplexity reached a compatible conclusion from the other direction: pages that got cited tended to score high on overall quality (being cited correlated with page quality at an odds ratio of 4.2), and the strongest structural correlates were fresh metadata, clean semantic HTML, and structured data. [FACT: geo16] Both studies have real limits, noted in the sources, but they point the same way. AI engines preferentially cite content that is well-structured, specific, current, and quotable.
[CHARACTERIZATION: this is a synthesis of the two GEO studies, not a measured value from either] The practical translation is that GEO is mostly classic content quality plus machine-readability, not a separate dark art. Concrete numbers, named credentials, dated facts, and clean markup are what get quoted, which is exactly what a good local site should have anyway.
## The llms.txt that nobody reads
This is also where a popular idea needs a cold splash of water. In 2024 a proposed standard called `llms.txt` emerged: a plain-text file at the root of a site giving AI models a clean, summarized version of the business's key facts, by analogy to `robots.txt`. It is a genuinely sensible idea, and OG Detailing publishes both an `llms.txt` and a longer `llms-full.txt`. [FACT: ogdetailing]
The problem is that, so far, the major AI engines do not appear to use it. A 90-day server-log experiment found that of more than 62,100 AI-bot visits, only 84 (about 0.1%) ever requested the `/llms.txt` file, fewer than hit an average content page. [FACT: otterly-llmstxt] Google has said publicly that it does not use `llms.txt`, with one Google representative comparing it to the long-dead keywords meta tag. [FACT: sej-llmstxt] A separate analysis across hundreds of thousands of domains found no clear effect on AI citations. [FACT: sej-llmstxt-300k]
So why keep the file? [CHARACTERIZATION: a judgment about cost versus optionality, not a measured benefit] Because it costs almost nothing to maintain, it cannot hurt, and it is a cheap option on a standard that *might* get adopted. But it should be understood as a bet on the future, not a working channel today. The thing that actually makes a business legible to AI is not a special file; it is consistent facts everywhere a machine can read them.
## How an AI decides what to say about a business
To see why consistency matters so much, it helps to know where an AI answer comes from. There are three sources, and they blend:
Training data
What the model absorbed when it was built. Frozen, often months old, and impossible to edit directly. This is the "model's memory" surface.
Retrieval / grounding
A live web search the model runs at question time, reading current pages and quoting them. This is how ChatGPT search, Perplexity, and Google AI Overviews stay current.
Third-party sources
What other sites - directories, review platforms, local pages - say about the business. The model cross-checks these against your own claims.
A business cannot edit a model's training data. What it can do is make sure that when the model *does* look, every source agrees. If the website says the shop has more than 200 hours of training and an old directory says 300, the model has to choose, and an uncertain model tends to hedge or omit. [CHARACTERIZATION: a mechanism described in the GEO literature and consistent with how grounded models resolve conflicts, not a quantified rate] Entity consistency, the same name, address, phone, hours, and credentials everywhere, is the closest thing there is to writing directly into the machine's understanding of you.
The crawlers themselves are now specialized, and the controls are per-purpose. OpenAI alone runs at least three: GPTBot gathers training data (blocking it is a training opt-out), OAI-SearchBot surfaces sites inside ChatGPT's search answers (blocking it removes you from those answers), and ChatGPT-User fetches a page when a user asks ChatGPT to look at it. [FACT: openai-bots] Google splits the function too: Google-Extended controls whether your content trains Gemini, and Google states it "does not impact a site's inclusion in Google Search nor is it used as a ranking signal." [FACT: google-extended] Anthropic, Perplexity, Apple, Amazon, Meta, and others each run their own. The strategic point is that blocking a training crawler and blocking a search crawler are completely different decisions: one affects whether a model learns about you in general, the other affects whether you can appear in that product's live answers. OG Detailing's `robots.txt` makes the choice explicit, welcoming the search and assistant crawlers by name. [FACT: ogdetailing]
---
# Movement 2 - The shift from links to answers
## The click is leaking out of search
The mechanics above describe a stable world. The reason this report exists is that the world is moving, and the movement is measurable.
Google's AI Overviews went from a rollout to a fixture in 2025. One large keyword study tracked them appearing on 6.49% of queries in January 2025, peaking at 24.61% in July, then settling back to 15.69% by November. [FACT: semrush-aio] An independent Pew Research analysis of real browsing data found that in March 2025, about 18% of all Google searches by US adults produced an AI summary. [FACT: pew] The exact number depends on the method, but the order of magnitude is clear: a large and volatile share of searches now show a generated answer before any link.
And when that answer appears, people click less. Pew found that when an AI summary was present, only 8% of searches led to a click on a traditional link, versus 15% when there was no summary, and just 1% of users clicked a link *inside* the summary. [FACT: pew] Ahrefs, measuring position-one click-through over time, found the presence of an AI Overview correlated with a 58% reduction in clicks to the top organic result by December 2025, up from a 34.5% reduction in April. [FACT: ahrefs-aio] Google has disputed the framing of some of these studies, but has not offered contradicting numbers. [CHARACTERIZATION: a note on the state of the dispute, characterizing the absence of a counter-figure, not endorsing either side] The direction is not seriously in question.
A caveat keeps this honest. One study found that on the *same* set of keywords, the appearance of AI Overviews did not automatically raise the zero-click rate (it measured 33.75% before versus 31.53% after). [FACT: semrush-aio] That conflicts with broader market measures showing zero-click behavior rising overall. The two are measuring different populations, and the safe reading is that the link economy is shrinking unevenly, not uniformly. [CHARACTERIZATION: a reconciliation of two studies with different scopes, offered as interpretation] The trend is real; any single percentage is a snapshot of a moving target.
## The discovery channel itself is changing hands
The deeper shift is not just inside Google. It is that people are increasingly not starting at Google at all.
The clearest signal comes from BrightLocal's 2026 consumer survey: the share of consumers using ChatGPT and other generative AI tools to find local business recommendations jumped from 6% a year earlier to 45%, while Google's share of that same job fell from 83% to 71%. [FACT: brightlocal-lcrs] Those are enormous moves for a single year, from one vendor's survey, so they deserve the usual caution. [CHARACTERIZATION: a caveat about a single vendor survey, not a dismissal of the figure] But even discounted heavily, they describe a market where "ask an AI" has gone from a novelty to a mainstream way of finding a plumber, a dentist, or a detailer.
What this really means in practice: A local business now has to be findable in a conversation it never sees. When a customer asks an assistant "who's a good detailer near Huntley," there is no results page to rank on and no ad to buy. There is only whether the model knows you, trusts the facts about you, and names you.
## Why being the cited entity beats ranking #4
Put the two shifts together and the strategic conclusion follows. In a page of ten blue links, the difference between ranking fourth and seventh still bought you some clicks. In an AI answer, there is no fourth and seventh. There is the handful of businesses the model names, and everyone else. [CHARACTERIZATION: a structural contrast between ranked lists and generated answers, framing rather than a measured cliff] The distribution gets more winner-take-most, and the prize changes from "rank higher" to "be one of the cited entities."
What earns a citation, per the research in Movement 1, is being the clearest, most consistent, best-structured, most genuinely useful source about your specific niche. [FACT: geo-paper] That is good news for a small business that does its homework, because the bar is specificity and trustworthiness, not ad budget.
## The case study: rebuilding OG Detailing for both worlds
OG Detailing is a family-owned shop in Huntley, run day to day by an IDA-certified detailer with more than 200 hours of training. Roughly 70% of its customers come from one place: the Sun City / Del Webb Huntley retirement community next door. [FACT: ogdetailing] That single fact ended up driving the whole strategy.
**The starting position.** A baseline captured the day the new site launched in June 2026 measured the problem precisely. Of nine target local queries, the business's own domain ranked on page one for zero of them. Only three URLs were indexed by Google, and all three were stale addresses from the old WordPress site. The business surfaced in search mainly through its Yelp listing, not its own site. [FACT: ogdetailing] On a simulated AI-discoverability test (could an assistant correctly answer what they do, where they are, their hours, their credentials, and whether to recommend them), the site scored 9 out of 10. [FACT: ogdetailing] The single missing point was the one thing the business could not mark up itself: on-site review and rating data, which by Google's own policy must come from real users. [FACT: google-review-snippet]
**The rebuild.** The technical work is the catalog from Movement 1, applied: the WordPress-to-Astro rebuild that cut load time from 23.2 to 3.6 seconds and lifted the SEO score from 58 to 100; the full `LocalBusiness` structured-data stack; per-page titles, meta descriptions, canonical tags, and Open Graph; an XML sitemap; 301 redirects from the old WordPress URLs to preserve whatever equity the three indexed pages still carried; `llms.txt` and `llms-full.txt`; and a `robots.txt` that names and welcomes the AI crawlers. [FACT: ogdetailing]
**The niche nobody owned.** The highest-leverage discovery was not technical. Searching "auto detailing Sun City Huntley" returns results for Sun City, *Arizona*. Nobody had claimed the term for Huntley, despite a retirement community of thousands of homes that already sent the shop most of its business. [FACT: ogdetailing] The rebuild added a dedicated Sun City / Del Webb page, internally linked and structured, targeting a high-intent local term with effectively zero competition. [CHARACTERIZATION: "zero competition" describes the observed search result, where no local rival ranked for the term] This is the local-SEO equivalent of finding an unlocked door: a specific, real, defensible niche that maps exactly to who the business already serves.
**What is still open.** The honest part of the story is that the most valuable remaining work is not in the code. It is owner-led: getting verified in Google Search Console so the new pages get crawled and indexed; collecting the real Google and Yelp star ratings and review counts so a genuine `aggregateRating` can be published (the one gap in the discoverability score); soliciting reviews that mention "Huntley" and the specific service; and cleaning up the contact data and the "200 versus 300 hours" discrepancy across third-party listings. [FACT: ogdetailing] The site is built; the off-site signals are earned over months.
## Where SEO and GEO turn out to be the same thing
The review gap is the tell. The thing that would most improve OG Detailing's classic local ranking (more real reviews and a higher rating) is the same thing that would close its last AI-discoverability point (structured, genuine `aggregateRating` data), is the same thing Google's policy says must be real to count, and is the same thing a human customer looks at first. [FACT: google-local-ranking] One asset, four payoffs. [CHARACTERIZATION: an observation that the same signal serves classic ranking, AI citation, policy eligibility, and human trust at once]
This is the quiet thesis of the whole transition. The channels look like they are diverging, but the underlying signals are converging. Real reviews, consistent facts, genuine specificity, and a fast, machine-readable site pay off in classic search, in AI answers, and with the actual human being deciding whether to call. The gimmicks (self-supplied ratings, an `llms.txt` the engines ignore, thin pages spun up for every nearby town) pay off in none of them.
---
# Movement 3 - The new future of local marketing
## The website becomes a fact-source, not a brochure
For two decades a local business website was a brochure: a pretty front for humans, with the real conversions happening on the phone. The shift described above slowly inverts that. Increasingly the most important reader of the site is a machine, deciding what to tell a human who will never visit the page. [CHARACTERIZATION: a reframing of the site's primary audience, extrapolated from the retrieval mechanics, not a measured change in traffic mix]
That does not make design irrelevant; a human still lands on the site and decides whether to trust it. But it adds a second audience with different needs. The machine does not care about the hero image. It cares whether the address in the page text matches the address in the JSON-LD matches the address on Google matches the address on Yelp. The future-proof site serves both: persuasive for the person, unambiguous for the parser.
## Generic content depreciates; specific facts appreciate
If AI answers can summarize the generic, then generic content stops being an asset. A page that says "we provide quality auto detailing with great customer service" is exactly what a model can generate for free, about anyone. [CHARACTERIZATION: a value judgment about commodity content, grounded in what generative models can already produce] What a model cannot generate is the specific, verifiable, local truth: that this shop is half a block from the Huntley Park District, that the owner holds an IDA certification with a stated number of training hours, that it serves the Del Webb community, that ceramic coating starts at a particular price for a particular vehicle size.
[PROJECTION: conditional on AI answers continuing to absorb generic informational content, which current AI Overview trends suggest but do not guarantee] If that trend holds, the content that retains value is the content a generative model cannot fabricate without you: concrete prices, real credentials, dated specifics, named service areas, and genuine reviews. The thin "SEO content" of the last decade depreciates; the specific facts appreciate.
## The durable moats
Three things survive every version of this shift, and they are where a local business should spend its scarce effort.
Genuine reviews and ratings
Central mechanism
Reviews feed classic local ranking, are required (and must be real) for rating rich results, are a top correlate of AI citation, and are the first thing a human checks. As of 2026, 97% of consumers read reviews for local businesses, 68% will only use a business rated 4 stars or higher (up from 55% the year before), and 31% now require 4.5 or higher (up from 17%). The threshold is rising; the asset is irreplaceable. [FACT: brightlocal-lcrs] [FACT: google-local-ranking]
Entity consistency across the web
Central mechanism
The same name, address, phone, hours, and credentials, identical everywhere a machine can read them: the site's visible text, its structured data, its sameAs links, and every third-party listing. This is how a model disambiguates you and decides to trust the facts enough to repeat them. Inconsistency is the most common reason an AI hedges or omits a business. [CHARACTERIZATION: identifies consistency as the primary controllable input to AI trust, drawn from the grounding mechanics in Movement 1]
A real, defensible niche
Important context
Owning a specific, true position ("the detailer the Del Webb community uses") beats competing for a generic head term. It is cheaper to win, harder for a national chain to contest, and exactly the kind of specific fact an AI answer can attach to a recommendation. [CHARACTERIZATION: a strategic claim that specificity outcompetes generic terms for small local businesses, illustrated by the Sun City case]
## What stays the same
Underneath all of it, the job has not changed. A local business has always won by being genuinely good, being known in its community, and being easy to find and contact. [CHARACTERIZATION: a restatement of local-business fundamentals, offered as the stable core beneath the changing surface] The machinery on top has changed three times now (directories, then search, now AI answers), and each time the businesses that adapted fastest were the ones whose underlying reputation was real enough to survive the new measurement.
What this really means in practice: The work that pays off in 2026 is the same work that paid off in 2016, made machine-readable. Be excellent, collect real reviews, keep your facts identical everywhere, claim a specific niche, and make the site fast and structured. Do that and you are findable in the link economy and the citation economy at once.
## A practical sequence
For an operator who wants the order of operations, the OG Detailing engagement suggests a priority list that generalizes:
1. **Fix the foundation.** A fast, structured, mobile-first site with complete `LocalBusiness` JSON-LD, canonical tags, a sitemap, and 301s from any old URLs. This is table stakes for all three surfaces. [FACT: webdev-vitals]
2. **Get crawled and indexed.** Verify the site in Google Search Console and Bing Webmaster Tools, submit the sitemap, and request indexing. Bing's index also feeds some AI search, so this is a GEO step, not just an SEO one.
3. **Win the off-site signals.** A complete Google Business Profile, real reviews solicited from happy customers (mentioning the town and the service), and consistent contact data across every listing. This is the slow, compounding work, and it is owner-led. [FACT: google-local-ranking]
4. **Claim the niche.** Find the specific, true, low-competition position the business already occupies, and build a genuine page for it. [CHARACTERIZATION: generalizes the Sun City tactic into a repeatable step]
5. **Keep the facts identical, and measure.** Audit name, address, phone, hours, and credentials across every surface, and re-check rankings and AI answers on a fixed schedule so drift is caught early.
The order matters because the steps build on each other: a fast structured site that no one has indexed is invisible, and a perfectly indexed site with inconsistent facts gets hedged by the very AI answers it is trying to win. [CHARACTERIZATION: explains the dependency ordering, reasoning rather than a measured result]
## The honest uncertainty
It would be a mistake to end with false precision. AI Overview prevalence rose to about 25% of queries and then fell back toward 16% within the same year. [FACT: semrush-aio] Crawler policies are changing month to month. The headline consumer figures come from single surveys. [CHARACTERIZATION: a caveat on the volatility of every figure in this report, restating the limits noted in the sources] Anyone who tells a local business they know exactly where AI search lands in two years is guessing.
[PROJECTION: conditional on AI-mediated discovery continuing to grow even if its exact form keeps shifting] What is safe to act on is the direction, not the date. If discovery keeps moving toward generated answers, the businesses positioned to win are the ones whose facts are real, consistent, structured, and specific, because those are the only inputs every version of the future rewards. The surface will keep changing. The reason a customer trusts a local business, and the signals that prove that trust to a machine, are changing far more slowly.
---
## Sources
**Classic local ranking and reviews**
- {google-local-ranking} [Google Business Profile Help - Improve your local ranking on Google](https://support.google.com/business/answer/7091) - the three official factors (relevance, distance, prominence) and that reviews and responses help local ranking
- {brightlocal-lcrs} [BrightLocal - Local Consumer Review Survey 2026](https://www.brightlocal.com/research/local-consumer-review-survey/) - 97% read reviews; 68% require 4+ stars (up from 55%); 31% require 4.5+ (up from 17%); AI use for local recommendations 6% to 45%; Google's share 83% to 71%
- {brightlocal-lcrs-2025} [BrightLocal - Local Consumer Review Survey 2025](https://www.brightlocal.com/research/local-consumer-review-survey-2025/) - prior-year baseline for the review-threshold and AI-use comparisons
**Structured data and rich results**
- {google-review-snippet} [Google Search Central - Review snippet (structured data) documentation](https://developers.google.com/search/docs/appearance/structured-data/review-snippet) - self-controlled reviews make a page ineligible for the star feature; ratings must be sourced directly from users, not curated by editors
**AI Overviews, zero-click, and click-through**
- {semrush-aio} [Semrush - AI Overviews study (2025)](https://www.semrush.com/blog/semrush-ai-overviews-study/) - AIO trigger rate 6.49% (Jan) to 24.61% (Jul) to 15.69% (Nov) 2025; same-keyword zero-click 33.75% to 31.53%
- {pew} [Pew Research Center - Google users are less likely to click links when an AI summary appears (Jul 2025)](https://www.pewresearch.org/short-reads/2025/07/22/google-users-are-less-likely-to-click-on-links-when-an-ai-summary-appears-in-the-results/) - 18% of US-adult searches produced an AI summary (Mar 2025); 8% click a link with a summary vs 15% without; 1% click within the summary
- {ahrefs-aio} [Ahrefs - AI Overviews reduce clicks (update)](https://ahrefs.com/blog/ai-overviews-reduce-clicks-update/) - position-1 CTR reduction of 58% for AIO queries by Dec 2025, up from 34.5% in April 2025
**GEO and llms.txt**
- {geo-paper} [Aggarwal et al. - GEO: Generative Engine Optimization (arXiv 2311.09735, KDD 2024)](https://arxiv.org/abs/2311.09735) - GEO can boost visibility up to ~40%; citations, quotations, and statistics are the most effective methods
- {geo16} [Kumar & Palkhouski - GEO-16 citation audit (arXiv 2509.10762, 2025)](https://arxiv.org/pdf/2509.10762) - 1,702 citations across Brave, Google AIO, Perplexity; page quality odds ratio 4.2; metadata/freshness, semantic HTML, and structured data the strongest correlates (observational preprint, B2B SaaS scope)
- {otterly-llmstxt} [OtterlyAI - The llms.txt experiment](https://otterly.ai/blog/the-llms-txt-experiment/) - only 84 of 62,100+ AI-bot visits (0.1%) requested /llms.txt over 90 days
- {sej-llmstxt} [Search Engine Journal - Google says llms.txt comparable to the keywords meta tag](https://www.searchenginejournal.com/google-says-llms-txt-comparable-to-keywords-meta-tag/544804/) - Google representatives state Google does not use llms.txt
- {sej-llmstxt-300k} [Search Engine Journal - llms.txt shows no clear effect on AI citations across 300k domains](https://www.searchenginejournal.com/llms-txt-shows-no-clear-effect-on-ai-citations-based-on-300k-domains/561542/) - large-scale corroboration of no measurable citation effect
**Crawlers**
- {openai-bots} [OpenAI - Bots and crawlers documentation](https://developers.openai.com/api/docs/bots) - GPTBot (training), OAI-SearchBot (ChatGPT search surfacing), ChatGPT-User (user-triggered fetch) are distinct with different robots.txt behavior
- {google-extended} [Google Search Central - Google crawlers and Google-Extended](https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers) - Google-Extended controls Gemini/Vertex training and does not affect Search inclusion or ranking
- {almcorp-claude} [ALM Corp - Anthropic Claude bots and robots.txt strategy](https://almcorp.com/blog/anthropic-claude-bots-robots-txt-strategy/) - roles of ClaudeBot, anthropic-ai, and Claude-Web
**Performance**
- {webdev-vitals} [web.dev - Web Vitals](https://web.dev/articles/vitals) - LCP 2.5s, INP 200ms (replaced FID in 2024), CLS 0.1 as the "good" thresholds at the 75th percentile
**Case study**
- {ogdetailing} [OG Detailing (ogdetailing.org)](https://ogdetailing.org) - live site and structured data; first-party figures (Lighthouse scores, LCP, indexing and ranking baseline, discoverability score, Sun City strategy) are from the Holbrook Solutions engagement's internal SEO baseline dated 2026-06-03
---
# Understanding IPOs: How They Work and What It Means to Invest in One
*A definitional walk through initial public offerings — what they are, why a company does one, how the process runs from S-1 to opening trade, the two prices and the first-day pop, how to read the headline numbers, what breaks a deal, lockups, and the access rules that decide whether an ordinary investor can participate at all.*
Source: https://theholbrookreport.com/reports/understanding-ipos/ · Published: 2026-06-01
An initial public offering (IPO) is the first sale of a private company's stock to the public, after which the shares trade on an exchange. The mechanics of that event are consistent across deals, and they determine both what a buyer is actually purchasing and which price that buyer can reach. The sections below build from the ground up: what an IPO is and why a company does one, how the process runs, the two prices that exist on the first day, how to read the headline numbers, what causes a deal to break, the lockup calendar, the access rules that govern whether an ordinary investor can participate, and the alternative routes a company can take to the public market.
Every claim is labeled as [FACT] (a sourced data point), [CHARACTERIZATION] (a descriptive label placed on those facts), or [PROJECTION] (a conditional "if X then Y" statement, never a prediction). The tags are interactive: a fact links to its source, and a characterization or projection gives its specific reasoning on hover or keyboard focus. The severity pills mark how central a mechanism is to understanding the offering — Central mechanism, Important context, Counterexample — not how dangerous it is.
## What an IPO is and why a company does one
A company is private when its shares are held by a closed set of owners — founders, employees, and the funds that backed it through successive private rounds — and there is no continuous public market in which to buy or sell them. An IPO converts the company to public status: a tranche of shares is sold to outside investors and the stock begins trading on an exchange, where its price is set continuously by supply and demand.
Companies pursue this for a small set of reasons. The offering can raise fresh capital when newly issued shares are sold, the cash going to the company itself. It provides liquidity for early investors and employees, whose previously unsellable shares become tradable. It establishes a public, market-tested valuation rather than a privately negotiated one. And it gives the company publicly traded stock to use as an acquisition currency and a recruiting tool. The flip side is the burden of being public: audited quarterly reporting, regulatory scrutiny, and a share price that reacts to every disclosure.
For an outside investor, the relevant consequence is that the IPO is the first moment shares can be bought at all — and, as the rest of this piece develops, the terms of that first moment are set by the company and its bankers, not by the buyer.
## How the process works
From registration to opening trade
Important context
The sequence from private company to traded stock is standardized. A company files a registration statement — for a U.S. IPO this is Form S-1 — with the SEC, which reviews it. A public S-1 with audited financials follows, then an amended S-1 carrying a proposed price range. Management then conducts a roadshow, during which the underwriters run a bookbuilding process — collecting orders from institutions to gauge demand. The offering is priced the night before trading, the stock opens the next morning, a lockup expires roughly 90 to 180 days later, and the first public earnings reports arrive after that. [FACT: valuethemarkets]
The practical division for a long-horizon buyer is that the first day is marketing and the first year of filings is information. The offer price is set by the underwriters and the company; an individual buyer almost never transacts at that price and instead buys in the open market once trading begins, frequently at a higher number. [CHARACTERIZATION: a statement about the typical case, not a measured rate — it follows from how allocation routes offer-price shares to institutions, but deals vary (some grant small retail allocations, and not every IPO opens higher), so it describes the central tendency rather than a certainty] There is no obligation to participate on the first day.
Two earlier stops on this path deserve their own definition because press coverage frequently conflates them. The S-1 is the most information-rich document a company going public produces: audited financial statements, a business description, the share structure, and a legally required Risk Factors section. A confidential draft S-1 is an earlier version submitted privately to the SEC, permitted under the 2012 JOBS Act for emerging growth companies (firms under $1B in annual revenue at filing). The public cannot read it, no offering can be marketed or priced from it, and it is not a commitment to sell shares; it starts the SEC's review clock and preserves the option to go public. Anthropic's 2026-06-01 step was this confidential draft. [FACT: anthropic]
What this really means in practice: Until a public S-1 is filed, there is nothing to fundamentally analyze — no audited revenue, no share count, no price. The figures circulating in the press before that point are private marks and run-rates, not the audited numbers the public filing will contain. When it arrives, that filing — not a headline — is the document in which the recurring failure causes (a rich multiple, insider supply, control structure, related-party revenue) actually become visible.
## The two prices and the first-day "pop"
The first-day open is, on average, the most expensive entry
Central mechanism
Two prices coexist at launch. The offer price is fixed the night before trading and sold to a chosen list of buyers. The opening price is the first price at which the stock trades the next morning, set by live supply and demand, and is frequently well above the offer price.
IPOs are, on average, deliberately priced below where they open. Underpricing is the practice of setting the offer price below the level demand will support; the "pop" is the resulting jump from the offer price to the opening price. Long-running research by Professor Jay Ritter (University of Florida) puts the average U.S. first-day return at roughly 17 to 18 percent from 2001 through 2023, having spiked to about 65 percent during the 1999–2000 dot-com period. [FACT: ritter]
The mechanism determines who captures that return. Institutions that receive an allocation at the offer price hold the gain and often sell into the retail demand that drives the open higher. By the time an individual account can transact, it is frequently buying from those sellers at a price that already embeds the excitement. [CHARACTERIZATION: the research measures the average first-day jump; naming the retail buyer as the one who pays it is the author's reading of who sits on each side of that trade — a directional conclusion the literature supports, not a measurement of any specific buyer's loss]
What this really means in practice: The pop is not money handed to the person buying on the first day; it is the cost that person pays. The discount embedded in the offer price was already delivered — to allocation holders — before the first public trade. Buying the open means buying after the discount has been distributed.
The cautionary cases are the offerings that "worked" spectacularly on the first day and still charged anyone who bought the open. The figures below are first-day moves measured from the offer price — the discount the open-market buyer did not receive.
| IPO | Offer price | Open / first-day close | First-day move |
|---|---|---|---|
| Snowflake (Sep 2020) | $120 | opened $245, closed ~$254 | +112% [FACT: snowflake] |
| Airbnb (Dec 2020) | $68 | opened $146, closed ~$144.71 | +112% [FACT: airbnb] |
| Visa (Mar 2008) | $44 | closed ~$56.50 | +~28% [FACT: visa-ipo] |
In each case the institutions allocated at the offer price captured the move. Snowflake and Visa were strong businesses — a buyer at Snowflake's open is reported down roughly 41 percent more than five years later [FACT: snowflake-drawdown], while Visa proved a strong long-run holding [FACT: visa-longrun] — yet the first-day open buyer paid the full markup over the offer price in all three. The point is not that these were bad companies; it is that the first-day open repeatedly charges the open-market buyer the underpricing the insiders already captured. [CHARACTERIZATION: a label drawn from the offer-to-open spreads in the table, not a claim about company quality — two of the three were strong businesses, so "charges the buyer" refers only to the first-day entry price, not to long-run returns]
Counter-receipt: A 2025-vintage study argues the classic underpricing figure may be overstated by as much as 40 percent because a small set of enthusiast buyers skews the first-day data. [FACT: globeandmail] Even under that revision the directional point holds: the open is set by the most eager marginal buyer, not the most disciplined one. [CHARACTERIZATION: the conclusion holds even if the study's ~40% downward revision is right — a smaller average pop would still be set at the margin by the most willing buyer, so the point about who sets the open price doesn't depend on the disputed magnitude]
Allocation is the sorting step that explains the asymmetry. Underwriters distribute offer-price shares overwhelmingly to large institutions and favored clients; ordinary retail accounts rarely receive a meaningful allocation on a hot deal. Some brokerages run IPO-allocation programs that grant retail clients a few offer-price shares, but on in-demand deals the amounts tend toward tiny or zero. The structural consequence is that most individuals transact at the opening price, which is precisely the number the pop has already inflated.
## Reading the numbers
A handful of figures dominate IPO coverage, and each is easy to misread. Three pairs of terms separate the honest reading from the headline.
### Run-rate versus audited revenue
Run-rate is a pace, not a year of collected revenue
Central mechanism
A run-rate takes a short recent stretch of revenue — say one month — and multiplies it out to a full year. If a company books about $3.9B in a single month, twelve times that is roughly a $47B run-rate, even though it has not collected $47B over a real year. A run-rate differs from audited annual revenue in three ways: it assumes the recent pace holds for twelve straight months (which for a fast-growing company overstates the real trailing-twelve-month total); it is typically reported by the press or the company, not checked by independent auditors; and audited revenue is the actual money recognized over a real 12-month period under accounting rules.
What this really means in practice: When a figure such as Anthropic's press-reported "~$47B run-rate" (late May 2026) appears, read it as "running at a pace that would total $47B a year if this exact month repeated twelve times — per the press, unaudited." [FACT: yahoo-anthropic] The gap between that sentence and "$47B in revenue" is the gap worth tracking. Any valuation ratio built on a run-rate where audited revenue belongs is distorted by the same gap. [CHARACTERIZATION: reasoning from the definitions, not a sourced figure — because a revenue multiple divides value by revenue, feeding it an annualized run-rate instead of audited revenue understates the multiple by the same proportion the run-rate overstates revenue]
### Post-money valuation versus market capitalization
Post-money valuation is what a company is judged to be worth immediately after a fresh private investment round closes; pre-money is the value just before that cash arrives. Worked example using Anthropic's press-reported figures: a "~$965B post-money on a $65B Series H" (a Series H is simply the eighth lettered funding round) implies a pre-money of about $900B, with the new investors owning roughly $65B ÷ $965B, about 6.7 percent. [FACT: fortune-anthropic] A post-money figure is a number a handful of investors negotiated in one private deal — a private mark, not a market-clearing public price. [CHARACTERIZATION: a distinction about how the number was set, not whether it is too high or low — a post-money figure is agreed by a few investors in one negotiation, while a market price is discovered continuously by many buyers and sellers]
The public-market cousin is market capitalization — share price times shares outstanding, set continuously by the open market rather than by one round. Ten billion shares at $100 each is a $1 trillion market cap. A private mark is not a promise the public market will agree; the IPO is where a continuously tested price is discovered, and it can land above or below the private mark.
### P/E ratio versus revenue multiple
Two ways to express how expensive a stock is. The price-to-earnings (P/E) ratio is the share price divided by per-share annual profit. An "85× P/E" means paying $85 for $1 of annual profit, which only pays off if profits grow fast — pairing "85×" with "slowing growth" was the warning in Facebook's case. The revenue multiple (price-to-sales) divides company value by annual revenue and is used when a company has little or no profit, so a P/E would be negative or meaningless. A company valued at $100B with $10B of revenue trades at a 10× revenue multiple. A revenue multiple is only as honest as the revenue figure fed into it — which is why the run-rate distinction above is load-bearing.
## What breaks an IPO
Priced too high, or structurally flawed
Important context
The reverse of underpricing is an offering that comes public too expensive, or with structural weaknesses, and trades below its first reference price within weeks. Roughly 21 to 25 percent of recent U.S. IPOs posted negative first-day returns — they closed the first day below the offer price. [FACT: duke]
| IPO | What happened | Stated cause(s) |
|---|---|---|
| Facebook (May 2012) | $38 offer; below offer within days ($34.03, then $31); under $18 by Sept 2012 | 85× P/E into slowing growth; mobile-revenue softness disclosed selectively to large investors; ~57% of IPO shares sold by insiders; a Nasdaq trading glitch on debut [FACT: facebook] |
| Lyft (Mar 2019) | opened ~$87 vs $72 offer; fell ~25% within a month | priced into decelerating growth [FACT: lyft] |
| Uber (May 2019) | $45 offer; opened $42; closed first day −7.6% | opened below offer [FACT: uber] |
| Rivian (Nov 2021) | $78 offer at ~$66.5B valuation; below $78 within weeks; reported down ~90% since | rich valuation, no path to profit [FACT: rivian] |
| Robinhood (Jul 2021) | $38 offer; dipped below offer in the first week | weak subsequent quarter [FACT: robinhood] |
| Peloton (Sep 2019) | ended first day ~11% below the IPO price | broken debut [FACT: peloton] |
| WeWork (2019) | IPO pulled entirely | heavy losses, tangled structure, weak governance, founder-control concerns [FACT: wework] |
The recurring causes are a small set: a price embedding too much future growth, heavy insider supply, governance or control problems, and no clear path to profit. [CHARACTERIZATION: a common thread found by inspecting the cases in the table, not a measured frequency — these causes recur across the documented breakages, but the piece doesn't claim a statistical rate at which each appears] None is visible from a headline valuation; all of them live in the S-1.
Two of those causes are specific structural features worth defining, because they recur and because both are disclosed in the filing rather than the press.
**Dual-class and super-voting shares.** Many founder-led technology companies go public with two share classes: the public buys Class A shares with one vote each, while founders hold Class B / super-voting shares with many votes each (often ten). The effect is that a public holder can own a real economic slice of the company while holding little say in how it is run. If founders hold 15 percent of the shares but those shares carry ten votes each, they can control a majority of the voting power while owning a minority of the economics. [CHARACTERIZATION: an illustrative example using round numbers (15% of shares at ten votes each) to show how super-voting mechanically separates control from economics — not any specific company's actual share split] It is not automatically adverse — it can let founders pursue a long-term plan — but it is a structural feature to price in rather than discover later. Anthropic is reported to combine founder control with a Long-Term Benefit Trust, a governance body intended to steer the company toward its mission; the share-class details are unknown until the public S-1. [FACT: yahoo-anthropic]
**Related-party revenue.** This is revenue a company books from a counterparty that is also one of its investors, owners, or affiliates. The concern is that such demand can be partly circular: an investor supplies cash, some of which returns as the investor's own spending and is recognized as revenue. If a backer invests $10B and in the same period spends $3B buying the company's product, part of the reported revenue is funded by the company's own investor and may not reflect independent customer demand. Amazon and Google are reported as both large strategic investors in Anthropic and as cloud/compute partners — exactly the relationship the S-1's related-party and customer-concentration disclosures exist to reveal. [FACT: fortune-stake] It does not make the revenue fake; it means independent demand should be discounted until the filing quantifies those flows. [CHARACTERIZATION: guidance on how to weigh the revenue, not a finding that any of it is circular — related-party flows may be entirely legitimate, but until the S-1 sizes them, how much of the revenue proves independent demand is unknown, so the prudent reading discounts it]
## Lockups and lockup expiry
A scheduled supply event roughly 90 to 180 days after listing
Important context
A lockup is a contractual promise by insiders — founders, employees, early-investor funds — not to sell for a set window after the IPO, most commonly 180 days, sometimes staggered (for example 25 percent released at 90 days and the balance at 180). Its purpose is to keep that large pile of insider shares from flooding the float on the first day. When the lockup expires, that supply becomes eligible to sell, and academic evidence consistently shows small negative abnormal returns around unlock dates, commonly in the low-single-digit-percent range. [FACT: valuethemarkets]
The lockup expiry is therefore a known, scheduled supply event. It is not guaranteed to produce a lower price — a strongly performing stock can absorb the supply — but it is a feature of the calendar that the first-day coverage never mentions. [CHARACTERIZATION: the unlock date is a fact; calling it a softer entry point is conditional — the added supply weighs on price only if demand doesn't absorb it, which a strong stock can, so this flags a tendency around unlocks, not a prediction that a given stock falls]
## Access: who can reach which price
The realistic options for an ordinary investor
Central mechanism
This is the part an investor most needs and the part the coverage least explains. The blunt summary is that an ordinary individual generally cannot buy at the offer price, generally cannot reach the private markets that precede the IPO, and is left with the open market — the one place where the pop has already been paid. The concrete routes:
**Offer-price allocation — usually closed.** Allocation, defined above, is why an individual usually cannot buy at the offer price; underwriters route those shares to institutions. Some brokerages run IPO-allocation programs that may grant retail clients a few offer-price shares, but on in-demand deals the amounts tend toward tiny or zero.
**Pre-IPO secondaries — gated by the accredited-investor rule.** Before an IPO, shares of a still-private company can sometimes be bought on pre-IPO secondary markets such as Forge, EquityZen, and Hiive. These are generally open only to accredited investors. The accredited investor rule determines whether those markets are open at all. As of 2026 an individual qualifies mainly by income — over $200,000 a year alone, or $300,000 jointly with a spouse, in each of the last two years with the same expected this year — or by net worth over $1 million excluding the value of a primary residence; certain license holders (Series 7, 65, or 82) also qualify. [FACT: sec-accredited] The net-worth test excludes the primary residence. [FACT: sec-networth] These thresholds have been unchanged since they were set; legislation that would add a qualification-by-exam path independent of wealth has passed the House and is not yet law, so it cannot be relied upon today. [FACT: sec-accredited] [PROJECTION: conditional on the House-passed exam-qualification bill becoming law — if it does, wealth would no longer be the only route to accredited status; since it hasn't, this is a possible future path, not something an investor can use today]
**Diluted public proxies — available to anyone, but thin.** An ordinary investor can buy a public company that itself owns a stake in the target. Owning Amazon or Google is an indirect, heavily diluted way to hold Anthropic exposure: if such a holder owns a low-double-digit-percent stake, a move in the private company's value reaches the proxy's share price only after being diluted by everything else that company does. [CHARACTERIZATION: reasoned from the reported stake size, not a computed sensitivity — a parent holding a low-double-digit-percent stake passes only that fraction of the target's value change into its own price, further muted by the rest of its business; the direction is clear, the exact pass-through is not quantified here]
**The open market after listing — the default.** For most individuals the realistic path is to buy on the exchange once trading begins, on the buyer's own timeline. Because there is no obligation to buy on the first day, a buyer can also wait — past the opening pop, and past the lockup-expiry supply event — rather than transact at the open. [CHARACTERIZATION: a description of the timing the open market leaves available, not advice to use it — since nothing compels a first-day purchase, waiting past the pop or the unlock is possible; the piece notes these options exist without recommending the reader choose them]
## Routes to going public
The route changes what "the first day" means
Important context
A traditional IPO sells newly issued shares at a fixed offer price set by underwriters, raising fresh cash; it is the default this piece mostly describes. A direct listing skips the underwritten sale: the company lists existing shares and lets them trade, with no fixed offer price and (classically) no new money raised. Coinbase used a direct listing in April 2021 — which is why its first-day drop is measured from the open ($381 to ~$328, −14%) rather than from an offer price. [FACT: coinbase] A SPAC is a shell that goes public first with no operating business, then merges with a private company to bring it public through the back door; WeWork reached the public market via a SPAC in 2021 after pulling its traditional IPO. [FACT: wework]
Because a direct listing has no offer price and no allocation step, the offer-price-to-open gap that defines the pop does not exist in the same form. Knowing which route a deal uses tells a reader how to interpret its first-day numbers. [CHARACTERIZATION: a consequence that follows from the route definitions, not a measured effect — a direct listing has no offer price, so its first-day move is measured from the open, which means the same "−14%" means something different than it would after a traditional IPO]
### The counterexample: a modified Dutch auction
The first-day spread is a design choice, not a law
Counterexample
Google's 2004 IPO used a modified Dutch auction. In a Dutch auction, investors submit bids stating how many shares they want and the price they will pay; the company finds the single clearing price, and everyone who bid at or above it pays that one price. Because the broad market — not the underwriters' bookbuilding — sets the price, the format tends to compress the first-day pop. Google cut its range from $108–$135 to $85–$95, priced at $85, and rose a comparatively modest ~18 percent on the first day, the muted pop the format is designed to produce. ("Modified" means it retained some underwriter involvement.) [FACT: google-dutch]
This is the structural counterexample: when the crowd sets the price rather than a banker's order book, more of the value stays with the company and small buyers. It is rare — most deals still use bookbuilding — but it demonstrates that the first-day spread is a feature of how most deals are structured, not an iron law. [CHARACTERIZATION: drawn from one well-documented case (Google 2004) as an illustration, not a statistic — a single auction compressing the pop shows the spread isn't inevitable, but one case can't establish how reliably auctions do so, hence "possible," not "always"]
## What an informed investor understands
Put end to end, the mechanics resolve into a coherent picture rather than a list of warnings. An IPO is the engineered transition of a company from a closed set of private owners to a continuously priced public market, undertaken for capital, liquidity, and a market-tested valuation — and the terms of that transition are set by the company and its underwriters before any outside individual can act. The process is legible in advance: the S-1 is the document where the business, the share structure, and the risks become visible, and until it is public the circulating figures are private marks and run-rates, not audited results.
The first day carries two prices for a reason. The offer price embeds a deliberate discount that is delivered to allocation holders before the first trade; the opening price is where that discount has already been paid. The headline numbers each have an honest and a flattering reading — run-rate versus audited revenue, post-money mark versus market-clearing price, revenue multiple versus the quality of the revenue under it — and the difference is knowable, not mysterious. The deals that break do so for a short, repeating list of reasons, every one of which is disclosed in the filing rather than the press. The lockup expiry is a dated supply event on the calendar from the day the company lists.
Most consequential for an ordinary investor is the access structure: the offer price is largely closed to individuals, the private markets that precede the IPO are gated by the accredited-investor thresholds, public proxies deliver only diluted exposure, and the open market — the one route open to everyone — is precisely where the pop has already been priced in. The route the company chooses changes what the first-day figures even mean, and the Dutch-auction case shows the first-day spread is a design choice rather than a fixed property of going public.
What an informed investor is left with, then, is not a verdict on any single offering but a way to read one: which price is being quoted and who already captured the discount, whether a number is audited or annualized, which structural features sit in the filing waiting to be read, and which of the few real access routes actually applies. The mechanics do not tell anyone whether to participate. They make clear exactly what participating would mean.
## Sources
The durable, load-bearing figures here — Ritter's first-day-return averages, the named case-study IPO numbers, and the accredited-investor thresholds — were spot-verified against primary and contemporaneous sources in June 2026. The Anthropic-specific figures are press-reported and provisional until a public S-1 exists.
**IPO mechanics, underpricing, and academic data**
- {ritter} [Jay R. Ritter — *Initial Public Offerings: Updated Statistics* (University of Florida)](https://site.warrington.ufl.edu/ritter/files/IPO-Statistics.pdf) — average first-day returns by era (~17–18% for 2001–2023; ~65% for 1999–2000)
- {ipohub} [IPOHub — IPO underpricing causes](https://www.ipohub.org/article/ipo-underpricing-causes) — who captures the first-day gain
- {globeandmail} [The Globe and Mail — study says underpricing data skewed by enthusiasts](https://www.theglobeandmail.com/investing/markets/stocks/FIG/pressreleases/34235615/ipo-pops-overstated-new-study-says-enthusiasts-skew-underpricing-data/) — the ~40%-overstatement claim
- {duke} [Duke FinReg Blog — negative first-day returns / IPO overpricing](https://sites.duke.edu/thefinregblog/2022/10/26/the-ipo-overpricing-phenomenon-debunking-the-determinants-of-negative-first-day-ipo-returns-in-the-us/) — the ~21–25% negative-first-day rate
- {valuethemarkets} [Value The Markets — IPO lockup period and expiry](https://www.valuethemarkets.com/analysis/investing-ideas/understanding-the-ipo-lockup-period-and-expiry) — 90–180-day lockup mechanics and lifecycle timing
- {cornell} [Cornell INFO 2040 — Dutch auction vs bookbuilding](https://blogs.cornell.edu/info2040/2012/09/26/going-public-is-a-dutch-auction-ipo-more-efficient-than-traditional-book-building/) — clearing-price mechanics
**Case studies**
- {snowflake} [CNBC — Snowflake debut](https://www.cnbc.com/2020/09/16/snowflake-snow-opening-trading-on-the-nyse.html) and [CNBC — Snowflake left $3.8B on the table](https://www.cnbc.com/2020/09/17/snowflakes-first-day-pop-means-ipo-left-3point8-billion-on-the-table-the-most-in-12-years.html) — offer $120, opened $245, +112% first day
- {snowflake-drawdown} [24/7 Wall St. — Snowflake IPO buyers still down ~41% years later](https://247wallst.com/investing/2026/04/22/if-you-bought-snowflake-ipo-youre-still-down-41-more-than-5-years-later/) — long-run drawdown
- {airbnb} [CNBC — Airbnb soars 112% in debut](https://www.cnbc.com/2020/12/10/airbnb-ipo-abnb-starts-trading-on-the-nasdaq.html) — offer $68, opened $146
- {visa-ipo} [CNN Money — Visa IPO priced at $44](https://money.cnn.com/2008/03/18/news/companies/visa_ipo.fortune/index.htm) — offer price and size
- {visa-longrun} [The Motley Fool — Visa long-run return](https://www.fool.com/investing/2020/02/13/if-you-invested-100-in-visa-ipo-this-is-how-much-m.aspx) — long-run holder outcome
- {google-dutch} [McLane — Google's 2004 Dutch auction IPO](https://www.mclane.com/insights/googles-dutch-auction-ipo-is-there-a-take-away-lesson-for-the-rest-of-us/) — range cut from $108–135 to $85–95, priced $85, ~18% first day
- {facebook} [Wikipedia — Initial public offering of Facebook](https://en.wikipedia.org/wiki/Initial_public_offering_of_Facebook) and [CNBC — Facebook below offer price](https://www.cnbc.com/2012/05/21/facebook-shares-fall-below-ipo-offering-price.html) — $38 offer, decline, causes
- {uber} [CNBC — Uber ends first day down ~7%](https://www.cnbc.com/2019/05/10/uber-ipo-stock-starts-trading-on-the-new-york-stock-exchange.html) — $45 offer, opened $42
- {lyft} [Yahoo Finance — Lyft ~25% first-month drop](https://finance.yahoo.com/news/uber-ipo-to-begin-trading-on-nyse-135445721.html) — decelerating-growth context
- {rivian} [CNBC — Rivian prices at $78](https://www.cnbc.com/2021/11/09/rivian-prices-ipo-at-78-a-share-valuing-company-at-66point5-billion.html) and [Nasdaq — Rivian below $78](https://www.nasdaq.com/articles/rivian-stock-just-fell-below-its-ipo-price-of-$78-per-share:-time-to-buy) — offer $78, below offer within weeks
- {robinhood} [CNBC — Robinhood below IPO price](https://www.cnbc.com/2021/10/27/robinhood-drops-10percent-to-below-ipo-price-as-investors-worry-about-bleak-outlook.html) — below $38 offer
- {peloton} [Yahoo Finance — Peloton ~11% below offer](https://finance.yahoo.com/news/pelotons-stock-takes-spill-even-173555397.html) — broken debut
- {wework} [HBS Working Knowledge — WeWork, the IPO that shouldn't](https://www.library.hbs.edu/working-knowledge/wework-the-ipo-that-shouldn-t) — governance, pulled deal, later SPAC route
- {coinbase} [Yahoo Finance — Coinbase closes 14% below opening price](https://finance.yahoo.com/news/coinbase-direct-listing-stock-cryptocurrency-bitcoin-172655492.html) — direct listing opened $381, closed ~$328
**Regulatory definitions**
- {sec-accredited} [SEC — Accredited Investors](https://www.sec.gov/resources-small-businesses/capital-raising-building-blocks/accredited-investors) — income/net-worth thresholds, license-holder paths, and pending qualification-by-exam proposal
- {sec-networth} [SEC — "Accredited Investor" Net Worth Standard](https://www.sec.gov/resources-small-businesses/small-business-compliance-guides/accredited-investor-net-worth-standard) — the primary-residence exclusion
- {carltonfields} [Carlton Fields — JOBS Act: Emerging Growth Companies](https://www.carltonfields.com/insights/publications/2012/jobs-act-emerging-growth-companies) — EGC definition (under $1B revenue) and confidential-draft eligibility
- {jdsupra} [JD Supra — JOBS Act confidential review option for EGC IPOs](https://www.jdsupra.com/post/contentViewerEmbed.aspx?fid=28294cae-1127-4886-8afe-fed9a97b6edd) — confidential draft S-1 mechanics
**The Anthropic filing (press-reported; to be re-confirmed against the public filing)**
- {anthropic} [Anthropic — confidential draft S-1 announcement](https://www.anthropic.com/news/confidential-draft-s1-sec) — filing status; no price or share count set
- {fortune-anthropic} [Fortune — Anthropic confidentially files IPO at $965B valuation](https://fortune.com/2026/06/01/anthropic-confidentially-files-ipo-965-billion-valuation/) — valuation, Series H
- {yahoo-anthropic} [Yahoo Finance — Anthropic files confidential S-1](https://finance.yahoo.com/markets/stocks/articles/anthropic-files-confidential-1-joins-161008569.html) — ~$47B run-rate, Google ~14% stake, Long-Term Benefit Trust
- {fortune-stake} [Fortune — Google/Amazon AI profits and the Anthropic stake](https://fortune.com/2026/04/30/google-amazon-ai-profits-anthropic-stake-bubble-earnings-2026/) — related-party / holder context