How Often Does Perplexity Recrawl Your Domain? A Complete Guide for Publishers

When Perplexity.ai serves up your content to millions of users asking questions, the traffic implications can be enormous. Unlike traditional search engines, Perplexity surfaces answers rather than blue links, and the referral traffic from Perplexity citations converts at 14.2% versus Google's 2.8%¹—a 5x quality multiplier that makes understanding their crawl behavior essential for any serious publisher.

But here's the challenge: Perplexity doesn't publish an official recrawl schedule. There is no "Search Console" equivalent where you can request re-indexing or monitor crawl frequency. What we do have are patterns, documentation, and third-party research that paint a picture of how often Perplexity recrawls your domain.

The Two Crawlers You Need to Know

Perplexity operates two distinct crawlers with different behaviors and respect for robots.txt rules. Understanding the difference is crucial for controlling how your content is accessed.

PerplexityBot is designed to surface and link websites in search results on Perplexity. It is not used to crawl content for AI foundation models.² This is Perplexity's primary crawler for discovery, and the good news is that PerplexityBot respects robots.txt. Unlike some AI crawlers that have been caught ignoring access controls, PerplexityBot checks and obeys robots.txt directives.³ It follows links and fetches content, abiding by robots.txt.⁴

Perplexity-User controls which sites user requests can access when users ask Perplexity a question. It might visit a web page to help provide an accurate answer and include a link to the page in its response. The critical difference: Since a user requested the fetch, this fetcher generally ignores robots.txt rules.⁵ It is not used for web crawling or to collect content for training AI foundation models.⁶

This distinction matters enormously. If you block PerplexityBot in robots.txt, Perplexity may still access your content through user-initiated requests via Perplexity-User—but you won't be cited as a source in search results.

How Often Does Perplexity Crawl Your Site?

Perplexity performs live web retrieval by default for every query it processes.⁷ Live retrieval is not an optional feature or a special mode, but the default operational behavior across Perplexity's consumer interface, Pro Search experience, and developer APIs.⁸

That pipeline now processes approximately 780 million monthly queries, up 239% from 230 million in August 2024⁹. The platform serves an estimated 22 million active users with 85% retention and $100 million in annualized revenue.¹⁰ This scale means Perplexity's crawlers are constantly working, but the frequency varies significantly based on several factors.

High-traffic and authoritative sites such as major news outlets, government pages, and popular blogs tend to be refreshed frequently, often within hours of publication.¹¹ Content older than 60 days is at a measurable disadvantage, regardless of backlinks or domain authority¹². Perplexity is 3.3x fresher than Google¹³, meaning they prioritize recency more aggressively than traditional search engines.

However, because Perplexity does not publish a guaranteed indexing window, freshness should be understood as probabilistic rather than absolute.¹⁴ Some content may not be indexed at all, including pages blocked by robots.txt, subscription-only journalism, gated research, or private community platforms.¹⁵

Perplexity's search index is smaller than Google's, which means they crawl fewer pages overall but with greater emphasis on quality and recency.

Detecting Perplexity Crawls on Your Domain

To verify Perplexity is crawling your site, check your server logs for two specific user-agent strings. The full PerplexityBot user-agent is Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)¹⁶. The Perplexity-User agent identifies itself as Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user)¹⁷.

You can verify legitimate Perplexity traffic by checking IP ranges against their official lists. PerplexityBot IP addresses are published at https://www.perplexity.com/perplexitybot.json¹⁸, and Perplexity-User IP addresses are listed at https://www.perplexity.com/perplexity-user.json¹⁹.

PerplexityBot's official crawler makes up to 25 million daily requests²⁰. Combined with other crawlers, the activity was observed across tens of thousands of domains and millions of requests per day²¹.

The Stealth Crawler Controversy

Publisher trust in Perplexity's crawling practices took a hit when Cloudflare published findings about undisclosed crawling behavior. We observed that Perplexity uses not only their declared user-agent, but also a generic browser intended to impersonate Google Chrome on macOS when their declared crawler was blocked.²²

Both their declared and undeclared crawlers were attempting to access the content for scraping contrary to the web crawling norms as outlined in RFC 9309.²³ This undeclared crawler utilized multiple IPs not listed in Perplexity's official IP range²⁴, and would rotate through these IPs in response to restrictive robots.txt policies and blocks from Cloudflare.

Cloudflare was able to fingerprint this crawler using a combination of machine learning and network signals.²⁵ As a result, they de-listed Perplexity as a verified bot and added heuristics to their managed rules that block this stealth crawling²⁶. txt feature or their managed rule blocking AI crawlers.

While this stealth crawler issue has been addressed, it highlights why publishers should monitor logs for unusual browser signatures and not rely solely on user-agent strings for identification.

Factors That Influence Crawl Frequency

Several factors determine how often Perplexity recrawls your domain:

Content freshness: Perplexity actively prioritizes recent content. Pages updated regularly signal active maintenance and receive more frequent crawls.

Traffic volume: Pages that already drive significant Perplexity referral traffic get crawled more often because they demonstrate user relevance.

Content type: Where Google returns 10 blue links per page, Perplexity typically cites 3 to 5 sources per answer²⁷. This selectivity means well-researched, authoritative content that directly answers questions is more likely to be recrawled and cited.

Internal linking: Internal link count is the strongest positive predictor of AI citation, though the effect is modest at r = 0.127, per Lee, 2026a.²⁸ Strong internal linking helps Perplexity's crawlers discover and prioritize your content.

Authority signals: Established sites with domain authority tend to get crawled more frequently than new or low-authority domains.

Optimizing for Perplexity Crawl Success

Given the probabilistic nature of Perplexity indexing, focus on factors within your control:

1. Publish fresh content regularly — Content older than 60 days faces measurable disadvantage.

2. Structure content for answer generation — Perplexity cites sources that directly answer user questions. Clear headers, concise summaries, and well-structured content increases citation probability.

3. Build internal links — Strong internal linking is the most reliable positive signal for AI citation.

4. Consider robots.txt carefully — Blocking PerplexityBot prevents discovery through search results, but Perplexity-User may still access content through user queries. Understand which crawler handles which traffic type.

5. Monitor your logs — Watch for both PerplexityBot and Perplexity-User, plus any unusual Chrome-on-macOS signatures that might indicate undeclared crawling.

The Bottom Line

Perplexity doesn't offer publishers the same control and transparency as Google Search Console. There is no recrawl button, no indexing API, and no guaranteed timeline. What there is: a massive, growing platform that processes 780 million monthly queries and drives high-converting referral traffic.

The most effective strategy is to optimize for the factors that matter: fresh, authoritative content with strong internal linking, structured for the conversational queries users ask Perplexity. Think of Perplexity indexing as probabilistic rather than guaranteed—and stack the odds in your favor through consistent quality and technical best practices.

Sources

“Referral traffic from Perplexity citations converts at 14.2% versus Google's 2.8% a 5x quality multiplier” — https://ziptie.dev/blog/how-perplexity-ai-answers-work/ · archive
“PerplexityBot is designed to surface and link websites in search results on Perplexity. It is not used to crawl content for AI foundation models.” — https://docs.perplexity.ai/docs/resources/perplexity-crawlers · archive
“PerplexityBot respects robots.txt. Unlike some AI crawlers that have been caught ignoring access controls, PerplexityBot checks and obeys robots.txt directives.” — https://aiplusautomation.com/blog/perplexity-optimization-complete-guide · archive
“PerplexityBot follows links and fetches content, abiding by robots.txt.” — https://ethanlazuk.com/blog/how-does-perplexity-work/ · archive
“Since a user requested the fetch, this fetcher generally ignores robots.txt rules.” — https://docs.perplexity.ai/docs/resources/perplexity-crawlers · archive
“Perplexity-User supports user actions within Perplexity. When users ask Perplexity a question, it might visit a web page to help provide an accurate answer and include a link to the page in its response. Perplexity-User controls which sites these user requests can access. It is not used for web crawling or to collect content for training AI foundation models.” — https://docs.perplexity.ai/docs/resources/perplexity-crawlers · archive
“Perplexity performs live web retrieval by default for every query it processes.” — https://www.datastudios.org/post/can-perplexity-search-the-web-in-real-time-live-results-update-frequency-and-system-limitations · archive
“Live retrieval is not an optional feature or a special mode, but the default operational behavior across Perplexity's consumer interface, Pro Search experience, and developer APIs.” — https://www.datastudios.org/post/can-perplexity-search-the-web-in-real-time-live-results-update-frequency-and-system-limitations · archive
“That pipeline now processes approximately 780 million monthly queries up 239% from 230 million in August 2024.” — https://ziptie.dev/blog/how-perplexity-ai-answers-work/ · archive
“The platform serves an estimated 22 million active users with 85% retention and $100 million in annualized revenue.” — https://ziptie.dev/blog/how-perplexity-ai-answers-work/ · archive
“High-traffic and authoritative sites such as major news outlets, government pages, and popular blogs tend to be refreshed frequently, often within hours of publication.” — https://www.datastudios.org/post/can-perplexity-search-the-web-in-real-time-live-results-update-frequency-and-system-limitations · archive
“Content older than 60 days is at a measurable disadvantage, regardless of backlinks or domain authority” — https://aiplusautomation.com/blog/perplexity-optimization-complete-guide · archive
“Perplexity is 3.3x fresher than Google” — https://aiplusautomation.com/blog/perplexity-optimization-complete-guide · archive
“Because Perplexity does not publish a guaranteed indexing window, freshness should be understood as probabilistic rather than absolute.” — https://www.datastudios.org/post/can-perplexity-search-the-web-in-real-time-live-results-update-frequency-and-system-limitations · archive
“Some content is not indexed at all, including pages blocked by robots.txt, subscription-only journalism, gated research, or private community platforms.” — https://www.datastudios.org/post/can-perplexity-search-the-web-in-real-time-live-results-update-frequency-and-system-limitations · archive
“Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)” — https://docs.perplexity.ai/docs/resources/perplexity-crawlers · archive
“Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user)” — https://docs.perplexity.ai/docs/resources/perplexity-crawlers · archive
“PerplexityBot IP addresses: https://www.perplexity.com/perplexitybot.json” — https://docs.perplexity.ai/docs/resources/perplexity-crawlers · archive
“Perplexity-User IP addresses: https://www.perplexity.com/perplexity-user.json” — https://docs.perplexity.ai/docs/resources/perplexity-crawlers · archive
“its official crawler makes up to 25 million daily requests” — https://www.vouched.id/learn/blog/perplexity-agent-detection-guide · archive
“This activity was observed across tens of thousands of domains and millions of requests per day.” — https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/ · archive
“We observed that Perplexity uses not only their declared user-agent, but also a generic browser intended to impersonate Google Chrome on macOS when their declared crawler was blocked.” — https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/ · archive
“Both their declared and undeclared crawlers were attempting to access the content for scraping contrary to the web crawling norms as outlined in RFC 9309.” — https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/ · archive
“This undeclared crawler utilized multiple IPs not listed in Perplexity's official IP range, and would rotate through these IPs in response to the restrictive robots.txt policy and block from Cloudflare.” — https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/ · archive
“We were able to fingerprint this crawler using a combination of machine learning and network signals.” — https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/ · archive
“We have de-listed them as a verified bot and added heuristics to our managed rules that block this stealth crawling.” — https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/ · archive
“Where Google returns 10 blue links per page, Perplexity typically cites 3 to 5 sources per answer” — https://aiplusautomation.com/blog/perplexity-optimization-complete-guide · archive
“Internal link count is the strongest positive predictor of AI citation (r = 0.127, per Lee, 2026a)” — https://aiplusautomation.com/blog/perplexity-optimization-complete-guide · archive