We Analyzed 22,000+ AI Citations Across Latin America. Here's What Gets Cited.
Original research: which Latin American websites get cited by AI? We ran Perplexity and Gemini against 22,000+ URLs across Brazil, Mexico, Colombia, Argentina, Chile, Peru, and Quebec. The results are surprising.
The Experiment
Over two nights, we ran Tocho's citation pipeline against 22,000+ URLs across Latin America. For each URL, we asked Gemini: "If a user asked about this topic, would you cite this page?"
We weighted our sample toward the Americas:
- 40% Spanish — Mexico, Argentina, Colombia, Chile, Peru, Ecuador
- 30% Portuguese — Brazil
- 20% French — Quebec
- 10% English — Baseline
Here's what we found.
Overall: 37-40% of Pages Get Cited
Across all URLs in our Americas corpus, 37-40% were cited by Gemini. This is higher than the global average reported by other studies (~25-30%), likely because we prioritized authoritative regional content.
By Country: Colombia Leads, Chile Is a Ghost
| Country | Sites Cited | Observation | |---------|-----------|-------------| | Colombia | 127 | Highest citation rate. Colombian content is well-structured and covers niche topics. | | Brazil | 45 | Strong ecosystem. Lots of Portuguese content for AI to choose from. | | Mexico | 28 | Moderate. Mexico's top sites get cited, but mid-tier content is invisible. | | Quebec | 16 | Punches way above its weight. Low competition = high citation rate. | | Argentina | 14 | Underserved. Argentine content is strong but underrepresented in AI training. | | Chile | 1 | Almost zero. This is the biggest gap — and the biggest opportunity. |
Why Colombia Wins
Colombian content tends to be well-structured, bilingual-aware, and covers niche topics that other Spanish-speaking countries don't address. Sites like colombiafintech.co, colombiacheck.com, and icesi.edu.co are citation authorities in their fields.
Why Chile Is Missing
Only 1 Chilean site was cited across 1,300+ queries. This isn't because Chilean content is bad — it's because there's less of it in AI training data. For Chilean businesses, this is actually an opportunity: less competition means easier citation wins.
By Content Type: Reference Beats News
| Content Type | % of Citations | |-------------|---------------| | Niche authority (blogs, tools, SaaS) | 68% | | Education (.edu) | 9% | | Government (.gov) | 7% | | Tech/developer | 6% | | Finance/fintech | 4% | | E-commerce | 4% | | News | 2% |
The biggest surprise: news sites get cited only 2% of the time. AI prefers reference-quality content over breaking news.
What Does "Citable" Content Look Like?
From the sites that consistently got cited, we identified these patterns:
- Clear, quotable definitions — "AI citation probability is defined as..."
- Comparison tables — structured data AI can extract and present
- FAQ sections — Q&A format maps directly to how AI generates responses
- Specific numbers — "Revenue grew 23% in Q3 2025" beats vague claims
- Named sources — "According to IBGE" gives AI a verifiable anchor
- JSON-LD schema — pages with structured data get cited 28-36% more
What This Means for Your Content Strategy
- If you're in Chile: You have almost zero competition for AI citations in Spanish. Start now and you own the space.
- If you're in Colombia: Your content is already getting cited. Optimize to get cited MORE.
- If you're in Quebec: French content has low competition. Bilingual optimization (FR + EN) is a superpower.
- If you're writing news: Pivot to reference content. Write the guide, not the article.
- If you're an agency: Your clients need to know that only 12% of AI citations match Google rankings. This is a new service you can sell.
Methodology
- Pipeline: Tocho citation runner with Gemini 2.5 Flash
- Sample: 22,000+ URLs from curated regional seed lists
- Weighting: ES 40%, PT 30%, FR 20%, EN 10%
- Score stratification: 40% high-scoring (70-100), 35% mid (40-69), 25% low (0-39)
Check Your Citation Probability
Want to know where your content stands? Check your AI citation probability for free →
Tocho predicts whether AI platforms will cite your content — per model, per language — trained on the same observation data used in this research.