LLMS.txt Overview

Last Modified:: 29 Apr 2026
User Level:: Power User

You've put enormous effort into your websites. You've created carefully written course descriptions, faculty profiles, event listings, and research pages. But there's a growing challenge: the way people find information online is changing rapidly, and your website may not be keeping up.

Search engines have always used automated programmes called "crawlers" to read and index websites. In recent years, AI-powered tools (including chatbots like ChatGPT and Gemini, and AI-enhanced search engines) have started using their own crawlers to gather information. These tools don't just look at the words on your page; they try to understand what your content means and how it relates to real-world areas like courses, people, and events.

This documentation will assist you in creating your initial LLMS.txt file with the core sections that will help your website gain more visibility for AI agents.

What LLMS.txt actually is

LLMS.txt is currently an emerging convention, not something formally defined or ratified by a standards body like W3C (web standards like HTML, CSS) or IETF (internet protocols like HTTP), which means that there’s currently no official specification, no RFC, and no universal agreement on File format, Supported fields, Parsing rules, Enforcement behavior, etc.

IMPORTANT NOTE: This documentation is based on what we know as of today, so this might change at any point and will require constant reviews as the technology develops and expands.

Who’s using it?

Right now, usage is on an early-stage and inconsistent and its mostly seen in:

AI-forward companies
Developer experiments
RAG / LLM tooling pipelines

Some organizations are starting to explore similar ideas, but there’s no universal adoption yet.

Does any AI actually follow it?

Some custom pipelines and crawlers may read it but most major LLM providers don’t publicly guarantee support. Currently this is best thought of as advisory metadata, not enforceable rules.

Why people are still using it?

Even without standardization, it’s useful because it:

Provides explicit context AI systems otherwise lack
Helps prevent misinterpretation (e.g., demo vs real data)
Acts as a low-cost signal for future AI tooling

Where it might go

There’s a good chance we’ll eventually see a more formal spec (maybe inspired by robots.txt) in the near future. Also some integration with existing standards (schema, metadata, HTTP headers) as AI crawlers and agents start adopting this convention more and more.