LLMS.txt
A guide on what is and how to implement a LLMS.tt file
You've put enormous effort into your websites. You've created carefully written course descriptions, faculty profiles, event listings, and research pages. But there's a growing challenge: the way people find information online is changing rapidly, and your website may not be keeping up.
Search engines have always used automated programmes called "crawlers" to read and index websites. In recent years, AI-powered tools (including chatbots like ChatGPT and Gemini, and AI-enhanced search engines) have started using their own crawlers to gather information. These tools don't just look at the words on your page; they try to understand what your content means and how it relates to real-world areas like courses, people, and events.
This documentation will assist you in creating your initial LLMS.txt file with the core sections that will help your website gain more visibility for AI agents.
What LLMS.txt actually is
LLMS.txt is currently an emerging convention, not something formally defined or ratified by a standards body like W3C (web standards like HTML, CSS) or IETF (internet protocols like HTTP), which means that there’s currently no official specification, no RFC, and no universal agreement on File format, Supported fields, Parsing rules, Enforcement behavior, etc.
IMPORTANT NOTE: This documentation is based on what we know as of today, so this might change at any point and will require constant reviews as the technology develops and expands.
Who’s using it?
Right now, usage is on an early-stage and inconsistent and its mostly seen in:
- AI-forward companies
- Developer experiments
- RAG / LLM tooling pipelines
Some organizations are starting to explore similar ideas, but there’s no universal adoption yet.
Does any AI actually follow it?
Some custom pipelines and crawlers may read it but most major LLM providers don’t publicly guarantee support. Currently this is best thought of as advisory metadata, not enforceable rules.
Why people are still using it?
Even without standardization, it’s useful because it:
- Provides explicit context AI systems otherwise lack
- Helps prevent misinterpretation (e.g., demo vs real data)
- Acts as a low-cost signal for future AI tooling
Where it might go
There’s a good chance we’ll eventually see a more formal spec (maybe inspired by robots.txt) in the near future. Also some integration with existing standards (schema, metadata, HTTP headers) as AI crawlers and agents start adopting this convention more and more.
IMPORTANT NOTE:
You must be an Administrator to complete the actions from this point onwards.
Create a LLMS Content Type
Create a new Content Type
You'll need to create a new Content Type by going to Assets > Content Types and clicking on the "Create content type" button. Fill in the settings shown below.
| Name | Description | Minimum User Level |
|---|---|---|
| LLMS.txt file | Used to add LLMS.txt file to the current section, used in the root/home section. | Administrator |
Add Elements
Once on the Elements tab, click the "Add Element" button at the bottom. Fill in the settings shown below. Then click "Save Changes".
| Element | Description | Type | Characters | Required | Show |
|---|---|---|---|---|---|
| Name | The Name Element | Plain Text | Default | Yes | Yes |
| LLMS File | Enter the code you wish to be output in the LLMS.txt file. | Plain Text | 2000 | Yes | Yes |
Create a LLMS-content content layout
- On the Content Layout tab, click on the "Add content layout" button. Fill in the General settings as shown below.
| Label | Value |
|---|---|
| Name | text/llms-content |
| File extension | (Default) |
| Syntax Type | HTML/XML |
| Content Layout Processor | Handlebars Content |
- Click the "Content Layout Code" Tab and enter the following:
{{publish element="LLMS File"}} - Click "Save Changes"
Create a default content layout
- On the Content Layout tab, click on the "Add content layout" button. Fill in the General settings as shown below.
| Label | Value |
|---|---|
| Name | text/html |
| File extension | (Default) |
| Syntax Type | HTML/XML |
| Content Layout Processor | Handlebars Content |
- Click the "Content Layout Code" Tab and enter the following:
{{nav id="XXX" name="Create LLMS.txt"}} - Click "Save Changes"
Note: This Content Layout contains a Navigation Object. We'll modify this later.
Create the Navigation Object: Create LLMS.txt
- You'll need to create a new Navigation by going to Assets > Navigation and clicking on the "Add new navigation" button.
- This uses the Generate file Navigation Object. Set it up using the options in the table below. Unlisted or empty options should be left at their default values.
- Copy the ID generated (or Handlebars tag)
| Label | Value |
|---|---|
| Name |
Create LLMS.txt |
| File Name | llms |
| Append the content ID to the name of the file | |
| File extension | txt |
| Output directory | Use the current directory |
| Append the current section path to the base directory | |
| Content layout | text/llms-content |
Update content layout with new Navigation
Once you have the Navigation ID, we'll have to go back and update the default content layout.
- Go to Assets > Content Types
- Search for your new "LLMS.txt file" content type.
- Click "Actions"
- Click "Edit content layouts"
- Choose "text/html"
- Change the ID to be the ID you copied in the previous step
Create the file
Now that you've finished your new content type, we need to create the actual LLMS.txt using it. To do this, be sure you're in your home section (The file needs to be found at the root of your website).
- Go to your Site Structure
- And navigate to your Home section
- From there, click on the "Content types" tab
- Search for "LLMS.txt file"
- Enable the LLMS.txt file content type for the Section
- Switch to the "Content" tab
- Click "Add content"
- Add a piece of content using the "LLMS.txt file" content type
- Enter Name: "LLMS.txt"
- Paste your rules
Sample LLMS.txt file
The below code is a baseline for a simple LLMS.txt file. You will note that it has core blocks, e.g.: description, data-characteristics, etc., that will be breakdown further down this page.
You may copy and paste it and replace the placeholders with your real info, as well as adding/removing sections as per your website characteristics and needs.
One of the most important things is to keep it human-readable (that helps both people and AI).
# Lines starting with '#' are comments
# LLMS.txt - Basic template
site_name: Your Website Name
site_url: https://www.example.com
site_type: (e.g., ecommerce, university, blog, SaaS)
description:
Brief description of what your website does and who it serves.
content_sections:
- Section 1 (e.g., Products, Blog, Services)
- Section 2
- Section 3
data_characteristics:
- Content is (choose: authoritative / editorial / user-generated / demo)
- May include (optional: outdated / incomplete / placeholder content)
llm_guidance:
- Summarization is allowed
- Attribution is appreciated
- Verify critical information before relying on it
allowed_actions:
- Indexing and retrieval
- Summarization of content
- Use in AI-generated responses
disallowed_actions:
- Presenting content as guaranteed factual without verification
- Misrepresenting the site or brand
crawl_policy
- Allowed: Public pages
- Avoid: Admin, login, or preview areas
contact:
- https://www.example.com/contact
last_updated: YYYY-MM-DD
File Breakdown
Comments
The very first line of your file should be # Lines starting with '#' are comments. This is strongly recommended as there are no standards for what a comment line is yet, so letting the AI agents know what a comment line is can avoid some agents trying to parse it.
Overall Information and description
These are straightforward fields that will contain the overall info about your website.
content_sections
Bullet points with the core sections of your website. Another important list to assist AI agents to understand your site, for example:
- Programs (Undergraduate, Postgraduate, A–Z listings)
- Admissions information
data_characteristics
Major characteristics of the type of data your website provides. In this case, as this is a sample site, most of the data does not reflect 'real' data, so we include bullet points to reflect that, for example:
- Highly structured academic content
- Frequent updates via news and research highlights
llm_guidance
High level guidelines to AI agents on how to use/interpret the data on your site, for example:
- Program, course, and staff data are illustrative
- Dates, events, and news may not be current or real
allowed_actions / disallowed_actions
What kind of actions AI agents should or should not use your site for, for example:
allowed:
- Summarization of page structures
- Extraction of navigation and IA patterns
disallowed:
- Treating content as real academic offerings
- Using data for decision-making (e.g., applying to programs)
crawl_policy
These are not rules to allow/disallow your site to be 'crawled' by AI agents, but rather what rules your website has for standard Search Engines, so AI agents understand the rules and can better respond to related prompts, for example:
- Allowed: Public pages
- Avoid: Admin paths, preview links, or unpublished content
last_updated
Used for relevancy and accuracy of the information shared
Easy ways to expand it later
When you’re ready, you can build on this file by adding new sections, such as:
intended_use - Bullet points of the intended use of your site. This will help AI agents to narrow it down for specific prompts, for example:
- Demonstration of Platform capabilities
- Example structure for university websites
key_features - Key functionalities and/or features that your website has, for example:
- Structured academic program data
- Search and filtering interfaces