LLMS.txt

Last Modified:: 14 May 2026
User Level:: Power User

A guide on what is and how to implement a LLMS.tt file

You've put enormous effort into your websites. You've created carefully written course descriptions, faculty profiles, event listings, and research pages. But there's a growing challenge: the way people find information online is changing rapidly, and your website may not be keeping up.

Search engines have always used automated programmes called "crawlers" to read and index websites. In recent years, AI-powered tools (including chatbots like ChatGPT and Gemini, and AI-enhanced search engines) have started using their own crawlers to gather information. These tools don't just look at the words on your page; they try to understand what your content means and how it relates to real-world areas like courses, people, and events.

This documentation will assist you in creating your initial LLMS.txt file with the core sections that will help your website gain more visibility for AI agents.

What LLMS.txt actually is

LLMS.txt is currently an emerging convention, not something formally defined or ratified by a standards body like W3C (web standards like HTML, CSS) or IETF (internet protocols like HTTP), which means that there’s currently no official specification, no RFC, and no universal agreement on File format, Supported fields, Parsing rules, Enforcement behavior, etc.

IMPORTANT NOTE: This documentation is based on what we know as of today, so this might change at any point and will require constant reviews as the technology develops and expands.

Who’s using it?

Right now, usage is on an early-stage and inconsistent and its mostly seen in:

AI-forward companies
Developer experiments
RAG / LLM tooling pipelines

Some organizations are starting to explore similar ideas, but there’s no universal adoption yet.

Does any AI actually follow it?

Some custom pipelines and crawlers may read it but most major LLM providers don’t publicly guarantee support. Currently this is best thought of as advisory metadata, not enforceable rules.

Why people are still using it?

Even without standardization, it’s useful because it:

Provides explicit context AI systems otherwise lack
Helps prevent misinterpretation (e.g., demo vs real data)
Acts as a low-cost signal for future AI tooling

Where it might go

There’s a good chance we’ll eventually see a more formal spec (maybe inspired by robots.txt) in the near future. Also some integration with existing standards (schema, metadata, HTTP headers) as AI crawlers and agents start adopting this convention more and more.

IMPORTANT NOTE:
You must be an Administrator to complete the actions from this point onwards.

Create a LLMS Content Type

Create a new Content Type

You'll need to create a new Content Type by going to Assets > Content Types and clicking on the "Create content type" button. Fill in the settings shown below.

Name	Description	Minimum User Level
LLMS.txt file	Used to add LLMS.txt file to the current section, used in the root/home section.	Administrator

Add Elements

Once on the Elements tab, click the "Add Element" button at the bottom. Fill in the settings shown below. Then click "Save Changes".

Element	Description	Type	Characters	Required	Show
Name	The Name Element	Plain Text	Default	Yes	Yes
LLMS File	Enter the code you wish to be output in the LLMS.txt file.	Plain Text	2000	Yes	Yes

Create a LLMS-content content layout

On the Content Layout tab, click on the "Add content layout" button. Fill in the General settings as shown below.

Label	Value
Name	text/llms-content
File extension	(Default)
Syntax Type	HTML/XML
Content Layout Processor	Handlebars Content

Click the "Content Layout Code" Tab and enter the following: {{publish element="LLMS File"}}
Click "Save Changes"

Create a default content layout

On the Content Layout tab, click on the "Add content layout" button. Fill in the General settings as shown below.

Label	Value
Name	text/html
File extension	(Default)
Syntax Type	HTML/XML
Content Layout Processor	Handlebars Content

Click the "Content Layout Code" Tab and enter the following: {{nav id="XXX" name="Create LLMS.txt"}}
Click "Save Changes"

Note: This Content Layout contains a Navigation Object. We'll modify this later.

Create the Navigation Object: Create LLMS.txt

You'll need to create a new Navigation by going to Assets > Navigation and clicking on the "Add new navigation" button.
This uses the Generate file Navigation Object. Set it up using the options in the table below. Unlisted or empty options should be left at their default values.
Copy the ID generated (or Handlebars tag)

Label	Value
Name	Create LLMS.txt
File Name	llms
Append the content ID to the name of the file
File extension	txt
Output directory	Use the current directory
Append the current section path to the base directory
Content layout	text/llms-content

Update content layout with new Navigation

Once you have the Navigation ID, we'll have to go back and update the default content layout.

Go to Assets > Content Types
Search for your new "LLMS.txt file" content type.
Click "Actions"
Click "Edit content layouts"
Choose "text/html"
Change the ID to be the ID you copied in the previous step

Create the file

Now that you've finished your new content type, we need to create the actual LLMS.txt using it. To do this, be sure you're in your home section (The file needs to be found at the root of your website).

Go to your Site Structure
And navigate to your Home section
From there, click on the "Content types" tab
Search for "LLMS.txt file"
Enable the LLMS.txt file content type for the Section
Switch to the "Content" tab
Click "Add content"
Add a piece of content using the "LLMS.txt file" content type
Enter Name: "LLMS.txt"
Paste your rules

Sample LLMS.txt file

The below code is a baseline for a simple LLMS.txt file. You will note that it has core blocks, e.g.: description, data-characteristics, etc., that will be breakdown further down this page.

You may copy and paste it and replace the placeholders with your real info, as well as adding/removing sections as per your website characteristics and needs.

One of the most important things is to keep it human-readable (that helps both people and AI).

# Lines starting with '#' are comments
# LLMS.txt - Basic template

site_name: Your Website Name
site_url: https://www.example.com
site_type: (e.g., ecommerce, university, blog, SaaS)

description:
Brief description of what your website does and who it serves.

content_sections:
- Section 1 (e.g., Products, Blog, Services)
- Section 2
- Section 3

data_characteristics:
- Content is (choose: authoritative / editorial / user-generated / demo)
- May include (optional: outdated / incomplete / placeholder content)

llm_guidance:
- Summarization is allowed
- Attribution is appreciated
- Verify critical information before relying on it

allowed_actions:
- Indexing and retrieval
- Summarization of content
- Use in AI-generated responses

disallowed_actions:
- Presenting content as guaranteed factual without verification
- Misrepresenting the site or brand

crawl_policy
- Allowed: Public pages
- Avoid: Admin, login, or preview areas

contact:
- https://www.example.com/contact

last_updated: YYYY-MM-DD

File Breakdown

Comments

The very first line of your file should be # Lines starting with '#' are comments. This is strongly recommended as there are no standards for what a comment line is yet, so letting the AI agents know what a comment line is can avoid some agents trying to parse it.

Overall Information and description

These are straightforward fields that will contain the overall info about your website.

content_sections

Bullet points with the core sections of your website. Another important list to assist AI agents to understand your site, for example:

- Programs (Undergraduate, Postgraduate, A–Z listings)
- Admissions information

data_characteristics

Major characteristics of the type of data your website provides. In this case, as this is a sample site, most of the data does not reflect 'real' data, so we include bullet points to reflect that, for example:

- Highly structured academic content
- Frequent updates via news and research highlights

llm_guidance

High level guidelines to AI agents on how to use/interpret the data on your site, for example:

- Program, course, and staff data are illustrative
- Dates, events, and news may not be current or real

allowed_actions / disallowed_actions

What kind of actions AI agents should or should not use your site for, for example:

allowed:
- Summarization of page structures
- Extraction of navigation and IA patterns

disallowed:
- Treating content as real academic offerings
- Using data for decision-making (e.g., applying to programs)

crawl_policy

These are not rules to allow/disallow your site to be 'crawled' by AI agents, but rather what rules your website has for standard Search Engines, so AI agents understand the rules and can better respond to related prompts, for example:

- Allowed: Public pages
- Avoid: Admin paths, preview links, or unpublished content

last_updated

Used for relevancy and accuracy of the information shared

Easy ways to expand it later

When you’re ready, you can build on this file by adding new sections, such as:

intended_use - Bullet points of the intended use of your site. This will help AI agents to narrow it down for specific prompts, for example:

- Demonstration of Platform capabilities
- Example structure for university websites

key_features - Key functionalities and/or features that your website has, for example:

- Structured academic program data
- Search and filtering interfaces