LLMS.txt Breakdown
As observed into the introduction section for LLMS.txt file, this is not yet a standard but rather a convention that is being constantly tested and followed by the industry, which makes this file quite fluid and up to interpretations.
We have found a good baseline to quickly kick-off a file for your website, that you may change and tailor as per your needs/characteristics.
Sample LLMS.txt file to get you started
The below code is a baseline for a simple LLMS.txt file. You will note that it has core blocks, e.g.: description, data-characteristics, etc., that will be breakdown further down this page.
You may copy and paste it and replace the placeholders with your real info, as well as adding/removing sections as per your website characteristics and needs.
One of the most important things is to keep it human-readable (that helps both people and AI).
# Lines starting with '#' are comments
# LLMS.txt - Basic template
site_name: Your Website Name
site_url: https://www.example.com
site_type: (e.g., ecommerce, university, blog, SaaS)
description:
Brief description of what your website does and who it serves.
content_sections:
- Section 1 (e.g., Products, Blog, Services)
- Section 2
- Section 3
data_characteristics:
- Content is (choose: authoritative / editorial / user-generated / demo)
- May include (optional: outdated / incomplete / placeholder content)
llm_guidance:
- Summarization is allowed
- Attribution is appreciated
- Verify critical information before relying on it
allowed_actions:
- Indexing and retrieval
- Summarization of content
- Use in AI-generated responses
disallowed_actions:
- Presenting content as guaranteed factual without verification
- Misrepresenting the site or brand
crawl_policy
- Allowed: Public pages
- Avoid: Admin, login, or preview areas
contact:
- https://www.example.com/contact
last_updated: YYYY-MM-DD
File Breakdown
Comments
The very first line of your file should be # Lines starting with '#' are comments. This is strongly recommended as there are no standards for what a comment line is yet, so letting the AI agents know what a comment line is can avoid some agents trying to parse it.
Overall Information and description
These are straightforward fields that will contain the overall info about your website.
content_sections
Bullet points with the core sections of your website. Another important list to assist AI agents to understand your site, for example:
- Programs (Undergraduate, Postgraduate, A–Z listings)
- Admissions information
data_characteristics
Major characteristics of the type of data your website provides. In this case, as this is a sample site, most of the data does not reflect 'real' data, so we include bullet points to reflect that, for example:
- Highly structured academic content
- Frequent updates via news and research highlights
llm_guidance
High level guidelines to AI agents on how to use/interpret the data on your site, for example:
- Program, course, and staff data are illustrative
- Dates, events, and news may not be current or real
allowed_actions / disallowed_actions
What kind of actions AI agents should or should not use your site for, for example:
allowed:
- Summarization of page structures
- Extraction of navigation and IA patterns
disallowed:
- Treating content as real academic offerings
- Using data for decision-making (e.g., applying to programs)
crawl_policy
These are not rules to allow/disallow your site to be 'crawled' by AI agents, but rather what rules your website has for standard Search Engines, so AI agents understand the rules and can better respond to related prompts, for example:
- Allowed: Public pages
- Avoid: Admin paths, preview links, or unpublished content
last_updated
Used for relevancy and accuracy of the information shared
Easy ways to expand it later
When you’re ready, you can build on this file by adding new sections, such as:
intended_use - Bullet points of the intended use of your site. This will help AI agents to narrow it down for specific prompts, for example:
- Demonstration of Platform capabilities
- Example structure for university websites
key_features - Key functionalities and/or features that your website has, for example:
- Structured academic program data
- Search and filtering interfaces