View on GitHub

JuKuCMS

Open Source CMS should be the fastest CMS world wide (WIP).

robots.txt

For search engines this CMS automatically generates a robots.txt file.\ This rules are loaded from database table rules.

Api

You can add or remove robots.txt rules with class Robots.

Allowed options:

value: Directory / value\ useragent: Here you can specify useragent, so you can disallow or allow crawling of a directory only for a specific crawler

Example Usage

//dont allow all robots to crawl directory /system/ with all sub-directories
Robots::addRule("DISALLOW", "/system/*");

//disallow crawling of directory /dir1/ only for googlebot
Robots::addRule("DISALLOW", "/dir1/*", "Googlebot");

//allow all robots to crawl directory /store/ with all sub-directories
Robots::addRule("ALLOW", "/store/*");

//disallow /dir1/ but allow /dir1/dir2/
Robots::addRule("DISALLOW", "/dir1/");
Robots::addRule("ALLOW", "/dir1/dir2/");

//remove rule /dir1/ for googlebot again
Robots::deleteRule("DISALLOW", "/dir1/*", "Googlebot");

//set sitemap
Robots::addRule("SITEMAP", "http://www.example.com/sitemap.xml");