Sitemap generation

Difficulty: Medium

Google erected their Webmaster tools some time ago. Part of this is sitemap.xml, a file containing all the urls of your website to allow for easier indexing by Google and other search engines. It’s in xml, writing it by hand would thus be a serious PITA. Today I whipped up a small library for Kohana to aid in this. In part it uses the get_url() method I posted on earlier.

So, how does it work. We begin with adding a route in /application/config/routes.php

$config['sitemap.xml'] = 'sitemap';

This will map all request to example.com/sitemap.xml to the site map controller. So, next step is creating that controller.

//application/controllers/sitemap.php
class Sitemap_Controller extends Controller{
	public function index(){
		$sitemap=new Sitemap; //create new sitemap
                $sitemap->add_url('http://www.example.com','2008-05-31','weekly',1); //url, last modified, change frequency, priority
		$sitemap->location='http://www.example.com/sitemap.xml'; //not necessary really since this url is assumed
		echo $sitemap->render(); //will output the sitemap and add an xml header
                $sitemap->ping_google();//tell Google about the sitemap
	}
}

This will output a validating sitemap.xml Caching is right now something for the future. Other methods are: save(), save sitemap to a file, get() retrieve the sitemap string.

I talked about the ORM::get_url() method in an earlier post. It comes in quite handy in this class though.

$sitemap=new Sitemap; //create new sitemap
$sitemap->add_model('article');

This code will call add_url(Article_Model $article->get_url()) for each record in the table. If you have a column ‘modified’ in your table and that column returns a timestamp (perhaps using __get) it will also set the lastmod element in the xml file.

Conditions are also possible

$sitemap=new Sitemap; //create new sitemap
$sitemap->add_model('article',array('is_published'=>1);  //where condition like in db builder
$sitemap->get();

You can see that generating a sitemap from a model is very easy using the get_url() method. You can quickly setup the sitemap from your models. Of course, it doesn’t cover for complicated cases. Google and other search engines can access your sitemap now through example.com/sitemap.xml

If you have a lot of records this library might be costly so you should put in some caching.

Update
I added caching to the library. For an example see the repo under controllers

viagra
free viagra
buy viagra online
generic viagra
how does viagra work
cheap viagra
buy viagra
buy viagra online inurl
viagra 6 free samples
viagra online
viagra for women
viagra side effects
female viagra
natural viagra
online viagra
cheapest viagra prices
herbal viagra
alternative to viagra
buy generic viagra
purchase viagra online
free viagra without prescription
viagra attorneys
free viagra samples before buying
buy generic viagra cheap
viagra uk
generic viagra online
try viagra for free
generic viagra from india
fda approves viagra
free viagra sample
what is better viagra or levitra
discount generic viagra online
viagra cialis levitra
viagra dosage
viagra cheap
viagra on line
best price for viagra
free sample pack of viagra
viagra generic
viagra without prescription
discount viagra
gay viagra
mail order viagra
viagra inurl
generic viagra online paypal
generic viagra overnight
generic viagra online pharmacy
generic viagra uk
buy cheap viagra online uk
suppliers of viagra
how long does viagra last
viagra sex
generic viagra soft tabs
generic viagra 100mg
buy viagra onli
generic viagra online without prescription
viagra energy drink
cheapest uk supplier viagra
viagra cialis
generic viagra safe
viagra professional
viagra sales
viagra free trial pack
viagra lawyers
over the counter viagra
best price for generic viagra
viagra jokes
buying viagra
viagra samples
viagra sample
cialis
generic cialis
cheapest cialis
buy cialis online
buying generic cialis
cialis for order
what are the side effects of cialis
buy generic cialis
what is the generic name for cialis
cheap cialis
cialis online
buy cialis
cialis side effects
how long does cialis last
cialis forum
cialis lawyer ohio
cialis attorneys
cialis attorney columbus
cialis injury lawyer ohio
cialis injury attorney ohio
cialis injury lawyer columbus
prices cialis
cialis lawyers
viagra cialis levitra
cialis lawyer columbus
online generic cialis
daily cialis
cialis injury attorney columbus
cialis attorney ohio
cialis cost
cialis professional
cialis super active
how does cialis work
what does cialis look like
cialis drug
viagra cialis
cialis to buy new zealand
cialis without prescription
free cialis
cialis soft tabs
discount cialis
cialis generic
generic cialis from india
cheap cialis sale online
cialis daily
cialis reviews
cialis generico
how can i take cialis
cheap cialis si
cialis vs viagra
levitra
generic levitra
levitra attorneys
what is better viagra or levitra
viagra cialis levitra
levitra side effects
buy levitra
levitra online
levitra dangers
how does levitra work
levitra lawyers
what is the difference between levitra and viagra
levitra versus viagra
which works better viagra or levitra
buy levitra and overnight shipping
levitra vs viagra
canidan pharmacies levitra
how long does levitra last
viagra cialis levitra
levitra acheter
comprare levitra
levitra ohne rezept
levitra 20mg
levitra senza ricetta
cheapest generic levitra
levitra compra
cheap levitra
levitra overnight
levitra generika
levitra kaufen


9 Responses to “Sitemap generation”

  1. Alex Sancho Says:

    Just one word, brilliant. Keep the good work.

  2. theShark Says:

    Great :)

  3. mrks Says:

    +1 :)

    btw
    i had add ‘unique location entry’ functionallity

  4. dlib Says:

    I’ll look into it

  5. dlib Says:

    In SVN there is now a check for url existence, if url exists it’s not added but no exception or anything is thrown.

  6. James Says:

    Great article, thanks for the information. In my view sitemaps are critical to getting your site listed in Google/Yahoo etc. correctly.

  7. 486 Says:

    Very useful library! Unfortunately, google throws a fit if your sitemap exceeds 50k URLs. Any ideas to easily break it down into multiple files and ping them?

  8. barat Says:

    Well … I have a problem … when I do something like this:

    $sitemap->add_url(’http://www.example.com’,'2008-05-31′,’weekly’,0.8)

    I get url’s like

    http://www.organizatorzyimprez.pl/firmy.html
    daily
    0,9

    with comma , and google wants dot …if not -> Warning

  9. LeshkaSaD Says:

    Yeah, it’s seem to it’s sitemap library issue. Google waits for dot-separated number instead comma. On my local pc it’s shows me dot-separated numbers, on production - comma :/

Leave a Comment