Implementing Sitemaps in Django for Dynamic and Static urls

Sitemap is a very important piece of information for any website which help search engines index your website. Its a simple XML file having information about all the urls of your website with priority, change frequency and other parameters as well. This helps search engines to learn about the site’s structure and index your website.

Generating sitemaps is pretty easy for a static website, there are many online sitemap generator websites available for the same which crawls your website and generates the xml file. You can download and upload this file in the root folder of your website. But this method will not work if you have dynamic urls or the content is not well-linked to each other.

There is a good news for django users as django has built in sitemap generation framework which makes this task very easy. In this article, I will be covering how to generate sitemaps containing both static as well as dynamic urls.

 

Installation & Initialization

Modify your INSTALLED_APPS to include the following two apps

INSTALLED_APPS = (
    ...
    'django.contrib.sites',
    'django.contrib.sitemaps',
    ...
)

Now open you primary URLConf file and define the sitemap url there

from django.contrib.sitemaps.views import sitemap

url(r'^sitemap\.xml$', sitemap, {'sitemaps': sitemaps}, name='django.contrib.sitemaps.views.sitemap'),

This tells django to build the sitemap whenever /sitemap.xml is accessed. The key thing here is the additional argument sitemaps. This is a dictionary which maps labels with its Sitemap class. You have to define this Sitemap class separately and it must subclass django.contrib.sitemaps.Sitemap. We will talk about this in detail in a while.

Lets take a use case where you have a collection of static and dynamic urls for which you have to generate a sitemap. You static urls are defined in a separate app named “home” and the dynamic ones are generated based on your “product” model.

 

Generating Sitemap for Dynamic Urls

Assuming you have lot of products on your website and you want your sitemap to include links of all your active products. First create a sitemaps.py file in “home” app with the following content. This file can reside anywhere in your codebase. For this example, I am defining it in your “home” app. Here’s how your Sitemap class will look


from django.contrib.sitemaps import Sitemap
from product.models import Product

class ProductSitemap(Sitemap):
    changefreq = "weekly"
    priority = 0.7

    def items(self):
       return Product.objects.filter(isActive=True)
 
    def lastmod(self, item): 
       return item.modifiedDate

Note that changefreq and priority are class attributes, but they can also be defined as functions like lastmod in the above example.

We have also defined the items method which return all the active products.

Note that we haven’t specified the urls of any object. There is a method location for this purpose. By default, location() calls get_absolute_url() method on each object to get its url. So in the Product model class, you have to define get_absolute_url method. An example is shown below


class Product(models.Model):
    category = models.CharField(max_length=100)
    isActive = models.BooleanField(default=True, db_index=True)
    name = models.CharField(max_length=100)
    ....

    def get_absolute_url(self):
        return '/'+self.name+'-'+self.category+'/'

This takes care of generating the sitemap for your dynamic urls. We still have to reference this class in the URLConf file, but first lets jump on how to create Sitemap class for static urls.

 

Generating Sitemap for Static Urls

In case of static urls where the url doesn’t correspond to a model object, the solution is to explicitly list all the urls and call reverse in the location method of the sitemap. The below example explains how to define the Sitemap class for this use case. Lets add static urls defined in your home.urls file to our already created sitemaps file.


from home.urls import urlpatterns as homeUrls
from django.core.urlresolvers import reverse

class StaticSitemap(Sitemap):
     priority = 0.8
     changefreq = 'weekly'

     # The below method returns all urls defined in urls.py file
     def items(self):
        mylist = [ ]
        for url in homeUrls:
            mylist.append('home:'+url.name) 
        return mylist

     def location(self, item):
         return reverse(item)

Here I have defined an extra “location” method in StaticSitemap class to specify the url of the object. Now that your Sitemap classes are defined, final step required is to utilize them which is explained below.

 

Using Sitemap classes to generate sitemap

You have to modify the primary URLConf file where you defined the sitemap url. Just create a dictionary with your Sitemap classes and pass it to the sitemap view. An example is shown below


from django.contrib.sitemaps.views import sitemap
from home.sitemaps import *

# Dictionary containing your sitemap classes
sitemaps = {
   'products': ProductSitemap(),
   'static': StaticSitemap(),
}

urlpatterns = [    ....,
    url(r'^sitemap\.xml$', sitemap, {'sitemaps': sitemaps}, name='django.contrib.sitemaps.views.sitemap'),
    ....,
  ]

Congratulations, just hit /sitemap.xml in your browser and you will see the generated xml file.

 

Pinging Google

One more important thing is to let google know about changes in your sitemap so that your site can be re-indexed. The sitemaps framework provides a function just to do that django.contrib.sitemaps.ping_google()

Whenever there is change in your sitemap, you can call this function. An important place to call it is the model’s save method, but it totally depends on your logic and various other factors which can change your sitemap. A better approach is to schedule a cron job for the same instead of adding network overhead in your model’s save function.

Make sure to register your site with google webmaster tools, otherwise this command will not work.

 

Creating a Sitemap Index File

Please note that your sitemap file should not exceed 50,000 URL entries or 50 MB in size (uncompressed). If you need more entries, you need to use a sitemap index file. A sitemap index file references other sitemap files rather than page URLs directly. The process to generate the sitemap index file is nearly same, the difference in usage is listed below

The relevant URLConf lines would look like this

from django.contrib.sitemaps import views

urlpatterns = [
    ....,
    url(r'^sitemap\.xml$', views.index, {'sitemaps': sitemaps}),
    url(r'^sitemap-(?P<section>.+)\.xml$', views.sitemap, {'sitemaps': sitemaps}),
    ....,
]

This will automatically generate a sitemap.xml file that references both sitemap-products.xml and sitemap-static.xml. The Sitemap classes and the sitemaps dictionary don’t change at all.

A Sitemap index also can include up to 50,000 entries, which theoretically means you can submit up to 2.5 billion URLs from your site to search.

 

I hope you find this article helpful. Let me know if you have any suggestions/ feedback in the comments section below.

Fun FactGame of Thrones season 6 is back, and its episode 4 is also titled as the book of stranger 🙂

 

3 thoughts on “Implementing Sitemaps in Django for Dynamic and Static urls

Come on, I know you want to say it