Variable Visions

XML Site Map Generator

Published Mon. Sep. 24, 2012

Set up both a web page of all links and an XML site-map for both users and bots. Don't forget to include your site-map file in your robots.txt file.

Your site-map is as important to users as it is to search engine bots and crawlers. The standard format is an xml document in the root directory of your web site and should look similar to the following:


<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.your-site.com/about.php</loc>
<lastmod>2011-11-16</lastmod>
<changefreq>monthly</changefreq>
<priority>0.9</priority>
</url>
<url>
<loc>http://www.your-site.com/contact.php</loc>
<lastmod>2011-11-16</lastmod>
<changefreq>monthly</changefreq>
<priority>0.9</priority>
</url>
</urlset>



Site-maps are indexed by web crawlers and are important for SEO. However, in order for the bots to find your site-map.xml file, you must include the following line of code in your robots.txt file (which also resides in the root directory of your web site):
SITEMAP: http://www.your-site.com/site-map.xml
There are tools used to create site-maps from static pages, but what if you do not use standard directories but database tables to populate content on your site? Or what if you do not want to rely on a provided xml creation tool?
For human visitors, I modified my SQL statement used in other sections of the web site to display content from a database to display ALL the records in the table, only showing the headlines. (Make sure you wrap anchor tags around these so they are actual links to the articles) This allows users to view this one page and see every article and page published.
But how did I create the XML version for the bots? How did I set this up to automatically add new articles and new page links?
The problem with site-map creation is that any content you add to your site after you create the site-map will not be included. I decided to use PHP, XML, and XHTML to create a page that automatically generates the xml site-map code needed. This code can easily be copied and pasted into the site-map.xml file in the root directory of your web site. Although I am still working on writing a php script to use fwrite to write the updated xml code directly to the site-map.xml file each time a new article is published.
I used a php do/while loop to display the xml sitemap code (after replacing all < with &lt; and > with &gt; so that the code is displayed and not executed by the browser). I then used php inside the xml file to echo the name of the article and loop through all the records on the database. In this case, I used PHP to build the dynamic URL. Keep the site-map header and closing tag outside the loop...we only need these tags once...everything else goes in the loop to echo the url and date modified from the database.


&lt;?xml version="1.0" encoding="UTF-8"?&gt;<br />
&lt;urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"&gt;<br /><br />

<?php do { ?>
 &lt;url&gt;<br />
    &lt;loc&gt;http://www.variablevisions.com/articles/<?php echo $row_new9['title'];?>&lt;/loc&gt;<br />
    &lt;lastmod&gt;<?php
           $x = strtotime( $row_new9['updated'] );
           $y = date( 'Y-n-j', $x ); echo "$y";
          
          
      ?>&lt;/lastmod&gt;<br />
    &lt;changefreq&gt;monthly&lt;/changefreq&gt;<br />
    &lt;priority&gt;0.8&lt;/priority&gt;<br />
  &lt;/url><br /><br />
  <?php } while ($row_new9 = mysql_fetch_assoc($new9));
mysql_free_result($new9);
    ?>
  <?php do { ?>
 
  &lt;url&gt;<br />
    &lt;loc&gt;http://www.variablevisions.com/<?php echo $row_new10['tag'];?>&lt;/loc&gt;<br />
    &lt;lastmod&gt;<?php
           $date = date( 'Y-n-j' );
           echo "$date";
      ?>&lt;/lastmod&gt;<br />
    &lt;changefreq&gt;monthly&lt;/changefreq&gt;<br />
    &lt;priority&gt;0.9&lt;/priority&gt;<br />
  &lt;/url><br /><br />
<?php } while ($row_new10 = mysql_fetch_assoc($new10));
mysql_free_result($new10);
   
    ?>
  
    <br />&lt;/urlset&gt;<br />
   
You may need to have a few different loops if you use multiple tables on a database to display content, and if you are going to include the main page links (about, contact, etc.) into the site-map.
The benefit to this method is that newly added articles will be automatically updated into the code. Make sure you move the static xml head code outside of the loop...we only want the specifics to loop.
Next we will remove any and all manual work by writing the output of the script directly to the site-map.xml file, using fwrite, when a new article post is inserted into the database.

The last step is the write the updated site map xml to a file each time a new article is published. I use PHP's fwrite and a do while loop to go through two tables on my databse and get the SEF URLs into the site map.

Keywords:XML, PHP, fwrite, fopen, fseek