In this post, we will see how to create dynamic sitemaps in Next.js.
What Is a Sitemap?
A sitemap is an XML file where you provide information about the pages on your website. A sitemap tells Search engines such as Google, which pages are important on your website and need to be indexed.
Add a Sitemap to a Next.js Website
To add a dynamic sitemap to a Next.js website we will use next-sitemap package.
Installation
yarn add next-sitemap
// or
npm install next-sitemap
Create config file
next-sitemap
requires a basic config file next-sitemap.config.js
under your project root
// next-sitemap.config.js
/** @type {import('next-sitemap').IConfig} */
const siteUrl = 'https://example.com';
const config = {
siteUrl
};
Build the sitemap
Add next-sitemap
as your postbuild script
{
"build": "next build",
"postbuild": "next-sitemap"
}
To build the sitemap, run npm run build
or yarn build
. By doing this, you will build the website and automatically create a sitemap.xml
file in the public
folder.
The sitemap.xml
contains the list of URLs generated.
To tell Google where to find the sitemap.xml
file and which URLs the crawler can access on your website, you have to add a robots.txt
file.
Create a robots.txt file
To automatically create a robots.txt
file, you need to add the generateRobotsTxt
option in the config file
// next-sitemap.config.js
/** @type {import('next-sitemap').IConfig} */
const siteUrl = 'https://example.com';
const config = {
siteUrl,
generateRobotsTxt: true // generates robots.txt
};
Prevent Google from Indexing certain web pages
In some cases, we want to prevent Google from indexing certain pages from our website.
The first step is to exclude a certain page URL from the sitemap list.
// next-sitemap.config.js
/** @type {import('next-sitemap').IConfig} */
const siteUrl = 'https://example.com';
const config = {
siteUrl,
generateRobotsTxt: true,
exclude: ['/protected-page', '/secret-page'] // exlude here
};
Even excluding certain pages from the sitemap list, Google can still find and index them. To be sure that you completely exclude these pages from being indexed, you have to add some policies to the robots.txt
file. To do that you have to add robotsTxtOptions
in the config file
// next-sitemap.config.js
/** @type {import('next-sitemap').IConfig} */
const siteUrl = 'https://example.com';
const config = {
siteUrl,
generateRobotsTxt: true,
exclude: ['/protected-page', '/secret-page'], // exlude here
robotsTxtOptions: {
policies: [
{ userAgent: '*', disallow: '/protected-page' }, // not indexed
{ userAgent: '*', disallow: '/secret-page' }, // not indexed
{ userAgent: '*', allow: '/' }, // index the rest of the pages
],
},
};
The above configuration will generate a robots.txt
file like this.
# *
User-agent: *
Disallow: /protected-page
# *
User-agent: *
Disallow: /secret-page
# *
User-agent: *
Allow: /
# Host
Host: https://www.example.com
# Sitemaps
Sitemap: https://www.example.com/sitemap.xml
Generating dynamic/server-side sitemaps
To generate a dynamic/server-side sitemap, create pages/server-sitemap-index.xml/index.tsx
page and add the following content.
The robots.txt
will tell Google to index all the URLs generated in the sitemap.xml
file.
// pages/server-sitemap.xml/index.tsx
import { GetServerSideProps } from 'next';
import { getServerSideSitemap, ISitemapField } from 'next-sitemap';
export const getServerSideProps: GetServerSideProps = async (ctx) => {
// Method to source urls from cms
// const response = await fetch('https//example.com/api')
// const fields = await response.json()
const fields: ISitemapField[] = [
{
loc: 'https://example.com', // Absolute url
lastmod: new Date().toISOString(),
},
{
loc: 'https://example.com/dynamic-path-2', // Absolute url
lastmod: new Date().toISOString(),
},
];
return getServerSideSitemap(ctx, fields);
};
// Default export to prevent next.js errors
export default function Sitemap() {}
Now, next.js
is serving the dynamic sitemap from http://localhost:3000/server-sitemap.xml
.
List the dynamic sitemap page in robotsTxtOptions.additionalSitemaps
and exclude this path from the static sitemap list.
// next-sitemap.config.js
/** @type {import('next-sitemap').IConfig} */
const siteUrl = 'https://example.com';
const config = {
siteUrl,
generateRobotsTxt: true,
exclude: ['/server-sitemap.xml'], // exclude here
robotsTxtOptions: {
additionalSitemaps: [
`${siteUrl}/server-sitemap.xml`, // add here
],
},
};
export default config;
In this way, next-sitemap
will manage the sitemaps for all your static pages and your dynamic sitemap will be listed on robots.txt
.
Exclude generated files from the git commit
The last thing to do is to exclude generated files from git commit
. New sitemap.xml
and robots.txt
files will be generated on the server on each build.
# .gitignore
/public/sitemap.xml
/public/robots.txt