Using SPAllSitesJobDefinition to Process Site by Site

Introduction

Timer Jobs are commonly used by the SharePoint platform to perform various tasks from maintenance such as cleaning up dead sites and old history to processing ratings and notifications. There is even a Timer Job to recycle the timer service that manages the Timer Jobs (Timer Job Recycle). Developers commonly create, deploy and schedule Timer Jobs to solve business requirements that happen on a schedule or may need to run absent a user or user action. Timer Jobs unlike a Windows Scheduled Task works with the SharePoint farm can easily interact with other parts of the farm making them a better solution for schedule tasks in SharePoint.

There are over a one hundred Job Definitions which define available Timer Jobs many are created for a specific services or process and cannot be extended. Some are abstract which allows a developer to extend and create a custom Timer Job. The SPAllSitesJobDefinition class is abstract and can be used to create a custom Timer Job. SPAllSitesJobDefinition’s specialty is to process each site contained in a Web application associated to the Timer Job. This post will focus on extending the abstract SPAllSitesJobDefinition to create a simple list of sites and webs contained in a Web application to show how simple it is to iterate all sites in a Web application.

Hierarchical View of the SPAllSitesJobDefinition

NOTE: Timer Jobs is the terminology used in Central Admin interface. Timer Jobs can be enabled, disable and scheduled. Developers create Job Definitions using SPJobDefintion or derived class. Custom SPJobDefinitions are added or removed from the farm. You manage an existing Job Definition using code, PowerShell or the Timer Job interface in Central Admin. For this post I generally use Timer Job as the scheduled instance of a developer-created SPJobDefinition.

To understand the SPAllSitesJobDefinition class we should look at the inheritance hierarchy. Each level in the hierarchy provides added functionality to the specific job definition.

SPJobDefinition

The SPJobDefinition class provides us with a base to create, deploy and schedule a job. This class is the base of all Jobs in SharePoint 2010. As an abstract class it does nothing of particular other than forming the base of all schedulable jobs. Key to the processing cycle of Job Definitions is the Execute method. It is this method that starts the job-specific processing. Being abstract – this class cannot be instantiated as a usable, concrete job. A derived class must be created. Depending on your farm deployment you may have upwards of one hundred JobDefinitions derived directly from SPJobDefintion, many of them abstract as well.

SPPausableJobDefinition

SPPausableJobDefinition derives from SPJobDefinition and is another abstract class. This abstract class provides the base for serializing the state of the processing (depending on the specific concrete SPJobDefintion implementation) allowing for “pausing” or restarting of the processing when the timer service stops and restarts.

SPContentDatabaseJobDefinition

Deriving from SPPausableJobDefinition the SPContentDatabaseJobDefinition is another abstract class. This SPJobDefinition serves up SPContentDatabase objects to the Execute method allowing a derived class to process each database in an associated Web application. There are many concrete classes deriving from the SPContentDatabaseJobDefinition including:

  • SPSiteDeletionJobDefinition
  • SPStorageMetricsProcessingJobDefinition
  • SPAllSitesJobDefinition

SPAllSitesJobDefinition

This class is yet another abstract class and extends SPContentDatabaseJobDefinition. This abstract class implements the abstract method Execute(SPContentDatabase, SPJobState) of SPContentDatabaseJobDefinition. This version of the Execute method will be fired once per SPContentDatabase in the associated Web application. For each firing of the Execute method the abstract class will call the abstract method ProcessSite(SPSite, SPJobState). It is this method you will need to implement in your derived class to process each site.

The Execute (SPContentDatabase, SPJobState) method of the SPContentDatabaseJobDefinition class also implements the state tracking of the processing allowing pausing and restarting of the job. The level of tracking is at the site level – the abstract class implements code to track the last successful processed site and will attempt to move to the next site on a restart.

SPAllSitesJobDefinition is an abstract class, it can’t be instantiated and if it could, it really does nothing of value by itself. It has no real processing, it is “abstract”. When requirements allow for processing site-by-site inherit from SPAllSitesJobDefinition. This post will demonstrate extending the SPAllSitesJobDefinition class and implement the ProcessSite(SPSite, SPJobState) method defined in the abstract class.

Working with Job Definitions

Job definitions provide the framework for schedule tasks in SharePoint and therefore working with a job definition is more about the specific business task to be accomplished then the job definition itself. The SharePoint server API provides many abstract job definitions that can be used to create a new job definition. There is usually no reason to inherit directly from SPJobDefinition with so many classes are available for particular scenarios such as processing all sites in a Web application.

Creating a custom SPAllSitesJobDefinition is similar to creating almost any other custom SPJobDefinition. Simply create a class which extends SPAllSitesJobDefinition and implement the ProcessSite(SPSite, SPJobState) method. It is in the ProcessSite method that your code will be provided the current SPSite object and where you generally call you custom processing code.

Here are a few points to remember when developing and testing custom Job Definitions, not specific to SPAllSitesJobDefinition development.

  1. Custom Job Definitions must be deployed. Solutions and features with a feature received should be used to for deployment
  2. Custom Job Definitions should be removed properly when the feature is deactivated and/or the solution is retracted/removed.
  3. Job Definitions do not run in the W3WP process. Job Definitions run under the OWS.exe process.
  4. Debugging a custom job definition requires the debugger to attach to the OWS.exe process.
  5. Debugging the feature receiver deployment is easier if the project is NOT set to activate on deployment. Turning off automatic activation in the SharePoint project allows the developer to attach to the OWS process before the feature receiver events fire. The feature receiver events installs and removes the job definition from the farm. It commonly sets the scheduled and associated the feature with a parent object.
  6. Manually restarting the SharePoint Timer Service is common to allow the process to release the assembly and reload the new assembly after a deployment. If you are not seeing what you expect while processing, the SharePoint Timer Service (OWS.exe) may not have loaded the newest version of the assembly. A restart of the service will result in a reload of the assembly.
  7. The Job Definition runs under the account of the SharePoint Timer Service and not under the account of any specific SharePoint user.

This post is not about the general details of creating, deploying and scheduling a timer job. This post is about using the SPAllSitesJobDefinition to process each site in a Web application. To learn the basics of creating, deploying and scheduling a timer job visit “Creating Timer Jobs in SharePoint 2010 That Target Specific Web Applications

 

The Demo Application

The Microsoft Patterns and Practices team created guidelines for SharePoint 2010. I along with others were reviewing and commenting on the guidance as the content was created. The free and I think valuable guidance is available at http://msdn.microsoft.com/en-us/library/ff770300.aspx. The book covers data models in Part Two. Specifically it discusses Aggregating List Views and in particular it mentions creating a list of lists or a list of sites which are commonly a list of hyperlinks usually maintained or populated using a SharePoint workflow or timer job. This demo will use a custom SPJobDefinition inheriting from SPAllSitesJobDefinition to populate a well-known list with site and web information.

Note: The goal of the following demo is demonstrate creating a new job definition based off of the SPAllSitesJobDefinition. The code that follows is not production ready.

I will not discuss each line of code. I have provided a link to MSDN above if you need to understand the basics of creating a new SPJobDefintion. The code for this demo project is available at: https://skydrive.live.com/redir?resid=C1BF05785026FF91!107

Setup

The demo project consists of

  • Custom list definition with two content types, Site Info and Web Info
  • Single class, SiteListAggregationJobDefinition, that extends SPAllSitesJobDefinition
  • Site-scoped feature deploying the list definition and creating a list instance
  • Web application –scoped feature with event receiver to add, remove and schedule the job and associated the job with a context Web application.

The site-scoped feature should be activated on the Central Admin root site. This is where the job will look for the custom list. The Web application-scoped feature should be activated on any Web applications (using Central Admin or PowerShell) that you want to be included in the list of sites.

SiteListAggregationJobDefinition

The SiteListAggregationJobDefinition class is the class that inherits from SPAllSitesJobDefinition. The only exciding code is the code that implements the Execute and ProcessSites methods. These methods are abstract methods in the various inherited classes.

In this demo there is no actual need to implement the Execute method. It was implemented to allow the reader to add a breakpoint and understand the order of calls Execute (SPContentDatabase, SPJobStatus) to ProcessSite(SPSite, SPJobStatus). The pseudo code goes like this:

For each( SPContentDatabase in Web application)
{
   Raise Execute (SPContentDatabase, SPJobStatus)
   {
     For each SPSite in SPContentDatabase.Sites
     {
        Raise ProcessSite(SPSite, SPJobStatus)
     }
   }
}

Most of the code in the ProcessSites implementation is standard recursive SharePoint code. Each ProcessSite call provides a single SPSite, the current site. The algorithm is simple.

  1. First we retrieve a reference to the well-known list.
  2. Delete all items with a Site Id matching the current SPSite id.
  3. Add a new list item for the SPSite data
  4. Recursively process each SPWeb, adding a new list item per web

This code is standard SharePoint server object model code. Nothing fancy and for demo purposes no error handling.

Below is the key methods in the SiteListAggregationJobDefinition class

public override void Execute(SPContentDatabase contentDatabase,
                             SPJobState jobState)
{
   base.Execute(contentDatabase, jobState);
}

public override void ProcessSite(Microsoft.SharePoint.SPSite site,
                                 SPJobState jobState)
{
   ProcessSiteForListItems(site);
}
public void ProcessSiteForListItems(Microsoft.SharePoint.SPSite site)
{
   Microsoft.SharePoint.Administration.SPAdministrationWebApplication
        centralAdminWeb = SPAdministrationWebApplication.Local;

   int webCnt = 0;

   using (SPWeb rootAdminWeb = centralAdminWeb.Sites[0].RootWeb)
   {
      list = rootAdminWeb.Lists["Site List"];
      Guid siteId = site.ID;

     //Delete existing webs and site collection by site collection id.
     DeleteWebsAndSiteBySiteId(list, siteId.ToString());

     SPListItem item = list.AddItem();
     item["Aptillon_SiteId"] = siteId;
     SPFieldUrlValue val = new SPFieldUrlValue();
     val.Description = site.RootWeb.Url;
     val.Url = site.RootWeb.Url;
     item["Aptillon_Url"] = val;
     item["Aptillon_SiteOwner"] = site.Owner.Name.ToString();
     item["Aptillon_WebCount"] = webCnt.ToString();
     item["Title"] = string.Format("Site: {0}", site.Url)
     item["ContentType"] = "Web Info";
     item.Update();

     using (SPWeb w = site.OpenWeb())
     {
        webCnt = ProcessWeb(w, siteId);
     }
  }
}
public int ProcessWeb(SPWeb web, Guid siteId)
{
   int webCnt = 0;
   foreach(SPWeb w in web.Webs)
   {
     webCnt += ProcessWeb(w, siteId);
   }

   SPListItem item = list.AddItem();
   item["Title"] = web.Title;
   item["ContentType"] = "Web Info";
   item["Aptillon_SiteId"] = siteId.ToString();
   SPFieldUrlValue val = new SPFieldUrlValue();
   val.Description = web.Url.ToString();
   val.Url = web.Url.ToString();
   item["Aptillon_Url"] = val;
   item["Aptillon_ListCount"] = web.Lists.Count.ToString();
   item["Aptillon_Description"] = web.Description;
   item["Aptillon_WebCount"] = web.Webs.Count.ToString();
   item.Update();

   return web.Webs.Count;
}
private void DeleteWebsAndSiteBySiteId(SPList l, string siteId)
{
   SPListItemCollection results;
   string q = string.Format("<Where><Eq><FieldRef Name='Aptillon_SiteId' />
           <Value Type='Text'>{0}</Value></Eq></Where>", siteId);

   var query = new SPQuery
   {
      Query = q
   };

   results = l.GetItems(query);
   if (results.Count > 0)
   {
      for (int x = results.Count - 1; x >= 0; x--)
      {
         results[x].Delete();
      }
   }
}

That is all there is to the code. There is actually more code recursively processing the webs and adding list items then there is for the framed out job definition. Using the SPAllSitesJobDefinition as a base class for job definition provides us free access for site-by-site processing.

A Look at the Finished Product

After the solution has been deployed and you have activated the site-scoped feature in Central Admin and the Web application-scoped feature on all Web applications you want processed you will have an empty list in Central Admin until the timer job executes.

EmptyList

The demo code sets an hourly schedule during the install. To run the timer job manually:

1. Open Central Admin

2. Navigate to _admin/ServiceJobDefinitions.aspx

AllSitesTimerJobs

3. Click on the Aptillon Demo Site List Aggregation Timer Job

TimerJobManagementPage

4. Click Run Now. The timer job may take a minute to start. Immediate is not exactly immediate

5. Review the Site List for sites and web information.

SiteListWithSitesAndWebs

 

Summary

Timer jobs are used to schedule one-timer or recurring SharePoint tasks. Developers create classes that inherit off of SPJobDefinition or a dervived child of SPJobDefinition. This post topic was an example of extending the virtual SPAllSitesJobDefinition class to process sites contained in SPContentDatabases of a specific Web application. Extending off of SPAllSitesJobDefinition provides the base functionality we need to process site-by-site. In this post we extended the virtual SPAllSitesJobDefinition and implemented the Execute and ProcessSite. The ProcessSite method is heavy lifter of this demo. The implementation of Execute method was to allow a reader to view the call progression from Execute to ProcessSite.

Leave a Reply