Simple Publishing Page Import to SharePoint

This article covers how to import publishing pages to SharePoint.

This can help you if you need to bulk create pages in SharePoint and you have the data stored in a database or spreadsheet or if you need to migrate pages from another content management system such as WordPress.

yourpages

Scenario

We are going to assume that you have your page data in a database.  For this example we are going to use MySQL and the WordPress schema.  As you will see in the coming steps this could just as easily be any OleDB or ODBC data source and any schema.

Import Tool

We will use the free Import for SharePoint toolset to import the pages.

You download it from here.

When you download and install the import tool you will have full documentation and additional example import configuration files which will help further.

The Source

Column

The source data must contain a column with the HTML mark-up in it.

In our example content this is called ‘PageContent’.  Inside it looks a bit like this.

Select Statement

Now from the database source we need to ‘Select’ the data that will create our pages.  Since we are working with WordPress in this example the data is in wp_posts as we see below.

This select statement is also, cleverly, giving us the destination page name and setting up the page to be automatically published.

Import Configuration File

This is the file that tells Import for SharePoint how to create the pages in SharePoint.

The schema is fully explained in the documentation but the important bits for this exercise are explained here;

DestinationItemType

So we want to create publishing pages (We could alternatively create wiki pages, modern SharePoint pages, site pages or blog posts) but lets stick to the most common (publishing pages) for now.

<DestinationItemType>PublishingPage</DestinationItemType>

PageLayoutASPXName

So if your source select statement does not have a column of this name then “ArticleLeft.aspx” will be used.  If you want to use another page layout then ensure you select statement returns a column of this name containing the name of your desired page layout.

<PageLayoutASPXName>PageLayoutASPXName</PageLayoutASPXName>

ImportMapping

This bit maps your HTML data (in the column PageContent) to the SharePoint page content (field Page Content).

<ImportMapping xsi:type=”ImportMapping_String”>
<DestinationField>Page Content</DestinationField>
<SourceColumn>PageContent</SourceColumn>
</ImportMapping>

Execution

Ok so rather than re-invent the wheel we’ll let you read the documentation installed with Import for SharePoint on this one.

Result

Ok so originally in WordPress the page looked like this.

source

And now in out of the box SharePoint it looks like this.

result

Great, but seems a bit simplistic

Ok so we have shown how to import publishing pages into SharePoint.

Realistically a project is always going to be more complicated than that.

So lets talk about real life….

Targeting Branded SharePoint

Page Layout

So the destination is likely to be branded?  That’s no problem we’ve already talked about PageLayoutASPXName and custom branded SharePoint really just means using a different page layout.

Content Type Fields

But the destination page has extra fields, like managed meta data “Tags”, a Byline, an Article Date?  Again no problem you just need more of these ImportMappings to map data from your source into those additional SharePoint fields.

<ImportMapping xsi:type=”ImportMapping_String”>
<DestinationField>Title</DestinationField>
<SourceColumn>post_title</SourceColumn>
</ImportMapping>

Data Manipulation

So what if the source data is not in the exact format that SharePoint needs?

No problem this manipulation can be done in SQL as shown below.

WordPress was never going to contain a column giving us a file name like “MyPage.Aspx” so we create one on the fly using concat here.

If (when?) your manipulations get too complex for inclusion in the SQL statement (on the fly) you can directly manipulate the source table, just make sure you take precautions if the source data is used by anything else (like working from a copy).

So what does this get used for?

We have seen this approach used for the following;

  • Legacy Content Management System (CMS) migration.
  • Bulk creation of pages from Excel
  • Scan to Mark-Up / Republishing – Loading data that has been scanned and OCR’d into pages.
  • WordPress to SharePoint Migration
  • Drupal to SharePoint Migration
  • Joomla to SharePoint Migration
  • Custom Intranet to SharePoint Migration

Great, Makes more sense now but I’m still an bit unsure

No problem just get in touch.

Share