Version Information: This was written based on WordPress 4.6.1

I recently was tasked with migrating content from a couple of very old Drupal sites to WordPress. One was Drupal 5 and one was Drupal 6. Using the Digital Ambit Data Integration Framework and mimicking the design patterns used in the more modern Drupal Migration API I was able to put together a dirty plug-in that I have put up on github.

It uses a custom mysql table to track what content has been migrated successfully. This allows me to do the migration in batches and also allows me to create external processes that act on the data. I used external processes to update URLs to new patterns, create .htaccess redirects, and replace old image links with new image links.

The code portion of the project hinges on WordPress' plug-in system and the wp_insert_post() function. This function takes a data array documented at codex.wordpress.com. When that function is successful it returns the new WordPress Post ID allowing us to use the add_post_meta function to add additional data. The rest of the plug-in code is there to facilitate the WordPress admin menu system and to select and mold the Drupal content into the format WordPress expects. Below are the details.

Data Mapping

Drupal Database Table.Field WordPress wp_insert_post Data Array Field Notes
node.nid N/A This is not stored in WordPress, but is stored in the Drupal NID to Wordpress Post ID helper table.
N/A ID Hard coded to 0
node.uid post_author I use a helper function to map node.uid to post_author ID and setup the users in WordPress before running the migration.
node_revisions.title post_title
node_revisions.body post_content
node_revisions.teaser post_excerpt
node_revisions.timestamp post_date Formated in Y-m-d
node.status post_status Map Drupal's 1 or 0 bit to "publish" or "draft"
term_data.name post_category I use a helper function to map term_data.name to the correct WordPress Category ID and setup the Categories in WordPress before running the migration.
N/A post_type Hardcoded to "post" for this migration but a helper function of node.type could be used.
N/A comment_status Hardcoded to "closed"

Standard WordPress Plug-In

The part of the code that makes it show up as a menu item/page in the wordpress admin menu I took straight from the documentation. I added this to the plugins directory as a single file.

/*
Plugin Name: Generic Content Import Tool
Description: One Time Import of Old Drupal Content
*/


// standard wordpress meny hook
add_action( 'admin_menu', 'drupal_import_menu' );

/**
* A standard WordPress plugin menu item
*/

function drupal_import_menu()
{
add_menu_page( "Drupal Import", "Drupal Import", 'import', 'drupal-import', show_drupal_import_menu );
}

/**
* A standard WordPress plugin menu page
* This has an "if is a POST, process the form"
* and then shows the form through normal echo of HTML
*/

function show_drupal_import_menu()
{
if ( !current_user_can( 'import' ) ) {
wp_die( __( 'You do not have sufficient permissions to access this page.' ) );
}

if( isset($_POST[ 'drupal_import_hidden' ]) && $_POST[ 'drupal_import_hidden' ] == 'Y' ) {

// the bulk of the logic goes here
}

echo '<form method="post" style="padding: 50px">';
echo '<input type="hidden" name="drupal_import_hidden" value="Y" />';
echo '<input type="submit" value="Process" />';
echo '</form>';
}

The Data Selection and Mapping Code

The helper functions that do the data mappings are straight forward if you are familiar with the Drupal Node and Node Revisions tables.

function load_data($db, $limit = 10)
{
// get X nodes where X is the passed in limit
$sql = sprintf("select n.nid, n.uid, nr.title, nr.body, nr.teaser, nr.timestamp, n.status from node_revisions nr join node n on n.nid = nr.nid and n.vid = nr.vid and n.type = 'story' left join drupal_import_nid_map m on nr.nid = m.drupal_nid where m.wordpress_post_id is null order by n.created limit %d", $limit);
$rows = $db->get_results($sql);
$data = [];
foreach ($rows as $obj) {

// grab the categories associated with this node and
// store them in an array of categoryIDs
$categories = [];
$get_categories_sql = sprintf("select name from term_data d join term_node n on n.tid = d.tid and n.nid = %d", $obj->nid);
$cats = $db->get_results($get_categories_sql);
foreach ($cats as $catObj) {
$categories[] = get_wp_cat_id($catObj->name);
}

$data[] = [
'ID' => 0,
'post_author' => get_wp_user_id($obj->uid), // maps drupal UID to WP User ID
'post_date' => date('Y-m-d', $obj->timestamp), // Formats Drupal Timestamp to standard date
'post_content' => $obj->body, // HTML
'post_title' => $obj->title,
'post_excerpt' => $obj->teaser,
'post_status' => $obj->status == 1 ? 'publish' : 'draft',
'post_type' => 'post', // I only have one type, but helper function could be used if you have more
'comment_status' => 'closed',
'post_category' => $categories, // see above
'drupal_nid' => $obj->nid // only used by a later function, ignored by WordPress
];
}
return $data;
}

function get_wp_cat_id($drupal_cat_name)
{
switch ($drupal_cat_name) {
case 'Old Category Name':
return 2; // New Category ID
break;
default:
return 1; // uncategorized
break;
}
}

function get_wp_user_id($drupal_uid)
{
switch ($drupal_uid) {
case 301:
return 2; // New User ID
break;
default:
return 1; // Admin
break;
}
}

Full Code Available on GitHub

There is a little more to the code than what I have included here. The entire code snippet can be seen on github at https://gist.github.com/carsonevans/eead315018a0c54f21254325c5b5003d.

I have a newsletter...

Many of my posts end up in Digital Ambit's monthly newsletter. It is the best way to keep up with what Dagny and I are doing in the business world. I appreciate your support and will only send you things we think are valuable.