Thursday, October 14, 2010

Migrating Umbraco from MySQL to MSSQL

So you've been running the Umbraco CMS for a while on MySQL, but you've been told you need to move it to Microsoft SQL Server. Does it sound intimidating? It doesn't have to be. With the SQL script and instructions in this post, it's not too bad. First, let's start with an overview of the process.

Overview:
1. Use the Umbraco installer scripts to create blank Umbraco tables on your SQL server
2. Set up a linked database in SQL Server to your MySQL database
3. Migrate the data (using my script below)
4. Update your web.config to point to the new database


Step 1 - Create a new database on SQL Server
First off, let's hope you kept the original zip file you downloaded from the Umbraco site. Create a second IIS site and unzip the fresh Umbraco files there. Browse to the /install directory, and run through just enough steps to create a database on your SQL server. Once that database is in place, you can get rid of this new IIS site


Step 2 - Add MySQL database as linked
There is much better information at CodeProject, but the basic idea involes installing the MySQL ODBC driver, add your MySQL server as a datasource, and add a 'Linked Server' in SQL Management Studio to your MySQL server as an ODBC connection.


Step 3 - Migrate Data
You will want to customize the @dbName, @LinkedServerName, and @LinkedDbName variables appropriately, but the rest of it should work automatically. What's happening here is that for each table we use default database collation for comparing strings, turn off IDENTITY_INSERT on tables with identity fields, and determine what fields uniquely identify each table. We then copy all data from the MySQL database to the SQL Server database, except for default records that already exist on tables in SQL Server.
DECLARE @hasIdentity bit
DECLARE @dbName varchar(20)
DECLARE @fieldNames varchar(1000)
DECLARE @LinkedServerName varchar(50)
DECLARE @LinkedDbName varchar(50)
DECLARE @tName sysname
DECLARE @fName sysname
DECLARE @IdentityFieldName sysname
DECLARE @sql varchar(8000)
DECLARE @cSQL nvarchar(255)
DECLARE @iSQL nvarchar(255)
DECLARE @numBefore INT
DECLARE @numAfter INT
DECLARE @numDesired INT
DECLARE @pkFieldName sysname
DECLARE @pkFieldType varchar(50)
DECLARE @collateOptions varchar(100)


SET NOCOUNT ON

SET @dbName = 'Umbraco'
SET @LinkedServerName = 'Umbraco_old'
SET @LinkedDbName = 'wesupport'

-- FOR EACH table @tName in the @dbName database
IF OBJECT_ID('tempdb..#umbTables') IS NOT NULL
DROP TABLE #umbTables
SELECT TABLE_NAME INTO #umbTables
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_CATALOG = @dbName ORDER BY TABLE_NAME
WHILE EXISTS (SELECT TOP 1 TABLE_NAME FROM #umbTables)
BEGIN
SELECT TOP 1 @tName = TABLE_NAME from #umbTables

-- Find out if the table has an identity column
SELECT @hasIdentity = OBJECTPROPERTY(OBJECT_ID(@tName),
'TableHasIdentity')
-- Allow us to mess with the identity field explicitly
SET @sql = ''
IF @hasIdentity = 1 BEGIN
SET @sql = 'SET IDENTITY_INSERT ' +
@tName + ' ON '
END

-- Get a list of field names
-- separated by commas
SET @fieldNames = ''
IF OBJECT_ID('tempdb..#umbFields') IS NOT NULL
DROP TABLE #umbFields
SELECT COLUMN_NAME INTO #umbFields
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_CATALOG = @dbName
AND TABLE_NAME = @tName
WHILE EXISTS (SELECT TOP 1 COLUMN_NAME
FROM #umbFields)
BEGIN
SELECT TOP 1 @fName = COLUMN_NAME
from #umbFields
SET @fieldNames =
@fieldNames + ' [' + @fName + '] '
DELETE FROM #umbFields
WHERE COLUMN_NAME = @fName
IF EXISTS (SELECT TOP 1 COLUMN_NAME
from #umbFields)
SET @fieldNames = @fieldNames + ' , '

-- Figure out if this field
-- is an identity field
IF @hasIdentity = 1 AND
COLUMNPROPERTY(OBJECT_ID(@tName),
@fName, 'IsIdentity') = 1
SET @IdentityFieldName = @fName
END
IF OBJECT_ID('tempdb..#umbFields') IS NOT NULL
DROP TABLE #umbFields

SET @cSQL = 'SELECT @i = COUNT(*) FROM ' + @tName
EXEC sp_executesql @cSQL, N'@i INT OUT',
@numBefore OUTPUT

-- OK, now we copy the data as needed
SET @sql = @sql + CHAR(10) +
' INSERT INTO ' + @tName +
' ( ' + @fieldNames + ' ) ' +
' SELECT ' + @fieldNames + ' FROM ' +
' openquery(' + @LinkedServerName +
', ''SELECT * FROM ' +
@LinkedDbName + '.' + @tName +
''') newstuff ' +
' WHERE NOT EXISTS ( SELECT * FROM ' +
@tName + ' mytable WHERE '

-- Get a list of primary keys into temporary table
IF OBJECT_ID('tempdb..#umbPKeys') IS NOT NULL
DROP TABLE #umbPKeys
CREATE TABLE #umbPKeys
(COLUMN_NAME sysname, DATA_TYPE varchar(50))
IF EXISTS (SELECT B.COLUMN_NAME
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS A,
INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE B
WHERE CONSTRAINT_TYPE = 'PRIMARY KEY'
AND A.CONSTRAINT_NAME = B.CONSTRAINT_NAME
AND A.TABLE_NAME = @tName)
INSERT INTO #umbPKeys SELECT B.COLUMN_NAME,
C.DATA_TYPE
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS A,
INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE B
INNER JOIN INFORMATION_SCHEMA.COLUMNS C
ON B.TABLE_NAME = C.TABLE_NAME
AND B.COLUMN_NAME = C.COLUMN_NAME
AND B.TABLE_CATALOG = C.TABLE_CATALOG
WHERE CONSTRAINT_TYPE = 'PRIMARY KEY'
AND A.CONSTRAINT_NAME = B.CONSTRAINT_NAME
AND A.TABLE_NAME = @tName
ELSE -- The table has no primary keys, so use
-- all fields to match records
INSERT INTO #umbPKeys
SELECT B.COLUMN_NAME, B.DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS B
WHERE TABLE_CATALOG = @dbName
AND TABLE_NAME = @tName
AND DATA_TYPE != 'ntext'

-- For each primary key in this table
WHILE EXISTS (SELECT TOP 1 COLUMN_NAME
FROM #umbPKeys)
BEGIN
SELECT TOP 1 @pkFieldName = COLUMN_NAME,
@pkFieldType = DATA_TYPE from #umbPKeys

SET @collateOptions = ''
IF @pkFieldType NOT IN ('int',
'uniqueidentifier', 'smallint',
'tinyint', 'bigint', 'bit', 'datetime')
SET @collateOptions =
' COLLATE DATABASE_DEFAULT '

SET @sql = @sql + ' mytable.[' + @pkFieldName +
'] ' + @collateOptions +
' = newstuff.[' + @pkFieldName + '] ' +
@collateOptions

DELETE FROM #umbPKeys
WHERE COLUMN_NAME = @pkFieldName
IF EXISTS (SELECT TOP 1 COLUMN_NAME
FROM #umbPKeys)
SET @sql = @sql + ' AND '
END

SET @sql = @sql + ' ) ' + CHAR(10)

IF @hasIdentity = 1
SET @sql = @sql + 'SET IDENTITY_INSERT ' +
@tName + ' OFF '

EXEC(@sql)
-- PRINT @sql

SET @cSQL = 'SELECT @i = COUNT(*) FROM ' + @tName
EXEC sp_executesql @cSQL, N'@i INT OUT',
@numAfter OUTPUT
SET @cSQL = 'SELECT @i = COUNT(*) FROM ' +
' openquery(' + @LinkedServerName + ', ' +
'''SELECT * FROM ' + @LinkedDbName + '.' +
@tName + ''')'
EXEC sp_executesql @cSQL, N'@i INT OUT',
@numDesired OUTPUT

PRINT @tName + ': From ' +
CAST(@numBefore AS VARCHAR) +
' to ' + CAST(@numAfter AS VARCHAR) +
' (Goal: ' + CAST(@numDesired AS VARCHAR)
+ ')'

IF OBJECT_ID('tempdb..#umbPKeys') IS NOT NULL
DROP TABLE #umbPKeys

DELETE FROM #umbTables WHERE TABLE_NAME = @tName
SET @IdentityFieldName = ''
END

-- Cleanup
IF OBJECT_ID('tempdb..#umbTables') IS NOT NULL DROP TABLE #umbTables
SET NOCOUNT OFF

4. Update web.config
Go back to your original website and, after backing up your web.config file, open it up and find the setting under Configuration - AppSettings - umbracoDbDSN. Update it so it looks like this, but have it all on a single line (with your settings, of course):

<add key="umbracoDbDSN" value="server=myServerName; database=myDatabaseName;user id=myUserID; password=myPassword">

Be sure to clear your web browser's cache after doing all of this to make sure that it picks up a fresh copy of your site. And you should now have it connected to SQL Server. If you want, you can remove the linked server in SQL Management Studio to clean things up.

7 comments:

Anonymous said...

I can't wait to try this out. I'd like to migrate simply because sql server seems to be the preferred DB, or at least you know you can always run the betas on it, whereas you may not be able to on mysql without mods.

Tim said...

Let me know how it goes for you, or if you run into any trouble. I've only used this on a single migration (Umbraco v 4.0.2.1), and I'd appreciate getting some feedback on it.

Lars Skjoldby said...

Part 1:

DECLARE @hasIdentity bit
DECLARE @dbName varchar(20)
DECLARE @dbUser varchar(20)
DECLARE @fieldNames varchar(1000)
DECLARE @LinkedServerName varchar(50)
DECLARE @LinkedDbName varchar(50)
DECLARE @tName sysname
DECLARE @fName sysname
DECLARE @IdentityFieldName sysname
DECLARE @sql varchar(8000)
DECLARE @cSQL nvarchar(255)
DECLARE @iSQL nvarchar(255)
DECLARE @numBefore INT
DECLARE @numAfter INT
DECLARE @numDesired INT
DECLARE @pkFieldName sysname
DECLARE @pkFieldType varchar(50)
DECLARE @collateOptions varchar(100)


SET NOCOUNT ON

SET @dbName = 'umbracoDB'
SET @LinkedServerName = 'MysqlServ'
SET @LinkedDbName = 'umbracoDBMySQL'
SET @dbUser = 'umbracoDBuser'

-- FOR EACH table @tName in the @dbName database
IF OBJECT_ID('tempdb..#umbTables') IS NOT NULL
DROP TABLE #umbTables
SELECT TABLE_NAME INTO #umbTables
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_CATALOG = @dbName ORDER BY TABLE_NAME
WHILE EXISTS (SELECT TOP 1 TABLE_NAME FROM #umbTables)
BEGIN
SELECT TOP 1 @tName = TABLE_NAME from #umbTables

-- Find out if the table has an identity column
SELECT @hasIdentity = OBJECTPROPERTY(OBJECT_ID(@dbUser +'.'+ @tName),
'TableHasIdentity')
-- Allow us to mess with the identity field explicitly
SET @sql = ''
IF @hasIdentity = 1 BEGIN
SET @sql = 'SET IDENTITY_INSERT ' +
@dbUser +'.'+ @tName + ' ON '
END

-- Get a list of field names
-- separated by commas
SET @fieldNames = ''
IF OBJECT_ID('tempdb..#umbFields') IS NOT NULL
DROP TABLE #umbFields
SELECT COLUMN_NAME INTO #umbFields
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_CATALOG = @dbName
AND TABLE_NAME = @tName
WHILE EXISTS (SELECT TOP 1 COLUMN_NAME
FROM #umbFields)
BEGIN
SELECT TOP 1 @fName = COLUMN_NAME
from #umbFields
SET @fieldNames =
@fieldNames + ' [' + @fName + '] '
DELETE FROM #umbFields
WHERE COLUMN_NAME = @fName
IF EXISTS (SELECT TOP 1 COLUMN_NAME
from #umbFields)
SET @fieldNames = @fieldNames + ' , '

Lars Skjoldby said...

Part 2:
-- Figure out if this field
-- is an identity field
IF @hasIdentity = 1 AND
COLUMNPROPERTY(OBJECT_ID(@dbUser +'.'+ @tName),
@fName, 'IsIdentity') = 1
SET @IdentityFieldName = @fName
END
IF OBJECT_ID('tempdb..#umbFields') IS NOT NULL
DROP TABLE #umbFields

SET @cSQL = 'SELECT @i = COUNT(*) FROM ' + @dbUser +'.'+ @tName
EXEC sp_executesql @cSQL, N'@i INT OUT',
@numBefore OUTPUT

-- OK, now we copy the data as needed
SET @sql = @sql + CHAR(10) +
'INSERT into ' + @dbUser + '.' + @tName +
' ( ' + @fieldNames + ' ) ' +
' SELECT ' + @fieldNames + ' FROM ' +
' openquery(' + @LinkedServerName +
', ''SELECT * FROM ' +
@LinkedDbName + '.' + UPPER(@tName) +
''') newstuff ' +
' WHERE NOT EXISTS ( SELECT * FROM ' + @dbUser + '.' +
@tName + ' mytable WHERE '

-- Get a list of primary keys into temporary table
IF OBJECT_ID('tempdb..#umbPKeys') IS NOT NULL
DROP TABLE #umbPKeys
CREATE TABLE #umbPKeys
(COLUMN_NAME sysname, DATA_TYPE varchar(50))
IF EXISTS (SELECT B.COLUMN_NAME
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS A,
INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE B
WHERE CONSTRAINT_TYPE = 'PRIMARY KEY'
AND A.CONSTRAINT_NAME = B.CONSTRAINT_NAME
AND A.TABLE_NAME = @tName)
INSERT INTO #umbPKeys SELECT B.COLUMN_NAME,
C.DATA_TYPE
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS A,
INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE B
INNER JOIN INFORMATION_SCHEMA.COLUMNS C
ON B.TABLE_NAME = C.TABLE_NAME
AND B.COLUMN_NAME = C.COLUMN_NAME
AND B.TABLE_CATALOG = C.TABLE_CATALOG
WHERE CONSTRAINT_TYPE = 'PRIMARY KEY'
AND A.CONSTRAINT_NAME = B.CONSTRAINT_NAME
AND A.TABLE_NAME = @tName
ELSE -- The table has no primary keys, so use
-- all fields to match records
INSERT INTO #umbPKeys
SELECT B.COLUMN_NAME, B.DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS B
WHERE TABLE_CATALOG = @dbName
AND TABLE_NAME = @tName
AND DATA_TYPE != 'ntext'

-- For each primary key in this table
WHILE EXISTS (SELECT TOP 1 COLUMN_NAME
FROM #umbPKeys)
BEGIN
SELECT TOP 1 @pkFieldName = COLUMN_NAME,
@pkFieldType = DATA_TYPE from #umbPKeys

SET @collateOptions = ''
IF @pkFieldType NOT IN ('int',
'uniqueidentifier', 'smallint',
'tinyint', 'bigint', 'bit', 'datetime')
SET @collateOptions =
' COLLATE DATABASE_DEFAULT '

SET @sql = @sql + ' mytable.[' + @pkFieldName +
'] ' + @collateOptions +
' = newstuff.[' + @pkFieldName + '] ' +
@collateOptions

DELETE FROM #umbPKeys
WHERE COLUMN_NAME = @pkFieldName
IF EXISTS (SELECT TOP 1 COLUMN_NAME
FROM #umbPKeys)
SET @sql = @sql + ' AND '
END

SET @sql = @sql + ' ) ' + CHAR(10)

IF @hasIdentity = 1
SET @sql = @sql + 'SET IDENTITY_INSERT ' +
@dbUser +'.'+ @tName + ' OFF '

EXEC(@sql)
-- PRINT @sql

SET @cSQL = 'SELECT @i = COUNT(*) FROM ' + @dbUser +'.'+@tName
EXEC sp_executesql @cSQL, N'@i INT OUT',
@numAfter OUTPUT
SET @cSQL = 'SELECT @i = COUNT(*) FROM ' +
' openquery(' + @LinkedServerName + ', ' +
'''SELECT * FROM ' + @LinkedDbName + '.' +
UPPER(@tName) + ''')'
EXEC sp_executesql @cSQL, N'@i INT OUT',
@numDesired OUTPUT

PRINT @tName + ': From ' +
CAST(@numBefore AS VARCHAR) +
' to ' + CAST(@numAfter AS VARCHAR) +
' (Goal: ' + CAST(@numDesired AS VARCHAR)
+ ')'

IF OBJECT_ID('tempdb..#umbPKeys') IS NOT NULL
DROP TABLE #umbPKeys

DELETE FROM #umbTables WHERE TABLE_NAME = @tName
SET @IdentityFieldName = ''
END

-- Cleanup
IF OBJECT_ID('tempdb..#umbTables') IS NOT NULL DROP TABLE #umbTables
SET NOCOUNT OFF

Lars Skjoldby said...

Dont know why but my explanation is missing.

Had problems with your script on an 4.5.2 install.

The MySQL tables are in upper case but i fixed that the UPPER() function.

Newer versions of umbraco uses the dbuser as object owner, not dbo. Fixed that with a new variable @dbUser.

How ever I'm stock at this error:
The INSERT statement conflicted with the FOREIGN KEY constraint "FK_cmsContent_umbracoNode". The conflict occurred in database "UmbracoDB", table "UmbracoUser.umbracoNode", column 'id'.

Can anybody help?

/Lars Skjoldby

steroberts89 said...

Thanks, This really helped!

Anonymous said...

Hi,
Thanks for the great info! I am also seeing the FK errors. Any solution to this? (Umbraco V4.9.1)

Thank you!