Why transfer databases?
When assessing the value of a database system for your server environment, one important objective is to get your existing databases into it, and out of it. This article is all about getting databases from SQL Server to MongoDB – and vice versa.
Whenever I’ve had to introduce changes in database strategy in a large organisation by introducing a database system, the question always comes up:
“How easy is it to move our databases to the new platform, and back again if things don’t work out?”
Many other questions come up too, but there is always that underlying anxiety that any new proprietary system could be one that is difficult to extricate ourselves from.
Data represents the most precious jewels that can get accidentally lost in a swamp. Procedures generally resist being ported because there is so little standardisation in database procedure code, but data is easier to move.
It is always slightly unfair to judge different database systems on their data interchange, especially when there are such fundamental differences between a document and relational database. However, it is always reassuring when you find out that it is possible.
Import from SQL Server to MongoDB
There are a number of approaches to importing data from SQL Server.
You can use a coding approach with the mongodb.net library, together with SQLClient or SMO. You can use Node JS, with the Express plugins, you can use a MongoDB tool like Studio 3T, you can do a file-based transfer, or you can use a commercial ODBC driver.
There may be other techniques that I haven’t tried. In this article, we’ll focus on the versatile file-based transfer, because for me that gets the quickest results.
File-based database transfer
MongoDB is a bit coy about importing and exporting data between systems. The story is that you just import JSON documents into a collection, and export data the same way. This is certainly partly true, but there are snags. The first one is ensuring the right data type.
The data type issues
Whereas one can be fairly relaxed when throwing data at the table of an RDBMS from JSON, the same isn’t true of a BSON collection. Why?
A relational table contains all the rules that govern the type of data that can be inserted. It has defined data types for each column, and constraints that define whether NULLs are allowed, whether values are unique, or whether the values conform to the rules of the data in the table. You can, for example, check the rules of identification or geographic codes as the data is inserted.
MongoDB has two devices for ensuring data integrity with a JSON import. It has adopted the JSON Schema system for checking the data, and it has adopted Extended JSON as its standard for importing and exporting BSON data with data types.
JSON Schema is only useful where a collection is organised enough to enable you to have some sort of idea of what sort of data is there and what keys you are using. The closer it gets to being a table, the simpler and more valuable it gets.
Extended JSON defines the data type along with every value that isn’t obvious. It can add another third to the size of the data, but it is safe for relatively unstructured data. Its main advantage is that it can be read by any system that can read standard JSON, because the data type information is recorded in JSON.
Are we being fussy here about data types? Well, I don’t think so. If you import a JSON collection that includes a date, it is interpreted as a string because JSON has no standard way of representing a date. This is OK until you want to sort documents according to date, and then it is tricky.
If you want to index by date, you are lost. There are many differences between a decimal number and a floating-point number, as you will soon find out if you try using floating point for financial calculations! If you store binary information, you need to retain the encoding information with it. Data types really matter.
If the data that you are importing is tabular, then there is a simple way of doing it. You can use CSV or TSV with the –columnsHaveTypes switch and provide field specifications.
In PowerShell we can demonstrate this easily:
$mongoDirectory='C:\Program Files\MongoDB\Server\4.0\bin' | |
Set-Alias mongoimport "$mongoDirectory\mongoimport.exe" | |
$CSVfile ="$env:APPDATA\Addresses.csv" | |
'Name.string(),Address.string(),City.string(),State.string(),ZIP.string(),InsertionDate.date_ms(yyyy-MM-dd H:mm:ss) | |
Andy Capp,123 High St,Jacksonville,CA,90213,2012-05-31 12:00:00 | |
Bowler Hatt,243 New Road,New York,NY,10860,2011-07-03 18:00:00'| | |
out-file -force -FilePath $CSVfile -Encoding utf8 | |
mongoimport -d Sample -c Addresses --type csv --file $CSVfile --headerline --columnsHaveTypes | |
<# | |
And in the database collection, we have the correct datatypes. | |
{ | |
"_id" : ObjectId("5c596f49e3e8eaa58a627635"), | |
"Name" : "Bowler Hatt", | |
"Address" : "243 New Road", | |
"City" : "New York", | |
"State" : "NY", | |
"ZIP" : "10860", | |
"InsertionDate" : ISODate("2011-07-03T18:00:00.000+0000") | |
} | |
// ---------------------------------------------- | |
{ | |
"_id" : ObjectId("5c596f49e3e8eaa58a627636"), | |
"Name" : "Andy Capp", | |
"Address" : "123 High St", | |
"City" : "Jacksonville", | |
"State" : "CA", | |
"ZIP" : "90213", | |
"InsertionDate" : ISODate("2012-05-31T12:00:00.000+0000") | |
} | |
#> |
Actually, mongoimport is a bit broken, in that it sends verbose content and error messages to unusual places, and will so not work properly in PowerShell. (By ‘properly’, I mean that it fires spurious errors in PowerShell, and it is easy to end up not seeing the real errors).
It is probably easier to use the old Windows command line to try this out:
Set csvfile=%appdata%\addresses.csv | |
Set MongoImport=%programFiles%\MongoDB\Server\4.0\bin\mongoimport.exe | |
chcp 65001 | |
echo Name.string(),Address.string(),City.string(),State.string(),ZIP.string(),InsertionDate.date_ms(yyyy-MM-dd H:mm:ss)>%csvfile% | |
echo Andy Capp,123 High St,Jacksonville,CA,90213,2012-05-31 12:00:00>>%csvfile% | |
echo Bowler Hatt,243 New Road,New York,NY,10860,2011-07-03 18:00:00>>%csvfile% | |
"%mongoimport%" -d 'Sample' -c 'Addresses' --type csv --file %appdata%\addresses.csv --headerline –columnsHaveTypes |
You wouldn’t want to add a header line to a large pre-existing CSV file, so you can instead specify a file for these headers.
The main problem we have with CSV is that MongoDB sticks sensibly to Common Format and MIME Type for Comma-Separated Values (CSV) Files (RFC4180) and SQL Server doesn’t.
Theirs may or may not work, but where it doesn’t, it messes up without errors. In other words: Don’t! TAB-separated files are fine until you have a tab character in your column values (the IANA standard for TSV simply forbids tabs).
If you want to import data from MongoDB into SQL Server, all you need to do is to use a JSON export because of all the checking that is done at the receiving end.
To import into MongoDB using mongoimport, the safest approach is Extended JSON.
The MongoDB GUI Studio 3T allows a variety of import methods, including to and from SQL, using a SQL connection that can read two million records in three minutes, but this is another, and swiftly moving, topic.
The primary key issue
Generally speaking, relational database tables have either a primary key or a unique constraint. Without them, you can’t easily retrieve a unique row.
MongoDB collections are built with a clustered index. By default, this is just a random object_id
.
This loses a wonderful indexing opportunity, because a clustered index is usually an ideal candidate for a primary key. If you import a tabular database into MongoDB, it will work surprisingly well if you index the database properly, and by creating meaningful clustered indexes, you create a free and appropriate index for every table.
I test to see if the primary key is compound or represented by a single column and I use that for the _id
field. I also keep the key as-is so whoever used the database has both options.
Here is an example of AdventureWorks, the classic SQL Server sample database, ported to MongoDB.
Using Studio 3T, I’ve done the classic NAD (Name And aDdress) view of employees in the SQL Query tab.
SELECT e.BusinessEntityID, p.Title, p.FirstName, p.MiddleName, p.LastName, | |
p.Suffix, e.JobTitle, pp.PhoneNumber, pnt.Name, ea.EmailAddress, | |
p.EmailPromotion, a.AddressLine1, a.AddressLine2, a.City, sp.Name, | |
a.PostalCode, cr.Name, p.AdditionalContactInfo | |
FROM "[HumanResources].[Employee]" AS e | |
INNER JOIN "[Person].[Person]" AS p | |
ON p._id = e.BusinessEntityID | |
INNER JOIN "[Person].[BusinessEntityAddress]" AS bea | |
ON bea.BusinessEntityID = e.BusinessEntityID | |
INNER JOIN "[Person].[Address]" AS a | |
ON a._id = bea.AddressID | |
INNER JOIN "[Person].[StateProvince]" AS sp | |
ON sp._id = a.StateProvinceID | |
INNER JOIN "[Person].[CountryRegion]" AS cr | |
ON cr._id = sp.CountryRegionCode | |
LEFT OUTER JOIN "[Person].[PersonPhone]" AS pp | |
ON pp.BusinessEntityID = p.BusinessEntityID | |
LEFT OUTER JOIN "[Person].[PhoneNumberType]" AS pnt | |
ON pnt._id= pp.PhoneNumberTypeID | |
LEFT OUTER JOIN "[Person].[EmailAddress]" AS ea | |
ON p.BusinessEntityID = ea.BusinessEntityID |
You’ll see that, where possible, I’ve managed to make use of the MongoDB _id
clustered index.
The way that this database is designed, the primary keys are often wasted, so it isn’t always possible, so indexes are created where necessary. It takes under half a second on my test machine, which isn’t fast but not unreasonable.
To solve both these problems, data types and primary keys use Extended JSON.
Using Extended JSON
Extended JSON is readable JSON that conforms to the JSON RFC, but which introduces extra key/value pairs to each value that defines the data type.
This format can be read by any process that can consume JSON data, but can be understood only by the MongoDB REST interface, mongoimport, Compass, and the mongo shell.
The important common data types are all there, but there are also several data types that are only relevant to MongoDB and are used for transfer between MongoDB databases. It has a lot in common with the data type specifications in the headers of CSV files.
The standard comes with a strict ‘canonical’ mode and ‘relaxed’ mode. MongoDB unfortunately parses the strict mode. There is a third dialect in MongoDB, mongo shell mode, which has ‘helper’ functions to describe the BSON data type. Both types can be used by mongoimport, but mongo shell mode is not recognized by standard JSON parsers.
SQL Server exports in standard JSON though it can have problems with CLR and deprecated data types. It has to be shown how to write Extended JSON. We therefore hide that complexity in a stored procedure.
At its simplest, it produces code like this in each file (I show just the first few documents). I’ve chosen to check whether there is a primary key based on one column and if so, I use that as the MongoDB key which is indicated by using the reserved label ‘_id
’.
I map the SQL Server data type to the equivalent MongoDB BSON data type which, in this case, is a 32-bit integer.
[ | |
{ | |
"_id":1, | |
"job_id":{ | |
"$numberInt":"1" | |
}, | |
"job_desc":"New Hire - Job not specified", | |
"min_lvl":{ | |
"$numberInt":"10" | |
}, | |
"max_lvl":{ | |
"$numberInt":"10" | |
} | |
}, | |
{ | |
"_id":2, | |
"job_id":{ | |
"$numberInt":"2" | |
}, | |
"job_desc":"Chief Executive Officer", | |
"min_lvl":{ | |
"$numberInt":"200" | |
}, | |
"max_lvl":{ | |
"$numberInt":"250" | |
} | |
}, | |
{ | |
"_id":3, | |
"job_id":{ | |
"$numberInt":"3" | |
}, | |
"job_desc":"Business Operations Manager", | |
"min_lvl":{ | |
"$numberInt":"175" | |
}, | |
"max_lvl":{ | |
"$numberInt":"225" | |
} | |
} | |
] |
Export JSON files through PowerShell
JSON files can be exported from SQL Server as Extended JSON using a modified JSON export, contained in a temporary stored procedure. This can be done from PowerShell or from SQL.
By using PowerShell, you can avoid having to open up SQL Server’s ‘surface area’ allowing it to run DOS commands that write data to a file. I show the simpler technique of using SQL in another article.
Here is a PowerShell version that saves every table in a database to an Extended JSON file. OK, it looks a bit complicated but in essence it just connects to a database and, for every table, it runs the stored procedure to convert the data to JSON. It then saves this to the directory you specify.
set-psdebug -strict | |
$ErrorActionPreference = "stop" # | |
$MS = 'Microsoft.SQLServer' | |
$My = "$MS.Management.Smo" | |
$Errors = @() | |
#You need to fill this lot in | |
$SourceServer = 'MyServer' | |
$Username = 'MyUserID' | |
$Database = 'TheDatabase' | |
$OutputPath = "MyPathToTheDirectoryForTheOoutoutFiles" | |
#Load SMO assemblies and so on | |
Import-Module sqlserver -DisableNameChecking -ErrorAction silentlycontinue -ErrorVariable +Errors | |
#Does the output path exist? | |
if (-not (Test-Path -PathType Container $OutputPath)) | |
{ | |
# we create the directory if it doesn't already exist | |
New-Item -ItemType Directory -Force -Path $OutputPath ` | |
-ErrorAction silentlycontinue -ErrorVariable +Errors; | |
} | |
<# make sure all the connections are servicable #> | |
if ($Errors.Count -eq 0) | |
{ | |
$conn = new-object "$MS.Management.Common.ServerConnection" | |
$conn.ServerInstance = $SourceServer | |
if ($username -ieq '') | |
{ | |
$conn.LoginSecure = $true; | |
} | |
else | |
{ | |
<# This is more elaborate a process than you might expect because we can't assume that we can use Windows authentication, because of Azure, remote servers outside the | |
domain, and other such complications. We can't ever keep passwords for SQL Server | |
authentication as part of the static script data. At this stage, we ask for passwords | |
if they aren't known, and otherwise store them as secure strings on file in the user | |
area, protected by the workstation security. | |
#> | |
#create a connection object to manage credentials | |
$encryptedPasswordFile = "$env:USERPROFILE\$($Username)-$SourceServer.txt" | |
# test to see if we know about the password un a secure string stored in the user area | |
if (Test-Path -path $encryptedPasswordFile -PathType leaf) | |
{ | |
#has already got this set for this login so fetch it | |
$encrypted = Get-Content $encryptedPasswordFile | ConvertTo-SecureString | |
$Credentials = New-Object System.Management.Automation.PsCredential($Username, $encrypted) | |
} | |
else #then we have to ask the user for it | |
{ | |
#hasn't got this set for this login | |
$Credentials = get-credential -Credential $Username | |
$Credentials.Password | ConvertFrom-SecureString | | |
Set-Content "$env:USERPROFILE\$SourceLogin-$SourceServerName.txt" | |
} | |
$conn.LoginSecure = $false; | |
$conn.Login = $Credentials.UserName; | |
$conn.SecurePassword = $Credentials.Password; | |
} | |
} | |
if ($errors.Count -eq 0) | |
#if there were errors, then it gives up at this stage and reports the errors. | |
{ | |
$DataSourceInstanceObject = new-object ("$My.Server") $conn | |
if ($DataSourceInstanceObject.Version -eq $null) | |
{ Throw "Can't find the instance $SourceServer" }; | |
$DataDatabaseObject = $DataSourceInstanceObject.Databases[$Database] | |
if ($DataDatabaseObject.name -ne $Database) | |
{ | |
Throw "Can't find the database '$Database' in instance $SourceServer" | |
}; | |
$procedure = @' | |
CREATE OR ALTER PROCEDURE #SaveExtendedJsonDataFromTable | |
/** | |
Summary: > | |
This gets the Extended JSON data from a table. You can specify | |
it either by the database.schema.table 'tablespec, or do it | |
individually. | |
Author: phil factor | |
Date: 04/02/2019 | |
Examples: > | |
USE pubs | |
DECLARE @Json NVARCHAR(MAX) | |
EXECUTE #SaveExtendedJsonDataFromTable | |
@database='pubs', | |
@Schema ='dbo', | |
@table= 'jobs', | |
@JSONData=@json OUTPUT | |
PRINT @Json | |
Returns: > | |
The JSON data | |
**/ | |
(@database sysname = NULL, @Schema sysname = NULL, @table sysname = NULL, | |
@tableSpec sysname, @jsonData NVARCHAR(MAX) OUTPUT | |
) | |
AS | |
BEGIN | |
DECLARE @Data NVARCHAR(MAX); | |
IF Coalesce(@table, @Tablespec) IS NULL | |
OR Coalesce(@Schema, @Tablespec) IS NULL | |
RAISERROR('{"error":"must have the table details"}', 16, 1); | |
IF @table IS NULL SELECT @table = ParseName(@Tablespec, 1); | |
IF @Schema IS NULL SELECT @Schema = ParseName(@Tablespec, 2); | |
IF @database IS NULL | |
SELECT @database = Coalesce(ParseName(@Tablespec, 3), Db_Name()); | |
IF @table IS NULL OR @Schema IS NULL OR @database IS NULL | |
RAISERROR('{"error":"must have the table details"}', 16, 1); | |
DECLARE @SourceCode NVARCHAR(255) = | |
( | |
SELECT 'SELECT * FROM ' + QuoteName(@database) + '.' | |
+ QuoteName(@Schema) + '.' + QuoteName(@table) | |
); | |
DECLARE @a_unique_key bit | |
DECLARE @HowManyUniqueKeys INT | |
SELECT @HowManyUniqueKeys= Sum(Convert(INT,is_part_of_unique_key)) | |
FROM sys.dm_exec_describe_first_result_set(@SourceCode, NULL, 1) | |
SELECT @a_unique_key= CASE WHEN @HowManyUniqueKeys = 1 THEN 1 ELSE 0 END | |
DECLARE @params NVARCHAR(MAX); | |
SELECT @params = | |
String_Agg( | |
CASE WHEN system_type_id IN | |
( 35, -- text | |
99, -- ntext | |
98, -- sql_variant | |
167, -- varchar | |
231, -- nvarchar | |
239, -- nchar | |
175, -- char | |
36, -- uniqueidentifier | |
59, -- real | |
62, -- float | |
104, -- bit | |
241 | |
) -- xml | |
THEN | |
CASE WHEN @a_unique_key=1 AND is_part_of_unique_key =1 THEN name+ ' as "_id", ' | |
ELSE '' END +QuoteName(name) | |
WHEN user_type_id IN (128, 129, 130) THEN | |
'convert(nvarchar(100),' + name + ') as "' + name + '"' | |
--hierarchyid (128) geometry (130) and geography types (129) can be coerced. | |
--------binary | |
WHEN system_type_id IN | |
( 165, -- varbinary | |
173-- binary | |
) | |
--THEN name + ' as "' + name + '.$binary.hex", ''80'' as "' + name + '.$binary.subType"' | |
THEN name --I gave up. Extended json binary form is just awful | |
WHEN system_type_id = 34 THEN --image | |
'convert(varbinary(max),' + name + ') as "' + name + '"' | |
WHEN system_type_id IN (35) THEN --35 is text | |
'convert(varchar(max),' + name + ') as "' + name + '"' | |
WHEN system_type_id IN (99) THEN --ntext | |
'convert(nvarchar(max),' + name + ') as "' + name + '"' | |
--------numberInt | |
WHEN system_type_id IN | |
( 48, -- tinyint | |
56, -- int | |
52 -- smallint | |
) | |
THEN | |
CASE WHEN @a_unique_key=1 AND is_part_of_unique_key =1 THEN name+ ' as "_id", ' | |
ELSE '' END + | |
'convert(varchar(50),' + name + ') as "' + name + '.$numberInt"' | |
--------numberLong | |
WHEN system_type_id = 127 -- bigint | |
THEN | |
CASE WHEN @a_unique_key=1 AND is_part_of_unique_key =1 THEN name+ ' as "_id", ' | |
ELSE '' END + | |
'convert(varchar(50),' + name + ') as "' + name + '.$numberLong"' | |
--------date | |
WHEN system_type_id IN | |
( 40, -- date | |
41, -- time | |
42, -- datetime2 | |
43, -- datetimeoffset | |
58, -- smalldatetime | |
61, -- datetime | |
189 | |
) -- timestamp | |
THEN | |
'convert(datetimeoffset,convert(datetime2(0),' + name + ')) as "' + name + '.$date"' | |
-------numberDecimal | |
WHEN system_type_id IN | |
( 106, -- decimal | |
108, -- numeric | |
122, -- smallmoney | |
60 | |
) -- money | |
THEN | |
'convert(varchar(50),' + name + ') as "' + name + '.$numberDecimal"' | |
ELSE QuoteName(name) | |
END, | |
', ' | |
) | |
FROM sys.dm_exec_describe_first_result_set(@SourceCode, NULL, 1); | |
DECLARE @expression NVARCHAR(max) = | |
' | |
USE ' + @database + ' | |
SELECT @TheData=(SELECT ' + @params + ' FROM ' + QuoteName(@database) + '.' | |
+ QuoteName(@Schema) + '.' + QuoteName(@table) | |
+ ' FOR JSON PATH)'; | |
PRINT @Expression | |
EXECUTE sp_executesql @expression, N'@TheData nvarchar(max) output', | |
@TheData = @jsonData OUTPUT; | |
END; | |
'@ <# we create a new connection just so we can be certain of being able to use | |
the same connection that has the temporary stored procedure on it. Normally | |
we just share the server's connection but this isn't safe and it is possib;e | |
to lose the connection just by doing an SMO action #> | |
$SqlConnection = New-Object System.Data.SqlClient.SqlConnection | |
$SqlConnection.ConnectionString = $conn.ConnectionString | |
$SqlConnection.Open() | |
$sqlCommand = $sqlConnection.CreateCommand() | |
try | |
{ | |
$sqlCommand.CommandText = $procedure | |
$sqlCommand.ExecuteScalar() | |
} | |
catch [System.Exception] | |
{ | |
$_.Exception.InnerException.Errors | foreach{ write-error "$_.message" } #parameters | |
} | |
} | |
if ($errors.Count -eq 0) | |
#if there were errors, then it gives up at this stage and reports the errors. | |
{ | |
$DataDatabaseObject.Tables | Select Schema, Name | | |
foreach { | |
$filename = "$($_.schema)_$($_.Name)" -replace '[\\\/\:\.]', '-' | |
$Tablename = "[$($_.schema)].[$($_.Name)]" | |
$result = $_.Name | |
#execute this script, using the existing connection | |
$params =@" | |
DECLARE @Json NVARCHAR(MAX) | |
EXECUTE #SaveExtendedJsonDataFromTable @TableSpec='$tablename',@JSONData=@json OUTPUT | |
Select @json | |
"@ | |
$sqlCommand = $sqlConnection.CreateCommand() | |
$sqlCommand.CommandText = $params | |
try | |
{ | |
$sqlCommand.ExecuteScalar() | Out-file -FilePath "$OutputPath\$filename" -Encoding utf8 -Force | |
} | |
catch [System.Exception] | |
{ | |
$_.Exception.InnerException.Errors | foreach{ write-error $_.message } #parameters | |
} | |
} | |
} | |
$SqlConnection.Close() | |
if ($Errors.count -gt 0) #if we couldn't import something | |
{ | |
$Errors | foreach{ | |
write-warning "There was an error '$($_)'" | |
} | |
} | |
Once you have exported the Extended JSON files, it is the work of a moment to get them in place with Studio 3T.
![JSON import in Studio 3T](https://studio3t.com/wp-content/uploads/2019/02/word-image-144.png)
And the next thing you know, there is the old Pubs database in your MongoDB database, from back in the old Sybase Days:
![](https://studio3t.com/wp-content/uploads/2019/02/word-image-145.png)
The problem is, we don’t always want to rely on doing this interactively. Sometimes, you’ve just got to script things out.
Usually, I advocate PowerShell, but the mongoimport utility is quirky and is best done from the command prompt or from a batch file. Here is the batch file. It is slightly cryptic but works!
Rem -----------------You need to fill in these settings with the correct details | |
rem JSONDirectory is the full path to the directory that holds your extended JSON files | |
Set JSONDirectory=MyPath | |
rem MongoImport must be set to the path of the version of MongoImport that you're using | |
Set MongoImport=%programFiles%\MongoDB\Server\4.0\bin\mongoimport.exe | |
rem host is the name of the MongoDB database server host | |
Set host=MyServer | |
rem port is the port number of the IP address of the MongoDB database server | |
Set port=27017 | |
Rem uid is the UserID | |
Set uid=MyID | |
Rem Password. Plain text blush blush | |
Set password=MyPassword Rem ad is the name of the admin database where the credentials are kept | |
Set ad=admin | |
Rem db is the name of the user database that these collections are going into | |
Set db=AdventureWorks | |
rem for each file in the directory you specify run the MongoImport routine, using the filename as the name of the collection. | |
for /f "delims=|" %%f in ('dir /b "%JSONDirectory%\*.JSON"') do "%mongoimport%" --host %host% --port %port% -u %uid% -p "%password%" --authenticationDatabase %ad% --db %db% --collection "%%~nf" --type json --jsonArray --drop "%JSONDirectory%\%%f" |
Export from MongoDB to SQL Server
The mongoexport utility exports in Extended JSON, rather than plain standard JSON.
In order to get plain JSON export, you’ll need a third-party utility such as Studio 3T.
SQL Server can read Extended JSON but only if you provide an explicit schema for each collection that you are putting in a table. This isn’t entirely satisfactory because it is tricky to automate.
Here is an example using just a sample of the output from mongoexport to illustrate.
DECLARE @JSONString VARCHAR(MAX) = | |
'[{ | |
"_id": "AED", | |
"CurrencyCode": "AED", | |
"Name": "Emirati Dirham", | |
"ModifiedDate": { | |
"$date": "2008-04-30T00:00:00.000Z" | |
} | |
}, | |
{ | |
"_id": "AFA", | |
"CurrencyCode": "AFA", | |
"Name": "Afghani", | |
"ModifiedDate": { | |
"$date": "2008-04-30T00:00:00.000Z" | |
} | |
}, | |
{ | |
"_id": "ALL", | |
"CurrencyCode": "ALL", | |
"Name": "Lek", | |
"ModifiedDate": { | |
"$date": "2008-04-30T00:00:00.000Z" | |
} | |
}, | |
{ | |
"_id": "AMD", | |
"CurrencyCode": "AMD", | |
"Name": "Armenian Dram", | |
"ModifiedDate": { | |
"$date": "2008-04-30T00:00:00.000Z" | |
} | |
}]'; | |
INSERT INTO Sales.Currency(CurrencyCode, Name, ModifiedDate) | |
SELECT CurrencyCode, Name, ModifiedDate | |
FROM OpenJson(@JSONString) WITH | |
( | |
CurrencyCode NCHAR(3), Name NVARCHAR(50), | |
ModifiedDate DATETIME '$.ModifiedDate."$date"' | |
); |
Notice that the path to the actual data has double-quotes around the key field. This is because the dollars sign isn’t legal in the path expression in the WITH explicit schema statement.
Conclusions
We can transfer the data of databases between SQL Server and MongoDB remarkably easily, but normally the actual tables are the least of our problems. There are also the most important indexes, the views, the procedural logic, scheduled tasks, and a mass of code to consider.
I wouldn’t even want to consider porting a relational system to MongoDB unless it was just an initial staging phase. In this case, I’d create the collections on SQL Server, on the source database from their constituent tables, making a judgement on the best design for a hierarchical document database.
My conclusions from this article are that the use of Extended JSON provides the best way of transferring data between the two database systems, especially as it solves two problems, and can be automated. However, a third-party tool such as Studio 3T makes aspects of the job a great deal easier, especially the import and export of a large collection of Extended JSON files in MongoDB!