Developer42

2015-12-04

SQL Server :: Compare Database Schemas

This is a script to list basic differences in table definitions between two databases.

Background Waffle
When we make database changes in a transactional system, we need to ensure that these won’t adversely affect our BI solution. Normally our developers give us this information when we plan which items to put into a release, so we can liaise with BI; unfortunately the developers machines are down at present due to a disaster at the data center of the company to whom we outsourced development. Since the changes for the release are already in our UAT environment (and only those changes are there), we could determine what would be changing in production by comparing the schemas of the two environments.

My first thought was to use the Database Comparison Tool that comes with Visual Studio. However whilst this does a great job for a developer looking for and resolving differences, it’s not so easy to communicate this information to others (i.e. giving a before and after image of columns side by side. You have to either click on each difference to get a definition of the before and after, then make notes or take screenshots, or you have to generate the DDL, in which case you don’t see the before picture; only the code to amend from the current to the to-be definition.

Instead I realised that I could simply pull the definitions of the tables & columns from SQL, compare these and list the differences; resulting in the code below.
This code can handle where both databases are on the same instance, or where they’re on different instances. For the latter scenario you’d need to create a linked server.
If you frequently compare different combinations of databases and want to avoid amending the instance and catalog names in the code each time you may want to look into views or synonyms (potentially aliases work too if the catalog name is consistent across instances).

Since our BI solution uses replication to get the data from the transactional system, I also included information on which columns were replicated; thus we can filter on those replicated to see just those which BI would be interested in, or can ignore replication to see all differences.

Caveats

  1. We’re not comparing everything; only basic table and column data.
  2. The maxLength value gives length in bytes; not characters (see previous post on how to correct for this).
  3. There are probably other issues; this was a quick fix to resolve an immediate issue which worked for us; there may be issues we’ve not encountered (e.g. maybe the linked server only gives access to a subset of the tables; thus giving false responses)…

Code

set transaction isolation level read uncommitted
go

;with uat as
(
       select t.name TableName
       , c.Name ColumnName
       , ty.name ColType
       , c.max_length ColMaxLength
       , coalesce(c.is_nullable,0) ColIsNullable
       , c.scale ColScale
       , c.[precision] ColPrecision
       , coalesce(c.collation_name,'') ColCollation
       , c.is_replicated
       from [myUatDbServer\uatInstance].[myUatDbCatalog].sys.tables t
       inner join [myUatDbServer\uatInstance].[myUatDbCatalog].sys.columns c
              on c.object_id = t.object_id
       inner join [myUatDbServer\uatInstance].[myUatDbCatalog].sys.types ty
              on ty.system_type_id = c.system_type_id
              and ty.user_type_id = c.user_type_id
)
, prd as
(
       select t.name TableName
       , c.Name ColumnName
       , ty.name ColType
       , c.max_length ColMaxLength
       , coalesce(c.is_nullable,0) ColIsNullable
       , c.scale ColScale
       , c.[precision] ColPrecision
       , coalesce(c.collation_name,'') ColCollation
       , c.is_replicated
       from [myProdDbServer\prodInstance].[myProdDbCatalog].sys.tables t
       inner join [myProdDbServer\prodInstance].[myProdDbCatalog].sys.columns c
              on c.object_id = t.object_id
       inner join [myProdDbServer\prodInstance].[myProdDbCatalog].sys.types ty
              on ty.system_type_id = c.system_type_id
              and ty.user_type_id = c.user_type_id
)
select coalesce(uat.TableName, prd.TableName) TableName
, coalesce(uat.ColumnName, prd.ColumnName) ColumnName
, case
       when prd.TableName is null and not exists (select top 1 1 from prd x where x.TableName = uat.TableName) then 'Add Table'
       when uat.TableName is null and not exists (select top 1 1 from uat x where x.TableName = prd.TableName)  then 'Remove Table'
       when prd.ColumnName is null then 'Add Column'
       when uat.ColumnName is null then 'Remove Column'
       else 'Change Column Definition'
end [AXtion] --our transaction system's Dynamics AX; I'm hilarious :S
, prd.ColType ColTypeFrom
, uat.ColType ColTypeTo
, prd.ColMaxLength ColMaxLengthFrom --not going to fuss about char byte size for now; just want to get a list of changes
, uat.ColMaxLength ColMaxLengthTo --not going to fuss about char byte size for now; just want to get a list of changes
, prd.ColIsNullable ColIsNullableFrom
, uat.ColIsNullable ColIsNullableTo
, prd.ColCollation ColCollationFrom
, uat.ColCollation ColCollationTo
, prd.ColPrecision ColPrecisionFrom
, uat.ColPrecision ColPrecisionTo
, prd.ColScale ColScaleFrom
, uat.ColScale ColScaleTo
, prd.is_replicated PrdIsReplicated --\_these aren't compared; just returned to make it easy to tell what's interesting to BI
, uat.is_replicated UatIsReplicated --/
from uat
full outer join prd
       on prd.TableName = uat.TableName
       and prd.ColumnName = uat.ColumnName
where prd.TableName is null
or uat.TableName is null
or prd.ColumnName is null
or uat.ColumnName is null
or (uat.ColType != prd.ColType)
or (uat.ColMaxLength != prd.ColMaxLength)
or (uat.ColIsNullable != prd.ColIsNullable) 
or (uat.ColCollation != prd.ColCollation) 
or (uat.ColPrecision != prd.ColPrecision) 
or (uat.ColScale != prd.ColScale)
order by coalesce(uat.TableName, prd.TableName) 
, coalesce(uat.ColumnName, prd.ColumnName) 

2010-04-29

Intelligently Splitting an Address

Filed under: SQL Server, Technology, Uncategorized — Tags: , , , , , , , , — Developer42 @ 08:05

An issue recently arose at work where I was asked to take a street address from one field, and split it into two for use with another system. The simple solutions available were:

  • simply splitting on the first comma to appear in the string, and putting each side of this split into the respective fields.
  • breaking the string down the middle.
  • putting everything I could into the first field, then any remainder into the second.

However, none of these seemed to feel right; the first solution would work well, but didn’t take the destination fields’ sizes into account, and the second and third solutions could break half-way through a word / in the case of the third, may not use the second field.
As a result, I came up with the following function; it’s probably not the best solution (since this was written in half an hour on getting home from a night out, and with just a quick check that all worked OK this morning), but hopefully it’ll be useful to some people.

The function takes 3 arguments; the string to be split, the length of the first field to be populated, and the length of the second. It returns a table with an ID field to give the order of the elements (not really required), and a String field to give the two return values in separate rows. This function will always return exactly 2 rows.

The function is written in T-SQL for SQL Server, though I believe it should also work with Sybase.


SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE FUNCTION dbo.SplitSpecial

(

      @stringToSplit nvarchar(max)

      ,@len1 int

      ,@len2 int

)  

RETURNS @returnVal table (id int identity(1,1),string nvarchar(max)) 

AS  

BEGIN 

      declare @i int = 0                        --used as a counter when searching for chars

      , @j int = 0                              --another char counter

      , @ldiff int = 0                    --says how many characters we have spare to play with

      , @mark int = 0                     --a marker

      , @tempmark int   = 0                     --a marker

      , @back nvarchar(max)               --stringToSplit in reverse

      , @temp nvarchar(max)               --for storing works in progress

      , @forward bit = 0                        --used to indicate the direction of the string from the marker

 

      --remove any starting, ending or duplicate spaces should they exist, to try to cram in the characters

      select @stringToSplit = ltrim(rtrim(@stringToSplit))

      while (charindex('  ',@stringToSplit)>0)

            select @stringToSplit = replace(@stringToSplit,N'  ',N' ')

      

      --find out what we have to play with / reduce max size of strings to size of input

      select @tempmark = len(@stringToSplit)

      select @len1  = 0.5*((@len1 + @tempmark) - abs(@len1 - @tempmark))

            ,  @len2  = 0.5*((@len2 + @tempmark) - abs(@len2 - @tempmark))

      select @ldiff = (@len1 + @len2) - @tempmark

      

      if @ldiff > 0

      begin

            --get a good starting point / break point: no longer required

            --select @mark = round(len(@stringToSplit)/2,0) --assume we can get away with halving the string

            --select @mark = 0.5*((@mark + @len1) - abs(@mark - @len1)) --account for first field being too short (least function math hack)

            --select @mark = len(@stringToSplit) - 0.5*(((len(@stringToSplit)-@mark) + @len2) - abs((len(@stringToSplit)-@mark) - @len2)) --account for 2nd field being too short (least function math hack)

            

            --the two fields are big enough for us to try some funky stuff with

            --is there a delimeter we can use?

            select @temp = substring(@stringToSplit,len(@stringToSplit)-@len2,@ldiff+1)--get crossover chars; the +1 accounts for dropping the delim

            while patindex('%[a-z0-9]%', @temp) > 0

                  set @temp = stuff(@temp, patindex('%[a-z0-9]%', @temp), 1, '')

            if (len(@temp)+charindex(N' ',@temp)) > 0 --a suitable delim exists in the crossover area

            begin

                  if charindex(N',',@temp)>0 --if commas are available, these are the preferred delimeter

                        select @temp = N','

                  else

                        if charindex(N' ',@temp)>0  --spaces are the second favourite

                              select @temp =N' '

                  select @back = reverse(@stringToSplit)

                  , @i=1

                  , @j = ceiling(len(@stringToSplit)/2)-1

                  , @forward = 1

                  , @mark = len(@stringToSplit)

                  while @i<=(len(@temp)+charindex(N' ',@temp))--try out all available delimeters (if comma or space are present, others are ignored)

                  begin

                        while @j>0

                        begin

                              select @tempmark = charindex(substring(@temp,@i,1),@stringToSplit,@j)

                              if @tempmark > 0 and @tempmark < @mark

                                    select @mark = @tempmark, @forward = 1

                              select @tempmark = charindex(substring(@temp,@i,1),@back,@j)

                              if @tempmark > 0 and @tempmark < @mark

                                    select @mark = @tempmark, @forward = 0

                              if(@mark<=@len1 and (len(@stringToSplit)-@mark)<=@len2)

                                    break

                              select @j = @j - 1

                        end

                        select @i = @i + 1

                  end

                  --select @mark = case @forward when 0 then len(@stringToSplit)-(@mark+1) else @mark end

                  select @mark = case @forward when 0 then (1+len(@stringToSplit))-@mark else @mark end

            end

      end

      --catch any issues & account for when we can't find a good break point / there isn't space

      if (@mark <= 0 or @mark > @len1)

            select @mark = @len1

            

      insert into @returnVal (string)     values (ltrim(rtrim(substring(@stringToSplit,1,@mark))))

      insert into @returnVal (string)     values (ltrim(rtrim(substring(@stringToSplit,1+@mark,@len2))))

 

      return

            

END

 

GO

 

--examples of the function in action

select * from dbo.SplitSpecial('33 Rue De Maisons, Paris',100,100)

select * from dbo.SplitSpecial('33 Rue De Maisons, Paris, France 123456',100,100)

select * from dbo.SplitSpecial('33,Rue,De,Maisons,Paris,France,123456',100,100)

select * from dbo.SplitSpecial('33 Rue De Maisons Paris France 123456',100,100)

select * from dbo.SplitSpecial('33 Rue De Maisons, Paris, France 123456',8,50)

select * from dbo.SplitSpecial('this string is too big for the fields',10,10)

At some point, I hope to update this to allow it to take any number of fields of different lengths, to make it more general (e.g. should you ever have a full address to be split into 4 or 5 lines), but for now, it works (so far, at least)

Blog at WordPress.com.

%d bloggers like this: