Problem
Every once in awhile a table gets created without a primary key and duplicate records get entered. The problem gets even worse when you have two identical rows in the table and there is no way to distinguish between the two rows. So how do you delete the duplicate record?
Every once in awhile a table gets created without a primary key and duplicate records get entered. The problem gets even worse when you have two identical rows in the table and there is no way to distinguish between the two rows. So how do you delete the duplicate record?
Solution
One option that SQL Server gives you is the ability to set ROWCOUNT which limits the numbers of records affected by a command. The default value is 0 which means all records, but this value can be set prior to running a command. So let's create a table and add 4 records with one duplicate record.
One option that SQL Server gives you is the ability to set ROWCOUNT which limits the numbers of records affected by a command. The default value is 0 which means all records, but this value can be set prior to running a command. So let's create a table and add 4 records with one duplicate record.
Create a table called duplicateTest and add 4 records.
CREATE TABLE dbo.duplicateTest
(
[ID] [int] ,
[FirstName] [varchar](25),
[LastName] [varchar](25)
) ON [PRIMARY]
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')
INSERT INTO dbo.duplicateTest VALUES(2, 'Dave','Jones')
INSERT INTO dbo.duplicateTest VALUES(3, 'Karen','White')
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')
(
[ID] [int] ,
[FirstName] [varchar](25),
[LastName] [varchar](25)
) ON [PRIMARY]
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')
INSERT INTO dbo.duplicateTest VALUES(2, 'Dave','Jones')
INSERT INTO dbo.duplicateTest VALUES(3, 'Karen','White')
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')
If we select all data we get the following:
SELECT * FROM dbo.duplicateTest
ID
|
FirstName
|
LastName
|
1
|
Bob
|
Smith
|
2
|
Dave
|
Jones
|
3
|
Karen
|
White
|
1
|
Bob
|
Smith
|
If we try to select the record for Bob Smith will all of the
available values such as the following query:
SELECT * FROM dbo.duplicateTest WHERE ID = 1 AND FirstName = 'Bob' AND LastName = 'Smith'
We still get 2 rows of data:
ID
|
FirstName
|
LastName
|
1
|
Bob
|
Smith
|
1
|
Bob
|
Smith
|
So to delete the duplicate record with SQL Server 2000 and 2005 we
can use the SET ROWCOUNT command to limit the number of rows affected by a
query. By setting it to 1 we can just delete one of these rows in the
table. Note: the select commands are just used to show the data prior and
after the delete occurs.
SELECT * FROM dbo.duplicateTest
SET ROWCOUNT 1
DELETE FROM dbo.duplicateTest WHERE ID = 1
SET ROWCOUNT 0
SELECT * FROM dbo.duplicateTest
SET ROWCOUNT 1
DELETE FROM dbo.duplicateTest WHERE ID = 1
SET ROWCOUNT 0
SELECT * FROM dbo.duplicateTest
With SQL Server 2005 we can also use the TOP command when we issue
the delete, such as the following. Note: the select commands are just used to
show the data prior and after the delete occurs.
SELECT * FROM dbo.duplicateTest
DELETE TOP(1) FROM dbo.duplicateTest WHERE ID = 1
SELECT * FROM dbo.duplicateTest
DELETE TOP(1) FROM dbo.duplicateTest WHERE ID = 1
SELECT * FROM dbo.duplicateTest
So as you can see with SQL Server 2005 there are two options to
allow you to delete duplicate identical rows of data in your tables.
Here is one note from Microsoft about using SET ROWCOUNT:
Using SET
ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in the next
release of SQL Server. Avoid using SET ROWCOUNT together with DELETE, INSERT,
and UPDATE statements in new development work, and plan to modify applications
that currently use it. Also, for DELETE, INSERT, and UPDATE statements that
currently use SET ROWCOUNT, we recommend that you rewrite them to use the TOP
syntax.
0 comments:
Post a Comment