Friday, March 30, 2012

Remove Duplicate Rows

I've got the following table data:

116525.99
116520.14
129965.03
129960.12
129967.00

And I need to write a query to return only rows 2 and 4, since the
remaining rows have duplicate IDs. I've tried the Group By, but am
having no luck.

Thanks!dale...@.gmail.com wrote:
> I've got the following table data:
> 116525.99
> 116520.14
> 129965.03
> 129960.12
> 129967.00
> And I need to write a query to return only rows 2 and 4, since the
> remaining rows have duplicate IDs. I've tried the Group By, but am
> having no luck.
> Thanks!

What do you mean by "rows 2 and 4"? Those numbers refer to positions in
the list of values you posted, but SQL Server knows nothing about that
because tables in SQL have no logical order at all. In other words you
haven't given enough information to answer your question.

If these are the only two columns you have then probably the best you
can do is:

SELECT col1, MIN(col2) AS col2
FROM your_table
GROUP BY col1 ;

or:

SELECT col1, MAX(col2) AS col2
FROM your_table
GROUP BY col1 ;

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/...US,SQL.90).aspx
--|||Why 2 and 4? Why not 1 & 3 or 1 & 5? What are you using as your
discriminator?

--
Tom

----------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Toronto, ON Canada
..
<dale.zjc@.gmail.com> wrote in message
news:1144699376.547275.246250@.t31g2000cwb.googlegr oups.com...
I've got the following table data:

11652 5.99
11652 0.14
12996 5.03
12996 0.12
12996 7.00

And I need to write a query to return only rows 2 and 4, since the
remaining rows have duplicate IDs. I've tried the Group By, but am
having no luck.

Thanks!|||Thanks for the quick response!

Here's my complete query:

SELECT Object.Name, Job.JobID, MAX(Data.[Value]) as NewValue,
DATEADD(S, Data.[Time], '1jan1970') AS EventDate,
Job.KSName, GETDATE() AS CURDATE

FROM DataHeader INNER JOIN
Data ON DataHeader.DataID = Data.DataID INNER JOIN
Object INNER JOIN
Job ON Object.ObjID = Job.MachineObjID ON
DataHeader.JobID = Job.JobID

Group By Job.JobID

But I'm getting the following error:

Server: Msg 8120, Level 16, State 1, Line 1
Column 'Object.Name' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
Server: Msg 8120, Level 16, State 1, Line 1
Column 'Data.Time' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
Server: Msg 8120, Level 16, State 1, Line 1
Column 'Job.KSName' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.

David Portas wrote:
> dale...@.gmail.com wrote:
> > I've got the following table data:
> > 116525.99
> > 116520.14
> > 129965.03
> > 129960.12
> > 129967.00
> > And I need to write a query to return only rows 2 and 4, since the
> > remaining rows have duplicate IDs. I've tried the Group By, but am
> > having no luck.
> > Thanks!
> What do you mean by "rows 2 and 4"? Those numbers refer to positions in
> the list of values you posted, but SQL Server knows nothing about that
> because tables in SQL have no logical order at all. In other words you
> haven't given enough information to answer your question.
> If these are the only two columns you have then probably the best you
> can do is:
> SELECT col1, MIN(col2) AS col2
> FROM your_table
> GROUP BY col1 ;
> or:
> SELECT col1, MAX(col2) AS col2
> FROM your_table
> GROUP BY col1 ;
> --
> David Portas, SQL Server MVP
> Whenever possible please post enough code to reproduce your problem.
> Including CREATE TABLE and INSERT statements usually helps.
> State what version of SQL Server you are using and specify the content
> of any error messages.
> SQL Server Books Online:
> http://msdn2.microsoft.com/library/...US,SQL.90).aspx
> --|||dale...@.gmail.com wrote:
> Thanks for the quick response!
> Here's my complete query:
> SELECT Object.Name, Job.JobID, MAX(Data.[Value]) as NewValue,
> DATEADD(S, Data.[Time], '1jan1970') AS EventDate,
> Job.KSName, GETDATE() AS CURDATE
> FROM DataHeader INNER JOIN
> Data ON DataHeader.DataID = Data.DataID INNER JOIN
> Object INNER JOIN
> Job ON Object.ObjID = Job.MachineObjID ON
> DataHeader.JobID = Job.JobID
> Group By Job.JobID
> But I'm getting the following error:
> Server: Msg 8120, Level 16, State 1, Line 1
> Column 'Object.Name' is invalid in the select list because it is not
> contained in either an aggregate function or the GROUP BY clause.
> Server: Msg 8120, Level 16, State 1, Line 1
> Column 'Data.Time' is invalid in the select list because it is not
> contained in either an aggregate function or the GROUP BY clause.
> Server: Msg 8120, Level 16, State 1, Line 1
> Column 'Job.KSName' is invalid in the select list because it is not
> contained in either an aggregate function or the GROUP BY clause.

Any column that you don't want to group by needs to be enclosed in an
aggregate function (MIN or MAX for example). Your problem is obviously
a bit different to what you first asked for. The best way to post a
problem like this is to include enough code so that others can
reproduce it. See:
http://www.aspfaq.com/etiquette.asp?id=5006

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/...US,SQL.90).aspx
--|||(dale.zjc@.gmail.com) writes:
> Thanks for the quick response!
> Here's my complete query:
> SELECT Object.Name, Job.JobID, MAX(Data.[Value]) as NewValue,
> DATEADD(S, Data.[Time], '1jan1970') AS EventDate,
> Job.KSName, GETDATE() AS CURDATE
> FROM DataHeader INNER JOIN
> Data ON DataHeader.DataID = Data.DataID INNER JOIN
> Object INNER JOIN
> Job ON Object.ObjID = Job.MachineObjID ON
> DataHeader.JobID = Job.JobID
> Group By Job.JobID

This is possible correct version of your query, but most probably not.
It's just a piece of guesswork.

SELECT o.Name, j.JobID, mx.NewValue,
DATEADD(ss, d.[Time], '1 jan 1970') AS EventDate,
j.KSName, GETDATE() AS CURDATE
FROM (SELECT j.JobID, NewValue = MAX(d.[Value])
FROM DataHeader dh
JOIN Job j ON dh.JobID = j.JobID
JOIN Data d ON dh.DataID = d.DataID) AS mx
JOIN Job j ON mx.JobID = j.JobID
JOIN DataHeader dh ON dh.JobID = j.JobID
JOIN Data d ON dh.DataID = d.DataID
JOIN Object o ON o.ObjID = j.MachineObjID
GROUP BY j.JobID

For this type of questions it helps if you include descriptions of
your tables, including keys. Preferably in form of CREATE TABLE
statements. Sample data is also a good idea, even better if as
INSERT statements, as that makes it easy to post a tested solution.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx

No comments:

Post a Comment