Parse field into multiple rows
Parse field into multiple rows
am 27.06.2007 13:22:45 von rcamarda
Hello,
I am loading data from our MS Active Directory into our data
warehouse. (check out Mircosofts's Logparser, it can pull data from
ADS, server event logs and more. It can also create text files or load
directly to SQL. Its free and a pretty useful tool)
There is a field that contains the direct reports of a manager. The
direct report users are delimited by a pipe symbol.
I want to breakup the field into multple rows. There can be none, one
or many direct report users in this field.
This is a snippet of an example. This is only an example. I know that
I have not defined PK nor indexes. My focus is how to solve a problem
of parsing a field that has multple values into multple rows.
Thanks for any help in advance.
Rob
CREATE TABLE "dbo"."F_ADS_MANAGERS"
(
"MANAGER_KEY" VARCHAR(255) NULL,
"DIRECT_REPORTS_CN" VARCHAR(255) NULL
);
INSERT INTO F_ADS_MANAGERS (MANAGER_KEY, DIRECT_REPORTS_CN)
VALUES ('CN=Marilette, 'CN=Robert
D,OU=TechnologyGroup,DC=strayer,DC=edu|CN=Robert
Camarda,OU=TechnologyGroup,DC=strayer,DC=edu|CN=Michelle
C,OU=TechnologyGroup,DC=strayer,DC=edu|CN=Magnolia
B,OU=TechnologyGroup,DC=strayer,DC=edu|CN=Lee K,OU=TechnologyGroup')
I want to end up with 5 rows, 1 row for each user that is seprated by
the PIPE symbol.
CN=Marilette CN=Robert D,OU=TechnologyGroup,DC=strayer,DC=edu
CN=Marilette CN=Robert
Camarda,OU=TechnologyGroup,DC=strayer,DC=edu
CN=Marilette CN=Michelle C,OU=TechnologyGroup,DC=strayer,DC=edu
CN=Marilette CN=Magnolia B,OU=TechnologyGroup,DC=strayer,DC=edu
CN=Marilette CN=Lee K,OU=TechnologyGroup
Re: Parse field into multiple rows
am 27.06.2007 16:01:36 von Plamen Ratchev
Here is one method to split the list using a recursive CTE (this assumes SQL
Server 2005). It is not the fastest, but with a small size list to split it
should be OK.
WITH Managers
(manager_key, direct_reports_cn, start_pos, end_pos)
AS
(
SELECT manager_key,
direct_reports_cn + '|',
1,
CHARINDEX('|', direct_reports_cn + '|')
FROM F_ADS_MANAGERS
UNION ALL
SELECT manager_key,
direct_reports_cn,
end_pos + 1,
CHARINDEX('|', direct_reports_cn, end_pos + 1)
FROM Managers
WHERE CHARINDEX('|', direct_reports_cn, end_pos + 1) > 0
)
SELECT manager_key,
SUBSTRING(direct_reports_cn,
start_pos,
end_pos - start_pos) AS direct_report
FROM Managers
WHERE end_pos > 0;
Erland Sommarskog has a very good article for arrays and lists that covers
various techniques and analysis:
http://www.sommarskog.se/arrays-in-sql.html
HTH,
Plamen Ratchev
http://www.SQLStudio.com
Re: Parse field into multiple rows
am 27.06.2007 17:15:58 von rcamarda
Thanks for the link to Erland's article's, it is very handy.
I tried your suggestion but I am confused by the WITH.
I tried
SELECT * FROM F_ADS_MANAGERS
WITH Managers
and I got an error. Haven't seen this before so I am not sure how to
trouble shoot.
Error I received:
Msg 336, Level 15, State 1, Line 2
Incorrect syntax near 'Managers'. If this is intended to be a common
table expression, you need to explicitly terminate the previous
statement with a semi-colon.
On Jun 27, 10:01 am, "Plamen Ratchev" wrote:
> Here is one method to split the list using a recursive CTE (this assumes SQL
> Server 2005). It is not the fastest, but with a small size list to split it
> should be OK.
>
> WITH Managers
> (manager_key, direct_reports_cn, start_pos, end_pos)
> AS
> (
> SELECT manager_key,
> direct_reports_cn + '|',
> 1,
> CHARINDEX('|', direct_reports_cn + '|')
> FROM F_ADS_MANAGERS
> UNION ALL
> SELECT manager_key,
> direct_reports_cn,
> end_pos + 1,
> CHARINDEX('|', direct_reports_cn, end_pos + 1)
> FROM Managers
> WHERE CHARINDEX('|', direct_reports_cn, end_pos + 1) > 0
> )
> SELECT manager_key,
> SUBSTRING(direct_reports_cn,
> start_pos,
> end_pos - start_pos) AS direct_report
> FROM Managers
> WHERE end_pos > 0;
>
> Erland Sommarskog has a very good article for arrays and lists that covers
> various techniques and analysis:http://www.sommarskog.se/arrays-in-sql.html
>
> HTH,
>
> Plamen Ratchevhttp://www.SQLStudio.com
Re: Parse field into multiple rows
am 27.06.2007 19:30:34 von Plamen Ratchev
Here are a few notes that will help you understand the common table
expressions and why you get the error:
- It is required to terminate the statement before the WITH keyword defining
the common table expression with a semicolon. It is because the WITH keyword
has other uses. In your case if you end the select statement in the line
before WITH using ; it will work.
- I am not sure if you understand how CTEs work. CTEs are not materialized
and work pretty much like derived tables (with more functionality, like the
recursion used in this method). They only get defined before the statement
the uses them, so you can reference the CTE only in that statement. There is
a lot more to it, and if you decide to use this method I would suggest to
read more on CTEs in SQL Server Books OnLine.
- If you need to materialize the result of the CTE, you can use INSERT to
insert the data into a temporary table, or just define a view based on the
CTE (or you could also create a function based on the CTE).
Here is a complete example using the approach to define a view based on the
CTE:
CREATE TABLE F_ADS_MANAGERS
(
MANAGER_KEY VARCHAR(255) NULL,
DIRECT_REPORTS_CN VARCHAR(255) NULL
);
INSERT INTO F_ADS_MANAGERS (MANAGER_KEY, DIRECT_REPORTS_CN)
VALUES ('CN=Marilette',
'CN=Robert D,OU=TechnologyGroup,DC=strayer,DC=edu|
CN=Robert Camarda,OU=TechnologyGroup,DC=strayer,DC=edu|
CN=Michelle C,OU=TechnologyGroup,DC=strayer,DC=edu|
CN=Magnolia B,OU=TechnologyGroup,DC=strayer,DC=edu|
CN=Lee K,OU=TechnologyGroup');
GO
CREATE VIEW ManagersWithReports
(manager_key, direct_report)
AS
WITH Managers
(manager_key, direct_reports_cn, start_pos, end_pos)
AS
(
SELECT manager_key,
direct_reports_cn + '|',
1,
CHARINDEX('|', direct_reports_cn + '|')
FROM F_ADS_MANAGERS
UNION ALL
SELECT manager_key,
direct_reports_cn,
end_pos + 1,
CHARINDEX('|', direct_reports_cn, end_pos + 1)
FROM Managers
WHERE CHARINDEX('|', direct_reports_cn, end_pos + 1) > 0
)
SELECT manager_key,
SUBSTRING(direct_reports_cn,
start_pos,
end_pos - start_pos) AS direct_report
FROM Managers
WHERE end_pos > 0;
GO
SELECT manager_key,
direct_report
FROM ManagersWithReports;
GO
DROP VIEW ManagersWithReports;
DROP TABLE F_ADS_MANAGERS;
GO
HTH,
Plamen Ratchev
http://www.SQLStudio.com