March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
I have a table with almost 700 distinct email addresses which I need to check against Active Directory forrest. Importing the data from Active Directory gives me like more than 20K records, so I don't want to go that way. Is there another way I can accomplish this?
Thanks,
Sander
Solved! Go to Solution.
For the fun of it try this code and see what it brings. The AD connector does seem to do some sort of caching so it might not be all bad.
let
Source = #table({"email"},{{"valid.email@address.com"},{"invalid.email@address.com"}}),
CheckAD = (email) => let
Source = Table.SelectColumns(ActiveDirectory.Domains("domain.net"){[Domain="domain.net"]}[#"Object Categories"]{[Category="organizationalPerson"]}[Objects],{"distinguishedName"})
in
Table.RowCount(Table.SelectRows(Source, each Text.Contains([distinguishedName], email))),
#"Invoked Custom Function" = Table.AddColumn(Source, "CheckAD", each CheckAD([email]))
in
#"Invoked Custom Function"
No, not too slow, but not allowed in the company. I was thinking about checking each individual address (about 700 addresses). How can that be accomplished?
When you say "not allowed" do you mean there is a business rule? Because technically nobody can prevent you from reading the AD. That would defeat its entire purpose.
Try the AD connector in Power BI, limit it to just the columns you need, and then do a left join from your list to the AD query. That will tell you which of your addresses are not valid etc.
You're completely right: technically it is possible to do this download. But not allowed because of strict business rules on several topics.
Thanks for the hint. The AD connector I already tried; I want to limit the records downloaded from AD. I was thinking about something like:
"for each email-address-to-validate ask AD if it exists in the forrest"
But don't know how to accomplish this or if it is even possible.
"for each email-address-to-validate ask AD if it exists in the forrest"
That would make it even worse, much worse. It would mean that for each email address in your list you would fetch the entire AD, filter it, throw away 99.995% of the data, and then do the same all over again for the other 700 addresses. This will be atrociously slow.
let
Source = ActiveDirectory.Domains("xxx.yyy"),
xxx.yyy = Source{[Domain="xxx.yyy"]}[#"Object Categories"],
organizationalPerson1 = xxx.yyy{[Category="organizationalPerson"]}[Objects],
#"Removed Other Columns" = Table.SelectColumns(organizationalPerson1,{"distinguishedName"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each Text.Contains([distinguishedName], "sample@email.com"))
in
#"Filtered Rows"
You can't even optimize that with Table.Buffer if you do it that way.
Talk to your AD team and discuss options.
Getting the data and throwing it away again and repeating that 700 times was definitely not what I had in mind. Something like AD returning a TRUE or FALSE, but I don't know if that's possible so I ask this forum for help.
Thanks for your help. I'll test the query later today and let you know.
For the fun of it try this code and see what it brings. The AD connector does seem to do some sort of caching so it might not be all bad.
let
Source = #table({"email"},{{"valid.email@address.com"},{"invalid.email@address.com"}}),
CheckAD = (email) => let
Source = Table.SelectColumns(ActiveDirectory.Domains("domain.net"){[Domain="domain.net"]}[#"Object Categories"]{[Category="organizationalPerson"]}[Objects],{"distinguishedName"})
in
Table.RowCount(Table.SelectRows(Source, each Text.Contains([distinguishedName], email))),
#"Invoked Custom Function" = Table.AddColumn(Source, "CheckAD", each CheckAD([email]))
in
#"Invoked Custom Function"
Sorry, it took a bit more time then expected.
This is not really the solution I had in mind. It seems to download everything from AD (under organizationalPerson). It will work, but as you said from performance point of view it does not.
Previously I said something about 20K records... well I let the query run a bit longer, but stopped it at 500K. I guess I have to search for another way to solve this.
Thanks for the help so far!
Have you talked to your AD team? Surely your company has some sort of scheduled AD extract?
Not yet, but I will. Thanks for the help. I'll accept your hint as solution.
" gives me like more than 20K records, so I don't want to go that way"
Why not? Is it too slow?
Do you have another source that pulls the data from AD on a regular basis?
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.