Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Vote for your favorite vizzies from the Power BI Dataviz World Championship submissions. Vote now!
Hi Guys,
I am trying to retrive a hierarchy beween P/N from a Website by scraping data each individual P/N html page.
I have a list of all the P/N that are active with their name and their unique #id on an Excel spreadsheet like the table below :
| P/N - Name | id |
| Cylinder head | 12 |
| Rod (bearing) | 35 |
| liner type B | 27 |
The html pages of each P/N are build like this : https://<my_domain>/<random_name>/PN=id
and on each page I can easily retrive the P/N that are directly linked to this P/N (see function Get_hierarchy)
On the table below, I would have scrapped the P/N Actuator (in green) which would then have given me the 3 sub components (in blue).
| level | P/N - Name |
| 0 | Actuator |
| 1 | Cylinder |
| 1 | Rod |
| 1 | Sealing system |
I would then need to retrive the ID of each sub component to issue the same request for each of those P/N and get the list of P/N that are directly linked to each each sub component. The end table will look a bit like this :
| level | P/N - Name | id |
| 0 | Actuator | 3 |
| 1 | Cylinder | 46 |
| 2 | Cylinder body | 78 |
| 2 | Cylinder head | 12 |
| 1 | Rod | 33 |
| 2 | Rod (bar) | 35 |
| 2 | Rod (bearing) | 40 |
| 1 | Sealing system | 112 |
Everything is working fine with the code below except when I have more that 8 row to scrap on the same level where the Query freezes and I can't get a decent result. Issue being, I have at least 100.000 P/N to scrap on 15 hierarchy levels so I can't do it manualy.
Code on the Hierachy_full table (for the 1st 2 levels but the remaining lignes are similar) :
let
Source = <Top P/N with its ID>
#"Fonction personnalisée appelée" = Table.AddColumn(Source, "N-1", each Get_hierarchy([N.ID])),
#"N-1 développé" = Table.ExpandTableColumn(#"Fonction personnalisée appelée", "N-1", {"P/N"}, {"N-1.P/N"}),
#"Requêtes fusionnées" = Table.NestedJoin(#"N-1 développé", {"N-1.P/N"}, #"P/N actifs", {"P/N"}, "P/N actifs", JoinKind.LeftOuter),
#"P/N actifs développé" = Table.ExpandTableColumn(#"Requêtes fusionnées", "P/N actifs", {"P/N"}, {"N-1.ID"}),
#"Fonction personnalisée appelée1" = Table.AddColumn(#"P/N actifs développé", "N-2", each #"Get_hierarchy"([#"N-1.ID"])),
#"N-2 développé" = Table.ExpandTableColumn(#"Fonction personnalisée appelée1", "N-2", {"P/N"}, {"N-2.P/N"}),
in
#"N-2 développé"
And the code of the function Get_hierarchy :
let
Source = (#"ID" as any) => let
Source = Web.Page(Web.Contents("https://<my_domain>/<random_name>/PN= " & #"id")),
Data = Source{2}[Data],
#"Type modifié" = Table.TransformColumnTypes(Data,{{"Column1", type text}, {"Column2", type text}}),
#"Diviser la colonne selon les transitions de caractères" = Table.SplitColumn(#"Type modifié", "Column2", Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"}), {"Column2.1", "Column2.2"}),
#"Colonnes renommées" = Table.RenameColumns(#"Diviser la colonne selon les transitions de caractères",{{"Column2.1", "P/N"}}),
#"Autres colonnes supprimées" = Table.SelectColumns(#"Colonnes renommées",{"P/N"})
in
#"Autres colonnes supprimées"
in
Source
Does any of you have an idea of what I did wrong or where does the problem could came from ?
And how to fix it of course.
Many thanks,
Romuald.
Is it possible that the web server is blocking your scraping/slowing you down because you cause too much traffic for them? Maybe you should ask them if they would be willing to give you a data extract.
Vote for your favorite vizzies from the Power BI World Championship submissions!
If you love stickers, then you will definitely want to check out our Community Sticker Challenge!
Check out the January 2026 Power BI update to learn about new features.
| User | Count |
|---|---|
| 15 | |
| 11 | |
| 11 | |
| 8 | |
| 6 |