Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
Hi,
I'm consuming an API which uses paging. I make the first request to page one with a rpp (records per page) of 1000. The first request tells me the count of pages as part of the JSON response. I then build a M list from 1 > Pages. The M code then uses the list to create a Custom Column that is Expanded to a Table.
This all works, but I was getting complaints of dupicate data when trying to create the relationships. It seems I have hundreds of repeated rows of data.
Here is my M code:
let //get page count GetAllPages = Json.Document(Web.Contents((APISettings[ProjectURL]{0}) & "/formchecks?key=" & (APISettings[APIKey]{0}) & "&rpp=1000&page=1&OrderBy=FCID")), #"Converted to Table" = Record.ToTable(GetAllPages), #"Changed Type" = Table.TransformColumnTypes(#"Converted to Table",{{"Value", type text}}), PageCount = #"Converted to Table"{4}[Value], myPageList= List.Generate(()=>1, each _ <= (PageCount), each _ + 1), #"PageListTable" = Table.FromList(myPageList, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Renamed Columns" = Table.RenameColumns(#"PageListTable",{{"Column1", "Pages"}}), #"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"Pages", type text}}), myFormChecks = (page as text) as table => let Source = Json.Document(Web.Contents((APISettings[ProjectURL]{0}) & "/formchecks?key=" & (APISettings[APIKey]{0}) & "&rpp=1000&OrderBy=FCID&page=" & (page) )), #"Converted to Table" = Record.ToTable(Source), Value1 = #"Converted to Table"{0}[Value], #"Converted to Table1" = Table.FromList(Value1, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table1", "Column1", {"FCID", "SCID", "CheckCode", "FormCode", "Categ", "Description", "srt", "Result", "ResultText", "ResultInt", "ResultDec", "ResultImg", "FIAID", "UserID", "DateStamp", "EmailUserID", "AnsType", "Mandatory", "EmailUser", "MultiChoice", "SCGUID", "PageNo", "DateSync", "ListCode", "LinkCheckCode", "FCGUID", "AutoSnagged", "IsCompleted", "UserRoleID", "Role"}, {"Column1.FCID", "Column1.SCID", "Column1.CheckCode", "Column1.FormCode", "Column1.Categ", "Column1.Description", "Column1.srt", "Column1.Result", "Column1.ResultText", "Column1.ResultInt", "Column1.ResultDec", "Column1.ResultImg", "Column1.FIAID", "Column1.UserID", "Column1.DateStamp", "Column1.EmailUserID", "Column1.AnsType", "Column1.Mandatory", "Column1.EmailUser", "Column1.MultiChoice", "Column1.SCGUID", "Column1.PageNo", "Column1.DateSync", "Column1.ListCode", "Column1.LinkCheckCode", "Column1.FCGUID", "Column1.AutoSnagged", "Column1.IsCompleted", "Column1.UserRoleID", "Column1.Role"}) in #"Expanded Column1", //loop through each page #"Added Custom" = Table.AddColumn(#"Changed Type1", "Custom", each myFormChecks([Pages])), #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1.FCID", "Column1.SCID", "Column1.CheckCode", "Column1.FormCode", "Column1.Categ", "Column1.Description", "Column1.srt", "Column1.Result", "Column1.ResultText", "Column1.ResultInt", "Column1.ResultDec", "Column1.ResultImg", "Column1.FIAID", "Column1.UserID", "Column1.DateStamp", "Column1.EmailUserID", "Column1.AnsType", "Column1.Mandatory", "Column1.EmailUser", "Column1.MultiChoice", "Column1.SCGUID", "Column1.PageNo", "Column1.DateSync", "Column1.ListCode", "Column1.LinkCheckCode", "Column1.FCGUID", "Column1.AutoSnagged", "Column1.IsCompleted", "Column1.UserRoleID", "Column1.Role"}, {"Column1.FCID", "Column1.SCID", "Column1.CheckCode", "Column1.FormCode", "Column1.Categ", "Column1.Description", "Column1.srt", "Column1.Result", "Column1.ResultText", "Column1.ResultInt", "Column1.ResultDec", "Column1.ResultImg", "Column1.FIAID", "Column1.UserID", "Column1.DateStamp", "Column1.EmailUserID", "Column1.AnsType", "Column1.Mandatory", "Column1.EmailUser", "Column1.MultiChoice", "Column1.SCGUID", "Column1.PageNo", "Column1.DateSync", "Column1.ListCode", "Column1.LinkCheckCode", "Column1.FCGUID", "Column1.AutoSnagged", "Column1.IsCompleted", "Column1.UserRoleID", "Column1.Role"}), #"Renamed Columns1" = Table.RenameColumns(#"Expanded Custom",{{"Column1.FCID", "FCID"}, {"Column1.SCID", "SCID"}, {"Column1.CheckCode", "CheckCode"}, {"Column1.FormCode", "FormCode"}, {"Column1.Categ", "Categ"}, {"Column1.Description", "Description"}, {"Column1.srt", "srt"}, {"Column1.Result", "Result"}, {"Column1.ResultText", "ResultText"}, {"Column1.ResultInt", "ResultInt"}, {"Column1.ResultDec", "ResultDec"}, {"Column1.ResultImg", "ResultImg"}, {"Column1.FIAID", "FIAID"}, {"Column1.UserID", "UserID"}, {"Column1.DateStamp", "DateStamp"}, {"Column1.EmailUserID", "EmailUserID"}, {"Column1.AnsType", "AnsType"}, {"Column1.Mandatory", "Mandatory"}, {"Column1.EmailUser", "EmailUser"}, {"Column1.MultiChoice", "MultiChoice"}, {"Column1.SCGUID", "SCGUID"}, {"Column1.PageNo", "PageNo"}, {"Column1.DateSync", "DateSync"}, {"Column1.ListCode", "ListCode"}, {"Column1.LinkCheckCode", "LinkCheckCode"}, {"Column1.FCGUID", "FCGUID"}, {"Column1.AutoSnagged", "AutoSnagged"}, {"Column1.IsCompleted", "IsCompleted"}, {"Column1.UserRoleID", "UserRoleID"}, {"Column1.Role", "Role"}}), #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns1",{"Pages"}), #"Changed Type2" = Table.TransformColumnTypes(#"Removed Columns",{{"SCID", Int64.Type}, {"FCID", Int64.Type}, {"Result", Int64.Type}, {"ResultInt", Int64.Type}, {"FIAID", Int64.Type}, {"UserID", Int64.Type}, {"EmailUserID", Int64.Type}, {"PageNo", Int64.Type}, {"UserRoleID", Int64.Type}, {"Role", Int64.Type}, {"IsCompleted", type logical}, {"AutoSnagged", type logical}, {"MultiChoice", type logical}, {"EmailUser", type logical}, {"Mandatory", type logical}, {"DateStamp", type datetime}, {"DateSync", type datetime}}) in #"Changed Type2"
The top section makes the first request and gets the MaxPages, then creates a list from that.
each myFormChecks([Pages]) is what calls the function repeatedly with the new page.
In Fiddler on the server I can see though that multiple requests to the page are sometimes being made:
Can anyone see where I am going wrong?
Solved! Go to Solution.
@markive Looking at your query there isn't anything that strikes me as wrong. Maybe the duplication is in the server side due to page faults?
It's actually quite normal for PBI to make more than one requests with the same page ID. They maybe used to address different steps since the expansion and etc., isn't foldable to to the service.
One thing you can do is reduce the number of pages and the page size, see if that makes a difference. Simplify the query to eliminate sources of error. If the duplication is indeed server side, then you can add a Remove Duplicate step in the end.
Obvious I don't have access to your data source so I simulated the behavior using OData. For reference, here's a Query that pulls data from http://services.odata.org/v3/northwind/northwind.svc/Orders using JSON document
let NumberOfPages = 10, GetPage = (pageId) => let Source = Json.Document(Web.Contents("http://services.odata.org/v3/northwind/northwind.svc/Orders?$format=json&$top=10&$skip=" & Text.From(pageId * 10))), value = Source[value], #"Converted to Table" = Table.FromList(value, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"OrderID", "CustomerID", "EmployeeID", "OrderDate", "RequiredDate", "ShippedDate", "ShipVia", "Freight", "ShipName", "ShipAddress", "ShipCity", "ShipRegion", "ShipPostalCode", "ShipCountry"}, {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"}) in #"Expanded Column1", Pages = {1..NumberOfPages}, #"Converted to Table" = Table.FromList(Pages, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each GetPage([Column1])), #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"}, {"Custom.Column1.OrderID", "Custom.Column1.CustomerID", "Custom.Column1.EmployeeID", "Custom.Column1.OrderDate", "Custom.Column1.RequiredDate", "Custom.Column1.ShippedDate", "Custom.Column1.ShipVia", "Custom.Column1.Freight", "Custom.Column1.ShipName", "Custom.Column1.ShipAddress", "Custom.Column1.ShipCity", "Custom.Column1.ShipRegion", "Custom.Column1.ShipPostalCode", "Custom.Column1.ShipCountry"}), #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Column1"}) in #"Removed Columns"
@markive Looking at your query there isn't anything that strikes me as wrong. Maybe the duplication is in the server side due to page faults?
It's actually quite normal for PBI to make more than one requests with the same page ID. They maybe used to address different steps since the expansion and etc., isn't foldable to to the service.
One thing you can do is reduce the number of pages and the page size, see if that makes a difference. Simplify the query to eliminate sources of error. If the duplication is indeed server side, then you can add a Remove Duplicate step in the end.
Obvious I don't have access to your data source so I simulated the behavior using OData. For reference, here's a Query that pulls data from http://services.odata.org/v3/northwind/northwind.svc/Orders using JSON document
let NumberOfPages = 10, GetPage = (pageId) => let Source = Json.Document(Web.Contents("http://services.odata.org/v3/northwind/northwind.svc/Orders?$format=json&$top=10&$skip=" & Text.From(pageId * 10))), value = Source[value], #"Converted to Table" = Table.FromList(value, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"OrderID", "CustomerID", "EmployeeID", "OrderDate", "RequiredDate", "ShippedDate", "ShipVia", "Freight", "ShipName", "ShipAddress", "ShipCity", "ShipRegion", "ShipPostalCode", "ShipCountry"}, {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"}) in #"Expanded Column1", Pages = {1..NumberOfPages}, #"Converted to Table" = Table.FromList(Pages, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each GetPage([Column1])), #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"}, {"Custom.Column1.OrderID", "Custom.Column1.CustomerID", "Custom.Column1.EmployeeID", "Custom.Column1.OrderDate", "Custom.Column1.RequiredDate", "Custom.Column1.ShippedDate", "Custom.Column1.ShipVia", "Custom.Column1.Freight", "Custom.Column1.ShipName", "Custom.Column1.ShipAddress", "Custom.Column1.ShipCity", "Custom.Column1.ShipRegion", "Custom.Column1.ShipPostalCode", "Custom.Column1.ShipCountry"}), #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Column1"}) in #"Removed Columns"
I have followed the examples, but I can't make it work. This is the page
It is a website where the plots of Andalusia appear and the GPS location
Up to only 500 of 2.000.000 milion of records appear and you have to increase the "Startindex" to 1000, 1500, etc.
can someone help me?
Are you aware whether or not the Dynamic variables for Paging will break the option to use Schedule Refresh on your Dataset(s).
We are receiving the following error:
You can't schedule refresh for this dataset because one or more sources currently don't support refresh.
let NumberOfPages = 10, GetPage = (pageId) => let Source = Json.Document(Web.Contents(API))), value = Source[value],
@bryw did you ever figre out the refresh issue? I am running into the same thing
Actually I did figure it out, if you use the 'Query' trick with Web.Contents instead of actually changing the base URL, it appears to work with refresh in the Power BI Service.
great you already figured it out!
anyway, there is a working example here in this other post.
indeed, I am getting the same refresh error using another API with page function inside Json.Document(Web.Contents(page_function_here)).
what is the correct work around? i am new to M language...
Hi @pqian
You are right! Thanks for your response. I tried to reply to my own thread yesterday and for some reason it wouldn't let me.
Sometimes the page order requested is random and also repeats probably due to some asynchroniousy but it does sort itself out in the end. The API I was targetting did indeed have an issue.
I didn't find too much information on paging so hopefully our two examples will help others trying to do the same as it must be quite common.
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
111 | |
96 | |
90 | |
79 | |
67 |
User | Count |
---|---|
153 | |
125 | |
114 | |
111 | |
95 |