Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
markive
Advocate II
Advocate II

Custom RESTful API Paging - PowerBI making multiple requests to the same page?!? Causing duplicates

Hi,

 

I'm consuming an API which uses paging. I make the first request to page one with a rpp (records per page) of 1000. The first request tells me the count of pages as part of the JSON response. I then build a M list from 1 > Pages. The M code then uses the list to create a Custom Column that is Expanded to a Table.

 

This all works, but I was getting complaints of dupicate data when trying to create the relationships. It seems I have hundreds of repeated rows of data.

 

Here is my M code:

 

let

	//get page count
    GetAllPages = Json.Document(Web.Contents((APISettings[ProjectURL]{0}) & "/formchecks?key=" & (APISettings[APIKey]{0}) & "&rpp=1000&page=1&OrderBy=FCID")),
    #"Converted to Table" = Record.ToTable(GetAllPages),
    #"Changed Type" = Table.TransformColumnTypes(#"Converted to Table",{{"Value", type text}}),
    PageCount = #"Converted to Table"{4}[Value],
    myPageList= List.Generate(()=>1, each _ <= (PageCount), each _ + 1),
    #"PageListTable" = Table.FromList(myPageList, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"PageListTable",{{"Column1", "Pages"}}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"Pages", type text}}),

    myFormChecks = (page as text) as table =>
      let
    	Source = Json.Document(Web.Contents((APISettings[ProjectURL]{0}) & "/formchecks?key=" & (APISettings[APIKey]{0}) & "&rpp=1000&OrderBy=FCID&page=" & (page) )),
    	#"Converted to Table" = Record.ToTable(Source),
    	Value1 = #"Converted to Table"{0}[Value],
    	#"Converted to Table1" = Table.FromList(Value1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    	#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table1", "Column1", {"FCID", "SCID", "CheckCode", "FormCode", "Categ", "Description", "srt", "Result", "ResultText", "ResultInt", "ResultDec", "ResultImg", "FIAID", "UserID", "DateStamp", "EmailUserID", "AnsType", "Mandatory", "EmailUser", "MultiChoice", "SCGUID", "PageNo", "DateSync", "ListCode", "LinkCheckCode", "FCGUID", "AutoSnagged", "IsCompleted", "UserRoleID", "Role"}, {"Column1.FCID", "Column1.SCID", "Column1.CheckCode", "Column1.FormCode", "Column1.Categ", "Column1.Description", "Column1.srt", "Column1.Result", "Column1.ResultText", "Column1.ResultInt", "Column1.ResultDec", "Column1.ResultImg", "Column1.FIAID", "Column1.UserID", "Column1.DateStamp", "Column1.EmailUserID", "Column1.AnsType", "Column1.Mandatory", "Column1.EmailUser", "Column1.MultiChoice", "Column1.SCGUID", "Column1.PageNo", "Column1.DateSync", "Column1.ListCode", "Column1.LinkCheckCode", "Column1.FCGUID", "Column1.AutoSnagged", "Column1.IsCompleted", "Column1.UserRoleID", "Column1.Role"})
     in
    #"Expanded Column1",

	//loop through each page
    #"Added Custom" = Table.AddColumn(#"Changed Type1", "Custom", each myFormChecks([Pages])),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1.FCID", "Column1.SCID", "Column1.CheckCode", "Column1.FormCode", "Column1.Categ", "Column1.Description", "Column1.srt", "Column1.Result", "Column1.ResultText", "Column1.ResultInt", "Column1.ResultDec", "Column1.ResultImg", "Column1.FIAID", "Column1.UserID", "Column1.DateStamp", "Column1.EmailUserID", "Column1.AnsType", "Column1.Mandatory", "Column1.EmailUser", "Column1.MultiChoice", "Column1.SCGUID", "Column1.PageNo", "Column1.DateSync", "Column1.ListCode", "Column1.LinkCheckCode", "Column1.FCGUID", "Column1.AutoSnagged", "Column1.IsCompleted", "Column1.UserRoleID", "Column1.Role"}, {"Column1.FCID", "Column1.SCID", "Column1.CheckCode", "Column1.FormCode", "Column1.Categ", "Column1.Description", "Column1.srt", "Column1.Result", "Column1.ResultText", "Column1.ResultInt", "Column1.ResultDec", "Column1.ResultImg", "Column1.FIAID", "Column1.UserID", "Column1.DateStamp", "Column1.EmailUserID", "Column1.AnsType", "Column1.Mandatory", "Column1.EmailUser", "Column1.MultiChoice", "Column1.SCGUID", "Column1.PageNo", "Column1.DateSync", "Column1.ListCode", "Column1.LinkCheckCode", "Column1.FCGUID", "Column1.AutoSnagged", "Column1.IsCompleted", "Column1.UserRoleID", "Column1.Role"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Expanded Custom",{{"Column1.FCID", "FCID"}, {"Column1.SCID", "SCID"}, {"Column1.CheckCode", "CheckCode"}, {"Column1.FormCode", "FormCode"}, {"Column1.Categ", "Categ"}, {"Column1.Description", "Description"}, {"Column1.srt", "srt"}, {"Column1.Result", "Result"}, {"Column1.ResultText", "ResultText"}, {"Column1.ResultInt", "ResultInt"}, {"Column1.ResultDec", "ResultDec"}, {"Column1.ResultImg", "ResultImg"}, {"Column1.FIAID", "FIAID"}, {"Column1.UserID", "UserID"}, {"Column1.DateStamp", "DateStamp"}, {"Column1.EmailUserID", "EmailUserID"}, {"Column1.AnsType", "AnsType"}, {"Column1.Mandatory", "Mandatory"}, {"Column1.EmailUser", "EmailUser"}, {"Column1.MultiChoice", "MultiChoice"}, {"Column1.SCGUID", "SCGUID"}, {"Column1.PageNo", "PageNo"}, {"Column1.DateSync", "DateSync"}, {"Column1.ListCode", "ListCode"}, {"Column1.LinkCheckCode", "LinkCheckCode"}, {"Column1.FCGUID", "FCGUID"}, {"Column1.AutoSnagged", "AutoSnagged"}, {"Column1.IsCompleted", "IsCompleted"}, {"Column1.UserRoleID", "UserRoleID"}, {"Column1.Role", "Role"}}),
    #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns1",{"Pages"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Removed Columns",{{"SCID", Int64.Type}, {"FCID", Int64.Type}, {"Result", Int64.Type}, {"ResultInt", Int64.Type}, {"FIAID", Int64.Type}, {"UserID", Int64.Type}, {"EmailUserID", Int64.Type}, {"PageNo", Int64.Type}, {"UserRoleID", Int64.Type}, {"Role", Int64.Type}, {"IsCompleted", type logical}, {"AutoSnagged", type logical}, {"MultiChoice", type logical}, {"EmailUser", type logical}, {"Mandatory", type logical}, {"DateStamp", type datetime}, {"DateSync", type datetime}})
in
    #"Changed Type2"

The top section makes the first request and gets the MaxPages, then creates a list from that.

 

each myFormChecks([Pages]) is what calls the function repeatedly with the new page.

 

In Fiddler on the server I can see though that multiple requests to the page are sometimes being made:

 

 

Can anyone see where I am going wrong?

1 ACCEPTED SOLUTION
pqian
Microsoft Employee
Microsoft Employee

@markive Looking at your query there isn't anything that strikes me as wrong. Maybe the duplication is in the server side due to page faults?

 

It's actually quite normal for PBI to make more than one requests with the same page ID. They maybe used to address different steps since the expansion and etc., isn't foldable to to the service. 

 

One thing you can do is reduce the number of pages and the page size, see if that makes a difference. Simplify the query to eliminate sources of error. If the duplication is indeed server side, then you can add a Remove Duplicate step in the end.

 

Obvious I don't have access to your data source so I simulated the behavior using OData. For reference, here's a Query that pulls data from http://services.odata.org/v3/northwind/northwind.svc/Orders using JSON document

 

let
	NumberOfPages = 10,
	GetPage = (pageId) => let
		Source = Json.Document(Web.Contents("http://services.odata.org/v3/northwind/northwind.svc/Orders?$format=json&$top=10&$skip=" & Text.From(pageId * 10))),
		value = Source[value],
		#"Converted to Table" = Table.FromList(value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
		#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"OrderID", "CustomerID", "EmployeeID", "OrderDate", "RequiredDate", "ShippedDate", "ShipVia", "Freight", "ShipName", "ShipAddress", "ShipCity", "ShipRegion", "ShipPostalCode", "ShipCountry"}, {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"})
	in
		#"Expanded Column1",

	Pages = {1..NumberOfPages},
    #"Converted to Table" = Table.FromList(Pages, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each GetPage([Column1])),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"}, {"Custom.Column1.OrderID", "Custom.Column1.CustomerID", "Custom.Column1.EmployeeID", "Custom.Column1.OrderDate", "Custom.Column1.RequiredDate", "Custom.Column1.ShippedDate", "Custom.Column1.ShipVia", "Custom.Column1.Freight", "Custom.Column1.ShipName", "Custom.Column1.ShipAddress", "Custom.Column1.ShipCity", "Custom.Column1.ShipRegion", "Custom.Column1.ShipPostalCode", "Custom.Column1.ShipCountry"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Column1"})
in
    #"Removed Columns"

View solution in original post

9 REPLIES 9
markive
Advocate II
Advocate II

 
pqian
Microsoft Employee
Microsoft Employee

@markive Looking at your query there isn't anything that strikes me as wrong. Maybe the duplication is in the server side due to page faults?

 

It's actually quite normal for PBI to make more than one requests with the same page ID. They maybe used to address different steps since the expansion and etc., isn't foldable to to the service. 

 

One thing you can do is reduce the number of pages and the page size, see if that makes a difference. Simplify the query to eliminate sources of error. If the duplication is indeed server side, then you can add a Remove Duplicate step in the end.

 

Obvious I don't have access to your data source so I simulated the behavior using OData. For reference, here's a Query that pulls data from http://services.odata.org/v3/northwind/northwind.svc/Orders using JSON document

 

let
	NumberOfPages = 10,
	GetPage = (pageId) => let
		Source = Json.Document(Web.Contents("http://services.odata.org/v3/northwind/northwind.svc/Orders?$format=json&$top=10&$skip=" & Text.From(pageId * 10))),
		value = Source[value],
		#"Converted to Table" = Table.FromList(value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
		#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"OrderID", "CustomerID", "EmployeeID", "OrderDate", "RequiredDate", "ShippedDate", "ShipVia", "Freight", "ShipName", "ShipAddress", "ShipCity", "ShipRegion", "ShipPostalCode", "ShipCountry"}, {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"})
	in
		#"Expanded Column1",

	Pages = {1..NumberOfPages},
    #"Converted to Table" = Table.FromList(Pages, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each GetPage([Column1])),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1.OrderID", "Column1.CustomerID", "Column1.EmployeeID", "Column1.OrderDate", "Column1.RequiredDate", "Column1.ShippedDate", "Column1.ShipVia", "Column1.Freight", "Column1.ShipName", "Column1.ShipAddress", "Column1.ShipCity", "Column1.ShipRegion", "Column1.ShipPostalCode", "Column1.ShipCountry"}, {"Custom.Column1.OrderID", "Custom.Column1.CustomerID", "Custom.Column1.EmployeeID", "Custom.Column1.OrderDate", "Custom.Column1.RequiredDate", "Custom.Column1.ShippedDate", "Custom.Column1.ShipVia", "Custom.Column1.Freight", "Custom.Column1.ShipName", "Custom.Column1.ShipAddress", "Custom.Column1.ShipCity", "Custom.Column1.ShipRegion", "Custom.Column1.ShipPostalCode", "Custom.Column1.ShipCountry"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Column1"})
in
    #"Removed Columns"

3.JPG1.JPG0.JPGI have followed the examples, but I can't make it work. This is the page


It is a website where the plots of Andalusia appear and the GPS location

 

http://www.geoportalagriculturaypesca.es/geoide/sigpac/wfs?service=WFS&STARTINDEX=500&request=GetFea...

 


Up to only 500 of 2.000.000 milion of records appear and you have to increase the "Startindex" to 1000, 1500, etc.

 

can someone help me?

 

 
 
 
bryw
Microsoft Employee
Microsoft Employee

 Are you aware whether or not the Dynamic variables for Paging will break the option to use Schedule Refresh on your Dataset(s).

 

We are receiving the following error:

You can't schedule refresh for this dataset because one or more sources currently don't support refresh.

let
	NumberOfPages = 10,
	GetPage = (pageId) => let
		Source = Json.Document(Web.Contents(API))),
		value = Source[value],

 

kbrouder
Frequent Visitor

@bryw did you ever figre out the refresh issue? I am running into the same thing

Actually I did figure it out, if you use the 'Query' trick with Web.Contents instead of actually changing the base URL, it appears to work with refresh in the Power BI Service. 

Anonymous
Not applicable

great you already figured it out!

anyway, there is a working example here in this other post.

https://community.powerbi.com/t5/Service/workaround-for-SCHEDULED-REFRESH-on-HTTP-API-with-pages/td-...

Anonymous
Not applicable

indeed, I am getting the same refresh error using another API with page function inside Json.Document(Web.Contents(page_function_here)).

what is the correct work around? i am new to M language...

Hi @pqian

 

You are right! Thanks for your response. I tried to reply to my own thread yesterday and for some reason it wouldn't let me.

 

Sometimes the page order requested is random and also repeats probably due to some asynchroniousy but it does sort itself out in the end. The API I was targetting did indeed have an issue.

 

I didn't find too much information on paging so hopefully our two examples will help others trying to do the same as it must be quite common.

 

 

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

AugPowerBI_Carousel

Power BI Monthly Update - August 2024

Check out the August 2024 Power BI update to learn about new features.

August Carousel

Fabric Community Update - August 2024

Find out what's new and trending in the Fabric Community.