Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.
Hello,
I am working for a cheminformatics research lab and we are currently trying to automate some of our data entry. Eventually we will most likely move on to using R or SQL, but for right now we are using Excel Power Query to query various REST APIs.
I am querying CrossRef API to get citation information on articles based on a Title. When navigating between steps, I have to wait the full time of the initial query to even preview the data, and likewise for making any new additional steps. There is currently an error (which is not the subject of the post but advice would be appreciated) that is not allowing all the data to be queried, but it is taking so long to even find the step to correct it. I'll put more context/solutions we have tried below, but here is the M code:
let
Source = temp_trimmedRawData,
#"Added Custom" = Table.AddColumn(Source, "Custom", each Function.InvokeAfter(
()=>getCrossRefDOI([Publication Title]),#duration(0,0,0,1))),
#"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom", "Custom", {"message"}, {"Custom.message"}),
#"Expanded Custom.message" = Table.ExpandRecordColumn(#"Expanded Custom", "Custom.message", {"items"}, {"Custom.message.items"}),
#"Expanded Custom.message.items" = Table.ExpandListColumn(#"Expanded Custom.message", "Custom.message.items"),
#"Expanded Custom.message.items1" = Table.ExpandRecordColumn(#"Expanded Custom.message.items", "Custom.message.items", {"title", "container-title", "DOI", "ISSN"}, {"title", "container-title", "DOI", "ISSN"}),
#"Expanded title" = Table.ExpandListColumn(#"Expanded Custom.message.items1", "title"),
#"Expanded container-title" = Table.ExpandListColumn(#"Expanded title", "container-title"),
#"Expanded ISSN" = Table.ExpandListColumn(#"Expanded container-title", "ISSN"),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded ISSN",{"Index", "title", "container-title", "DOI", "ISSN", "Publication Title", "Journal", "Publication Volume", "Publication Issue", "Publication Year", "Publication Authors", "Publication #(lf)DOI"}),
#"Removed Duplicates" = Table.Distinct(#"Reordered Columns", {"Index"}),
#"Replaced Value" = Table.ReplaceValue(#"Removed Duplicates",null,"",Replacer.ReplaceValue,{"Index", "title", "container-title", "DOI", "ISSN", "Publication Title", "Journal", "Publication Volume", "Publication Issue", "Publication Year", "Publication Authors", "Publication #(lf)DOI"})
in
#"Replaced Value"
The custom function, getCrossRef API is below. I have censored some sensitive information:
let
fxGetCrossRefDOI = (Title) =>
let
PRE = Text.Combine({baseURL,endpt},"/"), // equiv to baseURL & "/" & endpt
baseURL = "https://api.crossref.org",
endpt = "works",
POST = Text.Combine({rows,sort,query,elements,filter},"&"),
// mailto = "mailto=" & email, -> normally 'mailto' is combined in 'POST'
// email = "email@someuniversity.edu",
rows = "rows=1",
sort = "sort=relevance",
query = Text.Combine({query_type,query_value},"="),
query_type = "query.bibliographic",
query_value = title,
title = Uri.EscapeDataString(Title),
elements = "select=title,container-title,DOI,ISSN",
filter = "filter=" & filter_name & ":" & filter_value,
filter_name = "type",
filter_value = "journal-article",
URL = Text.Combine({PRE,POST},"?")
in
Json.Document(Text.FromBinary((Web.Contents(URL))))
in
fxGetCrossRefDOI
I understand batching information is more efficient than delays, which we have done for other APIs, but this one does not allow multiple queries in one request/URL. All privacy levels in each Power Query query are set to anonymous/none.
I want to stop the row by row loading once all the data is loaded. All refresh settings are disabled.
This might not be possible, but just wanted to do one last thing before we just move on to using another language. Thank you!
Solved! Go to Solution.
Hi @pmershon ,
Based on your description and the code provided, since each API call is very time-consuming, please minimize the number of calls by batching requests whenever possible. Although the CrossRef API does not support multiple queries in a single request, you can still optimize by reducing unnecessary steps in your queries.
You can try creating a custom function
let
fxGetCrossRefDOI = (Title) =>
let
baseURL = "https://api.crossref.org/works",
query = Uri.EscapeDataString(Title),
URL = baseURL & "?rows=1&sort=relevance&query.bibliographic=" & query & "&select=title,container-title,DOI,ISSN&filter=type:journal-article",
response = Json.Document(Web.Contents(URL))
in
response
in
fxGetCrossRefDOI
Of course, you can also use Table.Buffer, which helps to minimize the number of times you have to read data from the source. You can use load data into memory, which may speed up subsequent steps
Table.Buffer - PowerQuery M | Microsoft Learn
You can also refer to this document to learn how to use the Rest API
TripPin 2 - REST - Power Query | Microsoft Learn
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Hi @pmershon ,
Based on your description and the code provided, since each API call is very time-consuming, please minimize the number of calls by batching requests whenever possible. Although the CrossRef API does not support multiple queries in a single request, you can still optimize by reducing unnecessary steps in your queries.
You can try creating a custom function
let
fxGetCrossRefDOI = (Title) =>
let
baseURL = "https://api.crossref.org/works",
query = Uri.EscapeDataString(Title),
URL = baseURL & "?rows=1&sort=relevance&query.bibliographic=" & query & "&select=title,container-title,DOI,ISSN&filter=type:journal-article",
response = Json.Document(Web.Contents(URL))
in
response
in
fxGetCrossRefDOI
Of course, you can also use Table.Buffer, which helps to minimize the number of times you have to read data from the source. You can use load data into memory, which may speed up subsequent steps
Table.Buffer - PowerQuery M | Microsoft Learn
You can also refer to this document to learn how to use the Rest API
TripPin 2 - REST - Power Query | Microsoft Learn
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly