The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hello,
I am trying to get raw file from github using link like:
https://raw.githubusercontent.com/myCompany/cost-mgmt-repo/BranchName/CostManagement/Dim_ManagementGroups.csv
I am getting:
version https://git-lfs.github.com/spec/v1 oid sha256:3b70bcd441cd74d876c58f9aef3e25c8822cb8bfe5904ed3b6dcaa344445 size 44476
I have code like here:
AddColumnWithTable = Table.AddColumn(ExpandedColumn, "ResultTable", each
let
a = Csv.Document(
Web.Contents(paramBaseUrlRawGithub, [
RelativePath = [RelativePath],
Query = [token = [Token]]
]),
[Delimiter = ",", QuoteStyle = QuoteStyle.Csv]
),
b = Table.PromoteHeaders(a, [PromoteAllScalars=true])
in
b
)
this is paramBaseUrlRawGithub:
https://raw.githubusercontent.com/
How to get it?
Best,
Jacek
Solved! Go to Solution.
Hi @jaryszek,
Thank you for reaching out to the Microsoft Fabric Forum Community.
If the issue still persists, I’d recommend raising a support ticket with Microsoft. The support team can look into the backend and provide more in-depth assistance tailored to your environment.
https://learn.microsoft.com/en-us/power-bi/support/create-support-ticket
If this solution helped, please consider marking the response as accepted and giving it a thumbs-up so others can benefit as well.
Best regards,
Prasanna Kumar
Hi @jaryszek ,
Thank you for reaching out to the Microsoft Fabric Forum Community.
As we haven’t heard back from you and there are no further queries at this time, we’ll proceed to close this thread for now.
Should you need any additional assistance, please feel free to start a new thread we’re always happy to help.
Best regards,
Prasanna Kumar
Hi @jaryszek,
Just checking, have you had a chance to open a support ticket, as suggested. If so, we'd love to hear the current status or any updates from that.
If the issue was resolved through the support ticket, it would be great if you could share the solution here as well. It could really help other community members find answers more quickly.
Warm regards,
Prasanna Kumar
Hi @jaryszek,
Thank you for reaching out to the Microsoft Fabric Forum Community.
If the issue still persists, I’d recommend raising a support ticket with Microsoft. The support team can look into the backend and provide more in-depth assistance tailored to your environment.
https://learn.microsoft.com/en-us/power-bi/support/create-support-ticket
If this solution helped, please consider marking the response as accepted and giving it a thumbs-up so others can benefit as well.
Best regards,
Prasanna Kumar
ok so i managed to add new funtion (it has to have the same parameters like in the first one - this is a huge Microsoft bug like always!!!: https://www.youtube.com/watch?v=U9lqXCiGa08 )
[DataSource.Kind="GitHub", Publish="GitHub.Publish"]
shared GitHub.LfsDownloadUrl = (relativePath as text, optional folderPath as text, optional branchName as text) =>
let
sizeNumber = Number.FromText(folderPath),
body = Json.FromValue([
operation = "download",
transfer = {"basic"},
objects = {
[ oid = relativePath, size = sizeNumber ]
}
]),
UrlLfs = "MyCompany" & "/" & branchName & ".git/info/lfs/objects/batch",
result = Json.Document(Web.Contents("https://github.com", [
RelativePath = UrlLfs,
Content = body,
Headers = [
#"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
#"Content-Type" = "application/vnd.git-lfs+json",
Accept = "application/vnd.git-lfs+json",
#"User-Agent" = "PowerQuery"
]
])),
url = result[objects]{0}[actions][download][href]
in
url;
and this work, i got fully downaloded link in power querY:
The issue is that i need to add credentials for this file always...why this is not using my GitHub credentials?
Best,
Jacek
Hi @jaryszek,
Thank you for reaching out to the Microsoft Fabric Forum Community.
You're getting a Git LFS pointer file instead of actual CSV content because the file is stored using Git Large File Storage (LFS). raw.githubusercontent.com only returns the pointer file, not the actual data.
Best fix: Remove the file from Git LFS and commit it as a regular file.
Alternative: Host the CSV elsewhere (e.g. SharePoint, Azure Blob, or a direct-download server).
If this solution helped, please consider marking the response as accepted and giving it a thumbs-up so others can benefit as well.
Best regards,
Prasanna Kumar
thanks it helped if you want to change your whole model but still is not a great solution.
It has to be LFS file by design it is higher than 100 mb and I am looking to get this file using power query and provided sha.
What about Custom Git Connector?
I have one, maybe here i could post the statement like here
https://gist.github.com/fkraeutli/66fa741d9a8c2a6a238a01d17ed0edc5
I have Github connector like here:
// GitHub Connector for Power BI (Standard GitHub OAuth)
[Version = "1.0.0"]
section GitHub;
client_id = "minenumber";
client_secret = "minenumber";
[DataSource.Kind="GitHub", Publish="GitHub.Publish"]
shared GitHub.Contents = (relativePath as text, optional folderPath as text, optional branchName as text) =>
let
baseUrl = "https://api.github.com/repos/Company/",
refBranch = if branchName <> null then "?ref=" & branchName else "?ref=develop",
url = relativePath & "/contents/" & folderPath & refBranch,
source = Json.Document(Web.Contents(baseUrl, [
RelativePath = url,
Headers = [
#"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
#"User-Agent" = "PowerQuery"
]
]))
in
source;
GitHub = [
Authentication = [
OAuth = [
StartLogin = (resourceUrl, state, display) =>
let
authorizeUrl = "https://github.com/login/oauth/authorize?" &
"client_id=" & client_id &
"&redirect_uri=https://oauth.powerbi.com/views/oauthredirect.html" &
"&scope=repo" &
"&state=" & state
in
[
LoginUri = authorizeUrl,
CallbackUri = "https://oauth.powerbi.com/views/oauthredirect.html",
WindowHeight = 720,
WindowWidth = 1024,
Context = null
],
FinishLogin = (context, callbackUri, state) =>
let
parts = Uri.Parts(callbackUri)[Query],
code = parts[code],
access_token_response = Json.Document(Web.Contents("https://github.com/login/oauth/access_token", [
Content = Text.ToBinary("client_id=" & client_id & "&client_secret=" & client_secret & "&code=" & code),
Headers = [
#"Content-Type" = "application/x-www-form-urlencoded",
Accept = "application/json"
]
])),
access_token = access_token_response[access_token]
in
[
access_token = access_token
],
Refresh = (resourceUrl, refresh_token) => error "GitHub does not support refresh tokens.",
TestConnection = (access_token_record) => {"https://api.github.com/user"},
AccessToken = (access_token_record) => "Bearer " & access_token_record[access_token]
]
]
];
GitHub.Publish = [
Beta = true,
Category = "Other",
ButtonText = { Extension.LoadString("ButtonTitle"), Extension.LoadString("ButtonHelp") },
LearnMoreUrl = "https://powerbi.microsoft.com/",
SourceImage = GitHub.Icons,
SourceTypeImage = GitHub.Icons
];
GitHub.Icons = [
Icon16 = { Extension.Contents("GitHub16.png"), Extension.Contents("GitHub20.png"), Extension.Contents("GitHub24.png"), Extension.Contents("GitHub32.png") },
Icon32 = { Extension.Contents("GitHub32.png"), Extension.Contents("GitHub40.png"), Extension.Contents("GitHub48.png"), Extension.Contents("GitHub64.png") }
];
Still need help on this!
Jacek
Ok i tried using this tip:
https://learn.microsoft.com/en-us/power-query/handling-resource-path
my code changed to implement it :
shared GitHub.Contents = Value.ReplaceType(GitHubContentsImpl,
type function (
repo as (text meta [DataSource.Path = false]),
filePath as (text meta [DataSource.Path = false]),
optional branch as nullable text
) as any
);
but it didnt work.
It is still asking twice about credentials for each repository. It is possible to turn this off?
Best,
Jacek
Ok so implementation for GitHub connector is:
[DataSource.Kind = "GitHub", Publish = "GitHub.Publish"]
shared GitHub.Contents = Value.ReplaceType(
GitHubContentsImpl,
type function (
repo as (type text meta [Documentation.FieldCaption="Repository (e.g. org/repo)", DataSource.Path=false]),
filePath as (type text meta [Documentation.FieldCaption="File path in repo", DataSource.Path=false]),
optional branch as (type nullable text meta [Documentation.FieldCaption="Branch (optional)", DataSource.Path=false])
) as any
);
GitHubContentsImpl = (repo as text, filePath as text, optional branch as nullable text) =>
let
actualBranch = if branch <> null then branch else "main",
baseUrl = "https://api.github.com",
relativeUrl = "repos/xxx/" & repo & "/contents/" & filePath & "?ref=" & actualBranch,
response = Json.Document(Web.Contents(baseUrl, [
RelativePath = relativeUrl,
Headers = [
#"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
#"User-Agent" = "PowerQuery"
]
]))
in
response;
GitHub = [
TestConnection = () => { "https://api.github.com" },
and it worked, it is asking only once!!
What about GitHubCloud connector? How to make this the same?
// GitHubCloud Connector for Power BI (Standard GitHub OAuth)
[Version = "1.0.0"]
section GitHubCloud;
client_id = "11111";
client_secret = "222222";
[DataSource.Kind="GitHubCloud", Publish="GitHubCloud.Publish"]
shared GitHubCloud.LfsCsv = (relativePath as text, optional folderPath as text, optional branchName as text) =>
let
downloadUrl =
let
sizeNumber = Number.FromText(folderPath),
body = Json.FromValue([
operation = "download",
transfer = {"basic"},
objects = {
[ oid = relativePath, size = sizeNumber ]
}
]),
UrlLfs = "xxxx" & "/" & branchName & ".git/info/lfs/objects/batch",
result = Json.Document(Web.Contents("https://github.com", [
RelativePath = UrlLfs,
Content = body,
Headers = [
#"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
#"Content-Type" = "application/vnd.git-lfs+json",
Accept = "application/vnd.git-lfs+json",
#"User-Agent" = "PowerQuery"
]
])),
url = result[objects]{0}[actions][download][href]
in
url,
csvContent = Csv.Document(Web.Contents(downloadUrl)),
final = Table.PromoteHeaders(csvContent)
in
final;
GitHubCloud = [
TestConnection = (token) => {"https://github.com"},
Authentication = [
OAuth = [
StartLogin = (resourceUrl, state, display) => [
LoginUri = "https://github.com/login/oauth/authorize?" &
"client_id=" & client_id &
"&redirect_uri=https://oauth.powerbi.com/views/oauthredirect.html" &
"&scope=repo" &
"&state=" & state,
CallbackUri = "https://oauth.powerbi.com/views/oauthredirect.html",
WindowHeight = 720,
WindowWidth = 1024,
Context = null
],
FinishLogin = (context, callbackUri, state) =>
let
parts = Uri.Parts(callbackUri)[Query],
code = parts[code],
tokenResponse = Json.Document(Web.Contents("https://github.com/login/oauth/access_token", [
Content = Text.ToBinary("client_id=" & client_id & "&client_secret=" & client_secret & "&code=" & code),
Headers = [#"Content-Type"="application/x-www-form-urlencoded", Accept="application/json"]
])),
access_token = tokenResponse[access_token]
in
[access_token = access_token],
Refresh = (resourceUrl, refresh_token) => error "GitHub doesn't support refresh tokens",
AccessToken = (token) => "Bearer " & token[access_token]
]
],
Label = "GitHub LFS OAuth Connector"
];
GitHubCloud.Publish = [
Beta = true,
Category = "Other",
ButtonText = {"GitHubCloud", "Download LFS files"},
SourceImage = GitHubCloud.Icons,
SourceTypeImage = GitHubCloud.Icons
];
GitHubCloud.Icons = [
Icon16 = { Extension.Contents("GitHub16.png"), Extension.Contents("GitHub20.png"), Extension.Contents("GitHub24.png"), Extension.Contents("GitHub32.png") },
Icon32 = { Extension.Contents("GitHub32.png"), Extension.Contents("GitHub40.png"), Extension.Contents("GitHub48.png"), Extension.Contents("GitHub64.png") }
];
The problem here I am asking github twice. First is just github to get answer:
result = Json.Document(Web.Contents("https://github.com", [
but next one i am getting row url:
url = result[objects]{0}[actions][download][href] in url, csvContent = Csv.Document(Web.Contents(downloadUrl)),
final = Table.PromoteHeaders(csvContent)
but here also i am asking Web.Contents once again and i think this is why power bi has to ask about credentials per table...So if i have 5 tables and each tables in repo i am asking about them 5 times, 5 times i need to use credentials for them.
how to avoid this?
Best,
Jacek