Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
jaryszek
Impactful Individual
Impactful Individual

How to connect to git lfs file on github?

Hello, 

I am trying to get raw file from github using link like:

https://raw.githubusercontent.com/myCompany/cost-mgmt-repo/BranchName/CostManagement/Dim_ManagementGroups.csv


I am getting:

version https://git-lfs.github.com/spec/v1
oid sha256:3b70bcd441cd74d876c58f9aef3e25c8822cb8bfe5904ed3b6dcaa344445
size 44476

I have code like here:

        AddColumnWithTable = Table.AddColumn(ExpandedColumn, "ResultTable", each 
            let
                a = Csv.Document(
                    Web.Contents(paramBaseUrlRawGithub, [
                        RelativePath = [RelativePath],
                        Query = [token = [Token]]
                    ]),
                    [Delimiter = ",", QuoteStyle = QuoteStyle.Csv]
                ),
                b = Table.PromoteHeaders(a, [PromoteAllScalars=true])
            in
                b
            )   

 

this is paramBaseUrlRawGithub:
https://raw.githubusercontent.com/ 

How to get it?

Best,
Jacek 
 

1 ACCEPTED SOLUTION
v-pgoloju
Community Support
Community Support

Hi @jaryszek,

 

Thank you for reaching out to the Microsoft Fabric Forum Community.

 

If the issue still persists, I’d recommend raising a support ticket with Microsoft. The support team can look into the backend and provide more in-depth assistance tailored to your environment.


https://learn.microsoft.com/en-us/power-bi/support/create-support-ticket


If this solution helped, please consider marking the response as accepted and giving it a thumbs-up so others can benefit as well.

 

Best regards,
Prasanna Kumar

 

View solution in original post

8 REPLIES 8
v-pgoloju
Community Support
Community Support

Hi @jaryszek ,

 

Thank you for reaching out to the Microsoft Fabric Forum Community.

 

As we haven’t heard back from you and there are no further queries at this time, we’ll proceed to close this thread for now.
Should you need any additional assistance, please feel free to start a new thread we’re always happy to help.

 

Best regards,
Prasanna Kumar

v-pgoloju
Community Support
Community Support

Hi @jaryszek,

 

Just checking, have you had a chance to open a support ticket, as suggested. If so, we'd love to hear the current status or any updates from that.

If the issue was resolved through the support ticket, it would be great if you could share the solution here as well. It could really help other community members find answers more quickly.

 

Warm regards,
Prasanna Kumar

v-pgoloju
Community Support
Community Support

Hi @jaryszek,

 

Thank you for reaching out to the Microsoft Fabric Forum Community.

 

If the issue still persists, I’d recommend raising a support ticket with Microsoft. The support team can look into the backend and provide more in-depth assistance tailored to your environment.


https://learn.microsoft.com/en-us/power-bi/support/create-support-ticket


If this solution helped, please consider marking the response as accepted and giving it a thumbs-up so others can benefit as well.

 

Best regards,
Prasanna Kumar

 

jaryszek
Impactful Individual
Impactful Individual

ok so i managed to add new funtion (it has to have the same parameters like in the first one - this is a huge Microsoft bug like always!!!: https://www.youtube.com/watch?v=U9lqXCiGa08 ) 

[DataSource.Kind="GitHub", Publish="GitHub.Publish"]
shared GitHub.LfsDownloadUrl = (relativePath as text, optional folderPath as text, optional branchName as text) =>
    let
        sizeNumber = Number.FromText(folderPath),
        body = Json.FromValue([
            operation = "download",
            transfer = {"basic"},
            objects = {
                [ oid = relativePath, size = sizeNumber ]
            }
        ]),
        UrlLfs = "MyCompany" & "/" & branchName & ".git/info/lfs/objects/batch",
        result = Json.Document(Web.Contents("https://github.com", [
            RelativePath = UrlLfs,
            Content = body,
            Headers = [
                #"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
                #"Content-Type" = "application/vnd.git-lfs+json",
                Accept = "application/vnd.git-lfs+json",
                #"User-Agent" = "PowerQuery"
            ]
        ])),
        url = result[objects]{0}[actions][download][href]
    in
        url;

 

and this work, i got fully downaloded link in power querY:

jaryszek_0-1747903861260.png


The issue is that i need to add credentials for this file always...why this is not using my GitHub credentials? 

Best,
Jacek

 

v-pgoloju
Community Support
Community Support

Hi @jaryszek,

 

Thank you for reaching out to the Microsoft Fabric Forum Community.

 

You're getting a Git LFS pointer file instead of actual CSV content because the file is stored using Git Large File Storage (LFS). raw.githubusercontent.com only returns the pointer file, not the actual data.

 

Best fix: Remove the file from Git LFS and commit it as a regular file.

Alternative: Host the CSV elsewhere (e.g. SharePoint, Azure Blob, or a direct-download server).

 

If this solution helped, please consider marking the response as accepted and giving it a thumbs-up so others can benefit as well.

 

Best regards,
Prasanna Kumar

 

jaryszek
Impactful Individual
Impactful Individual

thanks it helped if you want to change your whole model but still is not a great solution. 
It has to be LFS file by design it is higher than 100 mb and I am looking to get this file using power query and provided sha.

What about Custom Git Connector? 

I have one, maybe here i could post the statement like here 

https://gist.github.com/fkraeutli/66fa741d9a8c2a6a238a01d17ed0edc5

I have Github connector like here:

// GitHub Connector for Power BI (Standard GitHub OAuth)
[Version = "1.0.0"]
section GitHub;

client_id = "minenumber";
client_secret = "minenumber";

[DataSource.Kind="GitHub", Publish="GitHub.Publish"]
shared GitHub.Contents = (relativePath as text, optional folderPath as text, optional branchName as text) =>
    let
        baseUrl = "https://api.github.com/repos/Company/",
        refBranch = if branchName <> null then "?ref=" & branchName else "?ref=develop",
        url = relativePath & "/contents/" & folderPath & refBranch,

        source = Json.Document(Web.Contents(baseUrl, [
            RelativePath = url,
            Headers = [
                #"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
                #"User-Agent" = "PowerQuery"
            ]
        ]))
    in
        source;

GitHub = [
    Authentication = [
        OAuth = [
            StartLogin = (resourceUrl, state, display) =>
                let
                    authorizeUrl = "https://github.com/login/oauth/authorize?" &
                        "client_id=" & client_id &
                        "&redirect_uri=https://oauth.powerbi.com/views/oauthredirect.html" &
                        "&scope=repo" &
                        "&state=" & state
                in
                    [
                        LoginUri = authorizeUrl,
                        CallbackUri = "https://oauth.powerbi.com/views/oauthredirect.html",
                        WindowHeight = 720,
                        WindowWidth = 1024,
                        Context = null
                    ],

            FinishLogin = (context, callbackUri, state) =>
                let
                    parts = Uri.Parts(callbackUri)[Query],
                    code = parts[code],
                    access_token_response = Json.Document(Web.Contents("https://github.com/login/oauth/access_token", [
                        Content = Text.ToBinary("client_id=" & client_id & "&client_secret=" & client_secret & "&code=" & code),
                        Headers = [
                            #"Content-Type" = "application/x-www-form-urlencoded",
                            Accept = "application/json"
                        ]
                    ])),
                    access_token = access_token_response[access_token]
                in
                    [
                        access_token = access_token
                    ],

            Refresh = (resourceUrl, refresh_token) => error "GitHub does not support refresh tokens.",
            TestConnection = (access_token_record) => {"https://api.github.com/user"},
            AccessToken = (access_token_record) => "Bearer " & access_token_record[access_token]
        ]
    ]
];

GitHub.Publish = [
    Beta = true,
    Category = "Other",
    ButtonText = { Extension.LoadString("ButtonTitle"), Extension.LoadString("ButtonHelp") },
    LearnMoreUrl = "https://powerbi.microsoft.com/",
    SourceImage = GitHub.Icons,
    SourceTypeImage = GitHub.Icons
];

GitHub.Icons = [
    Icon16 = { Extension.Contents("GitHub16.png"), Extension.Contents("GitHub20.png"), Extension.Contents("GitHub24.png"), Extension.Contents("GitHub32.png") },
    Icon32 = { Extension.Contents("GitHub32.png"), Extension.Contents("GitHub40.png"), Extension.Contents("GitHub48.png"), Extension.Contents("GitHub64.png") }
];


Still need help on this!
Jacek

jaryszek
Impactful Individual
Impactful Individual

Ok i tried using this tip:
https://learn.microsoft.com/en-us/power-query/handling-resource-path

my code changed to implement it :

shared GitHub.Contents = Value.ReplaceType(GitHubContentsImpl,
type function (
repo as (text meta [DataSource.Path = false]),
filePath as (text meta [DataSource.Path = false]),
optional branch as nullable text
) as any
);

but it didnt work. 

It is still asking twice about credentials for each repository. It is possible to turn this off?

Best,
Jacek




jaryszek
Impactful Individual
Impactful Individual

Ok so implementation for GitHub connector is:

[DataSource.Kind = "GitHub", Publish = "GitHub.Publish"]
shared GitHub.Contents = Value.ReplaceType(
    GitHubContentsImpl,
    type function (
        repo     as (type text meta [Documentation.FieldCaption="Repository (e.g. org/repo)", DataSource.Path=false]),
        filePath as (type text meta [Documentation.FieldCaption="File path in repo", DataSource.Path=false]),
        optional branch as (type nullable text meta [Documentation.FieldCaption="Branch (optional)", DataSource.Path=false])
    ) as any
);

GitHubContentsImpl = (repo as text, filePath as text, optional branch as nullable text) =>
    let
        actualBranch = if branch <> null then branch else "main",
        baseUrl = "https://api.github.com",
        relativeUrl  = "repos/xxx/" & repo & "/contents/" & filePath & "?ref=" & actualBranch,

        response = Json.Document(Web.Contents(baseUrl, [
            RelativePath = relativeUrl,
            Headers = [
                #"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
                #"User-Agent" = "PowerQuery"
            ]
        ]))
    in
        response;

GitHub = [
    TestConnection = () => { "https://api.github.com" },

and it worked, it is asking only once!!

What about GitHubCloud connector? How to make this the same?

// GitHubCloud Connector for Power BI (Standard GitHub OAuth)
[Version = "1.0.0"]
section GitHubCloud;

client_id = "11111";
client_secret = "222222";

[DataSource.Kind="GitHubCloud", Publish="GitHubCloud.Publish"]
shared GitHubCloud.LfsCsv = (relativePath as text, optional folderPath as text, optional branchName as text) =>
    let
        downloadUrl = 
            let
                sizeNumber = Number.FromText(folderPath),
                body = Json.FromValue([
                    operation = "download",
                    transfer = {"basic"},
                    objects = {
                        [ oid = relativePath, size = sizeNumber ]
                    }
                ]),
                UrlLfs = "xxxx" & "/" & branchName & ".git/info/lfs/objects/batch",
                result = Json.Document(Web.Contents("https://github.com", [
                    RelativePath = UrlLfs,
                    Content = body,
                    Headers = [
                        #"Authorization" = "Bearer " & Extension.CurrentCredential()[access_token],
                        #"Content-Type" = "application/vnd.git-lfs+json",
                        Accept = "application/vnd.git-lfs+json",
                        #"User-Agent" = "PowerQuery"
                    ]
                ])),
                url = result[objects]{0}[actions][download][href]
            in
                url,
        csvContent = Csv.Document(Web.Contents(downloadUrl)),
        final = Table.PromoteHeaders(csvContent)
    in
        final;

GitHubCloud = [
    TestConnection = (token) => {"https://github.com"},
    Authentication = [
        OAuth = [
            StartLogin = (resourceUrl, state, display) => [
                LoginUri = "https://github.com/login/oauth/authorize?" &
                    "client_id=" & client_id &
                    "&redirect_uri=https://oauth.powerbi.com/views/oauthredirect.html" &
                    "&scope=repo" &
                    "&state=" & state,
                CallbackUri = "https://oauth.powerbi.com/views/oauthredirect.html",
                WindowHeight = 720,
                WindowWidth = 1024,
                Context = null
            ],
            FinishLogin = (context, callbackUri, state) =>
                let
                    parts = Uri.Parts(callbackUri)[Query],
                    code = parts[code],
                    tokenResponse = Json.Document(Web.Contents("https://github.com/login/oauth/access_token", [
                        Content = Text.ToBinary("client_id=" & client_id & "&client_secret=" & client_secret & "&code=" & code),
                        Headers = [#"Content-Type"="application/x-www-form-urlencoded", Accept="application/json"]
                    ])),
                    access_token = tokenResponse[access_token]
                in
                    [access_token = access_token],
            Refresh = (resourceUrl, refresh_token) => error "GitHub doesn't support refresh tokens",
            AccessToken = (token) => "Bearer " & token[access_token]
        ]
    ],
    Label = "GitHub LFS OAuth Connector"
];

GitHubCloud.Publish = [
    Beta = true,
    Category = "Other",
    ButtonText = {"GitHubCloud", "Download LFS files"},
    SourceImage = GitHubCloud.Icons,
    SourceTypeImage = GitHubCloud.Icons
];

GitHubCloud.Icons = [
    Icon16 = { Extension.Contents("GitHub16.png"), Extension.Contents("GitHub20.png"), Extension.Contents("GitHub24.png"), Extension.Contents("GitHub32.png") },
    Icon32 = { Extension.Contents("GitHub32.png"), Extension.Contents("GitHub40.png"), Extension.Contents("GitHub48.png"), Extension.Contents("GitHub64.png") }
];


The problem here I am asking github twice. First is just github to get answer: 

result = Json.Document(Web.Contents("https://github.com", [​


but next one i am getting row url:

url = result[objects]{0}[actions][download][href] in url, csvContent = Csv.Document(Web.Contents(downloadUrl)),
final = Table.PromoteHeaders(csvContent)​


but here also i am asking Web.Contents once again and i think this is why power bi has to ask about credentials per table...So if i have 5 tables and each tables in repo i am asking about them 5 times, 5 times i need to use credentials for them.

how to avoid this? 

Best,
Jacek

 

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.