Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Power Query showing ghost duplicates in Profil View (PostgreSQL import)

Hi,

Issue with Power Query on Power BI Desktop (February 2022) showing a few duplicates in Profil View when importing a large Table from a PostgreSQL database. Those duplicates do not exist but are being deleted when asked to. 

- Imported a Table called "s_tache" from a PostgreSQL database :

BastienC_0-1646063353998.png

 

- Distribution indicates duplicates in the Primary Key ("tae_id") field. Unique values does not match Number of values :

BastienC_1-1646063354148.png

 

- Checking the source database with a SQL query. "tae_id" is indeed the primary key and does not countain duplicates (Yes, my name is Captain Obvious. How could you tell?)

BastienC_2-1646063354173.png

 

- Deleting rows works. No more duplicates in the Primary Key field :

BastienC_3-1646063354464.png

 

- Keep Rows > Keep duplicates DOES NOT return any row (???):

BastienC_4-1646063354307.png

- After filtering [tae_id] = 4457 only a single row comes up. 

BastienC_5-1646063388742.png

- In report view the table does not seem to have duplicates. :

The data here was manipulated with the following steps (probably irrelevant but just incase):

  • renamed [tae_id] to [id_task] 
  • added prefix "t_" to [id_task]
  • union with another table

BastienC_6-1646063795409.png

 

However I'm not sure if this is only a "visual" bug and if the importation is done properly. My DAX measures do not match the expected results given by SQL (those queries are relatively complex and mistakes could be on me of course). 

 

Regards,
Bastien

Status: Delivered

Glad to hear that it could be fixed by NativeQuery and thanks for your sharing which would help others who may have the same issue in the community.

 

Best Regards,
Community Support Team _ Yingjie Li

Comments
v-yingjl
Community Support
Status changed to: Investigating
However I'm not sure if this is only a "visual" bug and if the importation is done properly

When you imported these tables into Power Query directly without any other operations like removing duplicates, filter data etc., have the table shown duplicated values in the data view after closing and applying Power Query editor although it has distinct values in PostgreSQL?

vyingjl_0-1646101799467.png

 

If the Data view table shows duplicated values, this issue may happen in Power Query editor when importing data from data source; if the Data view table is the same as the source table, it may be a 'visual issue' in Power Query editor.

 

In addition, you have mentioned that this issue happened in Feb 2022 Power BI Desktop version, have you tried or did this issue also happen in previous Desktop version? Besides, what is the version of PostgreSQL database in this issue, you can share more details about it.

 

Best Regards,
Community Support Team _ Yingjie Li

Anonymous
Not applicable

Hi Yingjie,

Thanks for considering the issue. My version of PostgreSQL is 9.6.19 :

BastienC_0-1646123269342.png

 


When you imported these tables into Power Query directly without any other operations like removing duplicates, filter data etc., have the table shown duplicated values in the data view after closing and applying Power Query editor although it has distinct values in PostgreSQL?

Forgot to be clear about it my bad. Without transformations, the data view did not show duplicates when checking the mentioned ones. Therefore, a [1,n] relationship was possible without using a new field (my transformation only applied a prefix and duplicates would still be here).

 

Sadly, I did not start working on the database before this version of PBi Desktop. I am not free to install other versions of Power BI Desktop on my computer for further tests.

 

Regards,
Bastien

v-yingjl
Community Support
Status changed to: Needs Info
my transformation only applied a prefix and duplicates would still be here

Could you please consider sharing the query about applying a prefix for the table in this issue, seems like could not reproduce it in my side with the same version of PostgreSQL and Power BI Desktop.

 

Best Regards,
Community Support Team _ Yingjie Li

Anonymous
Not applicable

Transformations do not matter for the issue as it is happening before. The query will not help you. Screenshots seem clear, duplicates are showing on the field's Profil View before applying any transformation. Duplicates are not appearing in Data View and it is not because of transformations.

The source table [s_tache] has 537365 rows and 33 fields. The primary key [tae_id] is an integer but not a continuous index, it sometimes jumps numbers as you can see in this screenshot of it in an ascending order :

BastienC_0-1646210557204.png

 

 

let
    //Database connection
    Source_fact_task = PostgreSQL.Database(" ", " "),

//Tache
    //Import table from database
    #"Import_s_tache" = Source_fact_task{[Schema="public",Item="s_tache"]}[Data],
    #"Rename_tache" = Table.RenameColumns(Import_s_tache,{
        {"tae_id", "id_task"}, {"tfl_id", "id_flux"}, {"ctc_id_creat", "id_parent"}, {"ctc_id_traite_par", "id_child"},{"tae_libelle_statut","status"},{"ref_id_statut","id_ref_statut"}
        }),
    #"Prefix_tache" = Table.TransformColumns(#"Rename_tache", {
        {"id_task", each "t_" & Text.From(_, "fr-FR"), type text},
        {"id_parent", each "c_" & Text.From(_, "fr-FR"), type text},
        {"id_child", each "c_" & Text.From(_, "fr-FR"), type text}
        }),
    #"Fact_tache" = 
        Table.SelectColumns(
            #"Prefix_tache",
            {"id_task","id_flux","id_parent","id_child","status","id_ref_statut"}
        ),
    
//Contact
    #"Import_contact" = Source_fact_task{[Schema="public",Item="s_contact_simple"]}[Data],
    #"Fusion_contact" = Table.CombineColumns(Table.TransformColumnTypes(Import_contact, {{"ctc_id", type text}, {"cqu_rang", type text}}, "fr-FR"),{"ctc_id", "cqu_rang"},Combiner.CombineTextByDelimiter("_", QuoteStyle.None),"id_task"),
    #"Rename_contact" = Table.RenameColumns(#"Fusion_contact",{
        {"mtfdm_id", "id_motifdm"}, {"mtf_id", "id_motif"}, {"tfl_id", "id_flux"},{"ctc_smiley","smiley"}
        }),
    #"Prefix_contact" = Table.TransformColumns(#"Rename_contact", {{"id_task", each "c_" & Text.From(_, "fr-FR"), type text}}),
    #"Fact_contact" = 
        Table.SelectColumns(
            #"Prefix_contact",
            {"id_task","id_flux","id_motif","id_motifdm","smiley"}
//Union
    #"Fact_task" = 
        Table.Combine({ 
            //Tache
            #"Fact_tache",
            //Contact
            #"Fact_contact"
        }),
    #"Archetype" = Table.AddColumn(Fact_task, "id_arch", each if Text.Contains([id_task], "t") then 0 else if Text.Contains([id_task], "c") then 1 else null)

in
    #"Archetype"

 

 

 

Anonymous
Not applicable

Fixed the problem by using a NativeQuery. This issue is not my concern anymore and I will stop reporting from now on. The report still stands but it's up to you to carry on further investigations.

BastienC_0-1646221401471.png

 


Regards,
Bastien

v-yingjl
Community Support
Status changed to: Delivered

Glad to hear that it could be fixed by NativeQuery and thanks for your sharing which would help others who may have the same issue in the community.

 

Best Regards,
Community Support Team _ Yingjie Li