Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Anonymous
Not applicable

Combine specific Files from a Folder

I have multiple files with same data structure in a folder and I would like to merge by row but not all files, so I need to filter the files before combine. And each file have more the one sheet, but I want to combine just one  specific sheet from each file.

Any suggestion?

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @Anonymous ,

Please try to use a scripting or programming approach, such as a Python script, which can handle file and sheet selection and then perform the merge operation. Below are the steps you can follow to achieve this:

1. Identify the Files to Merge:
Make a list of the specific files you want to merge. You can do this manually or write a script to filter files based on a naming convention or a pattern.

 

2. Install Required Libraries:
If you're using Python, ensure you have libraries like `pandas` and `openpyxl` installed, which are great for handling Excel files.
You can install them using pip:

 ```bash
     pip install pandas openpyxl
 ```

 

3. Write the Script:
Write a Python script that does the following:
Loops through the list of files.
Opens the specific sheet you want to merge from each file.
Appends the data from the sheet to a master dataframe.
Here's a basic structure of the script:

```python
import pandas as pd
import os

# List of files to merge
files_to_merge = ['file1.xlsx', 'file2.xlsx', ...]

# The name of the sheet you want to merge
sheet_name = 'Sheet1'

# Empty dataframe to hold merged data
merged_data = pd.DataFrame()

for file in files_to_merge:
# Construct the full file path if the files are not in the same directory as the script
file_path = os.path.join('path_to_directory', file)
# Read the specific sheet from the Excel file
data = pd.read_excel(file_path, sheet_name=sheet_name)
# Append the data to the merged_data dataframe
merged_data = merged_data.append(data, ignore_index=True)

# Save the merged data to a new Excel file
merged_data.to_excel('merged_data.xlsx', index=False)
```

 

4. Run the Script:
Execute the script to perform the merge. Ensure that you have the correct file paths and sheet names specified.

 

5. Check the Output:
After running the script, you should have a new Excel file named "merged_data.xlsx" with the combined data from the specified sheets.

After following these steps, you should have successfully merged the specified sheets from the selected files into a single Excel file.

 

Best regards,
Community Support Team_Binbin Yu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
Anonymous
Not applicable

Hi @Anonymous ,

Please try to use a scripting or programming approach, such as a Python script, which can handle file and sheet selection and then perform the merge operation. Below are the steps you can follow to achieve this:

1. Identify the Files to Merge:
Make a list of the specific files you want to merge. You can do this manually or write a script to filter files based on a naming convention or a pattern.

 

2. Install Required Libraries:
If you're using Python, ensure you have libraries like `pandas` and `openpyxl` installed, which are great for handling Excel files.
You can install them using pip:

 ```bash
     pip install pandas openpyxl
 ```

 

3. Write the Script:
Write a Python script that does the following:
Loops through the list of files.
Opens the specific sheet you want to merge from each file.
Appends the data from the sheet to a master dataframe.
Here's a basic structure of the script:

```python
import pandas as pd
import os

# List of files to merge
files_to_merge = ['file1.xlsx', 'file2.xlsx', ...]

# The name of the sheet you want to merge
sheet_name = 'Sheet1'

# Empty dataframe to hold merged data
merged_data = pd.DataFrame()

for file in files_to_merge:
# Construct the full file path if the files are not in the same directory as the script
file_path = os.path.join('path_to_directory', file)
# Read the specific sheet from the Excel file
data = pd.read_excel(file_path, sheet_name=sheet_name)
# Append the data to the merged_data dataframe
merged_data = merged_data.append(data, ignore_index=True)

# Save the merged data to a new Excel file
merged_data.to_excel('merged_data.xlsx', index=False)
```

 

4. Run the Script:
Execute the script to perform the merge. Ensure that you have the correct file paths and sheet names specified.

 

5. Check the Output:
After running the script, you should have a new Excel file named "merged_data.xlsx" with the combined data from the specified sheets.

After following these steps, you should have successfully merged the specified sheets from the selected files into a single Excel file.

 

Best regards,
Community Support Team_Binbin Yu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Greg_Deckler
Community Champion
Community Champion

@Anonymous You should be able to filter the files before combining them. Also, in your transform sample file, you can just select the sheet that you want. Do you have a programmatic way to identify the files that you want to filter out or keep?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors
Top Kudoed Authors