Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Ilgar_Zarbali

Data Engineering with Microsoft Fabric: Efficiently Loading Data into a Lakehouse

Introduction to Lakehouses
The Lakehouse in Microsoft Fabric serves as a central repository for your files and tables. When we refer to "tables," we mean data stored in the Delta Parquet format. This guide begins by logging into Fabric at fabric.microsoft.com and navigating to the Synapse Data Engineering workspace, which is equipped with tools like Lakehouses, notebooks, Spark job definitions, and data pipelines.

Microsoft FabricMicrosoft Fabric

Setting Up Your Workspace
If you're familiar with Power BI, you’ll find the concept of workspaces familiar. In this guide, we’re working in a private workspace where all artifacts—including the Lakehouse we create—will be stored. To enable the full functionality of Fabric, we’ve assigned a Fabric trial capacity to this workspace, which provides 60 days of access to premium features. To start your trial, click the account manager icon and follow the steps to activate your trial.

Trial CapacityTrial Capacity

Creating a Lakehouse
To create a Lakehouse, navigate to the Data Engineering homepage and select Lakehouse. Assign a name to your Lakehouse and click Create. The Lakehouse will appear in your chosen workspace. This also automatically generates a SQL endpoint, allowing external tools like SQL Server Management Studio or Azure Data Explorer to query the data. You can find this endpoint in the Lakehouse settings.

LakehouseLakehouse

Creating LakehouseCreating Lakehouse

Organizing and Loading Data
1. Files vs. Tables:
Files Folder: Stores raw files like CSV, JSON, or Excel files. These files remain as-is and cannot be queried directly using SQL.
Tables Folder: Stores data in Delta Parquet format, making it accessible for querying and analysis.

 

Files-SubFolderFiles-SubFolder

2. Uploading Files:
- Create a subfolder (e.g., Sales) in the Files folder.
- Upload files by selecting Upload, then choose a file or folder from your local machine. For example, we uploaded a file named `items.csv` containing 400,000 rows.

UploadUpload

3. Loading Files into Tables:
- To query data, move files from the Files folder to the Tables folder. Use the Load to Table option, ensuring column names in your file do not contain spaces (replace them with underscores if necessary).
- After successfully loading, the file becomes a Delta Parquet table, ready for SQL queries and other operations.

Load to TableLoad to Table

4. Using Shortcuts:
- Instead of copying data, you can create a shortcut to files stored in other locations like Azure Data Lake Storage or Amazon S3. This approach avoids duplicating data while making it accessible in your Lakehouse.

 

Querying Data in Delta Parquet Format
Once your data is in the Tables folder, you can query it using SQL or other compute engines like Power BI, Excel, or Spark notebooks. The Delta Parquet format ensures compatibility and provides features like transaction logs and version history.

 

Next Steps
This guide has covered creating a Lakehouse, uploading files, and converting them into queryable tables. In part two, we’ll explore querying the data using SQL and other tools, along with advanced data engineering techniques.

By following this guide, you’ll gain a comprehensive understanding of managing and querying data in Microsoft Fabric Lakehouses.

Comments