Welcome to our blog featuring Microsoft Fabric Analytics Engineer Associate DP 600 exam free Dumps questions and answers for aspiring Microsoft Fabric Analytics Engineer Associates!
This certification journey requires a deep understanding of designing, building, and rolling out large-scale data analytics solutions within enterprises. In this role, your primary focus is on leveraging Microsoft Fabric components to convert data into reusable analytics assets effectively. These components encompass lakehouses, data warehouses, notebooks, dataflows, data pipelines, semantic models, and reports.
Your expertise also extends to implementing analytics best practices in Fabric, emphasizing version control and deployment strategies. Collaboration is key in this role, as you’ll frequently engage with solution architects, data engineers, data scientists, AI engineers, database administrators, and Power BI data analysts.
Aside from mastering the Fabric platform, your experience should encompass data modeling, data transformation, Git-based source control, and exploratory analytics. Proficiency in languages like SQL, DAX, and PySpark is also crucial.
Our blog aims to equip you with the knowledge and insights needed to excel in the DP-600 exam. Dive in, explore, and prepare yourself to plan, implement, and manage robust data analytics solutions, serve and prepare data effectively, implement and oversee semantic models, and explore and analyze data proficiently. Let’s embark on this certification journey together!
Case Study
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.
To start the case study –
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.
Overview –
Contoso, Ltd. is a US-based health supplements company. Contoso has two divisions named Sales and Research. The Sales division contains two departments named Online Sales and Retail Sales. The Research division assigns internally developed product lines to individual teams of researchers and analysts.
Existing Environment –
Identity Environment –
Contoso has a Microsoft Entra tenant named contoso.com. The tenant contains two groups named ResearchReviewersGroup1 and ResearchReviewersGroup2.
Data Environment –
Contoso has the following data environment:
The Sales division uses a Microsoft Power BI Premium capacity.
The semantic model of the Online Sales department includes a fact table named Orders that uses Import made. In the system of origin, the OrderID value represents the sequence in which orders are created.
The Research department uses an on-premises, third-party data warehousing product.
Fabric is enabled for contoso.com.
An Azure Data Lake Storage Gen2 storage account named storage1 contains Research division data for a product line named Productline1. The data is in the delta format.
A Data Lake Storage Gen2 storage account named storage2 contains Research division data for a product line named Productline2. The data is in the CSV format.
Requirements –
Planned Changes –
Contoso plans to make the following changes:
Enable support for Fabric in the Power BI Premium capacity used by the Sales division.
Make all the data for the Sales division and the Research division available in Fabric.
For the Research division, create two Fabric workspaces named Productline1ws and Productine2ws.
In Productline1ws, create a lakehouse named Lakehouse1.
In Lakehouse1, create a shortcut to storage1 named ResearchProduct.
Data Analytics Requirements –
Contoso identifies the following data analytics requirements:
All the workspaces for the Sales division and the Research division must support all Fabric experiences.
The Research division workspaces must use a dedicated, on-demand capacity that has per-minute billing.
The Research division workspaces must be grouped together logically to support OneLake data hub filtering based on the department name.
For the Research division workspaces, the members of ResearchReviewersGroup1 must be able to read lakehouse and warehouse data and shortcuts by using SQL endpoints.
For the Research division workspaces, the members of ResearchReviewersGroup2 must be able to read lakehouse data by using Lakehouse explorer.
All the semantic models and reports for the Research division must use version control that supports branching.
Data Preparation Requirements –
Contoso identifies the following data preparation requirements:
The Research division data for Productline1 must be retrieved from Lakehouse1 by using Fabric notebooks.
All the Research division data in the lakehouses must be presented as managed tables in Lakehouse explorer.
Semantic Model Requirements –
Contoso identifies the following requirements for implementing and managing semantic models:
The number of rows added to the Orders table during refreshes must be minimized.
The semantic models in the Research division workspaces must use Direct Lake mode.
Question 1
General Requirements –
Contoso identifies the following high-level requirements that must be considered for all solutions:
Follow the principle of least privilege when applicable.
Minimize implementation and maintenance effort when possible.
You need to ensure that Contoso can use version control to meet the data analytics requirements and the general requirements.
What should you do?
A. Store at the semantic models and reports in Data Lake Gen2 storage.
B. Modify the settings of the Research workspaces to use a GitHub repository.
C. Modify the settings of the Research division workspaces to use an Azure Repos repository.
D. Store all the semantic models and reports in Microsoft OneDrive.
Question 2
General Requirements –
Contoso identifies the following high-level requirements that must be considered for all solutions:
Follow the principle of least privilege when applicable.
Minimize implementation and maintenance effort when possible.
You need to recommend a solution to group the Research division workspaces.
What should you include in the recommendation? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Grouping method: Capacity | Domain | Tenant
Tool: OneLake data hub | The Fabric Admin portal | The Microsoft Entra admin center
Question 3
General Requirements –
Contoso identifies the following high-level requirements that must be considered for all solutions:
Follow the principle of least privilege when applicable.
Minimize implementation and maintenance effort when possible.
You need to refresh the Orders table of the Online Sales department. The solution must meet the semantic model requirements.
What should you include in the solution?
A. an Azure Data Factory pipeline that executes a Stored procedure activity to retrieve the maximum value of the OrderID column in the destination lakehouse
B. an Azure Data Factory pipeline that executes a Stored procedure activity to retrieve the minimum value of the OrderID column in the destination lakehouse
C. an Azure Data Factory pipeline that executes a dataflow to retrieve the minimum value of the OrderID column in the destination lakehouse
D. an Azure Data Factory pipeline that executes a dataflow to retrieve the maximum value of the OrderID column in the destination lakehouse
Question 4
General Requirements –
Contoso identifies the following high-level requirements that must be considered for all solutions:
Follow the principle of least privilege when applicable.
Minimize implementation and maintenance effort when possible.
Which syntax should you use in a notebook to access the Research division data for Productline1?
A. spark.read.format(“delta”).load(“Tables/productline1/ResearchProduct”)
B. spark.sql(“SELECT * FROM Lakehouse1.ResearchProduct ”)
C. external_table(‘Tables/ResearchProduct)
D. external_table(ResearchProduct)
Question 5
You have an Azure Repos repository named Repo1 and a Fabric-enabled Microsoft Power BI Premium capacity. The capacity contains two workspaces named Workspace1 and Workspace2. Git integration is enabled at the workspace level.
You plan to use Microsoft Power BI Desktop and Workspace1 to make version-controlled changes to a semantic model stored in Repo1. The changes will be built and deployed to Workspace2 by using Azure Pipelines.
You need to ensure that report and semantic model definitions are saved as individual text files in a folder hierarchy. The solution must minimize development and maintenance effort
In which file format should you save the changes?
Question 6
You have a Fabric tenant that contains a workspace named Workspace 1. Workspace 1 is assigned to a Fabric capacity.
You need to recommend a solution to provide users with the ability to create and publish custom Direct Lake semantic models by using external tools. The solution must follow the principle of least privilege.
Which three actions in the Fabric Admin portal should you include in the recommendation? Each correct answer presents part of the solution.
NOTE: Each correct answer is worth one point.
Question 7
You have a Microsoft Power BI semantic model that contains measures. The measures use multiple CALCULATE functions and a FILTER function.
You are evaluating the performance of the measures.
In which use case will replacing the FILTER function with the KEEPFILTERS function reduce execution time?
Question 8
You have a Microsoft Power BI semantic model.
You need to identify any surrogate key columns in the model that have the Summarize By property set to a value other than to None. The solution must minimize effort.
What should you use?
Question 9
You have a Fabric tenant that contains a semantic model. The model contains 15 tables.
You need to programmatically change each column that ends in the word Key to meet the following requirements:
What should you use?
Question 10
You have a Fabric tenant that contains a semantic model. The model contains 15 tables.
You need to programmatically change each column that ends in the word Key to meet the following requirements:
What should you use?
Question 11
You have a semantic model named Model1. Modell contains five tables that all use Import mode. Modell contains a dynamic row-level security (RLS) role named HR. The HR role filters employee data so that HR managers only see the data of the department to which they are assigned.
You publish Modell to a Fabric tenant and configure RLS role membership. You shake the model and related reports to users.
An HR manager reports that the data they see in a report is incomplete.
What should you do to validate the data seen by the HR Manager?
Question 11
You have a Fabric tenant that contains a Microsoft Power BI report named Report1.
Report1 is slow to render. You suspect that an inefficient DAX query is being executed.
You need to identify the slowest DAX query, and then review how long the query spends in the formula engine as compared to the storage engine.
Which five actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Actions
Sort the Duration (ms) column in descending order.
View the Server Timings tab.
Answer Area
Question 12
You create a semantic model by using Microsoft Power BI Desktop. The model contains one security role named Sales RegionManager and the following tables:
Sales
You need to modify the model to ensure that users assigned the Sales RegionManager role cannot se a column named Address in SalesAddress.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Actions
Open the model in Tabular Editor.
Set Object Level Security to None for Sales Region Manager.
Set Object Level Security to Default for Sales Region Manager.
Answer Area
1.Open the model in Power BI Desktop.
2.Select the Address column in Sales Address.
3.Set the Hidden property to True.
Question 13
You have a custom Direct Lake semantic model named Modell that has one billion rows of data.
You use Tabular Editor to connect to Model1 by using the XMLA endpoint.
You need to ensure that when users interact with reports based on Modell, their queries always use Direct Lake mode.
What should you do?
Question 14
You have a Microsoft Power BI report named Report1 that uses a Fabric semantic model.
Users discover that Report1 renders slowly.
You open Performance analyzer and identify that a visual named Orders By Date is the slowest to render. The duration breakdown for Orders By Date is shown in the following table.
Name | Duration (ms)
DAX query | 27
Visual display | 39
Other | 1047
What will provide the greatest reduction in the rendering duration of Report1?
Question 15
You have a Fabric workspace that contains a Direct Query semantic model. The model queries a data source that has 500 million rows.
You have a Microsoft Power BI report named Report1 that uses the model. Report1 contains visuals on multiple pages.
You need to reduce the query execution time for the visuals on all the pages.
What are two features that you can use? Each correct answer presents a complete solution.
NOTE: Each correct answer is worth one point.
Question 16
You have a Fabric tenant that contains a lakehouse named Lakehouse1. Lakehouse1 contains a table named Table1.
You are creating a new data pipeline.
You plan to copy external data to Table 1. The schema of the external data changes regularly.
You need the copy operation to meet the following requirements:
You add a Copy data activity to the pipeline.
What should you do for the Copy data activity?
Question 17
You have a Fabric tenant that contains a warehouse named Warehouse 1. Warehouse1 contains two schemas name schema1 and schema2 and a table named schema1.city.
You need to make a copy of schema1.city in schema2. The solution must minimize the copying of data.
Which T-SQL statement should you run?.
Question 18
You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1.
In Workspace1, you create a data pipeline named Pipeline1.
You have CSV files stored in an Azure Storage account.
You need to add an activity to Pipeline1 that will copy data from the CSV files to Lakehouse 1. The activity must support Power Query M formula language expressions.
Which type of activity should you add?
Question 19
You have a Fabric tenant that contains a lakehouse named Lakehouse1.
You need to prevent new tables added to Lakehouse1 from being added automatically to the defa semantic model of the lakehouse.
What should you configure?
Question 20
You have a Fabric workspace named Workspace1 and an Azure SQL database
You plan to create a dataflow that will read data from the database, and then transform the data by performing an inner join.
You need to ignore spaces in the values when performing the inner join. The solution must minimize development effort.
What should you do?
Question 21
You have an Azure Data Lake Storage Gen2 account named storage1 that contains a Parquet file named sales.parquet.
You have a Fabric tenant that contains a workspace named Workspace1.
Using a notebook in Workspace1, you need to load the content of the file to the default lakehouse. The solution must ensure that the content will display automatically as a table named Sales in Lakehouse explorer.
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point
Answer Area
df.spark.read.parquet(“abfss://fs1@storage1.dfs.core.windows.net/files/sales.parquet
df.write.mode(“overwrite”).format(“parquet”).saveAsTable(“sales”)
Question 22
You have a Fabric tenant that contains a lakehouse.
You plan to use a visual query to merge two tables.
You need to ensure that the query returns all the rows in both tables.
Which type of join should you use?
Question 23
You are implementing two dimension tables named Customers and Products in a Fabric warehouse.
You need to use slowly changing dimension (SCD) to manage the versioning of data. The solution must meet the requirements shown in the following table.
Table | Change action
Customers | Create a new version of the row,
Products | Overwrite the existing value in the latest row.
Which type of SCD should you use for each table? To answer, drag the appropriate SCD types to correct tables. Each SCD type may be used once, more than once, or not at all. You may need to d the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
SCD Types
Type 0
Type 1
Type 2
Type 3
Answer Area
Customers: Type 2
Products: Type 1
Question 23
You have a Fabric tenant that contains a workspace named Workspace1. Workspace1 lakehouse named Lakehouse1 and a warehouse named Warehouse 1.
You need to create a new table in Warehouse1 named POSCustomers by querying the in Lakehouse1.
How should you complete the T-SQL statement? To answer, select the appropriate opt answer area.
NOTE: Each correct selection is worth one point.
Answer Area
CREATE TABLE dbo.POSCustomers AS SELECT
DISTINCT customerid,
customer,
postalcode,
category
FROM lakehouse1.dbo.customer
Question 24
You have a Fabric tenant that contains a warehouse.
You are designing a star schema model that will contain a customer dimensio dimension table will be a Type 2 slowly changing dimension (SCD).
You need to recommend which columns to add to the table. The columns mu the source.
Which three types of columns should you recommend? Each correct answer solution.
NOTE: Each correct answer is worth one point.
Question 25
You have a Fabric tenant that contains two workspaces named Workspace1 and Workspace2. Workspace1 contains a lakehouse named Lakehouse1. Workspace2 contains a lakehouse named Lakehouse2. Lakehouse1 contains a table named dbo Sales. Lakehouse2 contains a table named dbo Customers.
You need to ensure that you can write queries that reference both dbo. Sales and dbo.Customers in the same SQL query without making additional copies of the tables.
What should you use?
Question 25
You have a Fabric tenant.
You are creating an Azure Data Factory pipeline..
You have a stored procedure that returns the number of active customers and their average sales for the current month.
You need to add an activity that will execute the stored procedure, in a warehouse. The returned values must be available to the downstream activities of the pipeline.
Which type of activity should you add?
A.KQL
B.Copy data
Question 26
You have a Fabric tenant.
You are creating an Azure Data Factory pipeline..
You have a stored procedure that returns the number of active customers and their average sales for the current month.
You need to add an activity that will execute the stored procedure, in a warehouse. The returned values must be available to the downstream activities of the pipeline.
Which type of activity should you add?
A.KQL
B.Copy data
Question 27
You have a Fabric tenant,
You plan to create a data pipeline named Pipeline1. Pipeline1 will include two activities that will execute in sequence.
You need to ensure that a failure of the first activity will NOT block the second activity.
Which conditional path should you configure between the first activity and the second activity?
Question 28
You have a Fabric tenant that contains lakehouse named Lakehouse1. Lakehousel contains a Delta table with eight columns.
You receive new data that contains the same eight columns and two additional columns.
You create a Spark DataFrame and assign the DataFrame to a variable named df. The DataFrame contains the new data
You need to add the new data to the Delta table to meet the following requirements:
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point
Answer Area
df.write.format(“delta”) mode(“append”) \
.option (“mergeSchema”, “true”) \
.saveAsTable(“table”)
Question 29
You have a Fabric tenant that contains a lakehouse named Lakehouse1. Lakehouse1 contains a Delta table that has one million Parquet files.
You need to remove files that were NOT referenced by the table during the past 30 days. The solution must ensure that the transaction log remains consistent, and the ACID properties of the table are maintained.
What should you do?
Question 30
You have a Fabric workspace named Workspace1 that contains a dataflow named Dataflow1. Dataflow1 returns 500 rows of data.
You need to identify the min and max values for each column in the query results.
Which three Data view options should you select? Each correct answer presents part of the solution.
NOTE: Each correct answer is worth one point.
A Show column value distribution
Question 31
You have a Fabric tenant that contains customer churn data stored as Parquet files in OneLake. The data contains details about customer demographics and product usage.
You create a Fabric notebook to read the data into a Spark DataFrame. You then create column charts in the notebook that show the distribution of retained customers as compared to lost customers based on geography, the number of products purchased, age, and customer tenure.
Which type of analytics are you performing?
A.predictive
B diagnostic
C prescriptive
D.descriptive
Question 32
You have a Fabric tenant that contains a machine learning model registered in a Fabric workspace.
You need to use the model to generate predictions by using the PREDICT function in a Fabric notebook
Which two languages can you use to perform model scoring? Each correct answer presents a complete solution.
NOTE: Each correct answer is worth one point.
Question 32
You have a Fabric tenant that contains a machine learning model registered in a Fabric workspace.
You need to use the model to generate predictions by using the PREDICT function in a Fabric notebook
Which two languages can you use to perform model scoring? Each correct answer presents a complete solution.
NOTE: Each correct answer is worth one point.
Question 33
You have a Fabric tenant that contains JSON files in OneLake. The files have one billion items.
You plan to perform time series analysis of the items.
You need to transform the data, visualize the data to find insights, perform anomaly detection, and share the insights with other business users. The solution must meet the following requirements:
Minimize how long it takes to load the data.
What should you use to transform and visualize the data?
Question 34
You have a Fabric tenant that contains a Microsoft Power BI report.
You are exploring a new semantic model.
You need to display the following column statistics:
Which Power Query function should you run?
Question 35
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a Fabric tenant that contains a semantic model named Model1.
You discover that the following query performs slowly against Model1.
EVALUATE
FILTER (
VALUES (Customer [Customer Name]),
CALCULATE (COUNTROWS (‘Order Item’)) > 0
)
ORDER BY Customer [Customer Name]
You need to reduce the execution time of the query
Solution: You replace line 4 by using the following code:
NOT ISEMPTY (CALCULATETABLE (‘Order Item’))
Does this meet the goal?
Question 36
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a Fabric tenant that contains a semantic model named Model1.
You discover that the following query performs slowly against Model 1.
EVALUATE
FILTER (
VALUES (Customer [Customer Name]),
CALCULATE (COUNTROWS (‘Order Item’)) > 0
)
ORDER BY Customer [Customer Name]
You need to reduce the execution time of the query.
Solution: You replace line 4 by using the following code:
CALCULATE (COUNTROWS (‘Order Item’)) >= 0
Does this meet the goal?
Question 37
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a Fabric tenant that contains a new semantic model in OneLake
You use a Fabric notebook to read the data into a Spark DataFrame.
You need to evaluate the data to calculate the min, max, mean, and standard deviation values for al the string and numeric columns.
Solution: You use the following PySpark expression:
df.explain()
Does this meet the goal?
Peacock Essay The peacock, with its resplendent plumage and majestic presence, is a creature of…
Navratri Essay Navratri, meaning 'nine nights' in Sanskrit, is one of the most significant Hindu…
Guru Purnima Essay Guru Purnima, a sacred festival celebrated by Hindus, Buddhists, and Jains, honors…
Swachh Bharat Abhiyan Essay Swachh Bharat Abhiyan, India's nationwide cleanliness campaign launched on October 2,…
Lachit Borphukan Essay Lachit Borphukan, a name revered in the annals of Indian history, stands…
Guru Tegh Bahadur Essay Guru Tegh Bahadur, the ninth Guru of Sikhism, is a towering…