Free Practice Questions for Databricks Certified Data Engineer Associate Certification
Study with 346 exam-style practice questions designed to help you prepare for the Databricks Certified Data Engineer Associate. All questions are aligned with the latest exam guide and include detailed explanations to help you master the material.
Start Practicing
Random Questions
Practice with randomly mixed questions from all topics
Domain Mode
Practice questions from a specific topic area
Exam Information
Exam Details
Key information about Databricks Certified Data Engineer Associate
Recertification is required every two years by taking the full exam
None allowed
None required; course attendance and six months of hands-on experience in Databricks are highly recommended
Online or test center
USD 200
90 minutes
45 scored multiple-choice questions
2 years
Exam Topics & Skills Assessed
Skills measured (from the official study guide)
Domain 1: Databricks Intelligence Platform
Subdomain 1.1: Enable features that simplify data layout decisions and optimize query performance.
Enable features that simplify data layout decisions and optimize query performance.
Subdomain 1.2: Explain the value of the Data Intelligence Platform.
Explain the value of the Data Intelligence Platform.
Subdomain 1.3: Identify the applicable compute to use for a specific use case.
Identify the applicable compute to use for a specific use case.
Domain 2: Development and Ingestion
Subdomain 2.1: Use Databricks Connect in a data engineering workflow.
Use Databricks Connect in a data engineering workflow.
Subdomain 2.2: Determine the capabilities of the Notebooks functionality.
Determine the capabilities of the Notebooks functionality.
Subdomain 2.3: Classify valid Auto Loader sources and use cases.
Classify valid Auto Loader sources and use cases.
Subdomain 2.4: Demonstrate knowledge of Auto Loader syntax.
Demonstrate knowledge of Auto Loader syntax.
Subdomain 2.5: Use Databricks' built-in debugging tools to troubleshoot a given issue.
Use Databricks' built-in debugging tools to troubleshoot a given issue.
Domain 3: Data Processing & Transformations
Subdomain 3.1: Describe the three layers of the Medallion Architecture and explain the purpose of each layer in a data processing pipeline.
Describe the three layers of the Medallion Architecture and explain the purpose of each layer in a data processing pipeline.
Subdomain 3.2: Classify the type of cluster and configuration for optimal performance based on the scenario in which the cluster is used.
Classify the type of cluster and configuration for optimal performance based on the scenario in which the cluster is used.
Subdomain 3.3: Emphasize the advantages of Lakeflow Spark Declarative Pipelines (for ETL process in Databricks).
Emphasize the advantages of Lakeflow Spark Declarative Pipelines (for ETL process in Databricks).
Subdomain 3.4: Implement data pipelines using Lakeflow Spark Declarative Pipelines.
Implement data pipelines using Lakeflow Spark Declarative Pipelines.
Subdomain 3.5: Identify DDL (Data Definition Language)/DML features.
Identify DDL (Data Definition Language)/DML features.
Subdomain 3.6: Compute complex aggregations and Metrics with PySpark Dataframes.
Compute complex aggregations and Metrics with PySpark Dataframes.
Domain 4: Productionizing Data Pipelines
Subdomain 4.1: Identify the difference between DAB and traditional deployment methods.
Identify the difference between DAB and traditional deployment methods.
Subdomain 4.2: Identify the structure of Asset Bundles.
Identify the structure of Asset Bundles.
Subdomain 4.3: Deploy a workflow, repair, and rerun a task in case of failure.
Deploy a workflow, repair, and rerun a task in case of failure.
Subdomain 4.4: Use serverless for a hands-off, auto-optimized compute managed by Databricks.
Use serverless for a hands-off, auto-optimized compute managed by Databricks.
Subdomain 4.5: Analyzing the Spark UI to optimize the query.
Analyzing the Spark UI to optimize the query.
Domain 5: Data Governance & Quality
Subdomain 5.1: Explain the difference between managed and external tables.
Explain the difference between managed and external tables.
Subdomain 5.2: Identify the grant of permissions to users and groups within UC.
Identify the grant of permissions to users and groups within UC.
Subdomain 5.3: Identify key roles in UC.
Identify key roles in UC.
Subdomain 5.4: Identify how audit logs are stored.
Identify how audit logs are stored.
Subdomain 5.5: Use lineage features in Unity Catalog.
Use lineage features in Unity Catalog.
Subdomain 5.6: Use the Delta Sharing feature available with Unity Catalog to share data.
Use the Delta Sharing feature available with Unity Catalog to share data.
Subdomain 5.7: Identify the advantages and limitations of Delta sharing.
Identify the advantages and limitations of Delta sharing.
Subdomain 5.8: Identify the types of delta sharing: Databricks vs. external systems.
Identify the types of delta sharing: Databricks vs. external systems.
Subdomain 5.9: Analyze the cost considerations of data sharing across clouds.
Analyze the cost considerations of data sharing across clouds.
Subdomain 5.10: Identify Use cases of Lakehouse Federation when connected to external sources.
Identify Use cases of Lakehouse Federation when connected to external sources.
Techniques & products