Someth ing slightly completely different for this submit…

I used to be requested to place collectively a set of interview questions that might be used on a cohort of knowledge engineering candidates with a assorted degree of expertise.

This appeared like factor to weblog for the broader group to evaluation and suggestions on. Please additionally settle for the plain caveats that the purpose of the questions was/is to not have technical debates on what the right state of a given platform might/needs to be. But merely to tease out the variations in candidate capabilitities and the place expertise can be utilized to determine attainable deceptive or imprecise questions. With some overlap in content material/solutions.

Also, to award bonus factors if these candidates comply with my weblog posts!

Enjoy 🙂


General and Theoretical questions

What does SQL stand for?

What does MDX stand for?

What does DAX stand for?

What does IaaS imply?

What does PaaS imply?

What does NoSQL imply?

What is supposed by the time period ‘serverless’?

What is the first use case for an OLTP database?

What is the first use case for an OLAP database?

Whats the distinction between and OLTP and OLAP database?

In pc science database transaction processing what does ACID stand for?

Why are ACID resilient transactions necessary for information processing workloads?

What is supposed by a heap desk?

Explain the principle ideas for the creation of a Kimball star schema information mannequin.

How would you determine what achieve to use to a knowledge warehouse reality desk?

What is the distinction between scaling up compute and scaling out?

What is the distinction between a Lambda and Kappa structure?

What are the important thing traits of a touchdown zone structure?

What is the distinction between Data Mesh and Data Fabric?

What is the position of the semantic layer in an analytics platform?

Explain what is supposed by predicate push down.

In cloud information processing why is it necessary to decouple compute and storage?

When can ingested information be described as real-time?

What is supposed by stream processing?

What is supposed by information eGress vs inGress within the context of a cloud platform?

What are the 5 V’s used to characterise massive information?

What is massive information?


Azure Technology Related Questions

In SQL Databases what’s the distinction between a clustered and non-clustered index?

What is the position of a DataBody in Apache Spark?

What is the distinction between an Azure Data Factory Web Activity and a Web Hook Activity and why?

When does an Azure Function App turn into sturdy?

How is a Spark Application dealt with and executed by a Databricks cluster?

What is the distinction between a Databricks job cluster and an interactive cluster?

What is supposed by a clusters time to stay (TTL)?

What is a Spark Session in Synapse Analytics?

What is the distinction between a Data Lake and Delta Lake?

What is the underlying file system utilized by most Data Lake product choices?

How is a Delta Lake entity represented on disk within the storage layer?

For an analytical information platform how would you assist and implement catastrophe restoration necessities?

What endpoints might be used to deal with a real-time information feed or messages?

What is the distinction between a Private Endpoint and a Service Endpoint?

What assets can be utilized to orchestrate information processing in Azure?

What is the distinction between a Resource Group and a Subscription in Azure?

What is the distinction between a Service Principal and a Managed Identity?

When ought to we use Power BI Premium information fashions vs Azure Analysis Services?

What is the distinction between a SQL Database and a Synapse Dedicated SQL Pool? Formly often called a SQL Data Warehouse?


If you discovered this handy please let me know. I’d wish to assist the following technology of knowledge engineers as a lot as attainable.

Many thanks for studying


Source link