DDrona4U
Sign inCreate account
My learningMatalin Interview QuestionsLesson 03
What is the difference between an orchestration pipeline (job) and a transformation pipeline (job) in Matillion_

Lesson 03

What is the difference between an orchestration pipeline (job) and a transformation pipeline (job) in Matillion_

In Matillion, an Orchestration Job is used to control and automate workflows such as loading data, calling APIs, running SQL scripts, triggering other jobs, and managing dependencies. It focuses on what happens and when. A Transformation Job is used to transform data already present in the target data warehouse. It performs tasks such as filtering, joining, aggregating, cleansing, and enriching data. It focuses on how data is transformed.

Get the full lesson

Sign in to unlock everything beyond the preview — it's free.

  • Take timestamped notes as you watch
  • Read the full transcript and download resources
  • Join the discussion and track your progress
Sign inCreate free account

Curriculum

20 lessons · 40m

0/20 lessons done40m left
  1. 0102:09

    What is Matillion, and what does it mean that it is a _cloud-native, push-down ELT_ tool rather than a traditional ETL tool_

    02:09

  2. 0202:00

    Explain the difference between ETL and ELT, and why Matillion pushes transformation logic down into the data warehouse.

    02:00

  3. 01:56

    What is the difference between an orchestration pipeline (job) and a transformation pipeline (job) in Matillion_

    01:56

  4. 0401:56

    Which cloud data warehouses does Matillion work with, and why does the choice of warehouse matter for how transformations execute_

    01:56

  5. 0502:03

    You need to extract data from a REST API, land it in Snowflake, then clean and join it. Which pipeline types and components would you use, and in what order_

    02:03

  6. 0601:56

    A transformation pipeline has grown to 40+ components and is hard to read and maintain. How would you restructure it_

    01:56

  7. 0701:56

    You need to load data from an API that has no native Matillion connector. What are your options_

    01:56

  8. 0801:58

    Your source data is semi-structured JSON nested several levels deep. How would you make it queryable as flat relational columns in Matillion_

    01:58

  9. 0902:10

    A single orchestration pipeline needs to run several transformation pipelines, some in parallel and some in sequence. How would you wire this up_

    02:10

  10. 1001:56

    You want to reuse the same load-and-transform logic for 50 different source tables. How would you avoid building 50 near-identical pipelines_

    01:56

  11. 1102:01

    The same pipeline must run against dev, test, and prod warehouses with different schemas and credentials. How would you structure this without duplicating pipelines_

    02:01

  12. 1201:41

    You need a pipeline to behave differently based on a value computed at runtime (e.g., a row count). How would you use variables to control the flow_

    01:41

  13. 1301:47

    Explain the difference between job variables and environment variables, and when you'd use a grid variable.

    01:47

  14. 1402:05

    You need to loop over a list of table names and run the same extract for each one. How would you implement iteration in Matillion_

    02:05

  15. 1502:05

    A variable's value set inside an iterator isn't persisting the way you expect across loop runs. What variable behavior (scope_branch) is likely causing this_

    02:05

  16. 1601:59

    A source table has 500 million rows and you only want to load records changed since the last run. How would you design an incremental load_

    01:59

  17. 1702:06

    You need to implement slowly changing dimension (SCD) Type 2 logic in a Matillion transformation pipeline. How would you approach it_

    02:06

  18. 1801:59

    Your full reload of a large table is slow and expensive every night. What strategies would you use to reduce load time and warehouse cost_

    01:59

  19. 1901:59

    A nightly load partially failed and left the target table in an inconsistent state. How would you make the load idempotent and safely re-runnable_

    01:59

  20. 2001:58

    You need to capture and process only new files arriving in S3_Blob storage. How would you orchestrate this_

    01:58