Difference is that is the capacity is abstracted from us.

Even though the capacity is abstracted away from you, you still can think about the underlying capacity in terms of worker capacity, pre-initialized capacity, just like provisioned concurrency from lambda

You no longer need to think about the number of workers (provisioned shards) as its taken care by EMR now

Application Lifecycle

  1. The EMR application has to be created with CreateApplicationAPI

The three possible states 1. creating, created and terminated.

  1. The application has to be started with StartApplication API, but it can also be terminated right after created with DeleteApplication API

  1. After starting the application with the start command, two possibilities can happen
    1. if there are no failures, it goes to started phase stopping stopped
    2. if there are failures, it goes straight to stopping stopped

Pre-initialized capacity