Below is a table containing all the details for the property group: Bulk Import
Property Name | Description | Default Value | Run CdkDeploy When Changed |
---|---|---|---|
sleeper.table.bulk.import.emr.instance.architecture | (Non-persistent EMR mode only) Which architecture to be used for EC2 instance types in the EMR cluster. Must be either "x86_64" "arm64" or "x86_64,arm64". For more information, see the Bulk import using EMR - Instance types section in docs/usage/ingest.md | arm64 | false |
sleeper.table.bulk.import.emr.master.x86.instance.types | (Non-persistent EMR mode only) The EC2 x86_64 instance types and weights to be used for the master node of the EMR cluster. For more information, see the Bulk import using EMR - Instance types section in docs/usage/ingest.md |
m7i.xlarge | false |
sleeper.table.bulk.import.emr.executor.x86.instance.types | (Non-persistent EMR mode only) The EC2 x86_64 instance types and weights to be used for the executor nodes of the EMR cluster. For more information, see the Bulk import using EMR - Instance types section in docs/usage/ingest.md |
m7i.4xlarge | false |
sleeper.table.bulk.import.emr.master.arm.instance.types | (Non-persistent EMR mode only) The EC2 ARM64 instance types and weights to be used for the master node of the EMR cluster. For more information, see the Bulk import using EMR - Instance types section in docs/usage/ingest.md |
m7g.xlarge | false |
sleeper.table.bulk.import.emr.executor.arm.instance.types | (Non-persistent EMR mode only) The EC2 ARM64 instance types and weights to be used for the executor nodes of the EMR cluster. For more information, see the Bulk import using EMR - Instance types section in docs/usage/ingest.md |
m7g.4xlarge | false |
sleeper.table.bulk.import.emr.executor.market.type | (Non-persistent EMR mode only) The purchasing option to be used for the executor nodes of the EMR cluster. Valid values are ON_DEMAND or SPOT. |
SPOT | false |
sleeper.table.bulk.import.emr.executor.initial.capacity | (Non-persistent EMR mode only) The initial number of capacity units to provision as EC2 instances for executors in the EMR cluster. This is measured in instance fleet capacity units. These are declared alongside the requested instance types, as each type will count for a certain number of units. By default the units are the number of instances. This value overrides the default value in the instance properties. It can be overridden by a value in the bulk import job specification. |
2 | false |
sleeper.table.bulk.import.emr.executor.max.capacity | (Non-persistent EMR mode only) The maximum number of capacity units to provision as EC2 instances for executors in the EMR cluster. This is measured in instance fleet capacity units. These are declared alongside the requested instance types, as each type will count for a certain number of units. By default the units are the number of instances. This value overrides the default value in the instance properties. It can be overridden by a value in the bulk import job specification. |
10 | false |
sleeper.table.bulk.import.emr.release.label | (Non-persistent EMR mode only) The EMR release label to be used when creating an EMR cluster for bulk importing data using Spark running on EMR. This value overrides the default value in the instance properties. It can be overridden by a value in the bulk import job specification. |
emr-7.2.0 | false |
sleeper.table.bulk.import.min.leaf.partitions | Specifies the minimum number of leaf partitions that are needed to run a bulk import job. If this minimum has not been reached, bulk import jobs will refuse to start | 64 | false |
sleeper.table.bulk.import.job.files.commit.async | If true, bulk import will add files via requests sent to the state store committer lambda asynchronously. If false, bulk import will commit new files at the end of the job synchronously. This is only applied if async commits are enabled for the table. The default value is set in an instance property. |
true | false |