Below is a table containing all the details for the property group: Ingest
Property Name | Description | Default Value | Run CdkDeploy When Changed |
---|---|---|---|
sleeper.table.ingest.file.writing.strategy | Specifies the strategy that ingest uses to creates files and references in partitions. Valid values are: [one_file_per_leaf, one_reference_per_leaf] |
one_reference_per_leaf | false |
sleeper.table.ingest.record.batch.type | The way in which records are held in memory before they are written to a local store. Valid values are 'arraylist' and 'arrow'. The arraylist method is simpler, but it is slower and requires careful tuning of the number of records in each batch. |
arrow | false |
sleeper.table.ingest.partition.file.writer.type | The way in which partition files are written to the main Sleeper store. Valid values are 'direct' (which writes using the s3a Hadoop file system) and 'async' (which writes locally and then copies the completed Parquet file asynchronously into S3). The direct method is simpler but the async method should provide better performance when the number of partitions is large. |
async | false |
sleeper.table.ingest.job.files.commit.async | If true, ingest tasks will add files via requests sent to the state store committer lambda asynchronously. If false, ingest tasks will commit new files synchronously. This is only applied if async commits are enabled for the table. The default value is set in an instance property. |
true | false |