Component | Version |
---|---|
HDI | 3.6 |
HDP | 2.6.5.3008-11 |
Hadoop | 2.7.3.2.6.5.3008-11 |
HBase | 1.1.2.2.6.5.3008-11 |
Nodes | Cores | Memory(GB) | Type | |
---|---|---|---|---|
Hadoop Master | 3 | 4 | 28 | Standard_D12_V2 |
Hadoop Worker | 10 | 8 | 56 | Standard_DS13_V2 |
Loading Queue | 20 | 4 | 16 | Standard_D4s_v3 |
Number of loaded files across time. We can differentiate some sections with different performance.
The more representative section is the last one, where we upgraded the disk to speed up the reading. In average, with the improved disk, processing up to 20 files simultaneously:
Time | Time/nodes | |
---|---|---|
Transform | 00:29:36 | 00:01:28 |
Load | 00:46:19 | 00:02:19 |
Total | 1:15:55 | 00:03:48 |
Index speed: 15.8 files/h or 79.0GB/h
#Files Day Hour 10 2019-07-10 19 4 2019-07-10 20 15 2019-07-10 21 30 2019-07-10 22 26 2019-07-10 23 28 2019-07-11 00 25 2019-07-11 01 23 2019-07-11 02 25 2019-07-11 03 28 2019-07-11 04 30 2019-07-11 05 30 2019-07-11 06 31 2019-07-11 07 30 2019-07-11 08 28 2019-07-11 09 28 2019-07-11 10 30 2019-07-11 11 32 2019-07-11 12 27 2019-07-11 13 27 2019-07-11 14 45 2019-07-11 15 14 2019-07-11 16 29 2019-07-11 17 35 2019-07-11 18 39 2019-07-11 19 11 2019-07-11 20 |
74.096.015 variants
Prepare: 529.303s [ 00:08:49 ]
Aggregate: 9591.626s [ 02:39:52 ]
Write: 7012.733s [ 01:56:53 ] -> Size : 59.5 GiB
1352.675s [ 00:22:33 ]
Prepare: 722.327s [ 00:12:02 ]
Annot: 50384.383s [ 13:59:44 ]
Load: 28204.666s [ 07:44:04 ]
SampleIndex: 12403.542s [ 03:26:44 ]
.....
Table of Contents: