1. 程式人生 > >Recommended size for yarn.nodemanager.resource.local-dirs?

Recommended size for yarn.nodemanager.resource.local-dirs?

post when mov pin pos cati hadoop works phi

Folks,

What is the recommended value for "yarn.nodemanager.resource.local-dirs"?

We only have one value (directory) configured for the above property, which has a size of 200GB.

Our hive jobs‘ map/reduce fill this folder up, and yarn places this node in the blocklist. Moving to tez engine and/or increasing the quota size may fix this, but we‘d like to know the recommended value.

最佳解答

個解答,截止Sourygna Luangsay

If you use the same partitions for yarn intermediate data than for the HDFS blocks, then you might also consider setting the fs.datanode.du.reserved property, which reserves some space on those partitions for non-hdfs use (such as intermediate yarn data).

One base recommendation I saw on my first Hadoop training long time ago was to dedicate 25% of the "data disks" for that kind of intermediate data. I guess the optimal answer should consider the maximum amount of intermediate data you can get at the same time (when launching a job, do you use all the data of HDFS as input data?) and dedicate the space for yarn.nodemanager.resource.local-dirs accordingly.

I would also recommend turning on the property mapreduce.map.output.compress in order to reduce the size of the intermediate data.

個解答,截止Jean-Philippe Player

You would assign one folder to each of the datanode disks, closely mapping dfs.datanode.data.dir. On a 12 disk system you would have 12 yarn local-dir locations.

技術分享圖片

Recommended size for yarn.nodemanager.resource.local-dirs?