Configuring NameNode Heap Size
Okay, so as my hadoop already reach more than 6 Milion files, the current setting is not suitable for running nameNode. So need toTotal Java Heap need to increased to 6GB.
Details can refer this arcticle : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_command-line-installation/content/configuring-namenode-heap-size.html
NameNode heap size depends on many factors, such as the number of files, the number of blocks, and the load on the system. The following table provides recommendations for NameNode heap size configuration. These settings should work for typical Hadoop clusters in which the number of blocks is very close to the number of files (generally, the average ratio of number of blocks per file in a system is 1.1 to 1.2).
Some clusters might require further tweaking of the following settings. Also, it is generally better to set the total Java heap to a higher value.
Table 1.11. Recommended NameNode Heap Size Settings
Number of
Files,
in Millions |
Total Java Heap (Xmx and Xms) |
Young Generation Size (-XX:NewSize -XX:MaxNewSize) |
---|---|---|
< 1 million files |
1126m |
128m |
1-5 million files |
3379m |
512m |
5-10 |
5913m |
768m |
10-20 |
10982m |
1280m |
20-30 |
16332m |
2048m |
30-40 |
21401m |
2560m |
40-50 |
26752m |
3072m |
50-70 |
36889m |
4352m |
70-100 |
52659m |
6144m |
100-125 |
65612m |
7680m |
125-150 |
78566m |
8960m |
150-200 |
104473m |
8960m |
Note | |
---|---|
Hortonworks recommends a maximum of 300 million files on the NameNode. |
You should also set -XX:PermSize to 128m and -XX:MaxPermSize to 256m.
Following are the recommended settings for HADOOP_NAMENODE_OPTS in the hadoop-env.sh file (replacing the ##### placeholder for -XX:NewSize, -XX:MaxNewSize, -Xms, and -Xmx with the recommended values from the table):
-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=##### -XX:MaxNewSize=##### -Xms##### -Xmx##### -XX:PermSize=128m -XX:MaxPermSize=256m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_NAMENODE_OPTS}If the cluster uses a secondary NameNode, you should also set HADOOP_SECONDARYNAMENODE_OPTS to HADOOP_NAMENODE_OPTS in the hadoop-env.sh file:
HADOOP_SECONDARYNAMENODE_OPTS=$HADOOP_NAMENODE_OPTSAnother useful HADOOP_NAMENODE_OPTS setting is -XX:+HeapDumpOnOutOfMemoryError. This option specifies that a heap dump should be executed when an out-of-memory error occurs. You should also use -XX:HeapDumpPath to specify the location for the heap dump file:
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./etc/heapdump.hprof
0 Comments:
Post a Comment