Sunday, August 28, 2011

High NFS Load causing echo 0 > /proc/sys/kernel/hung_task_timeout_secs

Do note that simultaneous numerous write by the NFS Clients on the NFS Server will cause tremendous performance penalty and system lock-out as describe below. You will notice if you use "top" utilities, the load can be extremely high as numerous system locks are queued. 

One of my researcher was running a intense load on the NFS Server that cause an  eventual  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs". Before that, I saw on the log file "rpc-srv/tcp: nfsd: got error -104 when sending 140 bytes - shutting down socket"

To solve the problem, you have to lighten the load of the NFS or improve the setting. You may want to take a look at the Configuring NFS Server for Performance. A longer term solution will be to move to parallel file system.

Sometimes, it could be caused by other factors like drivers. You may want to take a look at Upgrading of Broadcom Drivers to resolve eth0 NIC SerDES Link is Down

No comments: