Some thoughts on NFS

	Sometime back, well a few years back I did some testing with a Linux box
	running a software raid array for nfs performance. The box was a
	Dell 2300, if I recall correctly, and ran v2.4 of the linux kernel.
	The box itself was basically a mp3 server for a bunch of developers
	to store their audio files. Note to RIAA, this was not a p2p sharing
	system.

	At the time we saw greatly improved performance in the NFS handling
	in the 2.4 linux kernel. I'm a BSD guy myself and a while back I
	acquired a LSI Logic SCSI->IDE raid controller for my own home
	raid array. I have never tweaked my server/client connections until
	recently. This array was never in much use except as a fancy
	homedirectory. I have recently deployed some scripts that I am
	developing which require more reads/writes to the array. I have
	also noticed that mozilla and pine, both of which read/write
	on a nfs mount point have gotten slower. This tempted me to investigate
	further.


	I run several FBSD versions at home, my raid server recently suffered
	death on its primary disk (note to self, investigate NFS root) and 
	I had to upgrade to 4.9-Release (now cvsup'd to -Stable). I have two 
	test mules: a Dell 600SC running 5.1-Release and my workstation running
	4.9-Stable. (another note to self, update 5.1-Release). In the past
	I have only mounted my raid array using minimal options: 

		rw   -> read write
		soft -> soft mount (depreciated, use -s)
		intr -> interruptible (depreciated, use -i)

	By default a nfs mount would use NFSv2, ie: UDP. In the past I have
	always been hesitant to use v3 (TCP) due to my perception that the
	overhead was hindering performance. Intrestingly my simple testing
	does not bear that analysis out. It is very possible that depending
	on the data UDP could outperform TCP but with my current setup I
	do not see much difference.

	While there are numerious articles relating to nfs tuning none
	seemed to answer the questions I had. If they are answered somewhere
	I could not easily find it. My test method is not exact either, but
	I just wanted an effective base line in which to judge my results.
	I used dd to create files of 100 meg and 1 meg in size. This
	should show a good comparison of large and small file sizes, for my
	needs, since my array holds audio files 1-8 meg and large ISO
	images and backups (100meg-650meg). I just chose the block size
	using the WAG method.
	
	I perform 7 passes, note the results returned by the dd command
	and throw out the highest and lowest result. Then average the rest.

 	-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-
	test 1: standard setup (for me)

	mount -t nfs -o soft,intr,rw
	dd if=/dev/zero of=./nfs.test bs=1024 count=102400
	average time: 27 seconds
	highest time: 34 (discarded)
	lowest time : 25 (discarded)

 	dd if=/dev/zero of=./nfs.test bs=1024 count=1024	
 	average time: 0.0535 seconds
	highest time: 0.0546 (discarded)
	lowest time : 0.0533 (discarded)	

	conclusion: none.
	
        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
	test 2: setting cache sizes

	mount -t nfs -o -r=8192,-w=8192,soft,intr,rw
	dd if=/dev/zero of=nfs.test bs=1024 count=102400
	average time: 27
	highest time: 29.03 (discarded)
	lowest time : 26.17 (discarded)

	dd if=/dev/zero of=nfs.test bs=1024 count=1024
	average time: 0.0534
	highest time: 0.0671 (discarded)
	lowest time : 0.0532 (discarded)

	conclusion: caching did not seem to improve performance
	
	-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
	test 3: increasing cache sizes
	
	mount -t nfs -o -r=32768,-w=32768,soft,intr,rw
	dd if=/dev/zero of=nfs.test bs=1024 count=102400
	average time: 24 sec
        highest time: 25.65 (discarded)
        lowest time : 24.14 (discarded)

        dd if=/dev/zero of=nfs.test bs=1024 count=1024
        average time: 0.012206 sec
        highest time: 0.012690 (discarded)
        lowest time : 0.011679 (discarded)

        conclusion: large cache sizes seemed to improve performance
	slightly but really improved small file sizes.

        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
        test 4: NFSv3, standard options
        
        mount -t nfs -o -3,-T,soft,intr,rw
        dd if=/dev/zero of=nfs.test bs=1024 count=102400
        average time: 26.5
        highest time: 32.03 (discarded)
        lowest time : 26.24 (discarded)

        dd if=/dev/zero of=nfs.test bs=1024 count=1024
        average time: 0.0627
        highest time: 0.0568 (discarded)
        lowest time : 0.0698 (discarded)

        conclusion: large file size performance was on par with
	v2, small file size seems to show worse performance. 

        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
        test 5: NFSv3, with 8192 cache size
        
        mount -t nfs -o -3,-T,-r=8192,-w=8192,soft,intr,rw
        dd if=/dev/zero of=nfs.test bs=1024 count=102400
        average time: 25.46
        highest time: 28.83 (discarded)
        lowest time : 24.54 (discarded)
        
        dd if=/dev/zero of=nfs.test bs=1024 count=1024
        average time: 0.06405 
        highest time: 0.05759 (discarded)
        lowest time : 0.07798 (discarded)

        conclusion: again v2 seems to do well.

        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
        test 6: NFSv3, larger cache size
        
        mount -t nfs -o -3,-T,-r=32768,-w=32768,soft,intr,rw
        dd if=/dev/zero of=nfs.test bs=1024 count=102400
        average time: 22.99
        highest time: 24.10 (discarded)
        lowest time : 22.46 (discarded)
        
        dd if=/dev/zero of=nfs.test bs=1024 count=1024
        average time: 0.01534
        highest time: 0.01587 (discarded)
        lowest time : 0.01518 (discarded)

        conclusion: again we see similar results as v2

	-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


        Conclusions: it seems that v3 does offer some overhead
	given the file size we are working with. Overall the
	difference seems to be slight. With either v2 or v3
	performance is effected greatly just with read and write
	cache sizing over anything else. V3 with a large cache
	size seemed to perform best.

	There is still much to be learned here though. I would
	like to rerun these tests with a bit more automation.
	Other variables I would like to play with include
	smaller file output sizes and greater block sizing.

 	I also want to try v4 and see how it does but that will
	require an upgrade of my file server.	

UPDATE 13 Feb 2004

	I felt it necessary to report some more details on my nfsv3
	experiments. I have been running v3/TCP for a while with
	mixed results. Opening large files over NFS, such as
	mail spools in Pine, resulted in NFS timeout error messages.

	I tweaked my Fbsd sysctl's 

		vfs.nfs.access_cache_timeout=6 (default was 2)
		vfs.nfs.gatherdelay_v3=10000 (default was 0)

	but I was still seeing timeouts. Even worse, after altering
	my sysctls I actually saw nfs "deafness" occur at random.
	I also increased the number of NFS daemons on my server
	and NFSIOD's on my client. Neither seemed to fix the issues.
	I changed back to NFSv2 UDP on my workstation and stability
	has been improved.

	I suspect the issue might be related to the 4.x code base so
	I want to try upgrading to 5.x once it has a little age on it.

	I will also add that the results on performance IS NOT a
	quality test of stability or overall performance but rather
	intended to show how the various settings effected transfers.
	I am writing a test script which will better test various
	file sizes and allow for multiple clients for future investigation.



Copyright © 2003,2004 - Steven St.Laurent - steven@403forbidden.net