[lug] shmem/mmap/HUGETLB question

Doug Pintar ratnip3 at gmail.com
Sat Oct 15 19:32:46 MDT 2011

----- Original Message ----- >
> madvise() will clear the young bit and put the pages onto the active list.
> But... trying to outsmart the page cache in this way is just never going 
> to
> lead to great joy; you're trying to out-optimize some of the most heavily
> optimized code in the whole system.  Basic streaming I/O is a pretty 
> common
> use case, after all.
> If you're having performance problems, it is almost certainly at a lower
> level.  Which filesystem are you using on each side?  Are there a lot of
> seeks going on?  Are your disks reasonable, and are you getting a whole 
> lot
> worse bandwidth than hdparm tells you is possible?  Is the CPU pegged?  If
> not, no amount of MM trickery will help you.  Run iostat and see if one
> drive is far busier than the other.  You will almost certainly find that
> your problem is not copying pages.
> But, should I be wrong and you *really* think you need to do zerocopy file
> copies, have a look at the splice() system call.
Thanks for your helpful and well-considered answer.  I'm using ext4 on both 
disks, there doesn't seem to be any heavy seeking, transfer bandwidth is 
about half the max on the slower of the two devices.  CPU not busy, system 
pretty idle.  I'm not a newbie at this stuff; I've been doing device drivers 
and performance optimization since I helped in the original port of AT&T 
SVr3 to the Intel 386 'way back in 1985.  I wrote the Interactive Systems 
High Performance Device Driver, and working with engineers at Adaptec, 
helped pioneer the first SCSI controller that could do 
multi-discontiguous-physical-page I/O, speeding up transfer rates by a 
factor of 8.  I understand a lot of work has been done on the paging code, 
but one man's optimized is another's ripe field.  No disrespect to all the 
fine Linux folks whose fraternity I'm glad to be (re-)joining after some 
years' absensence, but a great many of the "tricks" that were absolutely 
_necessary_ back in the old 16-bit minicomputer days have simply been lost 
or faded into obscurity.  Given that my current processor is about 1,000 
times faster and the disk access at least 50 times faster than my old 
X-windows 16MB-RAM i386 from 1987, it just seems that system performance 
could have improved by considerably more than it has.  It takes longer to 
boot Linux and get to the window manager than SVr3 did on that old box.  I 
know, I know, there's a lot more functionality and flexibility than the old 
system had; I just suspect that some of us "old dogs" might surprise you 
with some old (but new every generation) tricks.  I'll look into this more; 
this was just a lark I started on a couple of days ago, being bored and 
needing something to write.  Thank you again for your assistance, and I'll 
keep the group posted.
Doug Pintar

More information about the LUG mailing list