bob wrote:The alternative is doing a _ton_ of I/O that is totally unnecessary. I'd rather swap out 2 4K pages out of 64 that were actually modified, rather than all 64 of them. Etc...
If writing/reading the 64 pages takes as long, of 1% longer than reading/writing the single page, I would prefer writing the 64 pages.
It is clear that demand paging does not work acceptably if the working set is scattered out in infinitesimal chunks over the entire address space (e.g. hash tables in a Chess engine). Neither with 4KB paging, nor with 256KB paging.
First, I though we were discussing this in the context of a chess engine, not in terms of general-purpose applications?
Secondly, operating systems definitely use a fault-in-on-demand strategy on program startup and while it is running. I do not want to page my program, and therefore I limit the size of everything to be certain it will fit in physical memory. If nothing else needs to run along with me. But I know that on a hash probe, I do not want to read 1/4 megabyte of data when I am only going to use 16 bytes. On the large magic number tables, I do not want to read in 1/4 megabyte to get to a 64 bit or 32 bit value. The list goes on. Because if I do have to page, I do not want to have to page out 256K chunks when only 1, 2 or 16 bytes have been changed.
Large page sizes have their place, but then so do small ones. The O/S is tricky with buffering file I/O using page-sized buffers. 256K is too big unless your program spends all its time reading/writing.
In general, if we have to use a "one-size-fits-all" approach, smaller is better, because _most_ running applications are very small.
So programmers will always be able to write programs that will make performance drop to 0.0000000001 if they are not the only program running on the machine, or if the machine has too little memory. The burden to design their software such that it does not run like crap under adverse conditions is on them. In particular to design it with some locality in mind. Now that typical software is so bloated that you need GB memories to run it, working sets will also be larger. How many different stages does a program go through, and how many different working sets will it have during its execution? Millions? Or dozens? 256KB pages would offer enough flexibility to handle decently designed programs, which would put their initilization code in a contiguous block.
How would you do that? One large function? Poor software engineering. Lots of small functions? No way to put 'em into one contiguous block then, since we don't have any semantics to cause that to happen.
Again, the general feeling among computer architecure and O/S designers is that if the choice is large or small, small is better. Were it dynamically variable, then yes, you could tune page size to a specific application's behavior and help things.
But right now it is more about memory wastage, rather than paging costs since most systems are configured to minimize actual paging I/O. And that's the bad mark on the 4M page size. It greatly reduces TLB misses on very large programs. But if you use it on small programs memory wastage is very high.
Your Crafty example is totally misleading. How much of the memory footprint do you save because the initialization code can be swapped out? I would be surprised if it is more than 1%. You don't have heavy swapping because you set your hash-table size to a value where you don't have it! Not because there is demand paging on your code segment.[/quote]
I will guarantee you that you will see paging in any program. It is unavoidable. We just work to minimize it. But we do have to start a game from scratch, and we have to page there. But as I mentioned, I worry far less about that than I do just the idea of wasting memory in the _general_ case. For Crafty, I certainly prefer 4M pages. I have reported on that here in the past and pointed out that performance was _higher_ because of the limited TLB we have. And TLB misses are _bad_. But I would not want to run with 4mb pages exclusively, because of the 100+ system daemons/tasks that run and I don't want every one of them to have a 4mb page for code, a 4mb page for read-only data, and a 4mb page for read/write data, when 128K will do for the whole thing. That's where large page sizes break down and hurt performance, if they _must_ be used everywhere.
If I would design an OS with 256KB pages, of course it would assign contiguous 256KB blocks in the swap file to each page. Not doing so would be asking for trouble. Disks typically run at 7200 rpm, i.e. 8.33 ms for a rotation. So average rotational delay would be 4.16 ms. Transfer rates ar 50-100 MB/s. So rotational delay alone on a random access corresponds to 200-400KB of data, a full rotation to 400-800KB. Wasting that I/O bandwidth to read a 4KB page would be madness, it would slow down the amount of data you could page in hundredfold! Even if there was only a 1% probability that a neighboring page would be needed in the near future, it would pay to speculatively read it as the read head is positioned to read it along with the page we wanted.
none of that matters much in the grand scheme of an operating system. The issue is how much of main memory is actually _used_ vs how much is wasted because of the large page sizes? A large number of small jobs, such as those spawned by unix or windows, increases waste significantly. And since a single page must have a single access control type, the typical program will require at least two different access controls, sometimes 3. Bringing in such large chunks also invalidates a lot of cache, and requires overhead to force dirty lines back to memory before the I/O is started. Not much is free... So yes, for a chess engine, which is a large application, large page sizes are good. I've already measured that a year or two back. But if you force it on everyone by just supporting 256K (or whatever) page size, it is not so good.