Ideas for Async-IO model

Summary:

Select not used for disk I/O. Use a short select timeout if there is I/O operations pending to keep things flowing quickly.
Never have more than one outstanding I/O operation on one FD
Multiple writes should be joined into one larger
I/O buffers locked by the I/O functions. Some kind of lock-count needed. All locking/unlocking done in the main thread when queueing operations or receiving completed operations.
Try to balance the I/O load on the available disks.
Make larger composite I/O operations, to avoid unessesary inter-thread communication.
Use some kind of internal "file" to keep state info. Not based on UNIX FD. A "file" does not nessesary have a filesystem file connected to it at all times.
One FD for each client, to avoid file pointer issues
Disk cache filedescriptors are expendable. It is allowable to scrap "idle" cache filedescriptors if we run low on filedescriptors. The filedescriptor in itself does not contain any stateful information.
The main code should NOT care about disk filedescriptors. This is only a matter for disk I/O routines.

open read-only
open write-only
open write-only-append
read up to 32K. Less if only header is wanted. More if both header and object wanted.
write any amount. Queue writes up to 32K before sheduling a write.
open readonly+stat+read(+close if object small enought)
open writeonly+write(+close if object small enought)
write+close
abort? (maybe not be needed)
mmap()+write() for sending data to the client?
fsync on close? (may be a good idea in order to limit buffer cache).

Use the disk with most available space and a idle thread. The definition of idle can be that no request are queued on the thread§.
If there is no idle thread then put the request on a global swapout queue. Threads checks this global queue before idling. The chanses for getting a lock situation (swapout requests, all threads idling) from not having a signal related to this queue is close to non-existent as requests are only queued here if all threads have requests queued and if it should occur then it is non-fatal as the next request gets things going again.

Use one thread pool for each cache_dir. Runtime configurable amount of threads for each cache_dir.
Use one I/O queue per disk or thread. If I/O operations to one FD should be isolated to one thread then one queue per thread is needed, but due to thread communication races it may be more efficient to have one queue per disk, and allow filedesriptors to be shared between them. The combined operations listed above should effectively cut down the number of inter-thread operations per swapin/swapout to one for most objects. (open-stat-read-close, or open-write-close).
Use a configurable amount of "done" queues. Initially one, but it should be possible to increase the amount if locking congestion gets to be a problem.
Keep mutex locks for as short period of time as possible.

High overhead for setting up and tearing down mmap() entries
No read-ahead done by the OS. Most OS:es only demand page mmap()ed files.
mmap()+write() makes it even harder for people to change Squid to process body content when data is sent down to clients.
Maybe this does not matter; such processing should be done before store to cache the processing done.

Always do the processing before storage
Do not post-process objects when sent to the client from cache. Future wersions of Squid may make it impossible to change the object body of a cached object as it is sent directly from disk to the waiting client and never even seen by the main Squid process.
Store rewritten objects using a special store key format. Do not store rewritten objects using the same store key as non-rewritten objects. This is important for adecuate peering relationships.
Change client_side.c to look up rewritten objects where appropriate.
When peering, do so using store digests and Squids using the same kind of rewrites. Look up store digests in the same way as local storage is queried, with the exception that non-modified objects are looked up even if local storage never contains non-rewritten objects.

Written by Henrik Nordström <hno@squid-cache.org>, last changed 1998-09-20