This patch is generated from the sfs branch of HEAD in squid
Wed Sep 29 01:30:51 2004 GMT
See http://devel.squid-cache.org/

Index: squid/doc/debug-sections.txt
diff -u squid/doc/debug-sections.txt:1.3 squid/doc/debug-sections.txt:1.3.6.1
--- squid/doc/debug-sections.txt:1.3	Sun Jan  7 15:53:36 2001
+++ squid/doc/debug-sections.txt	Thu Jan 25 07:15:19 2001
@@ -86,3 +86,4 @@
 section 79    HTTP Meter Header
 section 80    WCCP
 section 81    Store Removal/Replacement policy
+section 82    SFS
Index: squid/src/mime.c
diff -u squid/src/mime.c:1.8 squid/src/mime.c:1.8.4.1
--- squid/src/mime.c:1.8	Fri Jan 12 00:20:33 2001
+++ squid/src/mime.c	Tue Feb  6 07:43:36 2001
@@ -449,4 +449,5 @@
     debug(25, 3) ("Loaded icon %s\n", url);
     storeUnlockObject(e);
     memFree(buf, MEM_4K_BUF);
+    storeDirSync();		/* to handle flushing IO and calling completions */
 }
Index: squid/src/fs/sfs/CHANGELOG
diff -u /dev/null squid/src/fs/sfs/CHANGELOG:1.1.2.1
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/CHANGELOG	Wed Jan 24 06:11:54 2001
@@ -0,0 +1,52 @@
+Changelog for sfs.
+
+---
+sfs-0.2 - 19990202
+---
+
+Altered types to use uintxx_t types - should be more portable, hopefully.
+I've a small concern about some of them still, especially sscanf places, but
+I'll sort that out eventually.  Also cleared last of purify-related errors,
+and all bar one compiler warning - Solaris 'cc' warns about incorrect
+type when passing (void *)function into one of the pthread functions.
+
+Using uintx_t also means I _know_ how big each data structure/variable is.
+This is becoming increasingly important ;)
+
+---
+sfs-0.1 - 19990201
+---
+
+First 'versioned' file - this contains a cleaned up Makefile and some cleaned
+up dependancies (thanks to Oskar Pearson).  List of changes from his patch:
+
+o	A real makefile
+o	Makefile includes linux and solaris sections. Defines _REENTRANT
+	for Linux. You have to manually specify this.
+	Currently linux just core-dumps with me immediately. I will try
+	and track this down.
+o	Compiles almost without warnings with 'gcc -Wall'. There are a
+	couple of things that I was not up to fixing immediately: noted them
+	with 'XXX'. I removed some unused variables (and commented out
+	others): if they were 'for future use', sorry.
+o	Now have $Id$ entries in every file, so that changes can be
+	tracked.
+o	Fixed recursive includes of header files
+o	sfs_seek code doesn't match documentation. Kludged it's variable
+	list in the meantime
+o	replaced broken squid_curtime definition. was the same as saying 
+		'exit + 1;'
+o	Included headers to get rid of silly warnings
+
+In addition, I've cleaned up the above-mentioned core dump, and the -Wall
+warnings.  Have also run through purify on Solaris - cleared most bugs.  This
+release is mainly to get the cleaning stuff back out there, so I can start
+on making the interface match throughout.
+
+---
+First version:
+---
+
+This is a fairly scrappy version - I havent' even changed the attribution
+headers on the files ;)  It was compiling and running when I put it on
+the web site, at least, but is very untested, and extremely alpha.
Index: squid/src/fs/sfs/DESIGN
diff -u /dev/null squid/src/fs/sfs/DESIGN:1.1.2.2
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/DESIGN	Sat Feb  3 17:16:04 2001
@@ -0,0 +1,245 @@
+$Id$
+
+			SQUIDFS README
+
+This file outlines the design of the SquidFS filesystem:
+
+Analysis of a running Squid internet object cache has found the following
+tidbits of information:
+
+Quantile	Object Size
+25%		< 1677 bytes
+50%		< 3710 bytes
+75%		< 9304 bytes
+90%		< 21360 bytes
+
+It is proposed that we use 4K fragments and 8K chunks.
+
+At the beginning of each file will be stored that file's inode.  This inode
+consists of the following information:
+
+4 byte file length
+63 x 4 bytes block pointers - each pointer indexes a 8K chunk or 4K fragment
+64 x 4 byte indirect block pointers - each pointer indexes a 8K chunk that
+itself contains a list of 4 byte pointers to fragments/chunks.  The remainder
+of the inode fragment/chunk will hold the first portion of data for that file.
+
+A file will consist of 0 or more 8K file chunks and 0 or 1 4 K fragments.
+
+An inode number is actually the index of the 4K fragment that contains the
+inode on the disk.  In this way indirect references to inodes are removed.
+
+*** Filesystem Bitmaps ***
+
+The filesystem will have 3 bitmaps that index all the fragments on a
+filesystem. 
+
+FBM - (on disk) Indexes fragments with valid data in them.
+IBM - (on disk) Indexes inode fragments
+MHB - (memory) Indexes blocks that are allocated but not necessarily written
+to disk.
+
+States of the on-disk bitmaps are:
+
+FBM	IBM	
+0	0	Fragment free
+1	0	Fragment contains valid file data
+0	1	Fragment contains complete inode data that references blocks
+		that may not be allocated.
+1	1	Fragment contains completed inode data
+
+*** file writes, closes and bitmap updates ***
+
+The first block (inode) is never written until a close is issued.
+
+Whenever a subsequent block is written MHB is set - block is flushed
+on demand.
+
+Whenever a file close occurs the current block and the completed inode block is
+flushed to disk immediately and IBM and MHB is set for the inode block.
+When IBM flush is complete FBM is set for all non-inode blocks
+When FBM flush is complete FBM is then set for inode block and flushed to
+disk. Once inode FBM flush is complete the file is valid
+
+Bitmap flushes should be scheduled events an should not occur on demand.
+A file that is waiting for a bitmap flush to occur should register itself
+to be called back when the flushes complete so that they may move onto
+the next stage of bitmap updates.  It is suggested that dirty bitmap pages
+be flushed to disk every 10 seconds.  In this instance it may take up to
+30 seconds before a file close results in a completely valid file on disk.
+In reality the file will be recoverable from on-disk data from 0-10 seconds
+after the file close was issued however.
+
+*** Filesystem rebuilds ***
+
+The filesystem may be left in an inconsistent state in the event of a
+power failure or system crash.  In this event the following algorithms
+are used to return the filesystem to a consistent state:
+
+FBM	IBM
+0	0	Fragment is free - ignore
+1	0	Fragment contains valid data - ignore
+0	1	Fragment is an inode that references data blocks that have not
+		completed their bitmap.  Scan through inode and set all blocks
+		referenced to be valid data blocks.
+1	1	Fragment is a valid inode block.
+
+*** Inode notes ***
+
+When a file is being written to the 8K inode block is held in memory,
+unallocated from disk until a file close is issued.  If the inode + file
+data is < 4K, the 4K fragment is allocated from the free bitmap, preferably
+as the last 4K of an 8K chunk.  If the inode + file data is >8K, the inode
+4K portion must be allocated on a free 8K chunk boundary and the data 4k
+portion will be the last 4K of the 8K chunk.  This 4K chunk is implied and
+is not referenced by the direct block pointers in the inode.
+
+*** Performance Analysis ***
+
+By running a set of squid object sizes for cache misses and cache hits through
+the above filesystem in a simulated environment we find the following:
+
+Internal block fragmentation: 17%
+
+File writes:
+
+  Disk	      Objects
+Accesses	 %
+   1		70%
+   2		15%
+   3		 6%
+   4		 3%
+   5+		 6%
+Average write accesses: 2.2
+
+Bitmap update accesses per avg. object - assuming 10 secs between bitmap flushes
+(worst case): 2 * inode updates + 1.2 * block updates = 3.2
+(average - 20% chance of disk bitmap locality @ 50 objects/sec): 0.05
+(best @ 50 objects/sec): 0.003
+
+File reads:
+
+  Disk	      Objects
+Accesses 	 %
+  1		86%
+  2		 8%
+  3		 2%
+  4		 1%
+  5+		 2%
+Average read accesses: 1.5
+Bitmap update accesses: 0
+
+File unlinks:
+
+1.05 disk accesses on average to retrieve block pointer information + worst
+case 2 bitmap update accesses assuming no locality in a 10 second window
+
+*** Notes ***
+
+Average bitmap update accesses is hard to measure but assumes we are usually
+writing to blocks that are close together for many files and so bitmap updates
+will get clustered together a fair portion of the time.
+
+*** Filesystem IO ***
+
+There is one thread per mounted filesystem.
+Blocks that are queued to be written are placed onto a thread's service
+queue.  Each thread inserts blocks into an doubly linked list ordered by
+the location of each block in the disk.  The thread scans backwards and
+forwards along the list writing out blocks and removing them from the
+write queue.  The blocks may still remain "owned" by a open file and
+the data within them may modified at any time.  The thread just writes
+out what it sees.  This MAY cause inconsistencies but the theory is that
+when the last write of that block is queued, the data will be consistent
+then anyway.  Since we're dealing with scenarios where this is acceptable
+then this is not an issue.  The requesting write will return straight away
+and the write will continue in the background.  If O_SYNC or O_WSYNC flag
+is set the requestor will wait until the write request is finished.  If the
+O_NONBLOCK flag is also set along with O_SYNC or O_WSYNC the requestor keeps
+a record of blocks it had queued for writing and returns EWOULDBLOCK.  On a
+subsequent write attempt by the parent process the requestor checks to see
+if the last issued write for that block has finished and simply returns the
+result of the write.
+
+Blocks that are to be read are placed onto the service queue for a filesystem's
+service thread.  The requestor then has the option of either waiting for
+the IO to complete or coming back to it later.  If the O_NONBLOCK flag
+is set
+
+Open file closes are also placed onto a separate queue for the thread.
+The filesystem thread is responsible for setting the bitmaps as flushing
+out dirty pages of the bitmap as required. Bitmap pages are flushed out
+only every N seconds, where N is a default of 15 but is user-modifiable
+to any value.  Every N seconds any dirty bitmap blocks are placed onto
+the write queue and are flushed out like normal data.
+
+A filesystem IO thread's service queue looks like:
+
+lastbmflushtime = 0
+loop:
+    now = time()
+    if(service queue is not empty)
+	acquire mutex for service queue.
+	grab head of service queue and set service queue head pointer to NULL
+	release mutex
+	for each item on service queue
+	    if item is marked as O_SYNC or O_WSYNC
+		flush out to disk immediately and report back
+	    else
+		insert into disk queue ordered by disk location
+
+    if(file close queue is not empty)
+	acquire mutex for file close queue
+	grab head of file close queue and set fcq gead pointer to NULL
+	release cfq mutex
+	for each item in file close queue
+	    add item to pending file close queue
+
+    if(now >= lastbmflushtime + bmflushinterval)
+	lastbmflushtime = now;
+	for each dirty bitmap block
+	    issue immediate write of each dirty bitmap block
+	for each item in pending file close queue
+	    advance state and modify necessary bitmap blocks
+	    if(state == done)
+		free pending file close
+
+    if(write queue is empty)
+	sleep(min((lastbmflushtime+bmflushinterval)-now, blockflushinterval))
+	goto loop:
+
+    write out block pointed to by file queue scan pointer (fqsp)
+    tmp = fqsp
+    /* Below is a simple SCAN algorithm however adding in VSCAN capability */
+    /* is easy to do and should be done on final implementation */
+    if(direction == forward)
+	if fqsp->next == NULL
+	    direction = backward
+	    fqsp = fqsp->prev
+	else
+	    fqsp = fqsp->next
+    else
+	if fqsp->prev == NULL
+	    direction = forward
+	    fqsp = fqsp->next
+	else
+	    fqsp = fqsp->prev
+    free tmp
+    goto loop:
+
+-----
+
+Grit:
+
+ This is a record of the typing decisions, as they're made.
+
+sfsfd - file descriptor.  This will be a uint32_t, the top 8 bits will be
+      the sfsid, the bottom 24 bits will be the 'thread-specific identifier'.
+      Essentially, the first byte tells us which drive, the other three
+      which file descriptor on that drive.
+
+sfsid - drive id.  This will be one byte.  Note, that limits us to 255 drives
+      - this is not a hard limit to raise.
+
+sfsinode - inode.  Position of first inode for this file on disk.  These will
+      be uint32_t - that gives us a fair whack of space to play with.
Index: squid/src/fs/sfs/Makefile.in
diff -u /dev/null squid/src/fs/sfs/Makefile.in:1.1.2.8
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/Makefile.in	Wed Feb  7 01:01:15 2001
@@ -0,0 +1,70 @@
+#
+#  Makefile for the sfs storage driver for the Squid Object Cache server
+#
+#  $Id$
+#
+
+FS		= sfs
+
+top_srcdir	= @top_srcdir@
+VPATH		= @srcdir@
+
+CC		= @CC@
+MAKEDEPEND	= @MAKEDEPEND@
+AR_R		= @AR_R@
+RANLIB		= @RANLIB@
+AC_CFLAGS	= @CFLAGS@
+SHELL		= /bin/sh
+LDFLAGS		= @LDFLAGS@
+
+INCLUDE		= -I../../../include -I$(top_srcdir)/include -I$(top_srcdir)/src/
+CFLAGS 		= $(AC_CFLAGS) $(INCLUDE) $(DEFINES)
+
+OUT		= ../$(FS).a
+
+SFSOBJS         = \
+                sfs_fslo.o \
+                sfs_interface.o \
+                sfs_llo.o \
+                sfs_splay.o \
+                sfs_util.o \
+
+OBJS            = $(SFSOBJS) \
+                store_dir_sfs.o \
+                store_io_sfs.o
+
+all:    $(OUT)
+
+test:   $(SFSOBJS) sfs_test.o
+	$(CC) $(CFLAGS) $(LDFLAGS) -DSFS_TEST -L../../../lib $(SFSOBJS) sfs_test.o sfs_shim.c ../../globals.o -lmiscutil -lm -o sfs_test
+
+read:   $(SFSOBJS) sfs_read.o
+	$(CC) $(CFLAGS) $(LDFLAGS) -DSFS_TEST -L../../../lib $(SFSOBJS) sfs_read.o sfs_shim.c ../../globals.o -lmiscutil -lm -o sfs_read
+
+$(OUT): $(OBJS)
+	@rm -f ../stamp
+	$(AR_R) $(OUT) $(OBJS)
+	$(RANLIB) $(OUT)
+
+$(OBJS): $(top_srcdir)/include/version.h ../../../include/autoconf.h
+$(OBJS): store_sfs.h
+
+.c.o:
+	@rm -f ../stamp
+	$(CC) $(CFLAGS) -c $<
+
+clean: 
+	-rm -rf *.o *pure_* core ../$(FS).a
+
+distclean:	clean
+	-rm -f Makefile
+	-rm -f Makefile.bak
+	-rm -f tags
+
+install:
+
+tags:
+	ctags *.[ch] $(top_srcdir)/src/*.[ch] $(top_srcdir)/include/*.h $(top_srcdir)/lib/*.[ch]
+
+depend:
+	$(MAKEDEPEND) $(INCLUDE) -fMakefile *.c
Index: squid/src/fs/sfs/sfs.h
diff -u /dev/null squid/src/fs/sfs/sfs.h:1.1.2.2
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs.h	Sat Feb  3 17:16:04 2001
@@ -0,0 +1,41 @@
+/* $Id$ */
+
+/* NOT USED */
+
+#ifndef SFS_H
+#define SFS_H
+
+typedef struct sfs_statistic_t {
+    uint	numreads;
+} sfs_statistic_t;
+
+typedef struct sfs_stat_t {
+    uint	sfs_ino;
+    uint	sfs_numblocks;
+    uint	sfs_len;
+} sfs_stat_t;
+
+/* Library level user-callable functions */
+
+int	sfs_setoptions(int op, int opdata);
+
+/* Filesystem level user-callable functions */
+
+int	sfs_format(const char *path, ulong);
+int	sfs_mount(const char *path);
+int	sfs_fsck(int sfsid, int fscktype);
+int	sfs_unmount(int sfsid);
+
+/* File level user-callable functions */
+
+int	sfs_open(const char *path, int oflag, mode_t mode);
+int	sfs_close(int fd);
+int	sfs_sync(int fd);
+int	sfs_read(int fd, void *buf, int buflen);
+int	sfs_write(int fd, void *buf, int buflen);
+int	sfs_seek(int fd, int offset, int whence);
+int	sfs_unlink(int sfsid, uint sfsinode);
+int	sfs_truncate(int sfsid, uint sfsinode, int newlen);
+int	sfs_stat(int sfsid, uint sfsinode, sfs_stat_t *statbuf);
+
+#endif	/* !SFS_H */
Index: squid/src/fs/sfs/sfs_defines.h
diff -u /dev/null squid/src/fs/sfs/sfs_defines.h:1.1.2.10
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_defines.h	Sat Feb  3 17:16:04 2001
@@ -0,0 +1,198 @@
+/* $Id$ */
+
+#ifndef SFS_DEFINES_H
+#define SFS_DEFINES_H
+
+#include <pthread.h>
+#include <sys/types.h>
+
+/* Possibly bogus defines? */
+
+#ifndef uint8_t
+#define uint8_t unsigned char
+#endif
+
+#ifndef uint32_t
+#define uint32_t int
+#endif
+
+#ifndef uint64_t
+#define uint64_t long long
+#endif
+
+#define sfsfd_t uint32_t
+#define sfsid_t int8_t
+#define sfsblock_t uint32_t
+
+/* Code assumes CHUNKSIZE is twice FRAGSIZE.  If it isn't things will break */
+/* very badly. */
+
+#define	FRAGSIZE	4096
+#define	CHUNKSIZE	8192
+#define MINFSFRAGS	1024	/* Minimum acceptable number of FS frags */
+#define	MAXFILESYS	127	/* Maximum number of mounted filesystems */
+
+#define	NUMDIP	62
+#define	NUMSIN	64
+
+#define BITINBYTE	8
+
+/* Magic! */
+#define SFS_MAGIC	0xdeadf00d
+
+/* The below defines assume there are 8 bits in a byte */
+
+#define TSTBIT(a, b)	(((a[b>>3]) << (b & 0x7)) & 0x80)
+#define SETBIT(a, b)	((a[b>>3]) |= (0x80 >> (b & 0x7)))
+#define	CLRBIT(a, b)	((a[b>>3]) &= (~(0x80 >> (b & 0x7))))
+
+enum sfs_request_type {
+    _SFS_OP_NONE = 0,
+    _SFS_OP_READ,
+    _SFS_OP_WRITE,
+    _SFS_OP_OPEN_READ,
+    _SFS_OP_OPEN_WRITE,
+    _SFS_OP_CLOSE,
+    _SFS_OP_UNLINK,
+    _SFS_OP_SYNC,
+    _SFS_OP_UMOUNT,
+    _SFS_OP_SEEK,
+};
+
+enum sfs_request_state {
+    _SFS_PENDING = 0,
+    _SFS_IN_PROGRESS,
+    _SFS_DONE
+};
+
+enum sfs_block_type {
+    _SFS_UNKNOWN = 0,
+    _SFS_DATA,
+    _SFS_INODE
+};
+
+enum sfs_io_type {
+    _SFS_IO_SYNC = 0,
+    _SFS_IO_ASYNC
+};
+
+typedef struct sfs_requestor {
+    dlink_node node;			/* List position */
+    void *dataptr;			/* Used by higher levels .. */
+    enum sfs_request_type request_type;
+    enum sfs_request_state request_state;
+    enum sfs_io_type io_type; /* sync or async */
+    pthread_cond_t done_signal;
+    pthread_mutex_t done_signal_lock;
+    sfsid_t sfsid;
+    sfsfd_t sfsfd;
+    sfsblock_t sfsinode;
+    ssize_t offset; /* The block inside the file in question (0..x) */
+    ssize_t buflen; /* The length of the buffer, if pre-allocated, or the read */
+    void *buf;
+    int ret;
+} sfs_requestor;
+
+/* This corresponds to the structure as it's stored on disk. */
+typedef struct sfs_inode_t {
+    sfsblock_t	len;
+    sfsblock_t	dip[NUMDIP];	/* Direct block pointers */
+    sfsblock_t	sin[NUMSIN];	/* Single Indirect Pointers */
+} sfs_inode_t;
+
+/* This is the structure stored mid-fs */
+/* I have some doubts as to the correctness of the dealing with this structure
+throughout the code - will check up on it. */
+typedef struct sfs_rootblock_t {
+    sfsblock_t	numfrags;
+    sfsblock_t	ibmpos;
+    sfsblock_t	fbmpos;
+    uint32_t	bmlen;
+    uint32_t	magic;
+} sfs_rootblock_t;
+
+/* These structures exist as members of linked lists hanging off a */
+/* sfs_openfile_t structure, except in the case of the inode block */
+/* and double indirect block of a file which are pointed to directly */
+/* since only one of these types of blocks can exist per file. */
+/* The buf points to a structure in either clean or dirty splay tree */
+typedef struct sfs_openblock_list {
+    struct sfs_blockbuf_t     *buf;
+    struct sfs_openblock_list *next;
+    struct sfs_openblock_list *prev;
+} sfs_openblock_list;
+
+/* Will be a chained hash table of open file descriptions */
+/* These can also be referenced by the background file flush daemon */
+/* that is in control of correctly flushing out all pending data and */
+/* bitmap updates when the file is closed or synced */
+typedef struct sfs_openfile_t {
+    sfsid_t		sfsid;
+    sfsblock_t		sfsinode;
+    sfsfd_t		sfsfd; /* Fake fd for reference - unique per open file*/
+    uint64_t		pos; /* Position in the file, for partial reads */
+    int			rd_refcount;
+    int			wr_refcount;
+    int			flushonclose;
+    struct sfs_inode_t	*inode;	/* The block pointed to by inodebuf_p->buf */
+    struct sfs_blockbuf_t	*inodebuf_p;	/* Pointer to blockbuf_t of inode */
+    struct sfs_openblock_list	*rwbuf_list_p;	/* List of RW open blocks */
+    struct sfs_openblock_list	*sibuf_list_p;	/* List of single indirect open blocks*/
+    struct sfs_blockbuf_t	*dibuf_p;	/* Double indirect open block */
+    struct sfs_openfile_t *prev;
+    struct sfs_openfile_t *next;
+} sfs_openfile_t;
+
+/* This structure references blocks held in the buffer cache */
+/* There will be an array indexed by sfsid that points to two ordered */
+/* splay trees of blocks for that sfsid, one of dirty pages, and one */
+/* of clean pages.  Periodically the dirty pages are flushed to disk */
+/* I'm going to add prev and next, and keep these as a list in time */
+/* order, also. */
+typedef struct sfs_blockbuf_t {
+    struct sfs_blockbuf_t *left;	/* Splay left & right pointers */
+    struct sfs_blockbuf_t *right;
+    struct sfs_blockbuf_t *prev;
+    struct sfs_blockbuf_t *next;
+    sfsid_t	sfsid;
+    sfsblock_t	sfsinode;
+    sfsblock_t	diskpos; /* Position of this CHUNK on disk */
+    int		refcount; /* How many people holding this page */
+    uint8_t	dirty;	/* Is page dirty */
+    uint8_t	type;	/* Inode or data - required for updating bitmaps */
+    int		buflen; /* Length of buffer (*buf) (FRAGSIZE max) */
+    char	*buf;	/* Pointer to page data */
+} sfs_blockbuf_t;
+
+/* This structure is the mount point parent information for a mounted */
+/* filesystem */
+typedef struct sfs_mountfs_t {
+    sfs_rootblock_t *path;
+    sfs_rootblock_t *rootblock;
+    sfsid_t	sfsid;
+    sfsfd_t	 fd;	/* Filedescriptor used for writing to this filesystem */
+    char *fbm;	/* Fragment allocation bitmap */
+    char *ibm;	/* Inode allocation bitmap */
+    char *mhb;	/* Memory holding fragment bitmap for disk bitmap updates */
+    sfs_blockbuf_t *dirty;
+    sfs_blockbuf_t *clean;
+    sfs_blockbuf_t *head[2]; /* These are used for keeping a time-ordered */
+    sfs_blockbuf_t *tail[2]; /* list of blocks */
+    int accepting_requests;
+    int pending_requests;
+    dlink_list *request_queue;
+    pthread_mutex_t req_lock;
+    pthread_cond_t req_signal;
+    pthread_mutex_t req_signal_lock;
+    pthread_t thread_id;
+
+    dlink_list done_queue;
+    int done_requests;
+    pthread_mutex_t done_lock;
+} sfs_mountfs_t;
+
+#ifndef max
+#define max(x,y) ((x)<(y)? (y) : (x))
+#endif
+
+#endif	/* !SFS_DEFINES_H */
Index: squid/src/fs/sfs/sfs_fslo.c
diff -u /dev/null squid/src/fs/sfs/sfs_fslo.c:1.1.2.2
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_fslo.c	Wed Jan 24 08:06:38 2001
@@ -0,0 +1,117 @@
+/* sfs_fslo.c,v 1.17 2001/01/24 12:49:58 adrian Exp */
+
+/*				Squid FS				*/
+/*									*/
+/*		 Squid FS - Filesystem Level Operations			*/
+/*									*/
+/*	Authors: Stew Forster (slf) - Original version			*/
+/*               Kevin Littlejohn (darius@bofh.net.au)                  */
+/*									*/
+
+/* A very simple stripped down UFS style filesystem that makes a lot	*/
+/* of assumptions based on the needs of the Squid web proxy caching	*/
+/* software. */
+
+/* Note, the types in here are possibly wrong - this needs to be gone over */
+
+#include "squid.h"
+
+#include "sfs_defines.h"
+
+int
+sfs_format(const char *rawdevpath, u_int32_t numfrags)
+{
+    char *fbm;
+    int bmlen;
+    /* XXX - not sure if this next variable is to be used, but it's not at
+     * the moment, remove? */
+    /* int bitsinfrag; */
+    int fbmpos;
+    int ibmpos;
+    int fd;
+    int i;
+    sfs_rootblock_t *rblock;
+    char *rbbuf;
+    uint64_t os;
+
+    if(numfrags < MINFSFRAGS) {
+	errno = ERANGE;
+	return -1;
+    }
+/* Work out how long the bitmaps should be (in bytes) */
+    bmlen = numfrags / BITINBYTE;
+    if(numfrags % BITINBYTE)
+	bmlen++;
+/* Position them half-way through the fs */
+    fbmpos = numfrags >> 1;
+    ibmpos = fbmpos - bmlen;
+
+    if((fd = open(rawdevpath, O_RDWR)) < 0)
+	return -1;
+    
+    /* Write out the root block */
+
+    if((rbbuf = (char *)xcalloc(1, CHUNKSIZE)) == NULL) {
+	close(fd);
+	return -1;
+    }
+    rblock = (sfs_rootblock_t *)rbbuf;
+    rblock->numfrags = numfrags;
+    rblock->ibmpos = ibmpos;
+    rblock->fbmpos = fbmpos;
+    rblock->bmlen = bmlen;
+    rblock->magic = SFS_MAGIC;
+    os = 0;
+    if(lseek(fd, os, SEEK_SET) < 0) {
+	xfree(rbbuf);
+	close(fd);
+	return -1;
+    }
+    if(write(fd, rbbuf, CHUNKSIZE) < 0) {
+	xfree(rbbuf);
+	close(fd);
+	return -1;
+    }
+    xfree(rbbuf);
+
+    /* Write out the inode bitmap.  This will be all zeros.  Since we */
+    /* xcalloc()ed the file bitmap and it's the same size as the file */
+    /* bitmap, just write it out */
+
+    os = ibmpos;
+    os *= FRAGSIZE;
+    if(lseek(fd, os, SEEK_SET) < 0) {
+	return -1;
+    }
+    if((fbm = (char *)xcalloc(1, bmlen)) == NULL) {
+	close(fd);
+	return -1;
+    }
+    if(write(fd, fbm, bmlen) < 0) {
+	xfree(fbm);
+	close(fd);
+	return -1;
+    }
+
+    /* set all the blocks that contain */
+    /* the bitmaps as allocated (not free). Also set as used the first */
+    /* two fragments which will contain the filesystem root block */
+
+    for(i = ibmpos; i < (ibmpos + (2 * bmlen)); i++)
+	SETBIT(fbm, i);
+    SETBIT(fbm, 0);
+    SETBIT(fbm, 1);
+
+    /* Write out the frag bitmap.  We will already be at the right */
+    /* location after writing out the inode bitmap */
+
+    if(write(fd, fbm, bmlen) < 0) {
+	xfree(fbm);
+	close(fd);
+	return -1;
+    }
+    xfree(fbm);
+
+    close(fd);	/* Done! So simple */
+    return 0;
+} /* sfs_format */
Index: squid/src/fs/sfs/sfs_interface.c
diff -u /dev/null squid/src/fs/sfs/sfs_interface.c:1.1.2.18
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_interface.c	Tue Feb  6 07:57:40 2001
@@ -0,0 +1,481 @@
+/* sfs_interface.c,v 1.58 2001/01/24 12:50:20 adrian Exp */
+
+/* These functions comprise the interface portion of squidFS - the bits that
+   outside functions can call.
+   I think I'll make the interfaces as identical to normal interfaces as
+   possible - not overly happy about that, as it means juggling things into
+   and out of strings, but until I have time to clean up squid's own fs
+   interfaces, that's the best that can be done.
+   The above changes in the light of the new store_* stuff in squid.
+*/
+
+/*
+ * DEBUG 82
+ */
+
+#include "squid.h"
+
+#include "store_sfs.h"
+
+/* Define this is you want to compile sfs_test, the test program
+ * - it will remove the references to cbdata* functions, which cause
+ * linking problems otherwise. */
+#undef SFS_TEST
+
+/* Public interfaces - the ones squid requires us to provide */
+
+sfsfd_t
+sfs_open(sfsid_t sfsid, sfsblock_t sfsinode, int oflag, mode_t mode,
+  enum sfs_io_type io_type, void *dataptr)
+{
+    struct sfs_requestor *req;
+    enum sfs_request_type rt;
+    sfsfd_t ret;
+    
+    /* Currently, you have to specify either an inode, or O_CREAT.
+     * We also make the rather brash assumption that if we're opening to
+     * write, we're creating a new file - that assumption can change.
+     * Could do with error checking on the sscanf... */
+    if (oflag & O_CREAT) {
+	rt = _SFS_OP_OPEN_WRITE;
+    } else {
+	rt = _SFS_OP_OPEN_READ;
+	/* If we're trying to open something that's not an inode, return. */
+	if (!(CBIT_TEST(_sfs_mounted[sfsid].ibm, sfsinode))) {
+	    printf("ERR: sfs_open opening non-inode\n");
+	    return -1;
+	}
+    }
+    if (!(req = _sfs_create_requestor(sfsid,rt, io_type))) {
+	return -1;
+    }
+    assert((io_type == _SFS_IO_SYNC) || dataptr);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataLock(dataptr);
+#endif
+    req->sfsinode = sfsinode;
+    req->dataptr = dataptr;
+    _sfs_submit_request(req);
+    _sfs_print_request(req);
+    if (io_type != _SFS_IO_SYNC) {
+	return 0;
+    }
+    _sfs_waitfor_request(req);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataUnlock(dataptr);
+#endif
+    ret = req->ret;
+    _sfs_remove_request(req);
+    return ret;
+}
+
+int
+sfs_close(sfsfd_t sfsfd, enum sfs_io_type io_type, void *dataptr)
+{
+    /* Need to flush the file to disk and remove the structure. */
+    sfs_requestor *req;
+    int ret;
+
+    if(!(req = _sfs_create_requestor(sfsfd >> 24, _SFS_OP_CLOSE, io_type)))
+	return -1;
+    assert((io_type == _SFS_IO_SYNC) || dataptr);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataLock(dataptr);
+#endif
+    req->sfsfd = sfsfd;
+    req->dataptr = dataptr;
+    _sfs_submit_request(req);
+    if (io_type != _SFS_IO_SYNC) {
+	return 0;
+    }
+    _sfs_waitfor_request(req);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataUnlock(dataptr);
+#endif
+    ret = req->ret;
+    _sfs_remove_request(req);
+    return ret;
+}
+
+ssize_t
+sfs_read(sfsfd_t sfsfd, void *buf, ssize_t buflen, enum sfs_io_type io_type,
+    void *dataptr)
+/* Takes:  sfsfd, a pointer to pre-allocated space, and length of said
+   space.
+   Returns: number of bytes read.
+   (Note, on Solaris 2.6, ssize_t is 4 bytes, and I believe signed)
+*/
+{
+    sfs_requestor *req;
+    ssize_t ret;
+    sfsid_t sfsid;
+
+    sfsid = sfsfd >> 24;
+    if(!(req = _sfs_create_requestor(sfsid, _SFS_OP_READ, io_type))) {
+	return -1;
+    }
+    assert((io_type == _SFS_IO_SYNC) || dataptr);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataLock(dataptr);
+#endif
+    req->sfsfd = sfsfd;
+    req->offset = -1;
+    req->buflen = buflen;
+    req->dataptr = dataptr;
+    _sfs_submit_request(req);
+    if (io_type != _SFS_IO_SYNC)
+	return 0;
+    _sfs_waitfor_request(req);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataUnlock(dataptr);
+#endif
+    if ((!buf) || (req->ret == 0))
+	return 0;
+    if (req->ret < 0)
+	return req->ret;
+    ret = req->buflen;
+    if (req->buf) {
+	memcpy(buf,req->buf,ret+1);
+	xfree(req->buf);
+    }
+    _sfs_remove_request(req);
+    return ret;
+}
+
+ssize_t
+sfs_write(sfsfd_t sfsfd, const void *buf, ssize_t buflen,
+    enum sfs_io_type io_type, void *dataptr)
+{
+    sfs_requestor *req;
+    ssize_t ret;
+    sfsid_t sfsid;
+
+    sfsid = sfsfd >> 24;
+    if (!(req = _sfs_create_requestor(sfsid,_SFS_OP_WRITE, io_type))) {
+	return -1;
+    }
+    assert((io_type == _SFS_IO_SYNC) || dataptr);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataLock(dataptr);
+#endif
+    req->sfsfd = sfsfd;
+    if (!(req->buf = xstrdup(buf))) {
+#ifndef SFS_TEST
+	if (dataptr)
+	    cbdataUnlock(dataptr);
+#endif
+	return -1;
+    }
+    req->buflen = buflen;
+    req->dataptr = dataptr;
+    _sfs_submit_request(req);
+    if (io_type != _SFS_IO_SYNC)
+	return 0;
+    _sfs_waitfor_request(req);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataUnlock(dataptr);
+#endif
+    ret = req->ret;
+    if (req->buf)
+	xfree(req->buf);
+    _sfs_remove_request(req);
+    return ret;
+}
+
+int
+sfs_unlink(sfsid_t sfsid, sfsblock_t sfsinode, enum sfs_io_type io_type,
+    void *dataptr)
+{
+/* Should really take a full filename, by rights */
+/* Here's the trick with this one:  You don't unlink a file till _after_
+   you've closed it (normally).  That means I can't take the normal sfsfd
+   and extract the relevant info :( */
+    sfs_requestor *req;
+    int ret;
+
+    if (!(req = _sfs_create_requestor(sfsid, _SFS_OP_UNLINK, io_type))) {
+	return -1;
+    }
+    /*
+     * We don't do this because its valid to have an async unlink without
+     * any notification info. Eww. -- adrian
+     */
+#if 0
+    assert((io_type == _SFS_IO_SYNC) || dataptr);
+#endif
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataLock(dataptr);
+#endif
+    req->sfsinode = sfsinode;
+    req->dataptr = dataptr;
+    _sfs_submit_request(req);
+    if (io_type != _SFS_IO_SYNC)
+	return 0;
+    _sfs_waitfor_request(req);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataUnlock(dataptr);
+#endif
+    ret = req->ret;
+    _sfs_remove_request(req);
+    return ret;
+}
+
+/* Private-ish interfaces - the ones people can call, but squid doesn't use */
+/* directly. */
+
+void sfs_thread_loop(sfs_mountfs_t *mount_point);
+
+int
+sfs_umount(sfsid_t sfsid, enum sfs_io_type io_type)
+/* As noted below, mount and umount need to be called only from a single
+thread - preferably the thread that calls init.  I _can_ fix this, with
+YALock, but I've chosen not to at this time. */
+{
+    sfs_requestor *req;
+    int ret;
+
+    if (sfsid >= MAXFILESYS)
+	return -1;
+    if (_sfs_mounted[sfsid].rootblock == NULL)
+	return 0;
+/* Send a umount, and wait for the return. */
+/* The umount request simply tells the fs not to accept any more requests,
+and to sync all changes to disk, close the fd, and remove itself from the
+list of mounted fs'es.  Basically, all the important stuff is done in the
+thread itself. */
+    if (!(req = _sfs_create_requestor(sfsid,_SFS_OP_UMOUNT, io_type)))
+	return -1;
+    if (_sfs_submit_request(req) < 0)
+	return -1;
+    if (io_type != _SFS_IO_SYNC)
+	return 0;
+    _sfs_waitfor_request(req);
+    ret = req->ret;
+    _sfs_remove_request(req);
+    if (ret == 0) {
+	if (_sfs_mounted[sfsid].rootblock) {
+	    xfree(_sfs_mounted[sfsid].rootblock);
+	    _sfs_mounted[sfsid].rootblock = NULL;
+	}
+    }
+    return ret;
+}
+
+sfsid_t
+sfs_mount(const char *rawdevpath)
+{
+    sfsid_t i;
+    sfsblock_t j, bmlen;
+    sfsblock_t ibmpos, fbmpos;
+    sfsblock_t magic;
+
+    /* This hunt is not thread-safe - assume only one thread doing these
+     * things (initialising/mounting) - otherwise bad things happen(tm).
+     * Fixing this assumption would mean adding a lock over the _sfs_mounted
+     * array */
+    for(i = 1; (_sfs_mounted[i].rootblock != NULL) && (i < MAXFILESYS); i++);
+    if (i == MAXFILESYS)
+	return -1;
+    if ((_sfs_mounted[i].fd = open(rawdevpath, O_RDWR)) < 0)
+	return -1;
+    if (lseek(_sfs_mounted[i].fd, (uint64_t)0, SEEK_SET) < (uint64_t)0) {
+	printf("ERR: Didn't manage to lseek in mount :(\n");
+	close(_sfs_mounted[i].fd);
+	return -1;
+    }
+    if ((_sfs_mounted[i].rootblock = (sfs_rootblock_t *)xcalloc(1,CHUNKSIZE)) == NULL) {
+	close(_sfs_mounted[i].fd);
+	_sfs_mounted[i].rootblock = NULL;
+	return -1;
+    }
+    if (read(_sfs_mounted[i].fd, _sfs_mounted[i].rootblock, CHUNKSIZE) < 0) {
+	close(_sfs_mounted[i].fd);
+	xfree(_sfs_mounted[i].rootblock);
+	_sfs_mounted[i].rootblock = NULL;
+	return -1;
+    }
+    ibmpos = _sfs_mounted[i].rootblock->ibmpos;
+    fbmpos = _sfs_mounted[i].rootblock->fbmpos;
+    bmlen = _sfs_mounted[i].rootblock->bmlen;
+    magic = _sfs_mounted[i].rootblock->magic;
+
+    printf("DEBUG: sfs root: ibmpos %d, fbmpos %d, bmlen %d, numfrags %d, magic %d\n",
+      _sfs_mounted[i].rootblock->ibmpos,
+      _sfs_mounted[i].rootblock->fbmpos,
+      _sfs_mounted[i].rootblock->bmlen,
+      _sfs_mounted[i].rootblock->numfrags,
+      _sfs_mounted[i].rootblock->magic);
+
+    /* Check magic! */
+    if (magic != SFS_MAGIC)
+        return -1;
+
+    /* If any of the rootblock stuff == 0, we have a bad fs */
+    if ((ibmpos == 0) || (fbmpos == 0) || (bmlen == 0))
+        return -1;
+
+    _sfs_mounted[i].sfsid = i;
+    _sfs_mounted[i].fbm = (char *)xcalloc(1,bmlen);
+    _sfs_mounted[i].ibm = (char *)xcalloc(1,bmlen);
+/* Seek to the bitmaps, and read them in */
+/* I wonder whether or not it makes sense to have this stuff done in the
+fs'es own thread.  Maybe */
+    if (lseek(_sfs_mounted[i].fd, ibmpos, SEEK_SET) < 0) {
+	close(_sfs_mounted[i].fd);
+	xfree(_sfs_mounted[i].rootblock);
+	_sfs_mounted[i].rootblock = NULL;
+	return -1;
+    }
+    if (read(_sfs_mounted[i].fd, _sfs_mounted[i].fbm, bmlen) < 0) {
+	close(_sfs_mounted[i].fd);
+	xfree(_sfs_mounted[i].rootblock);
+	_sfs_mounted[i].rootblock = NULL;
+	return -1;
+    }
+    if (read(_sfs_mounted[i].fd, _sfs_mounted[i].ibm, bmlen) < 0) {
+	close(_sfs_mounted[i].fd);
+	xfree(_sfs_mounted[i].rootblock);
+	_sfs_mounted[i].rootblock = NULL;
+	return -1;
+    }
+    if ((_sfs_mounted[i].mhb = (char *)xcalloc(1,bmlen)) == NULL) {
+	close(_sfs_mounted[i].fd);
+	xfree(_sfs_mounted[i].rootblock);
+	_sfs_mounted[i].rootblock = NULL;
+	return -1;
+    }
+    for (j = 0; j <= bmlen; j++) {
+	if (CBIT_TEST(_sfs_mounted[i].ibm, j) || CBIT_TEST(_sfs_mounted[i].fbm, j))
+	    CBIT_SET(_sfs_mounted[i].mhb, j);
+    }
+    _sfs_mounted[i].dirty = NULL;
+    _sfs_mounted[i].request_queue = xcalloc(1,sizeof(dlink_list));
+    _sfs_mounted[i].pending_requests = 0;
+    pthread_mutex_init(&(_sfs_mounted[i].req_lock), NULL);
+    pthread_mutex_init(&(_sfs_mounted[i].req_signal_lock), NULL);
+    pthread_cond_init(&(_sfs_mounted[i].req_signal), NULL);
+    pthread_mutex_init(&(_sfs_mounted[i].done_lock), NULL);
+    _sfs_mounted[i].done_queue.head = NULL;
+    _sfs_mounted[i].done_queue.tail = NULL;
+    _sfs_mounted[i].done_requests = 0;
+    
+    pthread_create(&(_sfs_mounted[i].thread_id), NULL, (void *)&sfs_thread_loop, &(_sfs_mounted[i]));
+    while (!(_sfs_mounted[i].accepting_requests))
+	sleep(1);
+    /* Return the sfsid */
+    return i;
+}
+
+off_t
+sfs_seek(sfsfd_t sfsfd, off_t pos, enum sfs_io_type io_type, void *dataptr)
+/* Takes:  sfsid, fd, and position to seek to.
+   Returns: 0 for success, -1 for failure.
+*/
+{
+    sfs_requestor *req;
+    sfsid_t sfsid;
+    int ret;
+
+    sfsid = sfsfd >> 24;
+    if(!(req = _sfs_create_requestor(sfsid, _SFS_OP_SEEK, io_type)))
+	return -1;
+
+    assert((io_type == _SFS_IO_SYNC) || dataptr);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataLock(dataptr);
+#endif
+    req->sfsfd = sfsfd;
+    req->dataptr = dataptr;
+    req->offset = pos;
+    _sfs_submit_request(req);
+    _sfs_print_request(req);
+    if (io_type != _SFS_IO_SYNC) {
+	return 0;
+    }
+    _sfs_waitfor_request(req);
+#ifndef SFS_TEST
+    if (dataptr)
+	cbdataUnlock(dataptr);
+#endif
+    ret = req->ret;
+    _sfs_remove_request(req);
+    return ret;
+}
+
+/*
+ * sfs_getcompleted - retrieve a single completed async request
+ *
+ * This function retrieves a single completed request from the done queue.
+ * It does not remove it from the done queue - this is the job of
+ * _sfs_remove_request.
+ *
+ * done_lock isn't to be held here - but we grab it whilst walking the list.
+ */
+sfs_requestor *
+sfs_getcompleted(sfsid_t sfsid)
+{
+    dlink_node *node;
+
+    /*
+     * Walk the list, find the first async request, and return a pointer
+     * to it. I'll hold the lock whilst we look through the list.
+     */
+    pthread_mutex_lock(&(_sfs_mounted[sfsid].done_lock));
+    node = _sfs_mounted[sfsid].done_queue.head;
+
+    while ((node != NULL) &&
+      (((sfs_requestor *)node->data)->io_type != _SFS_IO_ASYNC))
+        node = node->next;
+    
+    pthread_mutex_unlock(&(_sfs_mounted[sfsid].done_lock));
+
+    /* We now have a node. Return it. */
+    if (node)
+        return node->data;
+    else
+        return NULL;
+}
+
+int
+sfs_filesize(sfsid_t sfsid, sfsblock_t sfsinode)
+{
+    /* This function returns the size of the given file - file given as
+     * sfsid and inode. It leans on the block cache to find the info. */
+    sfs_blockbuf_t *inode;
+
+    inode = _sfs_read_block(sfsid,sfsinode);
+    return ((sfs_inode_t *)inode->buf)->len;
+}
+
+int
+sfs_openNextInode(sfsid_t sfsid, sfsblock_t *cur)
+{
+    sfsblock_t i;
+    int found, fd;
+    /* This function walks through a mounted filesystem, returning the
+     * next inode that's in-use.  Used by rebuildDir stuff.
+     * First block is in use for storing rootblock.  */
+    for(i=max(3,*cur), found=0; i<_sfs_mounted[sfsid].rootblock->numfrags; i += 1) {
+	if (CBIT_TEST(_sfs_mounted[sfsid].ibm, i)) {
+	    *cur = i;
+	    printf("DEBUG: next inode %d\n",*cur);
+	    /* Note, should make an sio, but I'm too lazy - SYNC doesn't
+	     * _really_ need one... */
+	    fd = sfs_open(sfsid,i,0,_SFS_OP_OPEN_READ,_SFS_IO_SYNC,NULL);
+	    return fd;
+    	}
+    }
+    *cur = 0;
+    return -2;
+}
Index: squid/src/fs/sfs/sfs_lib.h
diff -u /dev/null squid/src/fs/sfs/sfs_lib.h:1.1.2.10
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_lib.h	Sat Feb  3 17:16:04 2001
@@ -0,0 +1,68 @@
+/* sfs_lib.h,v 1.16 2001/01/24 12:47:58 adrian Exp */
+
+/*				Squid FS				*/
+/*									*/
+/*	Authors: Stew Forster (slf) - Original version			*/
+/*               Kevin Littlejohn (darius@bofh.net.au)                  */
+/*									*/
+
+/* A very simple stripped down UFS style filesystem that makes a lot	*/
+/* of assumptions based on the needs of the Squid web proxy caching	*/
+/* software. */
+
+#ifndef SFS_LIB_H
+#define SFS_LIB_H
+
+
+/* The mount list */
+extern sfs_mountfs_t _sfs_mounted[MAXFILESYS];
+extern sfs_openfile_t * _sfs_openfiles[MAXFILESYS];
+
+/* Internal functions */
+/* sfs_util.c */
+extern void _sfs_waitfor_request(sfs_requestor *req);
+extern int _sfs_remove_request(sfs_requestor *req);
+extern void _sfs_done_request(sfs_requestor *req, int retval);
+extern sfs_requestor * _sfs_create_requestor(int sfsid,
+    enum sfs_request_type reqtype, enum sfs_io_type iotype);
+extern int _sfs_submit_request(sfs_requestor *req);
+extern sfs_blockbuf_t * _sfs_read_block(uint sfsid, uint diskpos);
+extern uint _sfs_calculate_diskpos(sfs_openfile_t *openfd, uint offset);
+extern void _sfs_commit_block(int sfsid, sfs_blockbuf_t *block);
+extern sfs_blockbuf_t * _sfs_write_block(uint sfsid, uint diskpos,
+    void *buf, int buflen, enum sfs_block_type type);
+extern uint _sfs_allocate_fd(sfs_openfile_t *new);
+extern uint _sfs_allocate_block(int sfsid, int blocktype);
+extern sfs_openfile_t * _sfs_find_fd(int sfsfd);
+extern void _sfs_flush_bitmaps(int sfsid);
+extern int _sfs_flush_file(int sfsid, sfs_openfile_t *fd);
+extern void _sfs_print_request(sfs_requestor *req);
+
+/* sfs_splay.c */
+extern sfs_blockbuf_t * _sfs_blockbuf_create();
+extern sfs_blockbuf_t *sfs_splay_find(uint diskpos, sfs_blockbuf_t *tree);
+extern sfs_blockbuf_t * sfs_splay_insert(int sfsid, sfs_blockbuf_t *new,
+    sfs_blockbuf_t *tree);
+extern sfs_blockbuf_t * sfs_splay_remove(int sfsid, sfs_blockbuf_t *tree);
+extern sfs_blockbuf_t * sfs_splay_delete(int sfsid, sfs_blockbuf_t *tree);
+
+
+
+/* External stuff */
+extern int sfs_format(const char *, u_int32_t );
+extern sfsfd_t sfs_open(sfsid_t, sfsblock_t, int, mode_t, enum sfs_io_type,
+    void *);
+extern int sfs_umount(sfsid_t, enum sfs_io_type );
+extern sfsid_t sfs_mount(const char * );
+extern int sfs_close(sfsfd_t, enum sfs_io_type, void *);
+extern ssize_t sfs_read(sfsfd_t , void * , ssize_t, enum sfs_io_type,
+    void *);
+extern off_t sfs_seek(sfsfd_t , off_t , enum sfs_io_type, void *);
+extern int sfs_unlink(sfsid_t , sfsblock_t, enum sfs_io_type, void *);
+extern ssize_t sfs_write(sfsfd_t , const void * , ssize_t,
+    enum sfs_io_type, void *);
+extern sfs_requestor * sfs_getcompleted(sfsid_t);
+int sfs_openNextInode(sfsid_t sfsid, sfsblock_t *cur);
+
+
+#endif	/* !SFS_LIB_H */
Index: squid/src/fs/sfs/sfs_llo.c
diff -u /dev/null squid/src/fs/sfs/sfs_llo.c:1.1.2.13
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_llo.c	Sat Feb  3 17:16:04 2001
@@ -0,0 +1,509 @@
+/* sfs_llo.c,v 1.84 1999/02/03 04:04:06 darius Exp */
+
+/*				Squid FS				*/
+/*									*/
+/*	Authors: Stew Forster (slf) - Original version			*/
+/*               Kevin Littlejohn (darius@bofh.net.au)                  */
+/*									*/
+
+/* A very simple stripped down UFS style filesystem that makes a lot	*/
+/* of assumptions based on the needs of the Squid web proxy caching	*/
+/* software. */
+
+#include "squid.h"
+
+#include "store_sfs.h"
+
+sfs_mountfs_t _sfs_mounted[MAXFILESYS];
+sfs_openfile_t *_sfs_openfiles[MAXFILESYS];
+int _sfs_initialised = 0;
+int inode_data_size = FRAGSIZE - sizeof(sfs_inode_t);
+int direct_pointer_threshold = FRAGSIZE - sizeof(sfs_inode_t) + (NUMDIP * FRAGSIZE);
+
+void sfs_do_umount(sfs_requestor *req);
+void sfs_do_open(sfs_requestor *req);
+void sfs_do_read(sfs_requestor *req);
+void sfs_do_write(sfs_requestor *req);
+void sfs_do_unlink(sfs_requestor *req);
+void sfs_do_close(sfs_requestor *req);
+void sfs_do_seek(sfs_requestor *req);
+
+void
+sfs_initialise()
+{
+    int i;
+
+    if (_sfs_initialised)
+	return;
+    _sfs_initialised = 1;
+    for(i = 0; i < MAXFILESYS; i++) {
+	_sfs_mounted[i].rootblock = NULL;
+	_sfs_mounted[i].accepting_requests = 0;
+	_sfs_openfiles[i] = NULL;
+    }
+} /* sfs_initialise */
+
+void
+sfs_thread_loop(sfs_mountfs_t *mount_point)
+{
+    sigset_t new;
+    int i;
+    sfs_requestor *req;
+    dlink_node *tnode;
+
+    /* Make sure to ignore signals which may possibly get sent to the parent */
+    /* squid thread.  Causes havoc with mutex's and condition waits otherwise */
+    /* (Stolen from aiops.c)  - Darius */
+
+    sigemptyset(&new);
+    sigaddset(&new, SIGPIPE);
+    sigaddset(&new, SIGCHLD);
+#if (defined(_SQUID_LINUX_) && USE_ASYNC_IO)
+    sigaddset(&new, SIGQUIT);
+    sigaddset(&new, SIGTRAP);
+#else
+    sigaddset(&new, SIGUSR1);
+    sigaddset(&new, SIGUSR2);
+#endif
+    sigaddset(&new, SIGHUP);
+    sigaddset(&new, SIGTERM);
+    sigaddset(&new, SIGINT);
+    sigaddset(&new, SIGALRM);
+    pthread_sigmask(SIG_BLOCK, &new, NULL);
+
+    /* Set a conditional, when it's realised scan through the service list. */
+    pthread_cond_init(&(mount_point->req_signal), NULL);
+    pthread_mutex_lock(&(mount_point->req_signal_lock));
+    mount_point->accepting_requests = 1;
+    i = 0;
+    while (1) {
+	pthread_cond_wait(&(mount_point->req_signal), &(mount_point->req_signal_lock));
+	pthread_mutex_lock(&(mount_point->req_lock));
+	while (mount_point->pending_requests > 0) {
+	    printf("Pending Requests: %d\n",mount_point->pending_requests);
+	    tnode = mount_point->request_queue->head;
+	    assert(tnode);
+            req = tnode->data;
+	    if (req && (req->request_state == _SFS_PENDING)) {
+		/* If we're not accepting requests, return fail for each
+		 * request.  Note, we can't just lock the request queue, as
+		 * things are still being removed from it by other threads. */
+		printf("dealing with pending request, %d\n",mount_point->pending_requests);
+		if (!(mount_point->accepting_requests)) {
+		    _sfs_done_request(req,-1);
+	       	} else {
+		    /* This portion sets the state, and works out exactly what
+		     * to do - open, read, write, close, sync, unlink. */
+		    req->request_state = _SFS_IN_PROGRESS;
+		    mount_point->pending_requests--;
+		    printf("pending requests now %d\n",mount_point->pending_requests);
+		    switch (req->request_type) {
+			case _SFS_OP_OPEN_READ:
+			case _SFS_OP_OPEN_WRITE:
+			    sfs_do_open(req);
+			    break;
+			case _SFS_OP_UNLINK:
+			    sfs_do_unlink(req);
+			    break;
+			case _SFS_OP_UMOUNT:
+			    sfs_do_umount(req);
+			    break;
+			case _SFS_OP_READ:
+			    sfs_do_read(req);
+			    break;
+			case _SFS_OP_WRITE:
+			    sfs_do_write(req);
+			    break;
+			case _SFS_OP_CLOSE:
+			    sfs_do_close(req);
+			    break;
+			case _SFS_OP_SEEK:
+			    sfs_do_seek(req);
+			    break;
+			default:
+			    _sfs_done_request(req,-1);
+		    }
+		}
+	    }
+	    printf("tnode prev: %p next: %p\n",tnode->prev,tnode->next);
+	    tnode = tnode->next;
+	}
+	pthread_mutex_unlock(&(mount_point->req_lock));
+	/* Flush the bitmaps every 10 seconds */
+	i = (i + 1) % 10;
+	if (i == 0) {
+	    _sfs_flush_bitmaps(mount_point->sfsid);
+	}
+    }
+}
+
+void
+sfs_do_read(sfs_requestor *req)
+{
+    sfs_blockbuf_t *new;
+    sfs_openfile_t *openfd;
+    int bytes_read, fragsize;
+    uint diskpos;
+    void *buf;
+
+    if (!(openfd = _sfs_find_fd(req->sfsfd))) {
+	_sfs_done_request(req,-1);
+	return;
+    }
+
+    if (req->offset > -1)
+	openfd->pos = req->offset;
+
+    bytes_read = 0;
+    buf = NULL;
+    /* one block at a time */
+    /* We _could_ alloc the space required for the entire file in one go,
+     * at least if the filesize was below a certain watermark - should be
+     * a significant speed boost if my understanding of realloc issues is
+     * correct... */
+    while ((bytes_read < req->buflen) && (bytes_read < openfd->inode->len)) {
+	if (!(diskpos = _sfs_calculate_diskpos(openfd,openfd->pos))) {
+	    req->buf = buf;
+	    req->buflen = bytes_read;
+	    _sfs_done_request(req,bytes_read);
+	    return;
+	}
+	if (!(new = _sfs_read_block(openfd->sfsid,diskpos))) {
+	    req->buf = buf;
+	    req->buflen = bytes_read;
+	    _sfs_done_request(req,bytes_read);
+	    return;
+	}
+	/* In case of request only wanting a certain amount, work out how much
+	 * to copy into req->buf */
+	fragsize = min(req->buflen - bytes_read, new->buflen);
+	fragsize = min(fragsize, openfd->inode->len);
+	if (new->type == _SFS_INODE)
+	    fragsize = min(fragsize, inode_data_size);
+	else {
+	    if ((openfd->pos + bytes_read) == openfd->inode->len) {
+		fragsize = min(fragsize, (openfd->inode->len % FRAGSIZE));
+	    }
+	}
+	buf = (char *)xrealloc(buf, bytes_read + fragsize);
+	if (new->type == _SFS_INODE)
+	    memcpy(((char *)buf)+bytes_read, new->buf+sizeof(sfs_inode_t), fragsize);
+	else
+	    memcpy(((char *)buf)+bytes_read, new->buf, fragsize);
+	bytes_read += fragsize;
+	openfd->pos += fragsize;
+	if (fragsize == 0)
+	    abort();
+    }
+    req->buf = buf;
+    req->buflen = bytes_read;
+    _sfs_done_request(req,bytes_read);
+}
+
+void
+sfs_do_write(sfs_requestor *req)
+{
+    sfs_openfile_t *openfd;
+    sfs_blockbuf_t *new, *current = NULL;
+    int offset, written, fragsize, inblock;
+    int bytes_left, leader;
+    sfsblock_t diskpos;
+    void *buf;
+    int type;
+
+    if (!(openfd = _sfs_find_fd(req->sfsfd))) {
+	_sfs_done_request(req,-1);
+	return;
+    }
+
+    if (req->offset == -1)
+	offset = openfd->pos;
+    else
+	offset = openfd->pos = req->offset;
+
+    diskpos = 0;
+
+    /* written tracks how much we've written so far in total. */
+    /* offset tells us where we start writing, minus the inode. */
+    /* bytes_left gets set to the number of bytes left to write. */
+    /* leader is the offset plus the inode - actual starting location. */
+    written = 0;
+    leader = offset + sizeof(sfs_inode_t);
+    bytes_left = req->buflen;
+    buf = NULL;
+    /* We write one block at a time - the process to find which block to write
+     * to next is a little, um, involved at present. */
+    while (written < req->buflen) {
+	/* Work out type and where in the block this write should go.
+	 * If the total file is smaller than the inode block data size, then
+	 * we're best off storing it in an inode data block - fastest retrieval
+	 * and all that.
+	 *
+	 * offset + written + sizeof(sfs_inode_t) = position in file (filepos)
+	 * filepos % FRAGSIZE = remainder (thisblock)
+	 * fragsize indicates the maximum amount of data to write in this
+	 * block. */
+	if (openfd->pos < inode_data_size) {
+	    type = _SFS_INODE;
+	    inblock = leader;
+	    fragsize = min(inode_data_size - openfd->pos,bytes_left);
+	} else {
+	    type = _SFS_DATA;
+	    inblock = leader % FRAGSIZE;
+	    fragsize = min(FRAGSIZE - inblock,bytes_left);
+	}
+	current = NULL;
+
+	/* Figure out where on disk it should be... */
+	if (!(diskpos = _sfs_calculate_diskpos(openfd, openfd->pos))) {
+	    /* If we're not within the file, allocate a new block */
+	    if (!(diskpos = _sfs_allocate_block(req->sfsid, type))) {
+		_sfs_done_request(req,written);
+		return;
+	    }
+	    if (type != _SFS_INODE) {
+		if (openfd->pos < direct_pointer_threshold) {
+		    openfd->inode->dip[(openfd->pos - inode_data_size) / FRAGSIZE] = diskpos;
+		} else {
+		    /* XXX indirect pointer - youch.  This has not yet been
+		    * implemented :( */
+		}
+	    }
+	} else {
+	    current = _sfs_read_block(openfd->sfsid,diskpos);
+	}
+
+	/* How much more to write? */
+	if (current) {
+	    buf = current->buf;
+	} else {
+	    buf = (char *)xcalloc(1, FRAGSIZE);
+	}
+	memcpy(((char *)buf)+inblock,((char *)req->buf)+written,fragsize);
+	if (!current) {
+	    if (!(new = _sfs_write_block(req->sfsid, diskpos, buf, fragsize, type))) {
+		_sfs_done_request(req,written);
+		return;
+	    }
+	    new->sfsinode = openfd->sfsinode;
+	    new->type = type;
+	    current = new;
+	}
+	written += fragsize;
+	openfd->pos += fragsize;
+	bytes_left -= fragsize;
+    }
+    /* $DEITY forbid two people try to write at once - maybe I need some
+     * locking to prevent that... */
+    openfd->inode->len = max(openfd->inode->len,openfd->pos);
+    printf("DEBUG: written %d bytes to fd %d (block %d)\n",req->buflen,req->sfsfd,diskpos);
+    _sfs_done_request(req,written);
+}
+
+void
+sfs_do_umount(sfs_requestor *req)
+{
+    sfs_mountfs_t *mnt;
+    sfs_blockbuf_t *block_ptr;
+    dlink_node *lnode;
+    sfs_openfile_t *openfd;
+    int i;
+
+    _sfs_mounted[req->sfsid].accepting_requests = 0;
+    mnt = &(_sfs_mounted[req->sfsid]);
+    /* Flush all the dirty blocks out to HDD */
+    openfd = _sfs_openfiles[req->sfsid];
+    while(openfd) {
+    /* flush_file has to get rid of stuff then, which is bad :(
+     * The structures get kinda confused at this point */
+	printf("DEBUG: flushing file...\n");
+	_sfs_flush_file(req->sfsid,openfd);
+	_sfs_openfiles[req->sfsid] = openfd->next;
+	xfree(openfd);
+	openfd = _sfs_openfiles[req->sfsid];
+    }
+    printf ("DEBUG: umount flushing dirty blocks\n");
+    while (mnt->dirty) {
+	_sfs_commit_block(req->sfsid, mnt->dirty);
+	mnt->dirty = sfs_splay_delete(req->sfsid, mnt->dirty);
+    }
+    _sfs_flush_bitmaps(req->sfsid);
+    if (mnt->fbm)
+	xfree(mnt->fbm);
+    if (mnt->ibm)
+	xfree(mnt->ibm);
+    if (mnt->mhb)
+	xfree(mnt->mhb);
+    block_ptr = mnt->clean;
+    /* Need to clean out the clean list */
+    while (mnt->clean) {
+	mnt->clean = sfs_splay_delete(req->sfsid, mnt->clean);
+    }
+    if (mnt->request_queue->head != NULL) {
+    /* Make doubly sure we've cleared any pending requests - shouldn't need to,
+    * but we _are_ umounting... This is actually bodgy code, if it ever does
+    * anything, then something's gone wrong.  Probably shouldn't do this, but
+    * it saves deadlock, and we'd rather see corruption than hang
+    * indefinately.*/
+	lnode = mnt->request_queue->head;
+	while (lnode) {
+            req = lnode->data;
+	    if (req->request_state == _SFS_PENDING) {
+		_sfs_done_request(req,-1);
+	    }
+            lnode = lnode->next;
+	}
+	i = 0;
+        /* Waiting for the request queue to be empty of all bar the umount
+	 * request */
+	while ((mnt->request_queue->head->next) && (i < 5)) {
+	    i++;
+	    sleep(1);
+	}
+    }
+    /* At this stage, all I need to do is kill the thread :)
+    * I could shuffle these, do the done_request before actually completely
+    * finishing - that would guarantee the requests are all collected
+    * properly */
+    _sfs_done_request(req,0);
+    if (mnt->rootblock) {
+	xfree(mnt->rootblock);
+	mnt->rootblock = NULL;
+    }
+    /* Should also free all open fd's */
+    pthread_exit(NULL);
+}
+
+void
+sfs_do_open(sfs_requestor *req)
+{
+    sfs_openfile_t *fd, *fdptr;
+
+    if ((fd = (sfs_openfile_t *)xcalloc(1, sizeof(sfs_openfile_t))) == NULL) {
+	_sfs_done_request(req,-1);
+	return;
+    }
+    fd->sfsid = req->sfsid;
+    fd->sfsfd = _sfs_allocate_fd(fd);
+    fd->pos = 0;
+
+    if (req->request_type == _SFS_OP_OPEN_READ) {
+	fd->sfsinode = req->sfsinode;
+	fd->inodebuf_p = _sfs_read_block(fd->sfsid,fd->sfsinode);
+    } else {
+	/* Doesn't need to lock, as all allocation is within thread. */
+	if (!(fd->sfsinode = _sfs_allocate_block(req->sfsid, _SFS_INODE))) {
+	    xfree(fd);
+	    printf("ERR: couldn't allocate sfsinode\n");
+	    _sfs_done_request(req,-1);
+	    return;
+	}
+	/* Fill the new inode */
+	if (!(fd->inodebuf_p = _sfs_blockbuf_create())) {
+	    printf("DEBUG: Couldn't create a blockbuf\n");
+	    xfree(fd);
+	    _sfs_done_request(req,-1);
+	    return;
+	}
+	fd->inodebuf_p->type = _SFS_INODE;
+	fd->inodebuf_p->buf = (char *)xcalloc(1, FRAGSIZE);
+	fd->inodebuf_p->diskpos = fd->inodebuf_p->sfsinode = fd->sfsinode;
+	fd->inodebuf_p->buflen = 0;
+	_sfs_mounted[req->sfsid].clean = sfs_splay_insert(req->sfsid, fd->inodebuf_p, _sfs_mounted[req->sfsid].clean);
+    }
+    /* Nasty cast */
+    fd->inode = (sfs_inode_t *)fd->inodebuf_p->buf;
+    if (req->request_type != _SFS_OP_OPEN_READ) {
+	fd->inode->len = 0;
+	fd->rwbuf_list_p = NULL;
+	fd->sibuf_list_p = NULL;
+	fd->dibuf_p = NULL;
+    }
+    fd->pos = 0;
+    fd->next = fd->prev = NULL;
+    /* Add this one to the sfsid list of open fd's */
+    /* Allocating an fd */
+    fdptr = _sfs_openfiles[req->sfsid];
+    if (fdptr) {
+	while(fdptr->next) {
+	    fdptr = fdptr->next;
+	}
+	fdptr->next = fd;
+	fd->prev = fdptr;
+    } else {
+	_sfs_openfiles[req->sfsid] = fd;
+    }
+    req->buf = fd;
+    _sfs_done_request(req,fd->sfsfd);
+}
+
+void
+sfs_do_unlink(sfs_requestor *req)
+{
+    /* XXX unused at the moment
+    sfs_openfile_t *ptr;
+    sfs_blockbuf_t *block; */
+
+    if (!(CBIT_TEST(_sfs_mounted[req->sfsid].ibm, req->sfsfd))) {
+	_sfs_done_request(req,-1);
+	return;
+    }
+/* Check to make sure there's not an open file here - if there is, close and
+flush it.  Is this correct behaviour?  At least, we shouldn't flush to disk -
+at most, we should do something about the threads trying to hold the file open
+    while (ptr = _sfs_find_fd(req->sfsid, req->sfsfd))
+	_sfs_flush_file(req->sfsid, ptr);
+*/
+/* Without opening a file ;) read in the inode, walk the list of blocks,
+   and CBIT_CLEAR each one from .fbm */
+}
+
+void
+sfs_do_close(sfs_requestor *req)
+{
+    sfs_openfile_t *ptr;
+
+    printf("DEBUG: closing file %d\n",req->sfsfd);
+    if (!(ptr = _sfs_find_fd(req->sfsfd))) {
+	printf("DEBUG: couldn't find fd %d\n",req->sfsfd);
+	_sfs_done_request(req,-1);
+	return;
+    }
+    printf("DEBUG: flushing file %d\n",req->sfsfd);
+    _sfs_flush_file(req->sfsid, ptr);
+    if (ptr) {
+       	/* Assuming _sfs_flush_file clears the other stuff from the openfd -
+	 * will check that later... */
+	xfree(ptr);
+    }
+    _sfs_done_request(req,0);
+    return;
+}
+
+void
+sfs_do_seek(sfs_requestor *req)
+{
+    sfs_openfile_t *ptr;
+    unsigned char sfsid;
+
+    sfsid = req->sfsid;
+    ptr = _sfs_openfiles[sfsid];
+    while (ptr) {
+	if (ptr->sfsfd == req->sfsfd)
+	    break;
+	ptr = ptr->next;
+    }
+    if (!ptr) {
+    	printf("DEBUG: Can't find an openfile for fd %d!\n", req->sfsfd);
+	_sfs_done_request(req,-1);
+	return;
+    }
+    if (req->offset > ptr->inode->len) {
+	printf("DEBUG: seek beyond EOF for fd %d\n",req->sfsfd);
+	_sfs_done_request(req,-1);
+	return;
+    }
+    ptr->pos = req->offset;
+    _sfs_done_request(req,0);
+    return;
+}
Index: squid/src/fs/sfs/sfs_read.c
diff -u /dev/null squid/src/fs/sfs/sfs_read.c:1.1.2.4
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_read.c	Wed Feb  7 02:41:49 2001
@@ -0,0 +1,56 @@
+/* $Id$ */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+
+#include "squid.h"
+#include "sfs_defines.h"
+#include "sfs_lib.h"
+
+int main(int argc, char *argv[])
+{
+    int sfsid;
+    int sfsfd;
+    char buf[512];
+    int inode;
+    int err;
+
+    /* Check args */
+    if (argc < 2) {
+        printf("error: need mountpoint inode..\n");
+        exit(1);
+    }
+
+    /* get inode */
+    inode = atoi(argv[2]);
+    if (inode < 0) {
+        printf("inode '%s' is an invalid number\n", argv[2]);
+        exit(1);
+    }
+
+    sfsid = sfs_mount(argv[1]);
+    printf("sfsid mount = %d\n",sfsid);
+    if (sfsid < 0)
+        exit(1);
+
+    printf("opening %d\n", inode);
+    sfsfd = sfs_open(sfsid, inode, O_RDONLY, 0, _SFS_IO_SYNC, NULL);
+    if (sfsfd < 0)
+        exit(1);
+
+    while ((err = sfs_read(sfsfd, buf, 510, _SFS_IO_SYNC, NULL)) > 0) {
+        /* err is also our write length! */
+        write(1, buf, err);
+    }
+    printf("Got %d from sfs_read\n", err);
+
+    printf("close result = %d\n",sfs_close(sfsfd, _SFS_IO_SYNC, NULL));
+    printf("umount result = %d\n",sfs_umount(sfsid, _SFS_IO_SYNC));
+    exit(0);
+}
Index: squid/src/fs/sfs/sfs_shim.c
diff -u /dev/null squid/src/fs/sfs/sfs_shim.c:1.1.2.3
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_shim.c	Sat Jan 27 02:20:44 2001
@@ -0,0 +1,48 @@
+#include "squid.h"
+    
+void
+dlinkAdd(void *data, dlink_node * m, dlink_list * list)
+{
+    m->data = data;
+    m->prev = NULL;
+    m->next = list->head;
+    if (list->head)
+	list->head->prev = m;
+    list->head = m;
+    if (list->tail == NULL)
+	list->tail = m;
+}
+
+void
+dlinkAddTail(void *data, dlink_node * m, dlink_list * list)
+{
+    m->data = data;
+    m->next = NULL;
+    m->prev = list->tail;
+    if (list->tail)
+	list->tail->next = m;
+    list->tail = m;
+    if (list->head == NULL)
+	list->head = m;
+}
+
+void
+dlinkDelete(dlink_node * m, dlink_list * list)
+{
+    if (m->next)
+	m->next->prev = m->prev;
+    if (m->prev)
+	m->prev->next = m->next;
+    if (m == list->head)
+	list->head = m->next;
+    if (m == list->tail)
+	list->tail = m->prev;
+    m->next = m->prev = NULL;
+}
+
+void
+xassert(const char *msg, const char *file, int line)
+{
+    printf("Assertion failed: %s:%d - %s\n", file, line, msg);
+    abort();
+}
Index: squid/src/fs/sfs/sfs_splay.c
diff -u /dev/null squid/src/fs/sfs/sfs_splay.c:1.1.2.1
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_splay.c	Wed Jan 24 06:11:54 2001
@@ -0,0 +1,157 @@
+/* $Id$ */
+
+#include "squid.h"
+
+#include "store_sfs.h"
+
+sfs_blockbuf_t *
+_sfs_blockbuf_create()
+{
+    sfs_blockbuf_t *new;
+    if ((new = (sfs_blockbuf_t *)xcalloc(1, sizeof(sfs_blockbuf_t))) == NULL)
+	return NULL;
+    new->left = NULL;
+    new->right = NULL;
+    new->prev = NULL;
+    new->next = NULL;
+    new->sfsid = -1;
+    new->sfsinode = 0;
+    new->diskpos = -1;
+    new->dirty = 0;
+    new->type = _SFS_UNKNOWN;
+    new->buf = NULL;
+    return new;
+}
+
+sfs_blockbuf_t *sfs_splay_find(uint diskpos, sfs_blockbuf_t *tree)
+{
+    sfs_blockbuf_t *temp, *l, *r;
+    sfs_blockbuf_t new;
+
+    if (tree == NULL)
+	return NULL;
+
+    l = r = &new;
+    for (;;) {
+	if (diskpos < tree->diskpos) {
+	    if (!(tree->left))
+		break;
+	    if (diskpos < tree->left->diskpos) {
+		temp = tree->left;
+		tree->left = temp->right;
+		temp->right = tree;
+		tree = temp;
+		if (tree->left == NULL)
+		    break;
+	    }
+	    r->left = tree;
+	    r = tree;
+	    tree = tree->left;
+	} else if (diskpos > tree->diskpos) {
+	    if (!(tree->right))
+		break;
+	    if (diskpos > tree->right->diskpos) {
+		temp = tree->right;
+		tree->right = temp->left;
+		temp->left = tree;
+		tree = temp;
+		if (tree->right == NULL)
+		    break;
+	    }
+	    l->right = tree;
+	    l = tree;
+	    tree = tree->right;
+	} else {
+	    break;
+	}
+    }
+    l->right = tree->left;
+    r->left = tree->right;
+    tree->left = new.right;
+    tree->right = new.left;
+    return tree;
+}
+
+sfs_blockbuf_t *
+sfs_splay_insert(int sfsid, sfs_blockbuf_t *new, sfs_blockbuf_t *tree)
+{
+    sfs_blockbuf_t **head, **tail;
+
+    if (new == NULL)
+	return NULL;
+    head = &(_sfs_mounted[sfsid].head[new->dirty]);
+    tail = &(_sfs_mounted[sfsid].tail[new->dirty]);
+    if (tree == NULL) {
+	new->left = NULL;
+	new->right = NULL;
+	*head = *tail = new;
+	return new;
+    }
+    tree = sfs_splay_find(new->diskpos,tree);
+    if (new->diskpos == tree->diskpos) {
+	tree->refcount++;
+	return tree;
+    }
+    new->next = *head;
+    (*head)->prev = new;
+    if (new->diskpos < tree->diskpos) {
+	new->left = tree->left;
+	new->right = tree->right;
+	tree->left = NULL;
+    } else {
+	new->right = tree->right;
+	new->left = tree;
+	tree->right = NULL;
+    }
+    CBIT_SET(_sfs_mounted[sfsid].mhb, new->diskpos);
+    new->refcount = 1;
+    return new;
+}
+
+sfs_blockbuf_t *
+sfs_splay_remove(int sfsid, sfs_blockbuf_t *tree)
+{
+    sfs_blockbuf_t *new;
+    sfs_blockbuf_t **head, **tail;
+
+    tree->refcount--;
+    if (tree->refcount > 0)
+	return tree;
+    new = NULL;
+    head = &(_sfs_mounted[sfsid].head[tree->dirty]);
+    tail = &(_sfs_mounted[sfsid].tail[tree->dirty]);
+    if (tree->left == NULL) {
+	new = tree->right;
+    } else {
+	new = sfs_splay_find(tree->left->diskpos,tree->left);
+	new->right = tree->right;
+    }
+    if (*head == tree)
+	*head = new;
+    if (*tail == tree)
+	*tail = new;
+    if (tree->prev)
+	tree->prev->next = tree->next;
+    if (tree->next)
+	tree->next->prev = tree->prev;
+    return new;
+}
+
+sfs_blockbuf_t *
+sfs_splay_delete(int sfsid, sfs_blockbuf_t *tree)
+{
+    sfs_blockbuf_t *old;
+
+    if (tree == NULL)
+	return NULL;
+    old = tree;
+/* Set this so it _will_ be deleted */
+    if (tree->refcount > 1)
+	tree->refcount = 1;
+    tree = sfs_splay_remove(sfsid,tree);
+    if (tree != old) {
+	xfree(old->buf);
+	xfree(old);
+    }
+    return tree;
+}
Index: squid/src/fs/sfs/sfs_test.c
diff -u /dev/null squid/src/fs/sfs/sfs_test.c:1.1.2.7
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_test.c	Sat Feb  3 17:16:04 2001
@@ -0,0 +1,59 @@
+/* $Id$ */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+
+#include "squid.h"
+#include "sfs_defines.h"
+#include "sfs_lib.h"
+
+int main() {
+    int sfsid;
+    uint sfsfd, sfsinode;
+    char filename[20];
+    char buf[80];
+
+    if (creat("test.drv", 0644) < 0) {
+	printf("cannot open new file test.drv: %s", strerror(errno));
+	exit(0);
+    }
+
+    if (sfs_format("test.drv",4096) < 0) {
+	printf("unable to format test.drv! %s", strerror(errno));
+	exit(0);
+    }
+    sfsid = sfs_mount("test.drv");
+    printf("sfsid = %d\n",sfsid);
+
+    snprintf(filename,20,"%d/0",sfsid);
+    printf("opening %s",filename);
+    sfsfd = sfs_open(sfsid, 0, O_CREAT, 0, _SFS_IO_SYNC, NULL);
+
+    sfs_write(sfsfd,"Hello...\n",strlen("Hello...\n"), _SFS_IO_SYNC, NULL);
+    sfs_write(sfsfd,"Hello, again!\n",strlen("Hello, again!\n"), _SFS_IO_SYNC, NULL);
+    printf("close result = %d\n",sfs_close(sfsfd, _SFS_IO_SYNC, NULL));
+    printf("umount result = %d\n",sfs_umount(sfsid, _SFS_IO_SYNC));
+    printf("About to remount and read...\n");
+    sfsid = sfs_mount("test.drv");
+    printf("sfsid = %d\n",sfsid);
+    printf("Opening %d/%d\n",sfsid,sfsinode);
+
+    snprintf(filename,20,"%d/%d",sfsid,sfsinode);
+    printf("DEBUG: %s\n",filename);
+    sfsfd = sfs_open(sfsid, sfsinode, O_RDONLY, 0, _SFS_IO_SYNC, NULL);
+
+    printf("sfsfd = %d, sfsinode = %d\n",sfsfd,sfsinode);
+    if (sfsfd >= 0) {
+	printf("read result = %d\n",sfs_read(sfsfd,buf,80, _SFS_IO_SYNC, NULL));
+	printf("\n***** %s *****\n",buf);
+	printf("strlen buf = %d\n",strlen(buf));
+	printf("close result = %d\n",sfs_close(sfsfd, _SFS_IO_SYNC, NULL));
+    }
+    printf("umount result = %d\n",sfs_umount(sfsid, _SFS_IO_SYNC));
+    exit(0);
+}
Index: squid/src/fs/sfs/sfs_util.c
diff -u /dev/null squid/src/fs/sfs/sfs_util.c:1.1.2.15
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/sfs_util.c	Wed Feb  7 01:49:06 2001
@@ -0,0 +1,406 @@
+/* sfs_util.c,v 1.53 2001/01/24 12:49:34 adrian Exp */
+
+#include "squid.h"
+#include "store_sfs.h"
+
+extern int inode_data_size;
+extern int direct_pointer_threshold;
+
+void
+_sfs_waitfor_request(sfs_requestor *req)
+/* You know, we could count the number of seconds a request has had to wait to
+ * be serviced here... */
+{
+    assert(req->io_type == _SFS_IO_SYNC);
+    pthread_mutex_lock(&(req->done_signal_lock));
+    if (!(req->request_state == _SFS_DONE))
+	pthread_cond_wait(&(req->done_signal),&(req->done_signal_lock));
+    pthread_mutex_unlock(&(req->done_signal_lock));
+}
+
+/*
+ * _sfs_remove_request
+ *
+ * Remove the request from the done request list, and deallocate it.
+ */
+int
+_sfs_remove_request(sfs_requestor *req)
+/* This doesn't free the buffer - not sure whether we have any need to keep
+ * the buffer anywhere or not, but the option is there... This will be changed
+ * by various commloops/modio changes - hopefully, we won't even be supplying
+ * the buffer. */
+{
+    printf("DEBUG: Removing %d request from done queue\n",req->request_type);
+    pthread_mutex_lock(&(_sfs_mounted[req->sfsid].done_lock));
+    dlinkDelete(&req->node, &(_sfs_mounted[req->sfsid].done_queue));
+    _sfs_mounted[req->sfsid].done_requests--;
+    pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].done_lock));
+    xfree(req);
+    return (0);
+}
+
+/*
+ * _sfs_done_request
+ *
+ * Move the request from the request queue to the done queue, and then
+ * signal any sleeping thread(s) that this request has completed.
+ */
+void
+_sfs_done_request(sfs_requestor *req, int retval)
+{
+
+    req->ret = retval;
+    req->request_state = _SFS_DONE;
+
+    /* Take it off the mount point request queue.  Note, requests are
+     * always marked done from within the thread, within the lock that's
+     * already in place from the main sfs_thread_loop.  Hence, no locking. */
+    dlinkDelete(&req->node, _sfs_mounted[req->sfsid].request_queue);
+    
+    /* Add it to the squid done queue */
+    pthread_mutex_lock(&(_sfs_mounted[req->sfsid].done_lock));
+    dlinkAddTail(req, &req->node, &(_sfs_mounted[req->sfsid].done_queue));
+    _sfs_mounted[req->sfsid].done_requests++;
+    pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].done_lock));
+
+    /* If it's a sync operation, signal it's done - that allows sfs_interface
+     * to pick up and return */
+    if (req->io_type == _SFS_IO_SYNC) {
+        pthread_mutex_lock(&(req->done_signal_lock));
+        pthread_cond_signal(&(req->done_signal));
+        pthread_mutex_unlock(&(req->done_signal_lock));
+    }
+}
+
+sfs_requestor *
+_sfs_create_requestor(int sfsid, enum sfs_request_type reqtype,
+    enum sfs_io_type iotype)
+{
+    struct sfs_requestor *req;
+
+    if (!(req = (sfs_requestor *)xcalloc(1, sizeof(sfs_requestor))))
+	return NULL;
+
+    req->request_type = reqtype;
+    req->io_type = iotype;
+    req->request_state = _SFS_PENDING;
+    req->sfsid = sfsid;
+    req->sfsfd = 0;
+    req->offset = -1;
+    req->ret = 0;
+    req->buf = NULL;
+    pthread_cond_init(&(req->done_signal), NULL);
+    pthread_mutex_init(&(req->done_signal_lock), NULL);
+    return req;
+}
+
+int
+_sfs_submit_request(sfs_requestor *req)
+{
+    /* add the request to the end of the list rather than the start */
+    pthread_mutex_lock(&(_sfs_mounted[req->sfsid].req_lock));
+    dlinkAddTail(req, &req->node, _sfs_mounted[req->sfsid].request_queue);
+    _sfs_mounted[req->sfsid].pending_requests++;
+    pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].req_lock));
+
+    /* and signal that the request has been made */
+    pthread_mutex_lock(&(_sfs_mounted[req->sfsid].req_signal_lock));
+    pthread_cond_signal(&(_sfs_mounted[req->sfsid].req_signal));
+    pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].req_signal_lock));
+
+    printf("DEBUG: request %p submitted\n",req);
+    return(0);
+}
+
+sfs_blockbuf_t *
+_sfs_read_block(uint sfsid, uint diskpos)
+{
+    /* This takes an sfsid, and a diskpos, and returns a blockbuf filled in
+       with the correct data. */
+    sfs_blockbuf_t *new;
+    uint64_t dpos;
+    int readlen;
+
+    /* Searching for the appropriate block in the clean list */
+    if (_sfs_mounted[sfsid].clean) {
+	_sfs_mounted[sfsid].clean = sfs_splay_find(diskpos,_sfs_mounted[sfsid].clean);
+	if (_sfs_mounted[sfsid].clean->diskpos == diskpos) {
+	    return _sfs_mounted[sfsid].clean;
+	}
+    }
+    /* And in the dirty list */
+    /* We probably shouldn't find things in the dirty list - they should
+     * probably be served by squid's own cache first (assuming squid's still
+     * keeping a cache...)  Might be worth measuring the frequency of this
+     * one. */
+    if (_sfs_mounted[sfsid].dirty) {
+	_sfs_mounted[sfsid].dirty = sfs_splay_find(diskpos,_sfs_mounted[sfsid].dirty);
+	if (_sfs_mounted[sfsid].dirty->diskpos == diskpos) {
+	    return _sfs_mounted[sfsid].dirty;
+	}
+    }
+    /* Otherwise we're reading a new one in off the disk. */
+    dpos = diskpos * FRAGSIZE;
+
+    if (!(new = _sfs_blockbuf_create()))
+	return NULL;
+    if (!(new->buf = (char *)xcalloc(1, FRAGSIZE)))
+	return NULL;
+    if (lseek(_sfs_mounted[sfsid].fd, dpos, SEEK_SET) < 0) {
+	xfree(new);
+	return NULL;
+    }
+    if ((readlen = read(_sfs_mounted[sfsid].fd, new->buf, FRAGSIZE)) < FRAGSIZE) {
+	xfree(new);
+	return NULL;
+    }
+    new->sfsid = sfsid;
+    new->diskpos = diskpos;
+    new->buflen = FRAGSIZE;
+    if (CBIT_TEST(_sfs_mounted[sfsid].ibm, diskpos)) {
+	new->type = _SFS_INODE;
+    } else {
+	new->type = _SFS_DATA;
+    }
+    /* Add it to the clean list on the spot. */
+    _sfs_mounted[sfsid].clean = sfs_splay_insert(sfsid, new, _sfs_mounted[sfsid].clean);
+    return new;
+}
+
+uint
+_sfs_calculate_diskpos(sfs_openfile_t *openfd, uint offset)
+{
+/* This function returns the disk position of the block into which bytes
+should be written, or from which bytes should be read.  It is granular
+to a block level only. */
+    sfs_blockbuf_t *din;
+    uint *dinptr;
+    sfsid_t sfsid;
+
+    sfsid = openfd->sfsid;
+    if (offset < inode_data_size)
+	return openfd->inodebuf_p->diskpos;
+/* Otherwise subtract inode_data_size, then div by FRAGSIZE to get entry in
+   direct block pointers */
+    else if (offset < direct_pointer_threshold) {
+	return openfd->inode->dip[((offset - inode_data_size) / FRAGSIZE)];
+    } else {
+/* This insinuates that we're storing indirect pointers in chunks rather than
+   frags - I think that's fair, it gives us bigger files ;)  Incidentally,
+   max filesize under this system is:
+   ((FRAGSIZE*(CHUNKSIZE / sizeof(uint)))*64)+(FRAGSIZE * 63)+inode_data_size
+   Under current settings, that's 512Mb + little bits.  Extending this should
+   be done, but without creating frag problems, and without increasing the
+   number of indirect pointers required by too much.  Having said that,
+   increasing number of indirect pointers at that stage is probably worth-
+   while - there's very few files that large legitimately.  My preference is
+   to store another pointer in the first chunk of indirect pointers - drops
+   that down to 2047, and adds another chunk (giving another 512Mb easily,
+   and the option to keep chaining if required).  At that stage, we'll want to
+   store state information so we don't read three times for each distinct
+   read call.
+*/
+	din = _sfs_read_block(sfsid,openfd->inode->sin[(offset - direct_pointer_threshold) / (FRAGSIZE * (CHUNKSIZE / sizeof(uint)))]);
+	dinptr = (uint *)din->buf;
+	return dinptr[(offset - direct_pointer_threshold) / FRAGSIZE];
+    }
+}
+
+void
+_sfs_commit_block(int sfsid, sfs_blockbuf_t *block)
+{
+    char *i;
+    uint64_t dpos;
+
+    i = (char *)block->buf;
+    dpos = block->diskpos * FRAGSIZE;
+    lseek(_sfs_mounted[sfsid].fd, dpos, SEEK_SET);
+    write(_sfs_mounted[sfsid].fd, block->buf, FRAGSIZE);
+    if (block->type == _SFS_INODE) {
+	CBIT_SET(_sfs_mounted[sfsid].ibm, block->diskpos);
+    }
+}
+
+sfs_blockbuf_t *
+_sfs_write_block(uint sfsid, uint diskpos, void *buf, int buflen, enum sfs_block_type type)
+{
+    sfs_blockbuf_t *new;
+    sfs_blockbuf_t *old;
+
+    printf("DEBUG: sfsid = %d, diskpos = %d, buflen = %d\n",sfsid,diskpos,buflen);
+    new = old = NULL;
+    /* If it's an inode, make sure we have it in clean or dirty - gotta
+     * preserve the inode data  */
+    if (CBIT_TEST(_sfs_mounted[sfsid].ibm, diskpos))
+	_sfs_read_block(sfsid, diskpos);
+    /* If it's in the clean list, remove it - it's now incorrect */
+    if (_sfs_mounted[sfsid].clean) {
+	_sfs_mounted[sfsid].clean = sfs_splay_find(diskpos,_sfs_mounted[sfsid].clean);
+	if (_sfs_mounted[sfsid].clean->diskpos == diskpos) {
+	    old = _sfs_mounted[sfsid].clean;
+	    _sfs_mounted[sfsid].clean->refcount = 1;
+	    _sfs_mounted[sfsid].clean = sfs_splay_remove(sfsid, _sfs_mounted[sfsid].clean);
+	    _sfs_mounted[sfsid].dirty = sfs_splay_insert(sfsid, _sfs_mounted[sfsid].dirty, old);
+	}
+    }
+    /* Likewise the dirty list - we'll be simply replacing the contents if it's
+     * there */
+    if (_sfs_mounted[sfsid].dirty) {
+	_sfs_mounted[sfsid].dirty = sfs_splay_find(diskpos,_sfs_mounted[sfsid].dirty);
+	if (_sfs_mounted[sfsid].dirty->diskpos == diskpos) {
+	    xfree(_sfs_mounted[sfsid].dirty->buf);
+	    _sfs_mounted[sfsid].dirty->buf = (char *)xcalloc(1, FRAGSIZE);
+	    new = _sfs_mounted[sfsid].dirty;
+	}
+    }
+    if (!new) {
+	if (!(new = _sfs_blockbuf_create()))
+	    return NULL;
+	new->buf = (char *)xcalloc(1, FRAGSIZE);
+	new->sfsid = sfsid;
+	new->diskpos = diskpos;
+	new->dirty = 1;
+	new->type = type;
+	CBIT_SET(_sfs_mounted[sfsid].mhb, diskpos);
+	printf("DEBUG: inserting block into dirty list\n");
+	_sfs_mounted[sfsid].dirty = sfs_splay_insert(sfsid, new, _sfs_mounted[sfsid].dirty);
+    }
+    memcpy(new->buf,buf,buflen);
+    return new;
+}
+
+uint
+_sfs_allocate_fd(sfs_openfile_t *new)
+/* This is to be called only from within an fs thread - no locking ;) */
+{
+    sfs_openfile_t *tmp;
+    uint maxfd;
+
+    maxfd = new->sfsid << 24;
+    tmp = _sfs_openfiles[new->sfsid];
+    while(tmp) {
+	if (tmp->sfsfd > maxfd)
+	    maxfd = tmp->sfsfd;
+	tmp = tmp->next;
+    }
+    return maxfd+1;
+}
+
+uint
+_sfs_allocate_block(int sfsid, int blocktype)
+{
+    uint i;
+    int found;
+    int blocks;
+
+    if (blocktype == _SFS_INODE)
+	blocks = 2;
+    else
+	blocks = 1;
+/* First block is always already used - rootblock */
+    for(i=1, found=0; i<_sfs_mounted[sfsid].rootblock->numfrags; i += blocks) {
+	if (!(CBIT_TEST(_sfs_mounted[sfsid].mhb, i))) {
+	    found = 1;
+	    break;
+	}
+    }
+    if (found) {
+	CBIT_SET(_sfs_mounted[sfsid].mhb, i);
+	return i;
+    } else
+	return 0;
+}
+
+sfs_openfile_t *
+_sfs_find_fd(int sfsfd)
+{
+    sfs_openfile_t *ptr;
+
+    ptr = _sfs_openfiles[sfsfd >> 24];
+    while (ptr) {
+	if (ptr->sfsfd == sfsfd)
+	    break;
+	ptr = ptr->next;
+    }
+    return ptr;
+}
+
+void
+_sfs_flush_bitmaps(int sfsid)
+{
+    printf("DEBUG: Flushing bitmaps\n");
+    lseek(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].rootblock->ibmpos, SEEK_SET);
+    write(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].ibm, _sfs_mounted[sfsid].rootblock->bmlen);
+    write(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].fbm, _sfs_mounted[sfsid].rootblock->bmlen);
+}
+
+int
+_sfs_flush_file(int sfsid, sfs_openfile_t *fd)
+{
+    sfs_openblock_list *tmp, *nxt;
+    sfs_openfile_t *tmpfile;
+    uint diskpos;
+
+    tmpfile = _sfs_openfiles[sfsid];
+    while (tmpfile) {
+	if (tmpfile->sfsfd == fd->sfsfd) {
+	    break;
+	}
+	tmpfile = tmpfile->next;
+    }
+    if (tmpfile) {
+	if (tmpfile->next)
+	    tmpfile->next->prev = tmpfile->prev;
+	if (tmpfile->prev)
+	    tmpfile->prev->next = tmpfile->next;
+	if ((tmpfile->next == tmpfile->prev) && (tmpfile->prev == NULL))
+	    _sfs_openfiles[sfsid] = NULL;
+    }
+/* Flush the inode block */
+    _sfs_commit_block(sfsid, fd->inodebuf_p);
+/* Flush all the single indirect blocks */
+    tmp = fd->sibuf_list_p;
+    while (tmp) {
+/* Two variables for after the splay_delete, in case the structure goes away */
+	nxt = tmp->next;
+	diskpos = tmp->buf->diskpos;
+	_sfs_commit_block(sfsid, tmp->buf);
+	if (tmp->buf->dirty) {
+	    _sfs_mounted[sfsid].dirty = sfs_splay_find(tmp->buf->diskpos, _sfs_mounted[sfsid].dirty);
+	    _sfs_mounted[sfsid].dirty = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].dirty);
+	} else {
+	    _sfs_mounted[sfsid].dirty = sfs_splay_find(tmp->buf->diskpos, _sfs_mounted[sfsid].clean);
+	    _sfs_mounted[sfsid].clean = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].clean);
+	}
+/* This isn't strictly how we designed it, but there you go */
+	CBIT_SET(_sfs_mounted[sfsid].fbm, diskpos);
+	xfree(tmp);
+	tmp = nxt;
+    }
+    fd->sibuf_list_p = NULL;
+/* Flush all the stuff off the double indirect block */
+    if (fd->dibuf_p) {
+/* indirect pointer - Panic ;) */
+    }
+/* Go back and flush the inode properly */
+    CBIT_SET(_sfs_mounted[sfsid].fbm, fd->inodebuf_p->diskpos);
+    if (fd->inodebuf_p->dirty) {
+	_sfs_mounted[sfsid].dirty = sfs_splay_find(fd->inodebuf_p->diskpos, _sfs_mounted[sfsid].dirty);
+	_sfs_mounted[sfsid].dirty = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].dirty);
+    } else {
+	_sfs_mounted[sfsid].clean = sfs_splay_find(fd->inodebuf_p->diskpos, _sfs_mounted[sfsid].clean);
+	_sfs_mounted[sfsid].clean = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].clean);
+    }
+/* XXX - nowhere else do we actually check the return value of this code. */
+/* Change function to void? */
+   return (0);
+}
+
+/* This function prints out the contents of a request - debug function. */
+void
+_sfs_print_request(sfs_requestor *req)
+{
+    printf("  fd %d: %d/%d - type %d, state %d, buflen %d\n",
+       	req->sfsfd,req->sfsid,req->sfsinode,req->request_type,
+	req->request_state,req->buflen);
+}
Index: squid/src/fs/sfs/store_dir_sfs.c
diff -u /dev/null squid/src/fs/sfs/store_dir_sfs.c:1.1.2.12
--- /dev/null	Tue Sep 28 18:35:34 2004
+++ squid/src/fs/sfs/store_dir_sfs.c	Wed Feb  7 02:18:08 2001
@@ -0,0 +1,1798 @@
+
+/*
+ * $Id$
+ *
+ * DEBUG: section 47    Store Directory Routines
+ * AUTHOR: Duane Wessels
+ *
+ * SQUID Web Proxy Cache          http://www.squid-cache.org/
+ * ----------------------------------------------------------
+ *
+ *  Squid is the result of efforts by numerous individuals from
+ *  the Internet community; see the CONTRIBUTORS file for full
+ *  details.   Many organizations have provided support for Squid's
+ *  development; see the SPONSORS file for full details.  Squid is
+ *  Copyrighted (C) 2001 by the Regents of the University of
+ *  California; see the COPYRIGHT file for full details.  Squid
+ *  incorporates software developed and/or copyrighted by other
+ *  sources; see the CREDITS file for full details.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *  
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *  
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
+ *
+ */
+
+#include "squid.h"
+
+#include "store_sfs.h"
+
+#define DefaultLevelOneDirs     16
+#define DefaultLevelTwoDirs     256
+#define STORE_META_BSFSZ 4096
+
+typedef struct _RebuildState RebuildState;
+struct _RebuildState {
+    SwapDir *sd;
+    int n_read;
+    FILE *log;
+    int speed;
+    int curlvl1;
+    int curlvl2;
+    struct {
+	unsigned int need_to_validate:1;
+	unsigned int clean:1;
+	unsigned int init:1;
+    } flags;
+    int done;
+    int in_dir;
+    int fn;
+    struct dirent *entry;
+    DIR *td;
+    char fullpath[SQUID_MAXPATHLEN];
+    char fullfilename[SQUID_MAXPATHLEN];
+    struct _store_rebuild_data counts;
+};
+
+static int n_sfs_dirs = 0;
+static int *sfs_dir_index = NULL;
+MemPool *sfs_state_pool = NULL;
+static int sfs_initialised = 0;
+
+static char *storeSfsDirSwapSubDir(SwapDir *, int subdirn);
+static int storeSfsDirVerifyCacheDirs(SwapDir *);
+static int storeSfsDirVerifyDirectory(const char *path);
+static char *storeSfsDirSwapLogFile(SwapDir *, const char *);
+static EVH storeSfsDirRebuildFromDirectory;
+static EVH storeSfsDirRebuildFromSwapLog;
+static int storeSfsDirGetNextFile(RebuildState *, int *sfileno, int *size);
+static StoreEntry *storeSfsDirAddDiskRestore(SwapDir * SD, const cache_key * key,
+    int file_number,
+    size_t swap_file_sz,
+    time_t expires,
+    time_t timestamp,
+    time_t lastref,
+    time_t lastmod,
+    u_num32 refcount,
+    u_short flags,
+    int clean);
+static void storeSfsDirRebuild(SwapDir * sd);
+static void storeSfsDirCloseTmpSwapLog(SwapDir * sd);
+static FILE *storeSfsDirOpenTmpSwapLog(SwapDir *, int *, int *);
+static STLOGOPEN storeSfsDirOpenSwapLog;
+static STINIT storeSfsDirInit;
+static STFREE storeSfsDirFree;
+static STLOGCLEANSTART storeSfsDirWriteCleanStart;
+static STLOGCLEANNEXTENTRY storeSfsDirCleanLogNextEntry;
+static STLOGCLEANWRITE storeSfsDirWriteCleanEntry;
+static STLOGCLEANDONE storeSfsDirWriteCleanDone;
+static STLOGCLOSE storeSfsDirCloseSwapLog;
+static STLOGWRITE storeSfsDirSwapLog;
+static STNEWFS storeSfsDirNewfs;
+static STDUMP storeSfsDirDump;
+static STMAINTAINFS storeSfsDirMaintain;
+static STCHECKOBJ storeSfsDirCheckObj;
+static STREFOBJ storeSfsDirRefObj;
+static STUNREFOBJ storeSfsDirUnrefObj;
+static STSYNC storeSfsSync;
+static QS rev_int_sort;
+static int storeSfsDirClean(int swap_index);
+static EVH storeSfsDirCleanEvent;
+static int storeSfsDirIs(SwapDir * sd);
+static int storeSfsFilenoBelongsHere(int fn, int F0, int F1, int F2);
+static int storeSfsCleanupDoubleCheck(SwapDir *, StoreEntry *);
+static void storeSfsDirStats(SwapDir *, StoreEntry *);
+static void storeSfsDirInitBitmap(SwapDir *);
+static int storeSfsDirValidFileno(SwapDir *, sfileno, int);
+static STCALLBACK storeSfsDirCallback;
+
+/*
+ * These functions were ripped straight out of the heart of store_dir.c.
+ * They assume that the given filenum is on a sfs partiton, which may or
+ * may not be true.. 
+ * XXX this evilness should be tidied up at a later date!
+ */
+
+int
+storeSfsDirMapBitTest(SwapDir * SD, int fn)
+{
+    sfileno filn = fn;
+    sfsinfo_t *sfsinfo;
+    sfsinfo = (sfsinfo_t *) SD->fsdata;
+    return file_map_bit_test(sfsinfo->map, filn);
+}
+
+void
+storeSfsDirMapBitSet(SwapDir * SD, int fn)
+{
+    sfileno filn = fn;
+    sfsinfo_t *sfsinfo;
+    sfsinfo = (sfsinfo_t *) SD->fsdata;
+    file_map_bit_set(sfsinfo->map, filn);
+}
+
+void
+storeSfsDirMapBitReset(SwapDir * SD, int fn)
+{
+    sfileno filn = fn;
+    sfsinfo_t *sfsinfo;
+    sfsinfo = (sfsinfo_t *) SD->fsdata;
+    /*
+     * We have to test the bit before calling file_map_bit_reset.
+     * file_map_bit_reset doesn't do bounds checking.  It assumes
+     * filn is a valid file number, but it might not be because
+     * the map is dynamic in size.  Also clearing an already clear
+     * bit puts the map counter of-of-whack.
+     */
+    if (file_map_bit_test(sfsinfo->map, filn))
+	file_map_bit_reset(sfsinfo->map, filn);
+}
+
+int
+storeSfsDirMapBitAllocate(SwapDir * SD)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata;
+    int fn;
+    fn = file_map_allocate(sfsinfo->map, sfsinfo->suggest);
+    file_map_bit_set(sfsinfo->map, fn);
+    sfsinfo->suggest = fn + 1;
+    return fn;
+}
+
+/*
+ * Initialise the sfs bitmap
+ *
+ * If there already is a bitmap, and the numobjects is larger than currently
+ * configured, we allocate a new bitmap and 'grow' the old one into it.
+ */
+static void
+storeSfsDirInitBitmap(SwapDir * sd)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+
+    if (sfsinfo->map == NULL) {
+	/* First time */
+	sfsinfo->map = file_map_create();
+    } else if (sfsinfo->map->max_n_files) {
+	/* it grew, need to expand */
+	/* XXX We don't need it anymore .. */
+    }
+    /* else it shrunk, and we leave the old one in place */
+}
+
+static char *
+storeSfsDirSwapSubDir(SwapDir * sd, int subdirn)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+
+    LOCAL_ARRAY(char, fullfilename, SQUID_MAXPATHLEN);
+    assert(0 <= subdirn && subdirn < sfsinfo->l1);
+    snprintf(fullfilename, SQUID_MAXPATHLEN, "%s/%02X", sd->path, subdirn);
+    return fullfilename;
+}
+
+static int
+storeSfsDirVerifyDirectory(const char *path)
+{
+    struct stat sb;
+    if (stat(path, &sb) < 0) {
+	debug(20, 0) ("%s: %s\n", path, xstrerror());
+	return -1;
+    }
+    if (S_ISDIR(sb.st_mode) == 0) {
+	debug(20, 0) ("%s is not a directory\n", path);
+	return -1;
+    }
+    return 0;
+}
+
+/*
+ * This function is called by storeSfsDirInit().  If this returns < 0,
+ * then Squid exits, complains about swap directories not
+ * existing, and instructs the admin to run 'squid -z'
+ */
+static int
+storeSfsDirVerifyCacheDirs(SwapDir * sd)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+    int j;
+    const char *path = sd->path;
+
+    if (storeSfsDirVerifyDirectory(path) < 0)
+	return -1;
+    for (j = 0; j < sfsinfo->l1; j++) {
+	path = storeSfsDirSwapSubDir(sd, j);
+	if (storeSfsDirVerifyDirectory(path) < 0)
+	    return -1;
+    }
+    return 0;
+}
+
+
+static char *
+storeSfsDirSwapLogFile(SwapDir * sd, const char *ext)
+{
+    LOCAL_ARRAY(char, path, SQUID_MAXPATHLEN);
+    LOCAL_ARRAY(char, pathtmp, SQUID_MAXPATHLEN);
+    LOCAL_ARRAY(char, digit, 32);
+    char *pathtmp2;
+    if (Config.Log.swap) {
+	xstrncpy(pathtmp, sd->path, SQUID_MAXPATHLEN - 64);
+	while (index(pathtmp, '/'))
+	    *index(pathtmp, '/') = '.';
+	while (strlen(pathtmp) && pathtmp[strlen(pathtmp) - 1] == '.')
+	    pathtmp[strlen(pathtmp) - 1] = '\0';
+	for (pathtmp2 = pathtmp; *pathtmp2 == '.'; pathtmp2++);
+	snprintf(path, SQUID_MAXPATHLEN - 64, Config.Log.swap, pathtmp2);
+	if (strncmp(path, Config.Log.swap, SQUID_MAXPATHLEN - 64) == 0) {
+	    strcat(path, ".");
+	    snprintf(digit, 32, "%02d", sd->index);
+	    strncat(path, digit, 3);
+	}
+    } else {
+	xstrncpy(path, sd->path, SQUID_MAXPATHLEN - 64);
+	strcat(path, "/swap.state");
+    }
+    if (ext)
+	strncat(path, ext, 16);
+    return path;
+}
+
+static void
+storeSfsDirOpenSwapLog(SwapDir * sd)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+    char *path;
+    int fd;
+    path = storeSfsDirSwapLogFile(sd, NULL);
+    fd = file_open(path, O_WRONLY | O_CREAT | O_BINARY);
+    if (fd < 0) {
+	debug(50, 1) ("%s: %s\n", path, xstrerror());
+	fatal("storeSfsDirOpenSwapLog: Failed to open swap log.");
+    }
+    debug(47, 3) ("Cache Dir #%d log opened on FD %d\n", sd->index, fd);
+    sfsinfo->swaplog_fd = fd;
+    if (0 == n_sfs_dirs)
+	assert(NULL == sfs_dir_index);
+    n_sfs_dirs++;
+    assert(n_sfs_dirs <= Config.cacheSwap.n_configured);
+}
+
+static void
+storeSfsDirCloseSwapLog(SwapDir * sd)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+    if (sfsinfo->swaplog_fd < 0)	/* not open */
+	return;
+    file_close(sfsinfo->swaplog_fd);
+    debug(47, 3) ("Cache Dir #%d log closed on FD %d\n",
+	sd->index, sfsinfo->swaplog_fd);
+    sfsinfo->swaplog_fd = -1;
+    n_sfs_dirs--;
+    assert(n_sfs_dirs >= 0);
+    if (0 == n_sfs_dirs)
+	safe_free(sfs_dir_index);
+}
+
+static void
+storeSfsDirInit(SwapDir * sd)
+{
+    sfsinfo_t *sfsinfo = sd->fsdata;
+
+    static int started_clean_event = 0;
+    static const char *errmsg =
+    "\tFailed to verify one of the swap directories, Check cache.log\n"
+    "\tfor details.  Run 'squid -z' to create swap directories\n"
+    "\tif needed, or if running Squid for the first time.";
+    storeSfsDirInitBitmap(sd);
+
+    /* Mount the FS */
+    assert(sfsinfo->sfsid == 0);
+    sfsinfo->sfsid = sfs_mount(sd->path);
+    if (sfsinfo->sfsid < 0)
+        fatalf("Failed to mount %s!\n", sd->path);
+
+#if 0
+    /* We have to verify it some other way */
+    if (storeSfsDirVerifyCacheDirs(sd) < 0)
+	fatal(errmsg);
+#endif
+    storeSfsDirOpenSwapLog(sd);
+    storeSfsDirRebuild(sd);
+    if (!started_clean_event) {
+	eventAdd("storeDirClean", storeSfsDirCleanEvent, NULL, 15.0, 1);
+	started_clean_event = 1;
+    }
+    (void) storeDirGetBlkSize(sd->path, &sd->fs.blksize);
+}
+
+static void
+storeSfsDirRebuildFromDirectory(void *data)
+{
+    RebuildState *rb = data;
+    SwapDir *SD = rb->sd;
+    LOCAL_ARRAY(char, hdr_buf, SM_PAGE_SIZE);
+    StoreEntry *e = NULL;
+    StoreEntry tmpe;
+    cache_key key[MD5_DIGEST_CHARS];
+    int sfileno = 0;
+    int count;
+    int size;
+    struct stat sb;
+    int swap_hdr_len;
+    int fd = -1;
+    tlv *tlv_list;
+    tlv *t;
+
+    sfsblock_t currentEntry;
+    sfsinfo_t *sfsinfo = rb->sd->fsdata;
+    int filesize;
+
+    assert(rb != NULL);
+    debug(20, 3) ("storeSfsDirRebuildFromDirectory: DIR #%d\n", rb->sd->index);
+
+    /* We don't do anything right now */
+    store_dirs_rebuilding--;
+    storeSfsDirCloseTmpSwapLog(rb->sd);
+    storeRebuildComplete(&rb->counts);
+    cbdataFree(rb);
+    return;
+
+    currentEntry = 0;
+
+    for (count = 0; count < rb->speed; count++) {
+	assert(fd == -1);
+	fd = sfs_openNextInode(sfsinfo->sfsid,&currentEntry);
+	if (fd == -2) {
+	    debug(20, 1) ("Done scanning %s dir (%d entries)\n",
+		rb->sd->path, rb->n_read);
+	    store_dirs_rebuilding--;
+	    storeSfsDirCloseTmpSwapLog(rb->sd);
+	    storeRebuildComplete(&rb->counts);
+	    cbdataFree(rb);
+	    return;
+	} else if (fd < 0) {
+	    continue;
+	}
+	assert(fd > -1);
+	filesize = sfs_filesize(sfsinfo->sfsid,currentEntry);
+
+	if ((++rb->counts.scancount & 0xFFFF) == 0)
+	    debug(20, 3) ("  %s %7d files opened so far.\n",
+		rb->sd->path, rb->counts.scancount);
+	debug(20, 9) ("file_in: fd=%d %08X\n", fd, currentEntry);
+	statCounter.syscalls.disk.reads++;
+	if (sfs_read(fd, hdr_buf, SM_PAGE_SIZE, _SFS_IO_SYNC, NULL) < 0) {
+	    debug(20, 1) ("storeSfsDirRebuildFromDirectory: sfs_read(FD %d): %s\n",
+		fd, xstrerror());
+	    sfs_close(fd,_SFS_IO_SYNC,NULL);
+	    store_open_disk_fd--;
+	    fd = -1;
+	    continue;
+	}
+	swap_hdr_len = 0;
+	tlv_list = storeSwapMetaUnpack(hdr_buf, &swap_hdr_len);
+	if (tlv_list == NULL) {
+	    debug(20, 1) ("storeSfsDirRebuildFromDirectory: failed to get meta data for file %d\n",currentEntry);
+	    sfs_close(fd,_SFS_IO_SYNC,NULL);
+            store_open_disk_fd--;
+            fd = -1;
+	    sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL);
+	    continue;
+	}
+    	sfs_close(fd,_SFS_IO_SYNC,NULL);
+	store_open_disk_fd--;
+	fd = -1;
+	debug(20, 3) ("storeSfsDirRebuildFromDirectory: successful swap meta unpacking\n");
+	memset(key, '\0', MD5_DIGEST_CHARS);
+	memset(&tmpe, '\0', sizeof(StoreEntry));
+	for (t = tlv_list; t; t = t->next) {
+	    switch (t->type) {
+	    case STORE_META_KEY:
+		assert(t->length == MD5_DIGEST_CHARS);
+		xmemcpy(key, t->value, MD5_DIGEST_CHARS);
+		break;
+	    case STORE_META_STD:
+		assert(t->length == STORE_HDR_METASIZE);
+		xmemcpy(&tmpe.timestamp, t->value, STORE_HDR_METASIZE);
+		break;
+	    default:
+		break;
+	    }
+	}
+	storeSwapTLVFree(tlv_list);
+	tlv_list = NULL;
+	if (storeKeyNull(key)) {
+	    debug(20, 1) ("storeSfsDirRebuildFromDirectory: NULL key\n");
+	    sfs_close(fd,_SFS_IO_SYNC,NULL);
+	    sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL);
+	    continue;
+	}
+	tmpe.hash.key = key;
+	if (tmpe.swap_file_sz == 0) {
+	    tmpe.swap_file_sz = filesize;
+	} else if (tmpe.swap_file_sz == filesize - swap_hdr_len) {
+	    tmpe.swap_file_sz = filesize;
+	} else if (tmpe.swap_file_sz != filesize) {
+	    debug(20, 1) ("storeSfsDirRebuildFromDirectory: SIZE MISMATCH %d!=%d\n",
+		tmpe.swap_file_sz, filesize);
+	    sfs_close(fd,_SFS_IO_SYNC,NULL);
+	    sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL);
+	    continue;
+	}
+	if (EBIT_TEST(tmpe.flags, KEY_PRIVATE)) {
+	    sfs_close(fd,_SFS_IO_SYNC,NULL);
+	    sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL);
+	    rb->counts.badflags++;
+	    continue;
+	}
+	e = storeGet(key);
+	if (e && e->lastref >= tmpe.lastref) {
+	    /* key already exists, current entry is newer */
+	    /* keep old, ignore new */
+	    rb->counts.dupcount++;
+	    continue;
+	} else if (NULL != e) {
+	    /* URL already exists, this swapfile not being used */
+	    /* junk old, load new */
+	    storeRelease(e);	/* release old entry */
+	    rb->counts.dupcount++;
+	}
+	rb->counts.objcount++;
+	storeEntryDump(&tmpe, 5);
+	e = storeSfsDirAddDiskRestore(SD, key,
+	    sfileno,
+	    tmpe.swap_file_sz,
+	    tmpe.expires,
+	    tmpe.timestamp,
+	    tmpe.lastref,
+	    tmpe.lastmod,
+	    tmpe.refcount,	/* refcount */
+	    tmpe.flags,		/* flags */
+	    (int) rb->flags.clean);
+	storeDirSwapLog(e, SWAP_LOG_ADD);
+    }
+    eventAdd("storeRebuild", storeSfsDirRebuildFromDirectory, rb, 0.0, 1);
+}
+
+static void
+storeSfsDirRebuildFromSwapLog(void *data)
+{
+    RebuildState *rb = data;
+    SwapDir *SD = rb->sd;
+    StoreEntry *e = NULL;
+    storeSwapLogData s;
+    size_t ss = sizeof(storeSwapLogData);
+    int count;
+    int used;			/* is swapfile already in use? */
+    int disk_entry_newer;	/* is the log entry newer than current entry? */
+    double x;
+    assert(rb != NULL);
+
+    /* We don't do anything right now */
+    store_dirs_rebuilding--;
+    storeSfsDirCloseTmpSwapLog(rb->sd);
+    storeRebuildComplete(&rb->counts);
+    cbdataFree(rb);
+    return;
+
+    /* load a number of objects per invocation */
+    for (count = 0; count < rb->speed; count++) {
+	if (fread(&s, ss, 1, rb->log) != 1) {
+	    debug(20, 1) ("Done reading %s swaplog (%d entries)\n",
+		rb->sd->path, rb->n_read);
+	    fclose(rb->log);
+	    rb->log = NULL;
+	    store_dirs_rebuilding--;
+	    storeSfsDirCloseTmpSwapLog(rb->sd);
+	    storeRebuildComplete(&rb->counts);
+	    cbdataFree(rb);
+	    return;
+	}
+	rb->n_read++;
+	if (s.op <= SWAP_LOG_NOP)
+	    continue;
+	if (s.op >= SWAP_LOG_MAX)
+	    continue;
+	/*
+	 * BC: during 2.4 development, we changed the way swap file
+	 * numbers are assigned and stored.  The high 16 bits used
+	 * to encode the SD index number.  There used to be a call
+	 * to storeDirProperFileno here that re-assigned the index 
+	 * bits.  Now, for backwards compatibility, we just need
+	 * to mask it off.
+	 */
+	s.swap_filen &= 0x00FFFFFF;
+	debug(20, 3) ("storeSfsDirRebuildFromSwapLog: %s %s %08X\n",
+	    swap_log_op_str[(int) s.op],
+	    storeKeyText(s.key),
+	    s.swap_filen);
+	if (s.op == SWAP_LOG_ADD) {
+	    (void) 0;
+	} else if (s.op == SWAP_LOG_DEL) {
+	    if ((e = storeGet(s.key)) != NULL) {
+		/*
+		 * Make sure we don't unlink the file, it might be
+		 * in use by a subsequent entry.  Also note that
+		 * we don't have to subtract from store_swap_size
+		 * because adding to store_swap_size happens in
+		 * the cleanup procedure.
+		 */
+		storeExpireNow(e);
+		storeReleaseRequest(e);
+		storeSfsDirReplRemove(e);
+		if (e->swap_filen > -1) {
+		    storeSfsDirMapBitReset(SD, e->swap_filen);
+		    e->swap_filen = -1;
+		    e->swap_dirn = -1;
+		}
+		storeRelease(e);
+		rb->counts.objcount--;
+		rb->counts.cancelcount++;
+	    }
+	    continue;
+	} else {
+	    x = log(++rb->counts.bad_log_op) / log(10.0);
+	    if (0.0 == x - (double) (int) x)
+		debug(20, 1) ("WARNING: %d invalid swap log entries found\n",
+		    rb->counts.bad_log_op);
+	    rb->counts.invalid++;
+	    continue;
+	}
+	if ((++rb->counts.scancount & 0xFFF) == 0) {
+	    struct stat sb;
+	    if (0 == fstat(fileno(rb->log), &sb))
+		storeRebuildProgress(SD->index,
+		    (int) sb.st_size / ss, rb->n_read);
+	}
+	if (!storeSfsDirValidFileno(SD, s.swap_filen, 0)) {
+	    rb->counts.invalid++;
+	    continue;
+	}
+	if (EBIT_TEST(s.flags, KEY_PRIVATE)) {
+	    rb->counts.badflags++;
+	    continue;
+	}
+	e = storeGet(s.key);
+	used = storeSfsDirMapBitTest(SD, s.swap_filen);
+	/* If this URL already exists in the cache, does the swap log
+	 * appear to have a newer entry?  Compare 'lastref' from the
+	 * swap log to e->lastref. */
+	disk_entry_newer = e ? (s.lastref > e->lastref ? 1 : 0) : 0;
+	if (used && !disk_entry_newer) {
+	    /* log entry is old, ignore it */
+	    rb->counts.clashcount++;
+	    continue;
+	} else if (used && e && e->swap_filen == s.swap_filen && e->swap_dirn == SD->index) {
+	    /* swapfile taken, same URL, newer, update meta */
+	    if (e->store_status == STORE_OK) {
+		e->lastref = s.timestamp;
+		e->timestamp = s.timestamp;
+		e->expires = s.expires;
+		e->lastmod = s.lastmod;
+		e->flags = s.flags;
+		e->refcount += s.refcount;
+		storeSfsDirUnrefObj(SD, e);
+	    } else {
+		debug_trap("storeSfsDirRebuildFromSwapLog: bad condition");
+		debug(20, 1) ("\tSee %s:%d\n", __FILE__, __LINE__);
+	    }
+	    continue;
+	} else if (used) {
+	    /* swapfile in use, not by this URL, log entry is newer */
+	    /* This is sorta bad: the log entry should NOT be newer at this
+	     * point.  If the log is dirty, the filesize check should have
+	     * caught this.  If the log is clean, there should never be a
+	     * newer entry. */
+	    debug(20, 1) ("WARNING: newer swaplog entry for dirno %d, fileno %08X\n",
+		SD->index, s.swap_filen);
+	    /* I'm tempted to remove the swapfile here just to be safe,
+	     * but there is a bad race condition in the NOVM version if
+	     * the swapfile has recently been opened for writing, but
+	     * not yet opened for reading.  Because we can't map
+	     * swapfiles back to StoreEntrys, we don't know the state
+	     * of the entry using that file.  */
+	    /* We'll assume the existing entry is valid, probably because
+	     * were in a slow rebuild and the the swap file number got taken
+	     * and the validation procedure hasn't run. */
+	    assert(rb->flags.need_to_validate);
+	    rb->counts.clashcount++;
+	    continue;
+	} else if (e && !disk_entry_newer) {
+	    /* key already exists, current entry is newer */
+	    /* keep old, ignore new */
+	    rb->counts.dupcount++;
+	    continue;
+	} else if (e) {
+	    /* key already exists, this swapfile not being used */
+	    /* junk old, load new */
+	    storeExpireNow(e);
+	    storeReleaseRequest(e);
+	    storeSfsDirReplRemove(e);
+	    if (e->swap_filen > -1) {
+		/* Make sure we don't actually unlink the file */
+		storeSfsDirMapBitReset(SD, e->swap_filen);
+		e->swap_filen = -1;
+		e->swap_dirn = -1;
+	    }
+	    storeRelease(e);
+	    rb->counts.dupcount++;
+	} else {
+	    /* URL doesnt exist, swapfile not in use */
+	    /* load new */
+	    (void) 0;
+	}
+	/* update store_swap_size */
+	rb->counts.objcount++;
+	e = storeSfsDirAddDiskRestore(SD, s.key,
+	    s.swap_filen,
+	    s.swap_file_sz,
+	    s.expires,
+	    s.timestamp,
+	    s.lastref,
+	    s.lastmod,
+	    s.refcount,
+	    s.flags,
+	    (int) rb->flags.clean);
+	storeDirSwapLog(e, SWAP_LOG_ADD);
+    }
+    eventAdd("storeRebuild", storeSfsDirRebuildFromSwapLog, rb, 0.0, 1);
+}
+
+static int
+storeSfsDirGetNextFile(RebuildState * rb, int *sfileno, int *size)
+{
+    SwapDir *SD = rb->sd;
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata;
+    int fd = -1;
+    int used = 0;
+    int dirs_opened = 0;
+    debug(20, 3) ("storeSfsDirGetNextFile: flag=%d, %d: /%02X/%02X\n",
+	rb->flags.init,
+	rb->sd->index,
+	rb->curlvl1, rb->curlvl2);
+    if (rb->done)
+	return -2;
+    while (fd < 0 && rb->done == 0) {
+	fd = -1;
+	if (0 == rb->flags.init) {	/* initialize, open first file */
+	    rb->done = 0;
+	    rb->curlvl1 = 0;
+	    rb->curlvl2 = 0;
+	    rb->in_dir = 0;
+	    rb->flags.init = 1;
+	    assert(Config.cacheSwap.n_configured > 0);
+	}
+	if (0 == rb->in_dir) {	/* we need to read in a new directory */
+	    snprintf(rb->fullpath, SQUID_MAXPATHLEN, "%s/%02X/%02X",
+		rb->sd->path,
+		rb->curlvl1,
+		rb->curlvl2);
+	    if (rb->flags.init && rb->td != NULL)
+		closedir(rb->td);
+	    rb->td = NULL;
+	    if (dirs_opened)
+		return -1;
+	    rb->td = opendir(rb->fullpath);
+	    dirs_opened++;
+	    if (rb->td == NULL) {
+		debug(50, 1) ("storeSfsDirGetNextFile: opendir: %s: %s\n",
+		    rb->fullpath, xstrerror());
+	    } else {
+		rb->entry = readdir(rb->td);	/* skip . and .. */
+		rb->entry = readdir(rb->td);
+		if (rb->entry == NULL && errno == ENOENT)
+		    debug(20, 1) ("storeSfsDirGetNextFile: directory does not exist!.\n");
+		debug(20, 3) ("storeSfsDirGetNextFile: Directory %s\n", rb->fullpath);
+	    }
+	}
+	if (rb->td != NULL && (rb->entry = readdir(rb->td)) != NULL) {
+	    rb->in_dir++;
+	    if (sscanf(rb->entry->d_name, "%x", &rb->fn) != 1) {
+		debug(20, 3) ("storeSfsDirGetNextFile: invalid %s\n",
+		    rb->entry->d_name);
+		continue;
+	    }
+	    if (!storeSfsFilenoBelongsHere(rb->fn, rb->sd->index, rb->curlvl1, rb->curlvl2)) {
+		debug(20, 3) ("storeSfsDirGetNextFile: %08X does not belong in %d/%d/%d\n",
+		    rb->fn, rb->sd->index, rb->curlvl1, rb->curlvl2);
+		continue;
+	    }
+	    used = storeSfsDirMapBitTest(SD, rb->fn);
+	    if (used) {
+		debug(20, 3) ("storeSfsDirGetNextFile: Locked, continuing with next.\n");
+		continue;
+	    }
+	    snprintf(rb->fullfilename, SQUID_MAXPATHLEN, "%s/%s",
+		rb->fullpath, rb->entry->d_name);
+	    debug(20, 3) ("storeSfsDirGetNextFile: Opening %s\n", rb->fullfilename);
+	    fd = file_open(rb->fullfilename, O_RDONLY | O_BINARY);
+	    if (fd < 0)
+		debug(50, 1) ("storeSfsDirGetNextFile: %s: %s\n", rb->fullfilename, xstrerror());
+	    else
+		store_open_disk_fd++;
+	    continue;
+	}
+	rb->in_dir = 0;
+	if (++rb->curlvl2 < sfsinfo->l2)
+	    continue;
+	rb->curlvl2 = 0;
+	if (++rb->curlvl1 < sfsinfo->l1)
+	    continue;
+	rb->curlvl1 = 0;
+	rb->done = 1;
+    }
+    *sfileno = rb->fn;
+    return fd;
+}
+
+/* Add a new object to the cache with empty memory copy and pointer to disk
+ * use to rebuild store from disk. */
+static StoreEntry *
+storeSfsDirAddDiskRestore(SwapDir * SD, const cache_key * key,
+    int file_number,
+    size_t swap_file_sz,
+    time_t expires,
+    time_t timestamp,
+    time_t lastref,
+    time_t lastmod,
+    u_num32 refcount,
+    u_short flags,
+    int clean)
+{
+    StoreEntry *e = NULL;
+    debug(20, 5) ("storeSfsAddDiskRestore: %s, fileno=%08X\n", storeKeyText(key), file_number);
+    /* if you call this you'd better be sure file_number is not 
+     * already in use! */
+    e = new_StoreEntry(STORE_ENTRY_WITHOUT_MEMOBJ, NULL, NULL);
+    e->store_status = STORE_OK;
+    storeSetMemStatus(e, NOT_IN_MEMORY);
+    e->swap_status = SWAPOUT_DONE;
+    e->swap_filen = file_number;
+    e->swap_dirn = SD->index;
+    e->swap_file_sz = swap_file_sz;
+    e->lock_count = 0;
+    e->lastref = lastref;
+    e->timestamp = timestamp;
+    e->expires = expires;
+    e->lastmod = lastmod;
+    e->refcount = refcount;
+    e->flags = flags;
+    EBIT_SET(e->flags, ENTRY_CACHABLE);
+    EBIT_CLR(e->flags, RELEASE_REQUEST);
+    EBIT_CLR(e->flags, KEY_PRIVATE);
+    e->ping_status = PING_NONE;
+    EBIT_CLR(e->flags, ENTRY_VALIDATED);
+    storeSfsDirMapBitSet(SD, e->swap_filen);
+    storeHashInsert(e, key);	/* do it after we clear KEY_PRIVATE */
+    storeSfsDirReplAdd(SD, e);
+    return e;
+}
+
+CBDATA_TYPE(RebuildState);
+static void
+storeSfsDirRebuild(SwapDir * sd)
+{
+    RebuildState *rb;
+    int clean = 0;
+    int zero = 0;
+    FILE *fp;
+    EVH *func = NULL;
+    CBDATA_INIT_TYPE(RebuildState);
+    rb = CBDATA_ALLOC(RebuildState, NULL);
+    rb->sd = sd;
+    rb->speed = opt_foreground_rebuild ? 1 << 30 : 50;
+    /*
+     * If the swap.state file exists in the cache_dir, then
+     * we'll use storeSfsDirRebuildFromSwapLog(), otherwise we'll
+     * use storeSfsDirRebuildFromDirectory() to open up each file
+     * and suck in the meta data.
+     */
+    fp = storeSfsDirOpenTmpSwapLog(sd, &clean, &zero);
+    if (fp == NULL || zero) {
+	if (fp != NULL)
+	    fclose(fp);
+	func = storeSfsDirRebuildFromDirectory;
+    } else {
+	func = storeSfsDirRebuildFromSwapLog;
+	rb->log = fp;
+	rb->flags.clean = (unsigned int) clean;
+    }
+    if (!clean)
+	rb->flags.need_to_validate = 1;
+    debug(20, 1) ("Rebuilding storage in %s (%s)\n",
+	sd->path, clean ? "CLEAN" : "DIRTY");
+    store_dirs_rebuilding++;
+    eventAdd("storeRebuild", func, rb, 0.0, 1);
+}
+
+static void
+storeSfsDirCloseTmpSwapLog(SwapDir * sd)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+    char *swaplog_path = xstrdup(storeSfsDirSwapLogFile(sd, NULL));
+    char *new_path = xstrdup(storeSfsDirSwapLogFile(sd, ".new"));
+    int fd;
+    file_close(sfsinfo->swaplog_fd);
+#if defined (_SQUID_OS2_) || defined (_SQUID_CYGWIN_)
+    if (unlink(swaplog_path) < 0) {
+	debug(50, 0) ("%s: %s\n", swaplog_path, xstrerror());
+	fatal("storeSfsDirCloseTmpSwapLog: unlink failed");
+    }
+#endif
+    if (xrename(new_path, swaplog_path) < 0) {
+	fatal("storeSfsDirCloseTmpSwapLog: rename failed");
+    }
+    fd = file_open(swaplog_path, O_WRONLY | O_CREAT | O_BINARY);
+    if (fd < 0) {
+	debug(50, 1) ("%s: %s\n", swaplog_path, xstrerror());
+	fatal("storeSfsDirCloseTmpSwapLog: Failed to open swap log.");
+    }
+    safe_free(swaplog_path);
+    safe_free(new_path);
+    sfsinfo->swaplog_fd = fd;
+    debug(47, 3) ("Cache Dir #%d log opened on FD %d\n", sd->index, fd);
+}
+
+static FILE *
+storeSfsDirOpenTmpSwapLog(SwapDir * sd, int *clean_flag, int *zero_flag)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+    char *swaplog_path = xstrdup(storeSfsDirSwapLogFile(sd, NULL));
+    char *clean_path = xstrdup(storeSfsDirSwapLogFile(sd, ".last-clean"));
+    char *new_path = xstrdup(storeSfsDirSwapLogFile(sd, ".new"));
+    struct stat log_sb;
+    struct stat clean_sb;
+    FILE *fp;
+    int fd;
+    if (stat(swaplog_path, &log_sb) < 0) {
+	debug(47, 1) ("Cache Dir #%d: No log file\n", sd->index);
+	safe_free(swaplog_path);
+	safe_free(clean_path);
+	safe_free(new_path);
+	return NULL;
+    }
+    *zero_flag = log_sb.st_size == 0 ? 1 : 0;
+    /* close the existing write-only FD */
+    if (sfsinfo->swaplog_fd >= 0)
+	file_close(sfsinfo->swaplog_fd);
+    /* open a write-only FD for the new log */
+    fd = file_open(new_path, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY);
+    if (fd < 0) {
+	debug(50, 1) ("%s: %s\n", new_path, xstrerror());
+	fatal("storeDirOpenTmpSwapLog: Failed to open swap log.");
+    }
+    sfsinfo->swaplog_fd = fd;
+    /* open a read-only stream of the old log */
+    fp = fopen(swaplog_path, "r");
+    if (fp == NULL) {
+	debug(50, 0) ("%s: %s\n", swaplog_path, xstrerror());
+	fatal("Failed to open swap log for reading");
+    }
+#if defined(_SQUID_CYGWIN_)
+    setmode(fileno(fp), O_BINARY);
+#endif
+    memset(&clean_sb, '\0', sizeof(struct stat));
+    if (stat(clean_path, &clean_sb) < 0)
+	*clean_flag = 0;
+    else if (clean_sb.st_mtime < log_sb.st_mtime)
+	*clean_flag = 0;
+    else
+	*clean_flag = 1;
+    safeunlink(clean_path, 1);
+    safe_free(swaplog_path);
+    safe_free(clean_path);
+    safe_free(new_path);
+    return fp;
+}
+
+struct _clean_state {
+    char *cur;
+    char *new;
+    char *cln;
+    char *outbuf;
+    off_t outbuf_offset;
+    int fd;
+    RemovalPolicyWalker *walker;
+};
+
+#define CLEAN_BUF_SZ 16384
+/*
+ * Begin the process to write clean cache state.  For SFS this means
+ * opening some log files and allocating write buffers.  Return 0 if
+ * we succeed, and assign the 'func' and 'data' return pointers.
+ */
+static int
+storeSfsDirWriteCleanStart(SwapDir * sd)
+{
+    struct _clean_state *state = xcalloc(1, sizeof(*state));
+    struct stat sb;
+    sd->log.clean.write = NULL;
+    sd->log.clean.state = NULL;
+    state->new = xstrdup(storeSfsDirSwapLogFile(sd, ".clean"));
+    state->fd = file_open(state->new, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY);
+    if (state->fd < 0) {
+	xfree(state->new);
+	xfree(state);
+	return -1;
+    }
+    state->cur = xstrdup(storeSfsDirSwapLogFile(sd, NULL));
+    state->cln = xstrdup(storeSfsDirSwapLogFile(sd, ".last-clean"));
+    state->outbuf = xcalloc(CLEAN_BUF_SZ, 1);
+    state->outbuf_offset = 0;
+    state->walker = sd->repl->WalkInit(sd->repl);
+#if !(defined(_SQUID_OS2_) || defined (_SQUID_CYGWIN_))
+    unlink(state->new);
+#endif
+    unlink(state->cln);
+    debug(20, 3) ("storeDirWriteCleanLogs: opened %s, FD %d\n",
+	state->new, state->fd);
+#if HAVE_FCHMOD
+    if (stat(state->cur, &sb) == 0)
+	fchmod(state->fd, sb.st_mode);
+#endif
+    sd->log.clean.write = storeSfsDirWriteCleanEntry;
+    sd->log.clean.state = state;
+    return 0;
+}
+
+/*
+ * Get the next entry that is a candidate for clean log writing
+ */
+const StoreEntry *
+storeSfsDirCleanLogNextEntry(SwapDir * sd)
+{
+    const StoreEntry *entry = NULL;
+    struct _clean_state *state = sd->log.clean.state;
+    if (state->walker)
+	entry = state->walker->Next(state->walker);
+    return entry;
+}
+
+/*
+ * "write" an entry to the clean log file.
+ */
+static void
+storeSfsDirWriteCleanEntry(SwapDir * sd, const StoreEntry * e)
+{
+    storeSwapLogData s;
+    static size_t ss = sizeof(storeSwapLogData);
+    struct _clean_state *state = sd->log.clean.state;
+    memset(&s, '\0', ss);
+    s.op = (char) SWAP_LOG_ADD;
+    s.swap_filen = e->swap_filen;
+    s.timestamp = e->timestamp;
+    s.lastref = e->lastref;
+    s.expires = e->expires;
+    s.lastmod = e->lastmod;
+    s.swap_file_sz = e->swap_file_sz;
+    s.refcount = e->refcount;
+    s.flags = e->flags;
+    xmemcpy(&s.key, e->hash.key, MD5_DIGEST_CHARS);
+    xmemcpy(state->outbuf + state->outbuf_offset, &s, ss);
+    state->outbuf_offset += ss;
+    /* buffered write */
+    if (state->outbuf_offset + ss > CLEAN_BUF_SZ) {
+	if (write(state->fd, state->outbuf, state->outbuf_offset) < 0) {
+	    debug(50, 0) ("storeDirWriteCleanLogs: %s: write: %s\n",
+		state->new, xstrerror());
+	    debug(20, 0) ("storeDirWriteCleanLogs: Current swap logfile not replaced.\n");
+	    file_close(state->fd);
+	    state->fd = -1;
+	    unlink(state->new);
+	    safe_free(state);
+	    sd->log.clean.state = NULL;
+	    sd->log.clean.write = NULL;
+	}
+	state->outbuf_offset = 0;
+    }
+}
+
+static void
+storeSfsDirWriteCleanDone(SwapDir * sd)
+{
+    int fd;
+    struct _clean_state *state = sd->log.clean.state;
+    if (NULL == state)
+	return;
+    if (state->fd < 0)
+	return;
+    state->walker->Done(state->walker);
+    if (write(state->fd, state->outbuf, state->outbuf_offset) < 0) {
+	debug(50, 0) ("storeDirWriteCleanLogs: %s: write: %s\n",
+	    state->new, xstrerror());
+	debug(20, 0) ("storeDirWriteCleanLogs: Current swap logfile "
+	    "not replaced.\n");
+	file_close(state->fd);
+	state->fd = -1;
+	unlink(state->new);
+    }
+    safe_free(state->outbuf);
+    /*
+     * You can't rename open files on Microsoft "operating systems"
+     * so we have to close before renaming.
+     */
+    storeSfsDirCloseSwapLog(sd);
+    /* save the fd value for a later test */
+    fd = state->fd;
+    /* rename */
+    if (state->fd >= 0) {
+#if defined(_SQUID_OS2_) || defined (_SQUID_CYGWIN_)
+	file_close(state->fd);
+	state->fd = -1;
+	if (unlink(state->cur) < 0)
+	    debug(50, 0) ("storeDirWriteCleanLogs: unlinkd failed: %s, %s\n",
+		xstrerror(), state->cur);
+#endif
+	xrename(state->new, state->cur);
+    }
+    /* touch a timestamp file if we're not still validating */
+    if (store_dirs_rebuilding)
+	(void) 0;
+    else if (fd < 0)
+	(void) 0;
+    else
+	file_close(file_open(state->cln, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY));
+    /* close */
+    safe_free(state->cur);
+    safe_free(state->new);
+    safe_free(state->cln);
+    if (state->fd >= 0)
+	file_close(state->fd);
+    state->fd = -1;
+    safe_free(state);
+    sd->log.clean.state = NULL;
+    sd->log.clean.write = NULL;
+}
+
+static void
+storeSwapLogDataFree(void *s)
+{
+    memFree(s, MEM_SWAP_LOG_DATA);
+}
+
+static void
+storeSfsDirSwapLog(const SwapDir * sd, const StoreEntry * e, int op)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata;
+    storeSwapLogData *s = memAllocate(MEM_SWAP_LOG_DATA);
+    s->op = (char) op;
+    s->swap_filen = e->swap_filen;
+    s->timestamp = e->timestamp;
+    s->lastref = e->lastref;
+    s->expires = e->expires;
+    s->lastmod = e->lastmod;
+    s->swap_file_sz = e->swap_file_sz;
+    s->refcount = e->refcount;
+    s->flags = e->flags;
+    xmemcpy(s->key, e->hash.key, MD5_DIGEST_CHARS);
+    file_write(sfsinfo->swaplog_fd,
+	-1,
+	s,
+	sizeof(storeSwapLogData),
+	NULL,
+	NULL,
+	(FREE *) storeSwapLogDataFree);
+}
+
+static void
+storeSfsDirNewfs(SwapDir * sd)
+{
+    int sfsid;
+
+    debug(47, 3) ("Creating swap space in %s\n", sd->path);
+
+    /* Check to see whether we have a sfs store. */
+    sfsid = sfs_mount(sd->path);
+    if (sfsid < 0) {
+        /* it failed, we can do stuff.. */
+        /*
+         * note - the FS *data* size will be max_size, but sfs metadata
+         * will make it bigger. Just like normal FSes.
+         */
+        if (sfs_format(sd->path, (sd->max_size * 1024) / FRAGSIZE) < 0)
+            fatalf("error whilst formatting %s! : (%d) %s\n", sd->path,
+                errno, strerror(errno));
+    } else {
+        /* it suceeded, unmount */
+        debug(47, 3) ("Swap space in %s is already formatted\n", sd->path);
+        sfs_umount(sfsid, _SFS_IO_SYNC);
+    }
+    
+
+}
+
+static int
+rev_int_sort(const void *A, const void *B)
+{
+    const int *i1 = A;
+    const int *i2 = B;
+    return *i2 - *i1;
+}
+
+static int
+storeSfsDirClean(int swap_index)
+{
+    DIR *dp = NULL;
+    struct dirent *de = NULL;
+    LOCAL_ARRAY(char, p1, MAXPATHLEN + 1);
+    LOCAL_ARRAY(char, p2, MAXPATHLEN + 1);
+#if USE_TRUNCATE
+    struct stat sb;
+#endif
+    int files[20];
+    int swapfileno;
+    int fn;			/* same as swapfileno, but with dirn bits set */
+    int n = 0;
+    int k = 0;
+    int N0, N1, N2;
+    int D0, D1, D2;
+    SwapDir *SD;
+    sfsinfo_t *sfsinfo;
+    N0 = n_sfs_dirs;
+    D0 = sfs_dir_index[swap_index % N0];
+    SD = &Config.cacheSwap.swapDirs[D0];
+    sfsinfo = (sfsinfo_t *) SD->fsdata;
+    N1 = sfsinfo->l1;
+    D1 = (swap_index / N0) % N1;
+    N2 = sfsinfo->l2;
+    D2 = ((swap_index / N0) / N1) % N2;
+    snprintf(p1, SQUID_MAXPATHLEN, "%s/%02X/%02X",
+	Config.cacheSwap.swapDirs[D0].path, D1, D2);
+    debug(36, 3) ("storeDirClean: Cleaning directory %s\n", p1);
+    dp = opendir(p1);
+    if (dp == NULL) {
+	if (errno == ENOENT) {
+	    debug(36, 0) ("storeDirClean: WARNING: Creating %s\n", p1);
+	    if (mkdir(p1, 0777) == 0)
+		return 0;
+	}
+	debug(50, 0) ("storeDirClean: %s: %s\n", p1, xstrerror());
+	safeunlink(p1, 1);
+	return 0;
+    }
+    while ((de = readdir(dp)) != NULL && k < 20) {
+	if (sscanf(de->d_name, "%X", &swapfileno) != 1)
+	    continue;
+	fn = swapfileno;	/* XXX should remove this cruft ! */
+	if (storeSfsDirValidFileno(SD, fn, 1))
+	    if (storeSfsDirMapBitTest(SD, fn))
+		if (storeSfsFilenoBelongsHere(fn, D0, D1, D2))
+		    continue;
+#if USE_TRUNCATE
+	if (!stat(de->d_name, &sb))
+	    if (sb.st_size == 0)
+		continue;
+#endif
+	files[k++] = swapfileno;
+    }
+    closedir(dp);
+    if (k == 0)
+	return 0;
+    qsort(files, k, sizeof(int), rev_int_sort);
+    if (k > 10)
+	k = 10;
+    for (n = 0; n < k; n++) {
+	debug(36, 3) ("storeDirClean: Cleaning file %08X\n", files[n]);
+	snprintf(p2, MAXPATHLEN + 1, "%s/%08X", p1, files[n]);
+#if USE_TRUNCATE
+	truncate(p2, 0);
+#else
+	safeunlink(p2, 0);
+#endif
+	statCounter.swap.files_cleaned++;
+    }
+    debug(36, 3) ("Cleaned %d unused files from %s\n", k, p1);
+    return k;
+}
+
+static void
+storeSfsDirCleanEvent(void *unused)
+{
+    static int swap_index = 0;
+    int i;
+    int j = 0;
+    int n = 0;
+
+    /* We don't do anything right now */
+    return;
+    /*
+     * Assert that there are SFS cache_dirs configured, otherwise
+     * we should never be called.
+     */
+    assert(n_sfs_dirs);
+    if (NULL == sfs_dir_index) {
+	SwapDir *sd;
+	sfsinfo_t *sfsinfo;
+	/*
+	 * Initialize the little array that translates SFS cache_dir
+	 * number into the Config.cacheSwap.swapDirs array index.
+	 */
+	sfs_dir_index = xcalloc(n_sfs_dirs, sizeof(*sfs_dir_index));
+	for (i = 0, n = 0; i < Config.cacheSwap.n_configured; i++) {
+	    sd = &Config.cacheSwap.swapDirs[i];
+	    if (!storeSfsDirIs(sd))
+		continue;
+	    sfs_dir_index[n++] = i;
+	    sfsinfo = (sfsinfo_t *) sd->fsdata;
+	    j += (sfsinfo->l1 * sfsinfo->l2);
+	}
+	assert(n == n_sfs_dirs);
+	/*
+	 * Start the storeSfsDirClean() swap_index with a random
+	 * value.  j equals the total number of SFS level 2
+	 * swap directories
+	 */
+	swap_index = (int) (squid_random() % j);
+    }
+    if (0 == store_dirs_rebuilding) {
+	n = storeSfsDirClean(swap_index);
+	swap_index++;
+    }
+    eventAdd("storeDirClean", storeSfsDirCleanEvent, NULL,
+	15.0 * exp(-0.25 * n), 1);
+}
+
+static int
+storeSfsDirIs(SwapDir * sd)
+{
+    if (strncmp(sd->type, "sfs", 3) == 0)
+	return 1;
+    return 0;
+}
+
+/*
+ * Does swapfile number 'fn' belong in cachedir #F0,
+ * level1 dir #F1, level2 dir #F2?
+ */
+static int
+storeSfsFilenoBelongsHere(int fn, int F0, int F1, int F2)
+{
+    int D1, D2;
+    int L1, L2;
+    int filn = fn;
+    sfsinfo_t *sfsinfo;
+    assert(F0 < Config.cacheSwap.n_configured);
+    sfsinfo = (sfsinfo_t *) Config.cacheSwap.swapDirs[F0].fsdata;
+    L1 = sfsinfo->l1;
+    L2 = sfsinfo->l2;
+    D1 = ((filn / L2) / L2) % L1;
+    if (F1 != D1)
+	return 0;
+    D2 = (filn / L2) % L2;
+    if (F2 != D2)
+	return 0;
+    return 1;
+}
+
+int
+storeSfsDirValidFileno(SwapDir * SD, sfileno filn, int flag)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata;
+    if (filn < 0)
+	return 0;
+    /*
+     * If flag is set it means out-of-range file number should
+     * be considered invalid.
+     */
+    if (flag)
+	if (filn > sfsinfo->map->max_n_files)
+	    return 0;
+    return 1;
+}
+
+void
+storeSfsDirMaintain(SwapDir * SD)
+{
+    StoreEntry *e = NULL;
+    int removed = 0;
+    int max_scan;
+    int max_remove;
+    double f;
+    RemovalPurgeWalker *walker;
+    /* We can't delete objects while rebuilding swap */
+    if (store_dirs_rebuilding) {
+	return;
+    } else {
+	f = (double) (SD->cur_size - SD->low_size) / (SD->max_size - SD->low_size);
+	f = f < 0.0 ? 0.0 : f > 1.0 ? 1.0 : f;
+	max_scan = (int) (f * 400.0 + 100.0);
+	max_remove = (int) (f * 70.0 + 10.0);
+	/*
+	 * This is kinda cheap, but so we need this priority hack?
+	 */
+    }
+    debug(20, 3) ("storeMaintainSwapSpace: f=%f, max_scan=%d, max_remove=%d\n", f, max_scan, max_remove);
+    walker = SD->repl->PurgeInit(SD->repl, max_scan);
+    while (1) {
+	if (SD->cur_size < SD->low_size)
+	    break;
+	if (removed >= max_remove)
+	    break;
+	e = walker->Next(walker);
+	if (!e)
+	    break;		/* no more objects */
+	removed++;
+	storeRelease(e);
+    }
+    walker->Done(walker);
+    debug(20, (removed ? 2 : 3)) ("storeSfsDirMaintain: %s removed %d/%d f=%.03f max_scan=%d\n",
+	SD->path, removed, max_remove, f, max_scan);
+}
+
+/*
+ * storeSfsDirCheckObj
+ *
+ * This routine is called by storeDirSelectSwapDir to see if the given
+ * object is able to be stored on this filesystem. SFS filesystems will
+ * happily store anything as long as the LRU time isn't too small.
+ */
+int
+storeSfsDirCheckObj(SwapDir * SD, const StoreEntry * e)
+{
+#if OLD_UNUSED_CODE
+    if (storeSfsDirExpiredReferenceAge(SD) < 300) {
+	debug(20, 3) ("storeSfsDirCheckObj: NO: LRU Age = %d\n",
+	    storeSfsDirExpiredReferenceAge(SD));
+	/* store_check_cachable_hist.no.lru_age_too_low++; */
+	return -1;
+    }
+#endif
+    /* Return 999 (99.9%) constant load */
+    return 0;
+}
+
+/*
+ * storeSfsDirRefObj
+ *
+ * This routine is called whenever an object is referenced, so we can
+ * maintain replacement information within the storage fs.
+ */
+void
+storeSfsDirRefObj(SwapDir * SD, StoreEntry * e)
+{
+    debug(1, 3) ("storeSfsDirRefObj: referencing %p %d/%d\n", e, e->swap_dirn,
+	e->swap_filen);
+    if (SD->repl->Referenced)
+	SD->repl->Referenced(SD->repl, e, &e->repl);
+}
+
+/*
+ * storeSfsDirUnrefObj
+ * This routine is called whenever the last reference to an object is
+ * removed, to maintain replacement information within the storage fs.
+ */
+void
+storeSfsDirUnrefObj(SwapDir * SD, StoreEntry * e)
+{
+    debug(1, 3) ("storeSfsDirUnrefObj: referencing %p %d/%d\n", e, e->swap_dirn,
+	e->swap_filen);
+    if (SD->repl->Dereferenced)
+	SD->repl->Dereferenced(SD->repl, e, &e->repl);
+}
+
+/*
+ * storeSfsSync
+ *
+ * Sync the filesystem
+ */
+void
+storeSfsSync(SwapDir *SD)
+{
+    /* Sync the FS */
+    /* Handle any pending callbacks */
+    while (storeSfsDirCallback(SD) > 0);
+}
+
+/*
+ * storeSfsDirUnlinkFile
+ *
+ * This routine unlinks a file and pulls it out of the bitmap.
+ * It used to be in storeSfsUnlink(), however an interface change
+ * forced this bit of code here. Eeek.
+ */
+void
+storeSfsDirUnlinkFile(SwapDir * SD, sfileno f)
+{
+    sfsinfo_t *sfsinfo = SD->fsdata;
+    int retval;
+
+    debug(79, 3) ("storeSfsDirUnlinkFile: unlinking fileno %08X\n", f);
+    /* storeSfsDirMapBitReset(SD, f); */
+    retval = sfs_unlink(sfsinfo->sfsid, (sfsblock_t)f, _SFS_IO_ASYNC, NULL);
+    if (retval < 0) {
+        debug(79, 1) ("storeSfsDirUnlinkFile: Can't unlink %d/%08X!\n",
+          SD->index, f);
+    }
+}
+
+/*
+ * Add and remove the given StoreEntry from the replacement policy in
+ * use.
+ */
+
+void
+storeSfsDirReplAdd(SwapDir * SD, StoreEntry * e)
+{
+    debug(20, 4) ("storeSfsDirReplAdd: added node %p to dir %d\n", e,
+	SD->index);
+    SD->repl->Add(SD->repl, e, &e->repl);
+}
+
+
+void
+storeSfsDirReplRemove(StoreEntry * e)
+{
+    SwapDir *SD = INDEXSD(e->swap_dirn);
+    debug(20, 4) ("storeSfsDirReplRemove: remove node %p from dir %d\n", e,
+	SD->index);
+    SD->repl->Remove(SD->repl, e, &e->repl);
+}
+
+/*
+ * storeSfsDirCallback
+ *
+ * Handle pending IO operations that have completed
+ */
+int
+storeSfsDirCallback(SwapDir *SD)
+{
+    int retval;
+    sfsinfo_t *sfsinfo = SD->fsdata;
+    sfs_requestor *req;
+    storeIOState *sio;
+    enum sfs_request_type rtype;
+    int ops = 0;
+ 
+    /* XXX using sfs_requestor in here might be considered layer-breaking! */
+    while((req = sfs_getcompleted(sfsinfo->sfsid)) != NULL) {
+        /* Find the sio in question */
+        sio = req->dataptr;
+    	rtype = req->request_type;
+	retval = req->ret;
+
+	/* Remove the requestor from the list */
+	_sfs_remove_request(req);
+
+	if (cbdataValid(sio)) {
+	    /* Callback time */
+	    switch (rtype) {
+		case _SFS_OP_READ:
+		    storeSfsReadDone(sio, retval);
+		    break;
+
+    		case _SFS_OP_WRITE:
+       		    storeSfsWriteDone(sio, retval);
+	  	    break;
+
+		case _SFS_OP_CLOSE:
+                    storeSfsCloseDone(sio, retval);
+                    break;
+
+	     	case _SFS_OP_OPEN_READ:
+		case _SFS_OP_OPEN_WRITE:
+		case _SFS_OP_UNLINK:
+		case _SFS_OP_SYNC:
+		case _SFS_OP_UMOUNT:
+		    break;
+
+		default:
+		    debug(20, 1) ("storeSfsDirCallback: unknown op %d\n",
+			req->request_type);
+	    }
+	    /* Tag that we've done an IO */
+	    ops = 1;
+	    cbdataUnlock(sio);
+	}
+    }
+
+    return ops;
+}
+ 
+
+
+
+/* ========== LOCAL FUNCTIONS ABOVE, GLOBAL FUNCTIONS BELOW ========== */
+
+void
+storeSfsDirStats(SwapDir * SD, StoreEntry * sentry)
+{
+    sfsinfo_t *sfsinfo = SD->fsdata;
+    int totl_kb = 0;
+    int free_kb = 0;
+    int totl_in = 0;
+    int free_in = 0;
+    int x;
+    storeAppendPrintf(sentry, "First level subdirectories: %d\n", sfsinfo->l1);
+    storeAppendPrintf(sentry, "Second level subdirectories: %d\n", sfsinfo->l2);
+    storeAppendPrintf(sentry, "Maximum Size: %d KB\n", SD->max_size);
+    storeAppendPrintf(sentry, "Current Size: %d KB\n", SD->cur_size);
+    storeAppendPrintf(sentry, "Percent Used: %0.2f%%\n",
+	100.0 * SD->cur_size / SD->max_size);
+    storeAppendPrintf(sentry, "Filemap bits in use: %d of %d (%d%%)\n",
+	sfsinfo->map->n_files_in_map, sfsinfo->map->max_n_files,
+	percent(sfsinfo->map->n_files_in_map, sfsinfo->map->max_n_files));
+    x = storeDirGetUFSStats(SD->path, &totl_kb, &free_kb, &totl_in, &free_in);
+    if (0 == x) {
+	storeAppendPrintf(sentry, "Filesystem Space in use: %d/%d KB (%d%%)\n",
+	    totl_kb - free_kb,
+	    totl_kb,
+	    percent(totl_kb - free_kb, totl_kb));
+	storeAppendPrintf(sentry, "Filesystem Inodes in use: %d/%d (%d%%)\n",
+	    totl_in - free_in,
+	    totl_in,
+	    percent(totl_in - free_in, totl_in));
+    }
+    storeAppendPrintf(sentry, "Flags:");
+    if (SD->flags.selected)
+	storeAppendPrintf(sentry, " SELECTED");
+    if (SD->flags.read_only)
+	storeAppendPrintf(sentry, " READ-ONLY");
+    storeAppendPrintf(sentry, "\n");
+#if OLD_UNUSED_CODE
+#if !HEAP_REPLACEMENT
+    storeAppendPrintf(sentry, "LRU Expiration Age: %6.2f days\n",
+	(double) storeSfsDirExpiredReferenceAge(SD) / 86400.0);
+#else
+    storeAppendPrintf(sentry, "Storage Replacement Threshold:\t%f\n",
+	heap_peepminkey(sd.repl.heap.heap));
+#endif
+#endif /* OLD_UNUSED_CODE */
+}
+
+/*
+ * storeSfsDirReconfigure
+ *
+ * This routine is called when the given swapdir needs reconfiguring 
+ */
+void
+storeSfsDirReconfigure(SwapDir * sd, int index, char *path)
+{
+    char *token;
+    int i;
+    int size;
+    int l1;
+    int l2;
+    unsigned int read_only = 0;
+
+    i = GetInteger();
+    size = i << 10;		/* Mbytes to kbytes */
+    if (size <= 0)
+	fatal("storeSfsDirReconfigure: invalid size value");
+    i = GetInteger();
+    l1 = i;
+    if (l1 <= 0)
+	fatal("storeSfsDirReconfigure: invalid level 1 directories value");
+    i = GetInteger();
+    l2 = i;
+    if (l2 <= 0)
+	fatal("storeSfsDirReconfigure: invalid level 2 directories value");
+    if ((token = strtok(NULL, w_space)))
+	if (!strcasecmp(token, "read-only"))
+	    read_only = 1;
+
+    /* just reconfigure it */
+    if (size == sd->max_size)
+	debug(3, 1) ("Cache dir '%s' size remains unchanged at %d KB\n",
+	    path, size);
+    else
+	debug(3, 1) ("Cache dir '%s' size changed to %d KB\n",
+	    path, size);
+    sd->max_size = size;
+    if (sd->flags.read_only != read_only)
+	debug(3, 1) ("Cache dir '%s' now %s\n",
+	    path, read_only ? "Read-Only" : "Read-Write");
+    sd->flags.read_only = read_only;
+    return;
+}
+
+void
+storeSfsDirDump(StoreEntry * entry, const char *name, SwapDir * s)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) s->fsdata;
+    storeAppendPrintf(entry, "%s %s %s %d %d %d\n",
+	name,
+	"sfs",
+	s->path,
+	s->max_size >> 10,
+	sfsinfo->l1,
+	sfsinfo->l2);
+}
+
+/*
+ * Only "free" the filesystem specific stuff here
+ */
+static void
+storeSfsDirFree(SwapDir * s)
+{
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) s->fsdata;
+    if (sfsinfo->swaplog_fd > -1) {
+	file_close(sfsinfo->swaplog_fd);
+	sfsinfo->swaplog_fd = -1;
+    }
+
+    /* Sync the FS and handle pending callbacks */
+    storeSfsSync(s);
+
+    /* Unmount the FS */
+    sfs_umount(sfsinfo->sfsid, _SFS_IO_SYNC);
+    sfsinfo->sfsid = -1;
+
+    filemapFreeMemory(sfsinfo->map);
+    xfree(sfsinfo);
+    s->fsdata = NULL;		/* Will aid debugging... */
+}
+
+char *
+storeSfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath)
+{
+    LOCAL_ARRAY(char, fullfilename, SQUID_MAXPATHLEN);
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata;
+    int L1 = sfsinfo->l1;
+    int L2 = sfsinfo->l2;
+    if (!fullpath)
+	fullpath = fullfilename;
+    fullpath[0] = '\0';
+    snprintf(fullpath, SQUID_MAXPATHLEN, "%s/%02X/%02X/%08X",
+	SD->path,
+	((filn / L2) / L2) % L1,
+	(filn / L2) % L2,
+	filn);
+    return fullpath;
+}
+
+/*
+ * storeSfsCleanupDoubleCheck
+ *
+ * This is called by storeCleanup() if -S was given on the command line.
+ */
+static int
+storeSfsCleanupDoubleCheck(SwapDir * sd, StoreEntry * e)
+{
+    struct stat sb;
+    if (stat(storeSfsDirFullPath(sd, e->swap_filen, NULL), &sb) < 0) {
+	debug(20, 0) ("storeSfsCleanupDoubleCheck: MISSING SWAP FILE\n");
+	debug(20, 0) ("storeSfsCleanupDoubleCheck: FILENO %08X\n", e->swap_filen);
+	debug(20, 0) ("storeSfsCleanupDoubleCheck: PATH %s\n",
+	    storeSfsDirFullPath(sd, e->swap_filen, NULL));
+	storeEntryDump(e, 0);
+	return -1;
+    }
+    if (e->swap_file_sz != sb.st_size) {
+	debug(20, 0) ("storeSfsCleanupDoubleCheck: SIZE MISMATCH\n");
+	debug(20, 0) ("storeSfsCleanupDoubleCheck: FILENO %08X\n", e->swap_filen);
+	debug(20, 0) ("storeSfsCleanupDoubleCheck: PATH %s\n",
+	    storeSfsDirFullPath(sd, e->swap_filen, NULL));
+	debug(20, 0) ("storeSfsCleanupDoubleCheck: ENTRY SIZE: %d, FILE SIZE: %d\n",
+	    e->swap_file_sz, (int) sb.st_size);
+	storeEntryDump(e, 0);
+	return -1;
+    }
+    return 0;
+}
+
+/*
+ * storeSfsDirParse
+ *
+ * Called when a *new* fs is being setup.
+ */
+void
+storeSfsDirParse(SwapDir * sd, int index, char *path)
+{
+    char *token;
+    int i;
+    int size;
+    int l1;
+    int l2;
+    unsigned int read_only = 0;
+    sfsinfo_t *sfsinfo;
+
+    i = GetInteger();
+    size = i << 10;		/* Mbytes to kbytes */
+    if (size <= 0)
+	fatal("storeSfsDirParse: invalid size value");
+    i = GetInteger();
+    l1 = i;
+    if (l1 <= 0)
+	fatal("storeSfsDirParse: invalid level 1 directories value");
+    i = GetInteger();
+    l2 = i;
+    if (l2 <= 0)
+	fatal("storeSfsDirParse: invalid level 2 directories value");
+    if ((token = strtok(NULL, w_space)))
+	if (!strcasecmp(token, "read-only"))
+	    read_only = 1;
+
+    sfsinfo = xmalloc(sizeof(sfsinfo_t));
+    if (sfsinfo == NULL)
+	fatal("storeSfsDirParse: couldn't xmalloc() sfsinfo_t!\n");
+
+    sd->index = index;
+    sd->path = xstrdup(path);
+    sd->max_size = size;
+    sd->fsdata = sfsinfo;
+    sfsinfo->l1 = l1;
+    sfsinfo->l2 = l2;
+    sfsinfo->swaplog_fd = -1;
+    sfsinfo->map = NULL;	/* Debugging purposes */
+    sfsinfo->suggest = 0;
+    sd->flags.read_only = read_only;
+    sd->init = storeSfsDirInit;
+    sd->newfs = storeSfsDirNewfs;
+    sd->dump = storeSfsDirDump;
+    sd->freefs = storeSfsDirFree;
+    sd->dblcheck = storeSfsCleanupDoubleCheck;
+    sd->statfs = storeSfsDirStats;
+    sd->maintainfs = storeSfsDirMaintain;
+    sd->checkobj = storeSfsDirCheckObj;
+    sd->refobj = storeSfsDirRefObj;
+    sd->unrefobj = storeSfsDirUnrefObj;
+    sd->callback = storeSfsDirCallback;
+    sd->sync = storeSfsSync;
+    sd->obj.create = storeSfsCreate;
+    sd->obj.open = storeSfsOpen;
+    sd->obj.close = storeSfsClose;
+    sd->obj.read = storeSfsRead;
+    sd->obj.write = storeSfsWrite;
+    sd->obj.unlink = storeSfsUnlink;
+    sd->log.open = storeSfsDirOpenSwapLog;
+    sd->log.close = storeSfsDirCloseSwapLog;
+    sd->log.write = storeSfsDirSwapLog;
+    sd->log.clean.start = storeSfsDirWriteCleanStart;
+    sd->log.clean.nextentry = storeSfsDirCleanLogNextEntry;
+    sd->log.clean.done = storeSfsDirWriteCleanDone;
+
+    /* Initialise replacement policy stuff */
+    sd->repl = createRemovalPolicy(Config.replPolicy);
+}
+
+/*
+ * Initial setup / end destruction
+ */
+void
+storeSfsDirDone(void)
+{
+    memPoolDestroy(sfs_state_pool);
+    sfs_initialised = 0;
+}
+
+void
+storeFsSetup_sfs(storefs_entry_t * storefs)
+{
+    assert(!sfs_initialised);
+    storefs->parsefunc = storeSfsDirParse;
+    storefs->reconfigurefunc = storeSfsDirReconfigure;
+    storefs->donefunc = storeSfsDirDone;
+    sfs_state_pool = memPoolCreate("SFS IO State data", sizeof(sfsstate_t));
+    sfs_initialised = 1;
+}
Index: squid/src/fs/sfs/store_io_sfs.c
diff -u /dev/null squid/src/fs/sfs/store_io_sfs.c:1.1.2.8
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/sfs/store_io_sfs.c	Tue Feb  6 07:43:37 2001
@@ -0,0 +1,330 @@
+
+/*
+ * $Id$
+ *
+ * DEBUG: section 79    Storage Manager SFS Interface
+ * AUTHOR: Duane Wessels
+ *
+ * SQUID Web Proxy Cache          http://www.squid-cache.org/
+ * ----------------------------------------------------------
+ *
+ *  Squid is the result of efforts by numerous individuals from
+ *  the Internet community; see the CONTRIBUTORS file for full
+ *  details.   Many organizations have provided support for Squid's
+ *  development; see the SPONSORS file for full details.  Squid is
+ *  Copyrighted (C) 2001 by the Regents of the University of
+ *  California; see the COPYRIGHT file for full details.  Squid
+ *  incorporates software developed and/or copyrighted by other
+ *  sources; see the CREDITS file for full details.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *  
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *  
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
+ *
+ */
+
+#include "squid.h"
+#include "store_sfs.h"
+
+
+static void storeSfsIOCallback(storeIOState * sio, int errflag);
+static CBDUNL storeSfsIOFreeEntry;
+
+/* === PUBLIC =========================================================== */
+
+storeIOState *
+storeSfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback,
+    STIOCB * callback, void *callback_data)
+{
+    sfsinfo_t *sfsinfo = SD->fsdata;
+
+    sfileno f = e->swap_filen;
+    storeIOState *sio;
+    sfsfd_t fd;
+
+    debug(79, 3) ("storeSfsOpen: fileno %08X\n", f);
+    sio = NULL;
+
+    sio = CBDATA_ALLOC(storeIOState, storeSfsIOFreeEntry);
+    sio->fsstate = memPoolAlloc(sfs_state_pool);
+
+    fd = sfs_open(sfsinfo->sfsid, f, O_RDONLY, 0, _SFS_IO_SYNC, sio);
+
+    if (fd < 0) {
+	debug(79, 3) ("storeSfsOpen: got failure (%d)\n", errno);
+	return NULL;
+    }
+
+    debug(79, 3) ("storeSfsOpen: opened FD %d\n", fd);
+
+    sio->swap_filen = f;
+    sio->swap_dirn = SD->index;
+    sio->mode = O_RDONLY;
+    sio->callback = callback;
+    sio->callback_data = callback_data;
+    cbdataLock(callback_data);
+    sio->e = e;
+    ((sfsstate_t *) (sio->fsstate))->fd = fd;
+    ((sfsstate_t *) (sio->fsstate))->flags.writing = 0;
+    ((sfsstate_t *) (sio->fsstate))->flags.reading = 0;
+    ((sfsstate_t *) (sio->fsstate))->flags.close_request = 0;
+    ((sfsstate_t *) (sio->fsstate))->swap_filen = -1;
+
+    /* We should update the heap/dlink position here ! */
+    return sio;
+}
+
+storeIOState *
+storeSfsCreate(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, STIOCB * callback, void *callback_data)
+{
+    storeIOState *sio;
+    sfsfd_t fd;
+    sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata;
+    sfileno filn;
+    sdirno dirn;
+
+    sio = NULL;
+
+    /* Allocate a number */
+    dirn = SD->index;
+    filn = storeSfsDirMapBitAllocate(SD);
+    sfsinfo->suggest = filn + 1;
+
+    debug(79, 3) ("storeSfsCreate: fileno %08X\n", filn);
+
+    sio = CBDATA_ALLOC(storeIOState, storeSfsIOFreeEntry);
+    sio->fsstate = memPoolAlloc(sfs_state_pool);
+    fd = sfs_open(sfsinfo->sfsid, filn, O_CREAT | O_RDWR, 0, _SFS_IO_SYNC,
+      sio);
+
+    if (fd < 0) {
+	debug(79, 3) ("storeSfsCreate: got failure (%d)\n", errno);
+	return NULL;
+    }
+
+    debug(79, 3) ("storeSfsCreate: opened FD %d\n", fd);
+
+    sio->swap_filen = -1; /* Defer the actual allocation */
+    sio->swap_dirn = dirn;
+    sio->mode = O_CREAT | O_RDWR;
+    sio->callback = callback;
+    sio->callback_data = callback_data;
+    sio->file_callback = file_callback;
+    cbdataLock(callback_data);
+    sio->e = (StoreEntry *) e;
+    ((sfsstate_t *) (sio->fsstate))->fd = fd;
+    ((sfsstate_t *) (sio->fsstate))->flags.writing = 0;
+    ((sfsstate_t *) (sio->fsstate))->flags.reading = 0;
+    ((sfsstate_t *) (sio->fsstate))->flags.close_request = 0;
+    ((sfsstate_t *) (sio->fsstate))->swap_filen = filn;
+
+    /* now insert into the replacement policy */
+    storeSfsDirReplAdd(SD, e);
+    return sio;
+}
+
+void
+storeSfsClose(SwapDir * SD, storeIOState * sio)
+{
+    sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate;
+
+    debug(79, 3) ("storeSfsClose: dirno %d, fileno %08X, FD %d\n",
+	sio->swap_dirn, sio->swap_filen, sfsstate->fd);
+    /* storeSfsIOCallback calls sfs_close as part of it's normal operation
+     * - who said this interface was untidy? :( */
+    storeSfsIOCallback(sio, 0);
+}
+
+void
+storeSfsRead(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, STRCB * callback, void *callback_data)
+{
+    sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate;
+    int retval;
+
+    assert(sio->read.callback == NULL);
+    assert(sio->read.callback_data == NULL);
+    sio->read.callback = callback;
+    sio->read.callback_data = callback_data;
+    cbdataLock(callback_data);
+    debug(79, 3) ("storeSfsRead: dirno %d, fileno %08X, FD %d\n",
+	sio->swap_dirn, sio->swap_filen, sfsstate->fd);
+    sio->offset = offset;
+    sfsstate->flags.reading = 1;
+    assert(sfsstate->read_buf == NULL);
+    sfsstate->read_buf = buf;
+
+    if (offset > -1) {
+        debug(79, 3) ("storeSfsRead: seeking to %d\n", offset);
+        retval = sfs_seek(sfsstate->fd, offset, _SFS_IO_SYNC, NULL);
+        if (retval < 0) {
+            debug(79, 2) ("storeSfsRead: sfs_seek on %d failed!\n",
+              sfsstate->fd);
+            storeSfsIOCallback(sio, DISK_ERROR);
+        }
+    }
+    retval = sfs_read(sfsstate->fd, buf, size, _SFS_IO_ASYNC, sio);
+    if (retval < 0) {
+        debug(79, 2) ("storeSfsRead: sfs_read on %d failed!\n", sfsstate->fd);
+        storeSfsIOCallback(sio, DISK_ERROR);
+    }
+}
+
+void
+storeSfsWrite(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, FREE * free_func)
+{
+    sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate;
+    int retval;
+
+    debug(79, 3) ("storeSfsWrite: dirn %d, fileno %08X, FD %d\n", sio->swap_dirn, sio->swap_filen, sfsstate->fd);
+    sfsstate->flags.writing = 1;
+
+    if (offset > -1) {
+        debug(79, 3) ("storeSfsWrite: seeking to %d\n", offset);
+        retval = sfs_seek(sfsstate->fd, offset, _SFS_IO_SYNC, NULL);
+        if (retval < 0) {
+            debug(79, 2) ("storeSfsWrite: sfs_seek on %d failed!\n",
+              sfsstate->fd);
+            storeSfsIOCallback(sio, DISK_ERROR);
+        }
+    }
+    retval = sfs_write(sfsstate->fd, buf, size, _SFS_IO_ASYNC, sio);
+    if (retval < 0) {
+        debug(79, 2) ("storeSfsWrite: sfs_read on %d failed!\n", sfsstate->fd);
+        storeSfsIOCallback(sio, DISK_ERROR);
+    }
+}
+
+void
+storeSfsUnlink(SwapDir * SD, StoreEntry * e)
+{
+    debug(79, 3) ("storeSfsUnlink: fileno %08X\n", e->swap_filen);
+    storeSfsDirReplRemove(e);
+    storeSfsDirMapBitReset(SD, e->swap_filen);
+    storeSfsDirUnlinkFile(SD, e->swap_filen);
+}
+
+
+void
+storeSfsReadDone(storeIOState *sio, int retval)
+{
+    sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate;
+    STRCB *callback = sio->read.callback;
+    void *their_data = sio->read.callback_data;
+    ssize_t rlen;
+    char *buf = sfsstate->read_buf;
+
+    debug(79, 3) ("storeSfsReadDone: dirno %d, fileno %08X, FD %d, len %d\n",
+	sio->swap_dirn, sio->swap_filen, sfsstate->fd, retval);
+    sfsstate->flags.reading = 0;
+    if (retval < 0) {
+	debug(79, 3) ("storeSfsReadDone: got failure\n");
+	rlen = -1;
+    } else {
+	rlen = (ssize_t) retval;
+	sio->offset += retval;
+    }
+    assert(callback);
+    assert(their_data);
+    sio->read.callback = NULL;
+    sio->read.callback_data = NULL;
+    sfsstate->read_buf = NULL;
+
+    if (cbdataValid(their_data))
+	callback(their_data, buf, (size_t) rlen);
+    cbdataUnlock(their_data);
+}
+
+void
+storeSfsWriteDone(storeIOState *sio, int retval)
+{
+    sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate;
+    debug(79, 3) ("storeSfsWriteDone: dirno %d, fileno %08X, FD %d, len %d\n",
+	sio->swap_dirn, sio->swap_filen, sfsstate->fd, retval);
+    sfsstate->flags.writing = 0;
+    if (retval < 0) {
+	debug(79, 0) ("storeSfsWriteDone: got failure\n");
+	storeSfsIOCallback(sio, DISK_ERROR);
+	return;
+    }
+    sio->offset += retval;
+}
+
+/*
+ * storeSfsCloseDone - called when we complete the CLOSE op
+ * Note that if we get here and the sio is invalid we don't
+ * call the file_callback to notify the upper layers of the
+ * change in swap filenumber. This has the side effect that
+ * if we are called due to a sfs_close() done because of an
+ * error during swapin/out, we don't notify the layers of a
+ * change in swap filenumber, which is ok. :-)
+ *   -- adrian, typing justified text again
+ */
+void
+storeSfsCloseDone(storeIOState *sio, int retval)
+{
+    sfsstate_t *sfsstate = sio->fsstate;
+    int errflag;
+
+    debug(79, 3) ("storeSfsCloseDone: dirno %d, fileno %08X\n",
+        sio->swap_dirn, sfsstate->swap_filen);
+
+    if (retval == 0)
+        errflag = DISK_OK;
+    else
+        errflag = DISK_ERROR;
+
+    /* Call back the filen notify */
+    if ((retval == 0) && sio->file_callback && cbdataValid(sio) &&
+         cbdataValid(sio->callback_data)) {
+        sio->swap_filen = sfsstate->swap_filen;
+        sio->file_callback(sio->callback_data, 0, sio);
+    }
+
+    if (cbdataValid(sio->callback_data))
+	sio->callback(sio->callback_data, errflag, sio);
+    cbdataUnlock(sio->callback_data);
+    sio->callback_data = NULL;
+    sio->callback = NULL;
+    cbdataFree(sio);
+}
+
+
+/*  === STATIC =========================================================== */
+
+static void
+storeSfsIOCallback(storeIOState * sio, int errflag)
+{
+    sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate;
+    int retval;
+
+    debug(79, 3) ("storeSfsIOCallback: errflag=%d\n", errflag);
+    if (sfsstate->fd > -1) {
+	retval = sfs_close(sfsstate->fd, _SFS_IO_ASYNC, sio);
+        if (retval < 0) {
+            debug(79, 1) ("storeSfsIOCallback: Can't close %d/%08X!\n",
+              sio->swap_dirn, sfsstate->fd);
+        }
+    }
+
+    /* The rest of the shutdown will get run in storeSfsCloseDone() */
+}
+
+
+/*
+ * Clean up any references from the SIO before it get's released.
+ */
+static void
+storeSfsIOFreeEntry(void *sio)
+{
+    memPoolFree(sfs_state_pool, ((storeIOState *) sio)->fsstate);
+}
Index: squid/src/fs/sfs/store_sfs.h
diff -u /dev/null squid/src/fs/sfs/store_sfs.h:1.1.2.5
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/sfs/store_sfs.h	Tue Feb  6 07:43:37 2001
@@ -0,0 +1,62 @@
+/*
+ * store_sfs.h
+ *
+ * Internal declarations for the sfs routines
+ */
+
+#ifndef __STORE_SFS_H__
+#define __STORE_SFS_H__
+
+#include "sfs_defines.h"
+#include "sfs_lib.h"
+
+struct _sfsinfo_t {
+    int swaplog_fd;
+    int l1;
+    int l2;
+    fileMap *map;
+    int suggest;
+    sfsid_t sfsid; /* The SFS mount id .. */
+};
+
+struct _sfsstate_t {
+    sfsfd_t fd;
+    char *read_buf;
+    struct {
+	unsigned int close_request:1;
+	unsigned int reading:1;
+	unsigned int writing:1;
+    } flags;
+    int swap_filen;
+};
+
+typedef struct _sfsinfo_t sfsinfo_t;
+typedef struct _sfsstate_t sfsstate_t;
+
+/* The sfs_state memory pool */
+extern MemPool *sfs_state_pool;
+
+/*
+ * store dir stuff
+ */
+extern void storeSfsDirMapBitReset(SwapDir *, sfileno);
+extern int storeSfsDirMapBitAllocate(SwapDir *);
+extern char *storeSfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath);
+extern void storeSfsDirUnlinkFile(SwapDir *, sfileno);
+extern void storeSfsDirReplAdd(SwapDir * SD, StoreEntry *);
+extern void storeSfsDirReplRemove(StoreEntry *);
+extern int sfs_openNextInode(sfsid_t sfsid, sfsblock_t *cur);
+
+/*
+ * Store IO stuff
+ */
+extern STOBJCREATE storeSfsCreate;
+extern STOBJOPEN storeSfsOpen;
+extern STOBJCLOSE storeSfsClose;
+extern STOBJREAD storeSfsRead;
+extern STOBJWRITE storeSfsWrite;
+extern STOBJUNLINK storeSfsUnlink;
+extern void storeSfsReadDone(storeIOState *, int);
+extern void storeSfsWriteDone(storeIOState *, int);
+extern void storeSfsCloseDone(storeIOState *, int);
+#endif
Index: squid/src/fs/ufs/Makefile.in
diff -u squid/src/fs/ufs/Makefile.in:1.2 squid/src/fs/ufs/Makefile.in:1.2.34.1
--- squid/src/fs/ufs/Makefile.in:1.2	Sat Oct 21 09:44:46 2000
+++ squid/src/fs/ufs/Makefile.in	Sat Apr 14 02:46:36 2001
@@ -1,10 +1,10 @@
 #
-#  Makefile for the UFS storage driver for the Squid Object Cache server
+#  Makefile for the AUFS storage driver for the Squid Object Cache server
 #
 #  $Id$
 #
 
-FS		= ufs
+FS		= aufs
 
 top_srcdir	= @top_srcdir@
 VPATH		= @srcdir@
@@ -22,11 +22,15 @@
 OUT		= ../$(FS).a
 
 OBJS	 	= \
+		aiops.o \
+		async_io.o \
 		store_dir_ufs.o \
-		store_io_ufs.o
+		store_io_ufs.o \
+		fs_aufs.o \
+		fs_ufs.o
 
 
-all: $(OUT)
+all:    $(OUT)
 
 $(OUT): $(OBJS)
 	@rm -f ../stamp
@@ -34,6 +38,7 @@
 	$(RANLIB) $(OUT)
 
 $(OBJS): $(top_srcdir)/include/version.h ../../../include/autoconf.h
+$(OBJS): fs_structs.h
 
 .c.o:
 	@rm -f ../stamp
Index: squid/src/fs/ufs/README
diff -u /dev/null squid/src/fs/ufs/README:1.1.2.1
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/ufs/README	Sat Apr 14 02:46:36 2001
@@ -0,0 +1,23 @@
+Ok, quick run-down of contents:
+
+store_io_ufs.c - this holds store* interface functions, as per the
+programming guide.  These are shared functions between aio and io (old
+aufs and ufs) - they feed into the new io layer.
+
+fs_ufs.c - this holds ufs basic functions - the buildRequest and submitRequest
+functions, essentially the new interface.  These functions are leaned on
+heavily by aufs/aio, they represent the bulkd of the shared code.
+
+fs_aufs.c - this is the async covering over ufs, it handles the ctrlp structure
+and feeding requests into the queueing aio requires.  It should be fairly
+small shims on fs_ufs, and parts of it may vanish as time goes.
+
+store_dir_ufs.c - the old store_dir_ufs/store_dir_aufs file, as yet untranslated
+in terms of function/structure names.  This again is pretty much all shared
+code - see http://www.squid-cache.org/mail-archive/squid-dev/200001/0043.html
+for more details ;)
+
+fs_structs.h - structure definitions.
+
+aiops.c and async_io.c - from old aufs, untranslated yet, but will make up
+the body of the queueing code for aio.
Index: squid/src/fs/ufs/aiops.c
diff -u /dev/null squid/src/fs/ufs/aiops.c:1.1.2.1
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/ufs/aiops.c	Sat Apr 14 02:46:36 2001
@@ -0,0 +1,904 @@
+/*
+ * $Id$
+ *
+ * DEBUG: section 43    AIOPS
+ * AUTHOR: Stewart Forster <slf@connect.com.au>
+ *
+ * SQUID Web Proxy Cache          http://www.squid-cache.org/
+ * ----------------------------------------------------------
+ *
+ *  Squid is the result of efforts by numerous individuals from
+ *  the Internet community; see the CONTRIBUTORS file for full
+ *  details.   Many organizations have provided support for Squid's
+ *  development; see the SPONSORS file for full details.  Squid is
+ *  Copyrighted (C) 2001 by the Regents of the University of
+ *  California; see the COPYRIGHT file for full details.  Squid
+ *  incorporates software developed and/or copyrighted by other
+ *  sources; see the CREDITS file for full details.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *  
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *  
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
+ *
+ */
+
+#include "squid.h"
+#include "store_asyncufs.h"
+
+#include	<stdio.h>
+#include	<sys/types.h>
+#include	<sys/stat.h>
+#include	<fcntl.h>
+#include	<pthread.h>
+#include	<errno.h>
+#include	<dirent.h>
+#include	<signal.h>
+#if HAVE_SCHED_H
+#include	<sched.h>
+#endif
+
+#define RIDICULOUS_LENGTH	4096
+
+enum _aio_thread_status {
+    _THREAD_STARTING = 0,
+    _THREAD_WAITING,
+    _THREAD_BUSY,
+    _THREAD_FAILED,
+    _THREAD_DONE
+};
+
+enum _aio_request_type {
+    _AIO_OP_NONE = 0,
+    _AIO_OP_OPEN,
+    _AIO_OP_READ,
+    _AIO_OP_WRITE,
+    _AIO_OP_CLOSE,
+    _AIO_OP_UNLINK,
+    _AIO_OP_TRUNCATE,
+    _AIO_OP_OPENDIR,
+    _AIO_OP_STAT
+};
+
+typedef struct aio_request_t {
+    struct aio_request_t *next;
+    enum _aio_request_type request_type;
+    int cancelled;
+    char *path;
+    int oflag;
+    mode_t mode;
+    int fd;
+    char *bufferp;
+    char *tmpbufp;
+    int buflen;
+    off_t offset;
+    int whence;
+    int ret;
+    int err;
+    struct stat *tmpstatp;
+    struct stat *statp;
+    aio_result_t *resultp;
+} aio_request_t;
+
+typedef struct aio_request_queue_t {
+    pthread_mutex_t mutex;
+    pthread_cond_t cond;
+    aio_request_t *volatile head;
+    aio_request_t *volatile *volatile tailp;
+    unsigned long requests;
+    unsigned long blocked;	/* main failed to lock the queue */
+} aio_request_queue_t;
+
+typedef struct aio_thread_t aio_thread_t;
+struct aio_thread_t {
+    aio_thread_t *next;
+    pthread_t thread;
+    enum _aio_thread_status status;
+    struct aio_request_t *current_req;
+    unsigned long requests;
+};
+
+int aio_cancel(aio_result_t *);
+int aio_open(const char *, int, mode_t, aio_result_t *);
+int aio_read(int, char *, int, off_t, int, aio_result_t *);
+int aio_write(int, char *, int, off_t, int, aio_result_t *);
+int aio_close(int, aio_result_t *);
+int aio_unlink(const char *, aio_result_t *);
+int aio_truncate(const char *, off_t length, aio_result_t *);
+int aio_opendir(const char *, aio_result_t *);
+aio_result_t *aio_poll_done();
+int aio_sync(void);
+
+static void aio_init(void);
+static void aio_queue_request(aio_request_t *);
+static void aio_cleanup_request(aio_request_t *);
+static void *aio_thread_loop(void *);
+static void aio_do_open(aio_request_t *);
+static void aio_do_read(aio_request_t *);
+static void aio_do_write(aio_request_t *);
+static void aio_do_close(aio_request_t *);
+static void aio_do_stat(aio_request_t *);
+static void aio_do_unlink(aio_request_t *);
+static void aio_do_truncate(aio_request_t *);
+#if AIO_OPENDIR
+static void *aio_do_opendir(aio_request_t *);
+#endif
+static void aio_debug(aio_request_t *);
+static void aio_poll_queues(void);
+
+static aio_thread_t *threads = NULL;
+static int aio_initialised = 0;
+
+
+#define AIO_LARGE_BUFS  16384
+#define AIO_MEDIUM_BUFS	AIO_LARGE_BUFS >> 1
+#define AIO_SMALL_BUFS	AIO_LARGE_BUFS >> 2
+#define AIO_TINY_BUFS	AIO_LARGE_BUFS >> 3
+#define AIO_MICRO_BUFS	128
+
+static MemPool *aio_large_bufs = NULL;	/* 16K */
+static MemPool *aio_medium_bufs = NULL;		/* 8K */
+static MemPool *aio_small_bufs = NULL;	/* 4K */
+static MemPool *aio_tiny_bufs = NULL;	/* 2K */
+static MemPool *aio_micro_bufs = NULL;	/* 128K */
+
+static int request_queue_len = 0;
+static MemPool *aio_request_pool = NULL;
+static MemPool *aio_thread_pool = NULL;
+static aio_request_queue_t request_queue;
+static struct {
+    aio_request_t *head, **tailp;
+} request_queue2 = {
+
+    NULL, &request_queue2.head
+};
+static aio_request_queue_t done_queue;
+static struct {
+    aio_request_t *head, **tailp;
+} done_requests = {
+
+    NULL, &done_requests.head
+};
+static pthread_attr_t globattr;
+static struct sched_param globsched;
+static pthread_t main_thread;
+
+static MemPool *
+aio_get_pool(int size)
+{
+    MemPool *p;
+    if (size <= AIO_LARGE_BUFS) {
+	if (size <= AIO_MICRO_BUFS)
+	    p = aio_micro_bufs;
+	else if (size <= AIO_TINY_BUFS)
+	    p = aio_tiny_bufs;
+	else if (size <= AIO_SMALL_BUFS)
+	    p = aio_small_bufs;
+	else if (size <= AIO_MEDIUM_BUFS)
+	    p = aio_medium_bufs;
+	else
+	    p = aio_large_bufs;
+    } else
+	p = NULL;
+    return p;
+}
+
+static void *
+aio_xmalloc(int size)
+{
+    void *p;
+    MemPool *pool;
+
+    if ((pool = aio_get_pool(size)) != NULL) {
+	p = memPoolAlloc(pool);
+    } else
+	p = xmalloc(size);
+
+    return p;
+}
+
+static char *
+aio_xstrdup(const char *str)
+{
+    char *p;
+    int len = strlen(str) + 1;
+
+    p = aio_xmalloc(len);
+    strncpy(p, str, len);
+
+    return p;
+}
+
+static void
+aio_xfree(void *p, int size)
+{
+    MemPool *pool;
+
+    if ((pool = aio_get_pool(size)) != NULL) {
+	memPoolFree(pool, p);
+    } else
+	xfree(p);
+}
+
+static void
+aio_xstrfree(char *str)
+{
+    MemPool *pool;
+    int len = strlen(str) + 1;
+
+    if ((pool = aio_get_pool(len)) != NULL) {
+	memPoolFree(pool, str);
+    } else
+	xfree(str);
+}
+
+static void
+aio_init(void)
+{
+    int i;
+    aio_thread_t *threadp;
+
+    if (aio_initialised)
+	return;
+
+    pthread_attr_init(&globattr);
+#if HAVE_PTHREAD_ATTR_SETSCOPE
+    pthread_attr_setscope(&globattr, PTHREAD_SCOPE_SYSTEM);
+#endif
+    globsched.sched_priority = 1;
+    main_thread = pthread_self();
+#if HAVE_PTHREAD_SETSCHEDPARAM
+    pthread_setschedparam(main_thread, SCHED_OTHER, &globsched);
+#endif
+    globsched.sched_priority = 2;
+#if HAVE_PTHREAD_ATTR_SETSCHEDPARAM
+    pthread_attr_setschedparam(&globattr, &globsched);
+#endif
+
+    /* Initialize request queue */
+    if (pthread_mutex_init(&(request_queue.mutex), NULL))
+	fatal("Failed to create mutex");
+    if (pthread_cond_init(&(request_queue.cond), NULL))
+	fatal("Failed to create condition variable");
+    request_queue.head = NULL;
+    request_queue.tailp = &request_queue.head;
+    request_queue.requests = 0;
+    request_queue.blocked = 0;
+
+    /* Initialize done queue */
+    if (pthread_mutex_init(&(done_queue.mutex), NULL))
+	fatal("Failed to create mutex");
+    if (pthread_cond_init(&(done_queue.cond), NULL))
+	fatal("Failed to create condition variable");
+    done_queue.head = NULL;
+    done_queue.tailp = &done_queue.head;
+    done_queue.requests = 0;
+    done_queue.blocked = 0;
+
+    /* Create threads and get them to sit in their wait loop */
+    aio_thread_pool = memPoolCreate("aio_thread", sizeof(aio_thread_t));
+    for (i = 0; i < NUMTHREADS; i++) {
+	threadp = memPoolAlloc(aio_thread_pool);
+	threadp->status = _THREAD_STARTING;
+	threadp->current_req = NULL;
+	threadp->requests = 0;
+	threadp->next = threads;
+	threads = threadp;
+	if (pthread_create(&threadp->thread, &globattr, aio_thread_loop, threadp)) {
+	    fprintf(stderr, "Thread creation failed\n");
+	    threadp->status = _THREAD_FAILED;
+	    continue;
+	}
+    }
+
+    /* Create request pool */
+    aio_request_pool = memPoolCreate("aio_request", sizeof(aio_request_t));
+    aio_large_bufs = memPoolCreate("aio_large_bufs", AIO_LARGE_BUFS);
+    aio_medium_bufs = memPoolCreate("aio_medium_bufs", AIO_MEDIUM_BUFS);
+    aio_small_bufs = memPoolCreate("aio_small_bufs", AIO_SMALL_BUFS);
+    aio_tiny_bufs = memPoolCreate("aio_tiny_bufs", AIO_TINY_BUFS);
+    aio_micro_bufs = memPoolCreate("aio_micro_bufs", AIO_MICRO_BUFS);
+
+    aio_initialised = 1;
+}
+
+
+static void *
+aio_thread_loop(void *ptr)
+{
+    aio_thread_t *threadp = ptr;
+    aio_request_t *request;
+    sigset_t new;
+
+    /*
+     * Make sure to ignore signals which may possibly get sent to
+     * the parent squid thread.  Causes havoc with mutex's and
+     * condition waits otherwise
+     */
+
+    sigemptyset(&new);
+    sigaddset(&new, SIGPIPE);
+    sigaddset(&new, SIGCHLD);
+#ifdef _SQUID_LINUX_THREADS_
+    sigaddset(&new, SIGQUIT);
+    sigaddset(&new, SIGTRAP);
+#else
+    sigaddset(&new, SIGUSR1);
+    sigaddset(&new, SIGUSR2);
+#endif
+    sigaddset(&new, SIGHUP);
+    sigaddset(&new, SIGTERM);
+    sigaddset(&new, SIGINT);
+    sigaddset(&new, SIGALRM);
+    pthread_sigmask(SIG_BLOCK, &new, NULL);
+
+    while (1) {
+	threadp->current_req = request = NULL;
+	request = NULL;
+	/* Get a request to process */
+	threadp->status = _THREAD_WAITING;
+	pthread_mutex_lock(&request_queue.mutex);
+	while (!request_queue.head) {
+	    pthread_cond_wait(&request_queue.cond, &request_queue.mutex);
+	}
+	request = request_queue.head;
+	if (request)
+	    request_queue.head = request->next;
+	if (!request_queue.head)
+	    request_queue.tailp = &request_queue.head;
+	pthread_mutex_unlock(&request_queue.mutex);
+	/* process the request */
+	threadp->status = _THREAD_BUSY;
+	request->next = NULL;
+	threadp->current_req = request;
+	errno = 0;
+	if (!request->cancelled) {
+	    switch (request->request_type) {
+	    case _AIO_OP_OPEN:
+		aio_do_open(request);
+		break;
+	    case _AIO_OP_READ:
+		aio_do_read(request);
+		break;
+	    case _AIO_OP_WRITE:
+		aio_do_write(request);
+		break;
+	    case _AIO_OP_CLOSE:
+		aio_do_close(request);
+		break;
+	    case _AIO_OP_UNLINK:
+		aio_do_unlink(request);
+		break;
+	    case _AIO_OP_TRUNCATE:
+		aio_do_truncate(request);
+		break;
+#if AIO_OPENDIR			/* Opendir not implemented yet */
+	    case _AIO_OP_OPENDIR:
+		aio_do_opendir(request);
+		break;
+#endif
+	    case _AIO_OP_STAT:
+		aio_do_stat(request);
+		break;
+	    default:
+		request->ret = -1;
+		request->err = EINVAL;
+		break;
+	    }
+	} else {		/* cancelled */
+	    request->ret = -1;
+	    request->err = EINTR;
+	}
+	threadp->status = _THREAD_DONE;
+	/* put the request in the done queue */
+	pthread_mutex_lock(&done_queue.mutex);
+	*done_queue.tailp = request;
+	done_queue.tailp = &request->next;
+	pthread_mutex_unlock(&done_queue.mutex);
+	threadp->requests++;
+    }				/* while forever */
+    return NULL;
+}				/* aio_thread_loop */
+
+static void
+aio_queue_request(aio_request_t * request)
+{
+    static int high_start = 0;
+    debug(41, 9) ("aio_queue_request: %p type=%d result=%p\n",
+	request, request->request_type, request->resultp);
+    /* Mark it as not executed (failing result, no error) */
+    request->ret = -1;
+    request->err = 0;
+    /* Internal housekeeping */
+    request_queue_len += 1;
+    request->resultp->_data = request;
+    /* Play some tricks with the request_queue2 queue */
+    request->next = NULL;
+    if (!request_queue2.head) {
+	if (pthread_mutex_trylock(&request_queue.mutex) == 0) {
+	    /* Normal path */
+	    *request_queue.tailp = request;
+	    request_queue.tailp = &request->next;
+	    pthread_cond_signal(&request_queue.cond);
+	    pthread_mutex_unlock(&request_queue.mutex);
+	} else {
+	    /* Oops, the request queue is blocked, use request_queue2 */
+	    *request_queue2.tailp = request;
+	    request_queue2.tailp = &request->next;
+	}
+    } else {
+	/* Secondary path. We have blocked requests to deal with */
+	/* add the request to the chain */
+	*request_queue2.tailp = request;
+	if (pthread_mutex_trylock(&request_queue.mutex) == 0) {
+	    /* Ok, the queue is no longer blocked */
+	    *request_queue.tailp = request_queue2.head;
+	    request_queue.tailp = &request->next;
+	    pthread_cond_signal(&request_queue.cond);
+	    pthread_mutex_unlock(&request_queue.mutex);
+	    request_queue2.head = NULL;
+	    request_queue2.tailp = &request_queue2.head;
+	} else {
+	    /* still blocked, bump the blocked request chain */
+	    request_queue2.tailp = &request->next;
+	}
+    }
+    if (request_queue2.head) {
+	static int filter = 0;
+	static int filter_limit = 8;
+	if (++filter >= filter_limit) {
+	    filter_limit += filter;
+	    filter = 0;
+	    debug(43, 1) ("aio_queue_request: WARNING - Queue congestion\n");
+	}
+    }
+    /* Warn if out of threads */
+    if (request_queue_len > MAGIC1) {
+	static int last_warn = 0;
+	static int queue_high, queue_low;
+	if (high_start == 0) {
+	    high_start = squid_curtime;
+	    queue_high = request_queue_len;
+	    queue_low = request_queue_len;
+	}
+	if (request_queue_len > queue_high)
+	    queue_high = request_queue_len;
+	if (request_queue_len < queue_low)
+	    queue_low = request_queue_len;
+	if (squid_curtime >= (last_warn + 15) &&
+	    squid_curtime >= (high_start + 5)) {
+	    debug(43, 1) ("aio_queue_request: WARNING - Disk I/O overloading\n");
+	    if (squid_curtime >= (high_start + 15))
+		debug(43, 1) ("aio_queue_request: Queue Length: current=%d, high=%d, low=%d, duration=%d\n",
+		    request_queue_len, queue_high, queue_low, squid_curtime - high_start);
+	    last_warn = squid_curtime;
+	}
+    } else {
+	high_start = 0;
+    }
+    /* Warn if seriously overloaded */
+    if (request_queue_len > RIDICULOUS_LENGTH) {
+	debug(43, 0) ("aio_queue_request: Async request queue growing uncontrollably!\n");
+	debug(43, 0) ("aio_queue_request: Syncing pending I/O operations.. (blocking)\n");
+	aio_sync();
+	debug(43, 0) ("aio_queue_request: Synced\n");
+    }
+}				/* aio_queue_request */
+
+static void
+aio_cleanup_request(aio_request_t * requestp)
+{
+    aio_result_t *resultp = requestp->resultp;
+    int cancelled = requestp->cancelled;
+
+    /* Free allocated structures and copy data back to user space if the */
+    /* request hasn't been cancelled */
+    switch (requestp->request_type) {
+    case _AIO_OP_STAT:
+	if (!cancelled && requestp->ret == 0)
+	    xmemcpy(requestp->statp, requestp->tmpstatp, sizeof(struct stat));
+	aio_xfree(requestp->tmpstatp, sizeof(struct stat));
+	aio_xstrfree(requestp->path);
+	break;
+    case _AIO_OP_OPEN:
+	if (cancelled && requestp->ret >= 0)
+	    /* The open() was cancelled but completed */
+	    close(requestp->ret);
+	aio_xstrfree(requestp->path);
+	break;
+    case _AIO_OP_CLOSE:
+	if (cancelled && requestp->ret < 0)
+	    /* The close() was cancelled and never got executed */
+	    close(requestp->fd);
+	break;
+    case _AIO_OP_UNLINK:
+    case _AIO_OP_TRUNCATE:
+    case _AIO_OP_OPENDIR:
+	aio_xstrfree(requestp->path);
+	break;
+    case _AIO_OP_READ:
+	if (!cancelled && requestp->ret > 0)
+	    xmemcpy(requestp->bufferp, requestp->tmpbufp, requestp->ret);
+	aio_xfree(requestp->tmpbufp, requestp->buflen);
+	break;
+    case _AIO_OP_WRITE:
+	aio_xfree(requestp->tmpbufp, requestp->buflen);
+	break;
+    default:
+	break;
+    }
+    if (resultp != NULL && !cancelled) {
+	resultp->aio_return = requestp->ret;
+	resultp->aio_errno = requestp->err;
+    }
+    memPoolFree(aio_request_pool, requestp);
+}				/* aio_cleanup_request */
+
+
+int
+aio_cancel(aio_result_t * resultp)
+{
+    aio_request_t *request = resultp->_data;
+
+    if (request && request->resultp == resultp) {
+	debug(41, 9) ("aio_cancel: %p type=%d result=%p\n",
+	    request, request->request_type, request->resultp);
+	request->cancelled = 1;
+	request->resultp = NULL;
+	resultp->_data = NULL;
+	return 0;
+    }
+    return 1;
+}				/* aio_cancel */
+
+
+int
+aio_open(const char *path, int oflag, mode_t mode, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    requestp->path = (char *) aio_xstrdup(path);
+    requestp->oflag = oflag;
+    requestp->mode = mode;
+    requestp->resultp = resultp;
+    requestp->request_type = _AIO_OP_OPEN;
+    requestp->cancelled = 0;
+
+    aio_queue_request(requestp);
+    return 0;
+}
+
+
+static void
+aio_do_open(aio_request_t * requestp)
+{
+    requestp->ret = open(requestp->path, requestp->oflag, requestp->mode);
+    requestp->err = errno;
+}
+
+
+int
+aio_read(int fd, char *bufp, int bufs, off_t offset, int whence, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    requestp->fd = fd;
+    requestp->bufferp = bufp;
+    requestp->tmpbufp = (char *) aio_xmalloc(bufs);
+    requestp->buflen = bufs;
+    requestp->offset = offset;
+    requestp->whence = whence;
+    requestp->resultp = resultp;
+    requestp->request_type = _AIO_OP_READ;
+    requestp->cancelled = 0;
+
+    aio_queue_request(requestp);
+    return 0;
+}
+
+
+static void
+aio_do_read(aio_request_t * requestp)
+{
+    lseek(requestp->fd, requestp->offset, requestp->whence);
+    requestp->ret = read(requestp->fd, requestp->tmpbufp, requestp->buflen);
+    requestp->err = errno;
+}
+
+
+int
+aio_write(int fd, char *bufp, int bufs, off_t offset, int whence, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    requestp->fd = fd;
+    requestp->tmpbufp = (char *) aio_xmalloc(bufs);
+    xmemcpy(requestp->tmpbufp, bufp, bufs);
+    requestp->buflen = bufs;
+    requestp->offset = offset;
+    requestp->whence = whence;
+    requestp->resultp = resultp;
+    requestp->request_type = _AIO_OP_WRITE;
+    requestp->cancelled = 0;
+
+    aio_queue_request(requestp);
+    return 0;
+}
+
+
+static void
+aio_do_write(aio_request_t * requestp)
+{
+    requestp->ret = write(requestp->fd, requestp->tmpbufp, requestp->buflen);
+    requestp->err = errno;
+}
+
+
+int
+aio_close(int fd, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    requestp->fd = fd;
+    requestp->resultp = resultp;
+    requestp->request_type = _AIO_OP_CLOSE;
+    requestp->cancelled = 0;
+
+    aio_queue_request(requestp);
+    return 0;
+}
+
+
+static void
+aio_do_close(aio_request_t * requestp)
+{
+    requestp->ret = close(requestp->fd);
+    requestp->err = errno;
+}
+
+
+int
+aio_stat(const char *path, struct stat *sb, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    requestp->path = (char *) aio_xstrdup(path);
+    requestp->statp = sb;
+    requestp->tmpstatp = (struct stat *) aio_xmalloc(sizeof(struct stat));
+    requestp->resultp = resultp;
+    requestp->request_type = _AIO_OP_STAT;
+    requestp->cancelled = 0;
+
+    aio_queue_request(requestp);
+    return 0;
+}
+
+
+static void
+aio_do_stat(aio_request_t * requestp)
+{
+    requestp->ret = stat(requestp->path, requestp->tmpstatp);
+    requestp->err = errno;
+}
+
+
+int
+aio_unlink(const char *path, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    requestp->path = aio_xstrdup(path);
+    requestp->resultp = resultp;
+    requestp->request_type = _AIO_OP_UNLINK;
+    requestp->cancelled = 0;
+
+    aio_queue_request(requestp);
+    return 0;
+}
+
+
+static void
+aio_do_unlink(aio_request_t * requestp)
+{
+    requestp->ret = unlink(requestp->path);
+    requestp->err = errno;
+}
+
+int
+aio_truncate(const char *path, off_t length, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    requestp->path = (char *) aio_xstrdup(path);
+    requestp->offset = length;
+    requestp->resultp = resultp;
+    requestp->request_type = _AIO_OP_TRUNCATE;
+    requestp->cancelled = 0;
+
+    aio_queue_request(requestp);
+    return 0;
+}
+
+
+static void
+aio_do_truncate(aio_request_t * requestp)
+{
+    requestp->ret = truncate(requestp->path, requestp->offset);
+    requestp->err = errno;
+}
+
+
+#if AIO_OPENDIR
+/* XXX aio_opendir NOT implemented yet.. */
+
+int
+aio_opendir(const char *path, aio_result_t * resultp)
+{
+    aio_request_t *requestp;
+    int len;
+
+    if (!aio_initialised)
+	aio_init();
+    requestp = memPoolAlloc(aio_request_pool);
+    return -1;
+}
+
+static void
+aio_do_opendir(aio_request_t * requestp)
+{
+    /* NOT IMPLEMENTED */
+}
+
+#endif
+
+static void
+aio_poll_queues(void)
+{
+    /* kick "overflow" request queue */
+    if (request_queue2.head &&
+	pthread_mutex_trylock(&request_queue.mutex) == 0) {
+	*request_queue.tailp = request_queue2.head;
+	request_queue.tailp = request_queue2.tailp;
+	pthread_cond_signal(&request_queue.cond);
+	pthread_mutex_unlock(&request_queue.mutex);
+	request_queue2.head = NULL;
+	request_queue2.tailp = &request_queue2.head;
+    }
+    /* poll done queue */
+    if (done_queue.head && pthread_mutex_trylock(&done_queue.mutex) == 0) {
+	struct aio_request_t *requests = done_queue.head;
+	done_queue.head = NULL;
+	done_queue.tailp = &done_queue.head;
+	pthread_mutex_unlock(&done_queue.mutex);
+	*done_requests.tailp = requests;
+	request_queue_len -= 1;
+	while (requests->next) {
+	    requests = requests->next;
+	    request_queue_len -= 1;
+	}
+	done_requests.tailp = &requests->next;
+    }
+    /* Give up the CPU to allow the threads to do their work */
+    /*
+     * For Andres thoughts about yield(), see
+     * http://www.squid-cache.org/mail-archive/squid-dev/200012/0001.html
+     */
+    if (done_queue.head || request_queue.head)
+#ifndef _SQUID_SOLARIS_
+	sched_yield();
+#else
+	yield();
+#endif
+}
+
+aio_result_t *
+aio_poll_done(void)
+{
+    aio_request_t *request;
+    aio_result_t *resultp;
+    int cancelled;
+    int polled = 0;
+
+  AIO_REPOLL:
+    request = done_requests.head;
+    if (request == NULL && !polled) {
+	aio_poll_queues();
+	polled = 1;
+	request = done_requests.head;
+    }
+    if (!request) {
+	return NULL;
+    }
+    debug(41, 9) ("aio_poll_done: %p type=%d result=%p\n",
+	request, request->request_type, request->resultp);
+    done_requests.head = request->next;
+    if (!done_requests.head)
+	done_requests.tailp = &done_requests.head;
+    resultp = request->resultp;
+    cancelled = request->cancelled;
+    aio_debug(request);
+    debug(43, 5) ("DONE: %d -> %d\n", request->ret, request->err);
+    aio_cleanup_request(request);
+    if (cancelled)
+	goto AIO_REPOLL;
+    return resultp;
+}				/* aio_poll_done */
+
+int
+aio_operations_pending(void)
+{
+    return request_queue_len + (done_requests.head ? 1 : 0);
+}
+
+int
+aio_sync(void)
+{
+    /* XXX This might take a while if the queue is large.. */
+    do {
+	aio_poll_queues();
+    } while (request_queue_len > 0);
+    return aio_operations_pending();
+}
+
+int
+aio_get_queue_len(void)
+{
+    return request_queue_len;
+}
+
+static void
+aio_debug(aio_request_t * request)
+{
+    switch (request->request_type) {
+    case _AIO_OP_OPEN:
+	debug(43, 5) ("OPEN of %s to FD %d\n", request->path, request->ret);
+	break;
+    case _AIO_OP_READ:
+	debug(43, 5) ("READ on fd: %d\n", request->fd);
+	break;
+    case _AIO_OP_WRITE:
+	debug(43, 5) ("WRITE on fd: %d\n", request->fd);
+	break;
+    case _AIO_OP_CLOSE:
+	debug(43, 5) ("CLOSE of fd: %d\n", request->fd);
+	break;
+    case _AIO_OP_UNLINK:
+	debug(43, 5) ("UNLINK of %s\n", request->path);
+	break;
+    case _AIO_OP_TRUNCATE:
+	debug(43, 5) ("UNLINK of %s\n", request->path);
+	break;
+    default:
+	break;
+    }
+}
Index: squid/src/fs/ufs/async_io.c
diff -u /dev/null squid/src/fs/ufs/async_io.c:1.1.2.1
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/ufs/async_io.c	Sat Apr 14 02:46:36 2001
@@ -0,0 +1,365 @@
+
+/*
+ * $Id$
+ *
+ * DEBUG: section 32    Asynchronous Disk I/O
+ * AUTHOR: Pete Bentley <pete@demon.net>
+ * AUTHOR: Stewart Forster <slf@connect.com.au>
+ *
+ * SQUID Web Proxy Cache          http://www.squid-cache.org/
+ * ----------------------------------------------------------
+ *
+ *  Squid is the result of efforts by numerous individuals from
+ *  the Internet community; see the CONTRIBUTORS file for full
+ *  details.   Many organizations have provided support for Squid's
+ *  development; see the SPONSORS file for full details.  Squid is
+ *  Copyrighted (C) 2001 by the Regents of the University of
+ *  California; see the COPYRIGHT file for full details.  Squid
+ *  incorporates software developed and/or copyrighted by other
+ *  sources; see the CREDITS file for full details.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *  
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *  
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
+ *
+ */
+
+#include "squid.h"
+#include "store_asyncufs.h"
+
+#define _AIO_OPEN	0
+#define _AIO_READ	1
+#define _AIO_WRITE	2
+#define _AIO_CLOSE	3
+#define _AIO_UNLINK	4
+#define _AIO_TRUNCATE	4
+#define _AIO_OPENDIR	5
+#define _AIO_STAT	6
+
+typedef struct aio_ctrl_t {
+    struct aio_ctrl_t *next;
+    int fd;
+    int operation;
+    AIOCB *done_handler;
+    void *done_handler_data;
+    aio_result_t result;
+    char *bufp;
+    FREE *free_func;
+    dlink_node node;
+} aio_ctrl_t;
+
+struct {
+    int open;
+    int close;
+    int cancel;
+    int write;
+    int read;
+    int stat;
+    int unlink;
+    int check_callback;
+} aio_counts;
+
+typedef struct aio_unlinkq_t {
+    char *path;
+    struct aio_unlinkq_t *next;
+} aio_unlinkq_t;
+
+static dlink_list used_list;
+static int initialised = 0;
+static OBJH aioStats;
+static MemPool *aio_ctrl_pool;
+static void aioFDWasClosed(int fd);
+
+static void
+aioFDWasClosed(int fd)
+{
+    if (fd_table[fd].flags.closing)
+	fd_close(fd);
+}
+
+void
+aioInit(void)
+{
+    if (initialised)
+	return;
+    aio_ctrl_pool = memPoolCreate("aio_ctrl", sizeof(aio_ctrl_t));
+    cachemgrRegister("aio_counts", "Async IO Function Counters",
+	aioStats, 0, 1);
+    initialised = 1;
+    comm_quick_poll_required();
+}
+
+void
+aioDone(void)
+{
+    memPoolDestroy(aio_ctrl_pool);
+    initialised = 0;
+}
+
+void
+aioOpen(const char *path, int oflag, mode_t mode, AIOCB * callback, void *callback_data)
+{
+    aio_ctrl_t *ctrlp;
+
+    assert(initialised);
+    aio_counts.open++;
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = -2;
+    ctrlp->done_handler = callback;
+    ctrlp->done_handler_data = callback_data;
+    ctrlp->operation = _AIO_OPEN;
+    cbdataLock(callback_data);
+    ctrlp->result.data = ctrlp;
+    aio_open(path, oflag, mode, &ctrlp->result);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+    return;
+}
+
+void
+aioClose(int fd)
+{
+    aio_ctrl_t *ctrlp;
+
+    assert(initialised);
+    aio_counts.close++;
+    aioCancel(fd);
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = fd;
+    ctrlp->done_handler = NULL;
+    ctrlp->done_handler_data = NULL;
+    ctrlp->operation = _AIO_CLOSE;
+    ctrlp->result.data = ctrlp;
+    aio_close(fd, &ctrlp->result);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+    return;
+}
+
+void
+aioCancel(int fd)
+{
+    aio_ctrl_t *curr;
+    AIOCB *done_handler;
+    void *their_data;
+    dlink_node *m, *next;
+
+    assert(initialised);
+    aio_counts.cancel++;
+    for (m = used_list.head; m; m = next) {
+	while (m) {
+	    curr = m->data;
+	    if (curr->fd == fd)
+		break;
+	    m = m->next;
+	}
+	if (m == NULL)
+	    break;
+
+	aio_cancel(&curr->result);
+
+	if ((done_handler = curr->done_handler)) {
+	    their_data = curr->done_handler_data;
+	    curr->done_handler = NULL;
+	    curr->done_handler_data = NULL;
+	    debug(0, 0) ("this be aioCancel\n");
+	    if (cbdataValid(their_data))
+		done_handler(fd, their_data, -2, -2);
+	    cbdataUnlock(their_data);
+	}
+	next = m->next;
+	dlinkDelete(m, &used_list);
+	memPoolFree(aio_ctrl_pool, curr);
+    }
+}
+
+
+void
+aioWrite(int fd, int offset, char *bufp, int len, AIOCB * callback, void *callback_data, FREE * free_func)
+{
+    aio_ctrl_t *ctrlp;
+    int seekmode;
+
+    assert(initialised);
+    aio_counts.write++;
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = fd;
+    ctrlp->done_handler = callback;
+    ctrlp->done_handler_data = callback_data;
+    ctrlp->operation = _AIO_WRITE;
+    ctrlp->bufp = bufp;
+    ctrlp->free_func = free_func;
+    if (offset >= 0)
+	seekmode = SEEK_SET;
+    else {
+	seekmode = SEEK_END;
+	offset = 0;
+    }
+    cbdataLock(callback_data);
+    ctrlp->result.data = ctrlp;
+    aio_write(fd, bufp, len, offset, seekmode, &ctrlp->result);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+}				/* aioWrite */
+
+
+void
+aioRead(int fd, int offset, char *bufp, int len, AIOCB * callback, void *callback_data)
+{
+    aio_ctrl_t *ctrlp;
+    int seekmode;
+
+    assert(initialised);
+    aio_counts.read++;
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = fd;
+    ctrlp->done_handler = callback;
+    ctrlp->done_handler_data = callback_data;
+    ctrlp->operation = _AIO_READ;
+    if (offset >= 0)
+	seekmode = SEEK_SET;
+    else {
+	seekmode = SEEK_CUR;
+	offset = 0;
+    }
+    cbdataLock(callback_data);
+    ctrlp->result.data = ctrlp;
+    aio_read(fd, bufp, len, offset, seekmode, &ctrlp->result);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+    return;
+}				/* aioRead */
+
+void
+aioStat(char *path, struct stat *sb, AIOCB * callback, void *callback_data)
+{
+    aio_ctrl_t *ctrlp;
+
+    assert(initialised);
+    aio_counts.stat++;
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = -2;
+    ctrlp->done_handler = callback;
+    ctrlp->done_handler_data = callback_data;
+    ctrlp->operation = _AIO_STAT;
+    cbdataLock(callback_data);
+    ctrlp->result.data = ctrlp;
+    aio_stat(path, sb, &ctrlp->result);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+    return;
+}				/* aioStat */
+
+void
+aioUnlink(const char *path, AIOCB * callback, void *callback_data)
+{
+    aio_ctrl_t *ctrlp;
+    assert(initialised);
+    aio_counts.unlink++;
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = -2;
+    ctrlp->done_handler = callback;
+    ctrlp->done_handler_data = callback_data;
+    ctrlp->operation = _AIO_UNLINK;
+    cbdataLock(callback_data);
+    ctrlp->result.data = ctrlp;
+    aio_unlink(path, &ctrlp->result);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+}				/* aioUnlink */
+
+void
+aioTruncate(const char *path, off_t length, AIOCB * callback, void *callback_data)
+{
+    aio_ctrl_t *ctrlp;
+    assert(initialised);
+    aio_counts.unlink++;
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = -2;
+    ctrlp->done_handler = callback;
+    ctrlp->done_handler_data = callback_data;
+    ctrlp->operation = _AIO_TRUNCATE;
+    cbdataLock(callback_data);
+    ctrlp->result.data = ctrlp;
+    aio_truncate(path, length, &ctrlp->result);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+}				/* aioTruncate */
+
+
+int
+aioCheckCallbacks(SwapDir * SD)
+{
+    aio_result_t *resultp;
+    aio_ctrl_t *ctrlp;
+    AIOCB *done_handler;
+    void *their_data;
+    int retval = 0;
+
+    assert(initialised);
+    aio_counts.check_callback++;
+    for (;;) {
+	if ((resultp = aio_poll_done()) == NULL)
+	    break;
+	ctrlp = (aio_ctrl_t *) resultp->data;
+	if (ctrlp == NULL)
+	    continue;		/* XXX Should not happen */
+	dlinkDelete(&ctrlp->node, &used_list);
+	if ((done_handler = ctrlp->done_handler)) {
+	    their_data = ctrlp->done_handler_data;
+	    ctrlp->done_handler = NULL;
+	    ctrlp->done_handler_data = NULL;
+	    if (cbdataValid(their_data)) {
+		retval = 1;	/* Return that we've actually done some work */
+		done_handler(ctrlp->fd, their_data,
+		    ctrlp->result.aio_return, ctrlp->result.aio_errno);
+	    }
+	    cbdataUnlock(their_data);
+	}
+	/* free data if requested to aioWrite() */
+	if (ctrlp->free_func)
+	    ctrlp->free_func(ctrlp->bufp);
+	if (ctrlp->operation == _AIO_CLOSE)
+	    aioFDWasClosed(ctrlp->fd);
+	memPoolFree(aio_ctrl_pool, ctrlp);
+    }
+    return retval;
+}
+
+void
+aioStats(StoreEntry * sentry)
+{
+    storeAppendPrintf(sentry, "ASYNC IO Counters:\n");
+    storeAppendPrintf(sentry, "open\t%d\n", aio_counts.open);
+    storeAppendPrintf(sentry, "close\t%d\n", aio_counts.close);
+    storeAppendPrintf(sentry, "cancel\t%d\n", aio_counts.cancel);
+    storeAppendPrintf(sentry, "write\t%d\n", aio_counts.write);
+    storeAppendPrintf(sentry, "read\t%d\n", aio_counts.read);
+    storeAppendPrintf(sentry, "stat\t%d\n", aio_counts.stat);
+    storeAppendPrintf(sentry, "unlink\t%d\n", aio_counts.unlink);
+    storeAppendPrintf(sentry, "check_callback\t%d\n", aio_counts.check_callback);
+    storeAppendPrintf(sentry, "queue\t%d\n", aio_get_queue_len());
+}
+
+/* Flush all pending I/O */
+void
+aioSync(SwapDir * SD)
+{
+    if (!initialised)
+	return;			/* nothing to do then */
+    /* Flush all pending operations */
+    debug(32, 1) ("aioSync: flushing pending I/O operations\n");
+    do {
+	aioCheckCallbacks(SD);
+    } while (aio_sync());
+    debug(32, 1) ("aioSync: done\n");
+}
+
+int
+aioQueueSize(void)
+{
+    return memPoolInUseCount(aio_ctrl_pool);
+}
Index: squid/src/fs/ufs/fs_aufs.c
diff -u /dev/null squid/src/fs/ufs/fs_aufs.c:1.1.2.2
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/ufs/fs_aufs.c	Sun Apr 29 03:50:11 2001
@@ -0,0 +1,84 @@
+/* async-io specific functions */
+
+void *
+aio_buildOpenRequest(char *path, int flags, int mode,
+    STIOCB * callback, void *callback_data)
+{
+    ufs_request_t *requestp;
+    ufs_ctrl_t *ctrlp;
+
+    if (!aio_initialised)
+    	aio_init();
+
+    /* XXX counts need to be per-SD, in fsdata */
+    aio_counts.open++;
+    ctrlp = memPoolAlloc(aio_ctrl_pool);
+    ctrlp->fd = -2;
+    ctrlp->done_handler = callback;
+    ctrlp->done_handler_data = callback_data;
+    ctrlp->operation = _AIO_OPEN;
+    cbdataLock(callback_data);
+    ctrlp->result.data = ctrlp;
+
+    requestp = (aio_request_t *)io_buildOpenRequest(path, flags, mode,
+      	callback, callback_data);
+    /* Link ctrlp and requestp together */
+    requestp->resultp = &ctrlp->result;
+    requestp->resultp->_data = requestp;
+
+    return (void *)requestp;
+}
+
+void
+aio_open(aio_request_t *requestp)
+{
+    aio_ctrl_t *ctrlp;
+
+    ctrlp = (requestp->resultp).data;
+    aio_queue_request(requestp);
+    dlinkAdd(ctrlp, &ctrlp->node, &used_list);
+    return;
+}
+
+void
+aio_cleanupRequest(aio_request_t *requestp)
+{
+    aio_result_t *resultp = requestp->resultp;
+
+    if (resultp != NULL && !cancelled) {
+	resultp->aio_return = requestp->ret;
+	resultp->aio_errno = requestp->err;
+    }
+    ufs_cleanupRequest(requestp);
+}
+
+ufs_request_t *
+aio_buildReadRequest()
+{
+}
+
+void
+aio_read(ufs_request_t *requestp)
+{
+    /* This code stolen from the store_io_aufs.c code.  Note, this changes
+     * the timing of this snippet slightly, but hopefully not enough to
+     * count.  Still got to think through the whole "pending requests"
+     * thing *frown*
+     * Obviously, this is just a placeholder snippet, not functional code.  */
+    if (fsstate->fd < 0) {
+        struct _queued_read *q;
+        debug(78, 3) ("storeUfsRead: queueing read because FD < 0\n");
+        assert(fsstate->flags.opening);
+        assert(fsstate->pending_reads == NULL);
+        assert(fsstate->async);
+        q = memPoolAlloc(ufs_qread_pool);
+        q->buf = buf;
+        q->size = size;
+        q->offset = offset;
+        q->callback = callback;
+        q->callback_data = callback_data;
+        linklistPush(&(fsstate->pending_reads), q);
+        return;
+    }
+
+}
Index: squid/src/fs/ufs/fs_structs.h
diff -u /dev/null squid/src/fs/ufs/fs_structs.h:1.1.2.2
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/ufs/fs_structs.h	Sun Apr 29 03:50:11 2001
@@ -0,0 +1,105 @@
+/* Request Types - fairly self explanatory, these are set in the request_t
+ * "objects". */
+enum _ufs_request_type {
+    _OP_NONE = 0,
+    _OP_OPEN,
+    _OP_CREATE,
+    _OP_READ,
+    _OP_WRITE,
+    _OP_CLOSE,
+    _OP_UNLINK,
+    _OP_TRUNCATE,
+    _OP_OPENDIR,
+    _OP_STAT
+};
+
+/* ufs_request_t is the structure that defines a request "object".  One of
+ * these is created for every request squid makes of the fs. */
+struct _ufs_request_t {
+    struct ufs_request_t *next;
+    enum _ufs_request_type request_type;
+    int cancelled;
+    char *path;
+    int oflag;
+    mode_t mode;
+    int fd;
+    char *bufferp;
+    char *tmpbufp;
+    int buflen;
+    off_t offset;
+    int whence;
+    int ret;
+    int err;
+    struct stat *tmpstatp;
+    struct stat *statp;
+    storeIOState *sio;
+};
+
+/* _ufs_fsinfo_t defines the fs-specific information hanging off the SwapDir
+ * structure (called "fsdata"). */
+struct _ufs_fsinfo_t {
+    int swaplog_fd;
+    int l1;
+    int l2;
+    fileMap *map;
+    int suggest;
+    int async;
+};
+
+/* _ufs_state_t records the state of any particular request.  It resides in
+ * ithe StoreIOState structure, called fsstate.  */
+struct _ufs_state_t {
+    int fd;
+    struct {
+        unsigned int close_request:1;
+        unsigned int reading:1;
+        unsigned int writing:1;
+        unsigned int opening:1;
+        unsigned int write_kicking:1;
+        unsigned int read_kicking:1;
+        unsigned int inreaddone:1;
+    } flags;
+    const char *read_buf;
+    link_list *pending_writes;
+    link_list *pending_reads;
+};
+
+typedef struct _ufs_request_type ufs_request_type;
+typedef struct _ufs_request_t ufs_request_t;
+typedef struct _ufs_fsinfo_t ufs_fsinfo_t;
+typedef struct _ufs_state_t ufs_state_t;
+
+/* The ufs_state memory pools.  These are/were mostly used by aufs, but
+ * that may change slightly with the new layout, not sure yet. */
+extern MemPool *ufs_state_pool;
+extern MemPool *ufs_qread_pool;
+extern MemPool *ufs_qwrite_pool;
+
+/*
+ * Notes on the various data structures used within ufs:
+ *
+ * Incoming requests usually carry a SwapDir pointer, pointing at the swap
+ * directory we're using, and either a StoreEntry pointer (for open/close/
+ * unlink), or a storeIOState pointer (for read and write).  They will generate
+ * ufs_request_t structures, which are then passed into the "submitRequest"
+ * style functions.  From there, things diverge slightly - at the moment,
+ * aufs also creates a "ctrlp" structure, which is used to hold the real
+ * callback, while the aufs-specific callback is stored in the request.  This
+ * should change before the code is useable, I suspect.
+ *
+ * I'm trying to set a standard naming convention within ufs itself - in
+ * most cases, you'll see functions called "io_*", which are the vanilla
+ * non-async versions, and "aio_*", which are the async versions.  I'm not
+ * sure that's appropriate atm, especially in light of the existence of
+ * the aio libaries/functions in glibc, but we'll see.
+ *
+ * Structures defined in here usually have an "_ufs_" prefix, to indicate
+ * they belong to the ufs file system.  Note, however, elsewhere they're not
+ * referred to with the _ufs_* prefix, but as their base name - I need to
+ * make static definitions in each file for the data structures needed, just
+ * because it looks cleaner.  C not being my prime language, I'm still a touch
+ * shaky on this.
+ *
+ * ufs_request_pool exists, and is a memory pool for ufs_request_t structures.
+ * This needs to be setup as a global thing, not a per-fs thing.
+ * */
Index: squid/src/fs/ufs/fs_ufs.c
diff -u /dev/null squid/src/fs/ufs/fs_ufs.c:1.1.2.3
--- /dev/null	Tue Sep 28 18:35:35 2004
+++ squid/src/fs/ufs/fs_ufs.c	Sun Apr 29 03:50:11 2001
@@ -0,0 +1,166 @@
+/* sync-io specific functions  - a lot of this actually gets shared by the
+ * aufs code, by virtue of aufs being a queueing layer over ufs.  */
+
+/* doCallback handles executing any given callback - short, simple, and used
+ * in lots of places */
+void
+io_doCallback(ufs_request_t *requestp)
+{
+    STIOCB *done_handler;
+    void *their_data;
+
+    if (done_handler = requestp->done_handler) {
+	their_data = requestp->done_handler_data;
+   	requestp->done_handler = NULL;
+	requestp->done_handler_data = NULL;
+   	if (cbdataValid(their_data)) {
+	    done_handler(requestp->fd,their_data,requestp->ret,requestp->err);
+	}
+   	cbdataUnlock(their_data);
+    }
+    if (requestp->fd > 0) {
+	io_close(io_buildCloseRequest(requestp->fd));
+	/*
+	close(requestp->fd);
+	fd_close(requestp->fd);
+	requestp->fd = -1;
+	*/
+    }
+}
+
+/* buildOpenRequest is called by ufs_storeCreate and ufs_storeOpen - it
+ * builds an "open" request */
+void *
+io_buildOpenRequest(char *path, int flags, int mode,
+    STIOCB *callback, void *callback_data)
+{
+    ufs_request_t *requestp;
+
+    requestp = memPoolAlloc(ufs_request_pool);
+    requestp->path = (char *) xstrdup(path);
+    requestp->oflag = flags;
+    requestp->mode = mode;
+    requestp->request_type = _OP_OPEN;
+    requestp->cancelled = 0;
+    return (void *)requestp;
+}
+
+/* io_open actions an open request, from ufs_storeOpen and ufs_storeCreate */
+void
+io_open(ufs_request_t *requestp)
+{
+    ufs_do_open(requestp);
+    io_openDone(requestp);
+}
+
+void
+io_openDone(ufs_request_t *requestp)
+{
+    storeIOState *sio = (storeIOState *)requestp->sio;
+    aiostate_t *aiostate = (aiostate_t *) sio->fsstate;
+
+    aiostate->flags.opening = 0;
+    if (requestp->err || (requestp->ret < 0)) {
+        debug(50,3) ("io_openDone: error opening file %s: %s\n", requestp->path,
+            xstrerror());
+	io_doCallback(requestp);
+    } else {
+        debug(6,5) ("io_openDone: FD %d\n",requestp->ret);
+	aiostate->fd = requestp->fd = requestp->ret;
+        fd_open(requestp->fd,FD_FILE,requestp->path);
+	if (_OP_CREATE == requestp->request_type) {
+	    /* XXX I'm not happy about where SD comes from below */
+	    storeUfsDirReplAdd(&Config.cacheSwap.swapDirs[sio->swap_dirn],
+	       	sio->e);
+	} else {
+	    /* Here, I believe we need to make sure the file size is filled
+	     * in in the sio.  Old ufs code did an fstat.  */
+	}
+    }
+    io_cleanupRequest(requestp);
+}
+
+void *
+io_buildCloseRequest(int fd)
+{
+    ufs_request_t *requestp;
+
+    requestp = memPoolAlloc(ufs_request_pool);
+    requestp->fd = fd;
+    requestp->request_type = _AIO_OP_CLOSE;
+    requestp->cancelled = 0;
+    return (void *)requestp;
+}
+
+/* io_close actions a close request */
+void
+io_close(ufs_request_t *requestp)
+{
+    ufs_do_close(requestp);
+    io_closeDone(requestp);
+}
+
+/* This is mainly here to maintain symmetry. */
+void
+io_closeDone(ufs_request_t *requestp)
+{
+    fd_close(requestp->fd);
+    io_cleanupRequest(requestp);
+}
+
+/* cleanupRequest de-allocates the various structures once we're completely
+ * done with the request.  Also copies any data accumulated back into
+ * squid's purview.  Note, there's some copying of data going on in here that
+ * should not be - we _should_ be reading data straight into squid's own
+ * buffers, not into our own temporary buffers.  */
+void
+io_cleanupRequest(ufs_request_t *requestp)
+{
+    int cancelled = requestp->cancelled;
+
+    /* Free allocated structures and copy data back to user space if the */
+    /* request hasn't been cancelled */
+    switch (requestp->request_type) {
+        case _AIO_OP_STAT:
+            if (!cancelled && requestp->ret == 0)
+                xmemcpy(requestp->statp, requestp->tmpstatp, sizeof(struct stat));
+            xfree(requestp->tmpstatp, sizeof(struct stat));
+            xstrfree(requestp->path);
+            break;
+        case _AIO_OP_OPEN:
+            if (cancelled && requestp->ret >= 0)
+                /* The open() was cancelled but completed */
+                close(requestp->ret);
+            xstrfree(requestp->path);
+            break;
+        case _AIO_OP_CLOSE:
+            if (cancelled && requestp->ret < 0)
+                /* The close() was cancelled and never got executed */
+                close(requestp->fd);
+            break;
+        case _AIO_OP_UNLINK:
+        case _AIO_OP_TRUNCATE:
+        case _AIO_OP_OPENDIR:
+            xstrfree(requestp->path);
+            break;
+        case _AIO_OP_READ:
+            if (!cancelled && requestp->ret > 0)
+                xmemcpy(requestp->bufferp, requestp->tmpbufp, requestp->ret);
+                xfree(requestp->tmpbufp, requestp->buflen);
+            break;
+        case _AIO_OP_WRITE:
+            xfree(requestp->tmpbufp, requestp->buflen);
+            break;
+        default:
+            break;
+    }
+    memPoolFree(ufs_request_pool, requestp);
+}
+
+static int
+ufsSomethingPending(storeIOState * sio)
+{
+    aiostate_t *aiostate = (aiostate_t *) sio->fsstate;
+    return (aiostate->flags.reading || aiostate->flags.writing ||
+	    aiostate->flags.opening || aiostate->flags.inreaddone);
+}
Index: squid/src/fs/ufs/store_dir_ufs.c
diff -u squid/src/fs/ufs/store_dir_ufs.c:1.10 squid/src/fs/ufs/store_dir_ufs.c:1.10.4.1
--- squid/src/fs/ufs/store_dir_ufs.c:1.10	Fri Jan 12 00:20:36 2001
+++ squid/src/fs/ufs/store_dir_ufs.c	Sat Apr 14 02:46:36 2001
@@ -1348,20 +1348,19 @@
  * This routine is called by storeDirSelectSwapDir to see if the given
  * object is able to be stored on this filesystem. UFS filesystems will
  * happily store anything as long as the LRU time isn't too small.
+ * (Darius was here)
  */
 int
-storeUfsDirCheckObj(SwapDir * SD, const StoreEntry * e)
+ufsCheckObj(SwapDir * SD, const StoreEntry * e)
 {
-#if OLD_UNUSED_CODE
-    if (storeUfsDirExpiredReferenceAge(SD) < 300) {
-	debug(20, 3) ("storeUfsDirCheckObj: NO: LRU Age = %d\n",
-	    storeUfsDirExpiredReferenceAge(SD));
-	/* store_check_cachable_hist.no.lru_age_too_low++; */
-	return -1;
-    }
-#endif
-    /* Return 999 (99.9%) constant load */
-    return 999;
+    int loadav, ql;
+
+    ql = ufsQueueSize();
+    if (ql == 0)
+        loadav = 0;
+    loadav = ql * 1000 / MAXQUEUED;
+    debug(41, 9) ("ufsCheckObj: load=%d\n", loadav);
+    return loadav;
 }
 
 /*
@@ -1369,28 +1368,30 @@
  *
  * This routine is called whenever an object is referenced, so we can
  * maintain replacement information within the storage fs.
+ * (Darius was here)
  */
 void
-storeUfsDirRefObj(SwapDir * SD, StoreEntry * e)
+ufsRefObj(SwapDir * SD, StoreEntry * e)
 {
-    debug(1, 3) ("storeUfsDirRefObj: referencing %p %d/%d\n", e, e->swap_dirn,
-	e->swap_filen);
+    debug(1, 3) ("fsRefObj: referencing %p %d/%d\n", e, e->swap_dirn,
+        e->swap_filen);
     if (SD->repl->Referenced)
-	SD->repl->Referenced(SD->repl, e, &e->repl);
+        SD->repl->Referenced(SD->repl, e, &e->repl);
 }
 
 /*
  * storeUfsDirUnrefObj
  * This routine is called whenever the last reference to an object is
  * removed, to maintain replacement information within the storage fs.
+ * (Darius was here)
  */
 void
-storeUfsDirUnrefObj(SwapDir * SD, StoreEntry * e)
+ufsUnrefObj(SwapDir * SD, StoreEntry * e)
 {
-    debug(1, 3) ("storeUfsDirUnrefObj: referencing %p %d/%d\n", e, e->swap_dirn,
-	e->swap_filen);
+    debug(1, 3) ("fsUnrefObj: referencing %p %d/%d\n", e, e->swap_dirn,
+        e->swap_filen);
     if (SD->repl->Dereferenced)
-	SD->repl->Dereferenced(SD->repl, e, &e->repl);
+        SD->repl->Dereferenced(SD->repl, e, &e->repl);
 }
 
 /*
Index: squid/src/fs/ufs/store_io_ufs.c
diff -u squid/src/fs/ufs/store_io_ufs.c:1.5 squid/src/fs/ufs/store_io_ufs.c:1.5.4.3
--- squid/src/fs/ufs/store_io_ufs.c:1.5	Fri Jan 12 00:20:36 2001
+++ squid/src/fs/ufs/store_io_ufs.c	Sun Apr 29 03:50:11 2001
@@ -1,264 +1,173 @@
-
-/*
- * $Id$
- *
- * DEBUG: section 79    Storage Manager UFS Interface
- * AUTHOR: Duane Wessels
- *
- * SQUID Web Proxy Cache          http://www.squid-cache.org/
- * ----------------------------------------------------------
- *
- *  Squid is the result of efforts by numerous individuals from
- *  the Internet community; see the CONTRIBUTORS file for full
- *  details.   Many organizations have provided support for Squid's
- *  development; see the SPONSORS file for full details.  Squid is
- *  Copyrighted (C) 2001 by the Regents of the University of
- *  California; see the COPYRIGHT file for full details.  Squid
- *  incorporates software developed and/or copyrighted by other
- *  sources; see the CREDITS file for full details.
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; either version 2 of the License, or
- *  (at your option) any later version.
- *  
- *  This program is distributed in the hope that it will be useful,
- *  but WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *  GNU General Public License for more details.
- *  
- *  You should have received a copy of the GNU General Public License
- *  along with this program; if not, write to the Free Software
- *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
- *
- */
-
-#include "squid.h"
-#include "store_ufs.h"
-
-
-static DRCB storeUfsReadDone;
-static DWCB storeUfsWriteDone;
-static void storeUfsIOCallback(storeIOState * sio, int errflag);
-static CBDUNL storeUfsIOFreeEntry;
-
-/* === PUBLIC =========================================================== */
-
+/* This is the guts of the open and create store* functions - the only
+ * significant differences between the two are mode and flags, and what
+ * happens at the other end of the operation.  */
 storeIOState *
-storeUfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback,
-    STIOCB * callback, void *callback_data)
+_storeUfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback,
+    STIOCB * callback, void *callback_data, int flags, int filen)
 {
-    sfileno f = e->swap_filen;
-    char *path = storeUfsDirFullPath(SD, f, NULL);
     storeIOState *sio;
-    struct stat sb;
-    int fd;
-    debug(79, 3) ("storeUfsOpen: fileno %08X\n", f);
-    fd = file_open(path, O_RDONLY | O_BINARY);
-    if (fd < 0) {
-	debug(79, 3) ("storeUfsOpen: got failure (%d)\n", errno);
-	return NULL;
-    }
-    debug(79, 3) ("storeUfsOpen: opened FD %d\n", fd);
+    int fd, filen;
+    int flags = (O_WRONLY | O_CREAT | O_TRUNC | O_BINARY);
+    int mode = 0644;
+    void *requestp;
+    char *path;
+    fsinfo_t *fsinfo;
+    fsstate_t *fsstate;
+
+    /* We need to check for "too many files open" here, on a per-fs basis */
+    /* We should probably also handle failures to get a new filen better */
+
+    /* Generate the new filename - this should happen in the fs'es open? */
+    filen = fsNewFileNum(SD);
+    path = fsDirFullPath(SD, filen, NULL);
+    fd = -1;
+
+    /* Setup the state object to return */
     sio = CBDATA_ALLOC(storeIOState, storeUfsIOFreeEntry);
-    sio->fsstate = memPoolAlloc(ufs_state_pool);
+    sio->fsstate = memPoolAlloc(fs_state_pool);
+
+    /* Convenience pointers for the fs state and other data */
+    fsstate = ((fsstate_t *)(sio->fsstate));
+    fsinfo = ((fsinfo_t *)(SD->fsdata));
 
-    sio->swap_filen = f;
+    /* fill in  the sio for this request in particular */
+    sio->swap_filen = filen;
     sio->swap_dirn = SD->index;
-    sio->mode = O_RDONLY;
+    /* mode and flags are confused here */
+    sio->mode = flags;
     sio->callback = callback;
     sio->callback_data = callback_data;
     cbdataLock(callback_data);
     sio->e = e;
-    ((ufsstate_t *) (sio->fsstate))->fd = fd;
-    ((ufsstate_t *) (sio->fsstate))->flags.writing = 0;
-    ((ufsstate_t *) (sio->fsstate))->flags.reading = 0;
-    ((ufsstate_t *) (sio->fsstate))->flags.close_request = 0;
-    if (fstat(fd, &sb) == 0)
-	sio->st_size = sb.st_size;
-    store_open_disk_fd++;
+    /* This needs to be moved somewhere io-specific, maybe */
+    fsstate->flags.writing = 0;
+    fsstate->flags.reading = 0;
+    fsstate->flags.close_request = 0;
+    fsstate->flags.opening = 1;
+
+    debug(79, 3) ("_storeUfsOpen: file %08X\n", filen);
+    requestp = fsinfo->buildOpenRequest(path, flags, mode,
+       	callback, callback_data);
+    fsinfo->submitOpen(requestp);
 
-    /* We should update the heap/dlink position here ! */
     return sio;
 }
 
 storeIOState *
-storeUfsCreate(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, STIOCB * callback, void *callback_data)
+storeUfsCreate(SwapDir * SD, StoreEntry * e, STFNCB * file_callback,
+    STIOCB * callback, void *callback_data)
 {
-    storeIOState *sio;
-    int fd;
-    int mode = (O_WRONLY | O_CREAT | O_TRUNC | O_BINARY);
-    char *path;
-    ufsinfo_t *ufsinfo = (ufsinfo_t *) SD->fsdata;
-    sfileno filn;
-    sdirno dirn;
-
-    /* Allocate a number */
-    dirn = SD->index;
-    filn = storeUfsDirMapBitAllocate(SD);
-    ufsinfo->suggest = filn + 1;
-    /* Shouldn't we handle a 'bitmap full' error here? */
-    path = storeUfsDirFullPath(SD, filn, NULL);
-
-    debug(79, 3) ("storeUfsCreate: fileno %08X\n", filn);
-    fd = file_open(path, mode);
-    if (fd < 0) {
-	debug(79, 3) ("storeUfsCreate: got failure (%d)\n", errno);
-	return NULL;
-    }
-    debug(79, 3) ("storeUfsCreate: opened FD %d\n", fd);
-    sio = CBDATA_ALLOC(storeIOState, storeUfsIOFreeEntry);
-    sio->fsstate = memPoolAlloc(ufs_state_pool);
+    int flags = (O_WRONLY | O_CREAT | O_TRUNC | O_BINARY);
+    int filen = fsNewFileNum(SD);
 
-    sio->swap_filen = filn;
-    sio->swap_dirn = dirn;
-    sio->mode = mode;
-    sio->callback = callback;
-    sio->callback_data = callback_data;
-    cbdataLock(callback_data);
-    sio->e = (StoreEntry *) e;
-    ((ufsstate_t *) (sio->fsstate))->fd = fd;
-    ((ufsstate_t *) (sio->fsstate))->flags.writing = 0;
-    ((ufsstate_t *) (sio->fsstate))->flags.reading = 0;
-    ((ufsstate_t *) (sio->fsstate))->flags.close_request = 0;
-    store_open_disk_fd++;
+    return _storeUfsOpen(SD,e,file_callback,callback,callback_data,flags,filen);
+}
 
-    /* now insert into the replacement policy */
-    storeUfsDirReplAdd(SD, e);
-    return sio;
+storeIOState *
+storeUfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback,
+    STIOCB * callback, void *callback_data)
+{
+    int flags = (O_RDONLY | O_BINARY);
+    int filen = e->swap_filen;
+
+    return _storeUfsOpen(SD,e,file_callback,callback,callback_data,flags,filen);
 }
 
 void
 storeUfsClose(SwapDir * SD, storeIOState * sio)
 {
-    ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate;
+    fsstate_t *fsstate = (fsstate_t *) sio->fsstate;
+    void *requestp;
 
     debug(79, 3) ("storeUfsClose: dirno %d, fileno %08X, FD %d\n",
-	sio->swap_dirn, sio->swap_filen, ufsstate->fd);
-    if (ufsstate->flags.reading || ufsstate->flags.writing) {
-	ufsstate->flags.close_request = 1;
-	return;
+        sio->swap_dirn, sio->swap_filen, fsstate->fd);
+    if (ufsSomethingPending(sio)) {
+	/* The IO callback routines will close a file if close_request is set
+	 * - this is kinda useful here. */
+        fsstate->flags.close_request = 1;
+        return;
     }
-    storeUfsIOCallback(sio, 0);
+    requestp = fsinfo->buildCloseRequest(fsstate->fd);
+    fsinfo->submitClose(requestp);
 }
 
+/* Not done past here - Darius */
+
 void
-storeUfsRead(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, STRCB * callback, void *callback_data)
+storeUfsUnlink(SwapDir * SD, StoreEntry * e)
 {
-    ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate;
+    debug(78, 3) ("storeUfsUnlink: dirno %d, fileno %08X\n", SD->index,
+       	e->swap_filen);
+    storeufsDirReplRemove(e);
+    storeufsDirMapBitReset(SD, e->swap_filen);
+    storeufsDirUnlinkFile(SD, e->swap_filen);
+}
 
+void
+storeUfsRead(SwapDir * SD, storeIOState * sio, char *buf, size_t size,
+    off_t offset, STRCB * callback, void *callback_data)
+{
+    /* This stuff needs a lot of sorting out, it's here that the
+     * two-level callback thing starts to bite */
+    fsstate_t *fsstate = (fsstate_t *) sio->fsstate;
     assert(sio->read.callback == NULL);
     assert(sio->read.callback_data == NULL);
+    assert(!fsstate->flags.reading);
     sio->read.callback = callback;
     sio->read.callback_data = callback_data;
+    fsstate->read_buf = buf;
     cbdataLock(callback_data);
-    debug(79, 3) ("storeUfsRead: dirno %d, fileno %08X, FD %d\n",
-	sio->swap_dirn, sio->swap_filen, ufsstate->fd);
+    debug(78, 3) ("storeAufsRead: dirno %d, fileno %08X, FD %d\n",
+        sio->swap_dirn, sio->swap_filen, aiostate->fd);
     sio->offset = offset;
-    ufsstate->flags.reading = 1;
-    file_read(ufsstate->fd,
-	buf,
-	size,
-	offset,
-	storeUfsReadDone,
-	sio);
+    fsstate->flags.reading = 1;
+    requestp = fsinfo->buildReadRequest(fsstate->fd,offset,buf,size);
+    fsinfo->submitRead(requestp);
 }
 
 void
-storeUfsWrite(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, FREE * free_func)
-{
-    ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate;
-    debug(79, 3) ("storeUfsWrite: dirn %d, fileno %08X, FD %d\n", sio->swap_dirn, sio->swap_filen, ufsstate->fd);
-    ufsstate->flags.writing = 1;
-    file_write(ufsstate->fd,
-	offset,
-	buf,
-	size,
-	storeUfsWriteDone,
-	sio,
-	free_func);
-}
-
-void
-storeUfsUnlink(SwapDir * SD, StoreEntry * e)
-{
-    debug(79, 3) ("storeUfsUnlink: fileno %08X\n", e->swap_filen);
-    storeUfsDirReplRemove(e);
-    storeUfsDirMapBitReset(SD, e->swap_filen);
-    storeUfsDirUnlinkFile(SD, e->swap_filen);
-}
-
-/*  === STATIC =========================================================== */
-
-static void
-storeUfsReadDone(int fd, const char *buf, int len, int errflag, void *my_data)
-{
-    storeIOState *sio = my_data;
-    ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate;
-    STRCB *callback = sio->read.callback;
-    void *their_data = sio->read.callback_data;
-    ssize_t rlen;
-
-    debug(79, 3) ("storeUfsReadDone: dirno %d, fileno %08X, FD %d, len %d\n",
-	sio->swap_dirn, sio->swap_filen, fd, len);
-    ufsstate->flags.reading = 0;
-    if (errflag) {
-	debug(79, 3) ("storeUfsReadDone: got failure (%d)\n", errflag);
-	rlen = -1;
-    } else {
-	rlen = (ssize_t) len;
-	sio->offset += len;
-    }
-    assert(callback);
-    assert(their_data);
-    sio->read.callback = NULL;
-    sio->read.callback_data = NULL;
-    if (cbdataValid(their_data))
-	callback(their_data, buf, (size_t) rlen);
-    cbdataUnlock(their_data);
-}
-
-static void
-storeUfsWriteDone(int fd, int errflag, size_t len, void *my_data)
-{
-    storeIOState *sio = my_data;
-    ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate;
-    debug(79, 3) ("storeUfsWriteDone: dirno %d, fileno %08X, FD %d, len %d\n",
-	sio->swap_dirn, sio->swap_filen, fd, len);
-    ufsstate->flags.writing = 0;
-    if (errflag) {
-	debug(79, 0) ("storeUfsWriteDone: got failure (%d)\n", errflag);
-	storeUfsIOCallback(sio, errflag);
-	return;
-    }
-    sio->offset += len;
-    if (ufsstate->flags.close_request)
-	storeUfsIOCallback(sio, errflag);
-}
-
-static void
-storeUfsIOCallback(storeIOState * sio, int errflag)
-{
-    ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate;
-    debug(79, 3) ("storeUfsIOCallback: errflag=%d\n", errflag);
-    if (ufsstate->fd > -1) {
-	file_close(ufsstate->fd);
-	store_open_disk_fd--;
-    }
-    if (cbdataValid(sio->callback_data))
-	sio->callback(sio->callback_data, errflag, sio);
-    cbdataUnlock(sio->callback_data);
-    sio->callback_data = NULL;
-    sio->callback = NULL;
-    cbdataFree(sio);
-}
-
-
-/*
- * Clean up any references from the SIO before it get's released.
- */
-static void
-storeUfsIOFreeEntry(void *sio)
+storeUfsWrite(SwapDir * SD, storeIOState * sio, char *buf, size_t size,
+    off_t offset, FREE * free_func)
 {
-    memPoolFree(ufs_state_pool, ((storeIOState *) sio)->fsstate);
+    aiostate_t *aiostate = (aiostate_t *) sio->fsstate;
+    debug(78, 3) ("storeAufsWrite: dirno %d, fileno %08X, FD %d\n",
+        sio->swap_dirn, sio->swap_filen, aiostate->fd);
+    if (aiostate->fd < 0) {
+        /* disk file not opened yet */
+        struct _queued_write *q;
+        assert(aiostate->flags.opening);
+        q = memPoolAlloc(aio_qwrite_pool);
+        q->buf = buf;
+        q->size = size;
+        q->offset = offset;
+        q->free_func = free_func;
+        linklistPush(&(aiostate->pending_writes), q);
+        return;
+    }
+#if ASYNC_WRITE
+    if (aiostate->flags.writing) {
+        struct _queued_write *q;
+        debug(78, 3) ("storeAufsWrite: queuing write\n");
+        q = memPoolAlloc(aio_qwrite_pool);
+        q->buf = buf;
+        q->size = size;
+        q->offset = offset;
+        q->free_func = free_func;
+        linklistPush(&(aiostate->pending_writes), q);
+        return;
+    }
+    aiostate->flags.writing = 1;
+    /*
+     * XXX it might be nice if aioWrite() gave is immediate
+     * feedback here about EWOULDBLOCK instead of in the
+     * callback function
+     * XXX Should never give EWOULDBLOCK under normal operations
+     * if it does then the MAGIC1/2 tuning is wrong.
+     */
+    aioWrite(aiostate->fd, offset, buf, size, storeAufsWriteDone, sio,
+        free_func);
+#else
+    file_write(aiostate->fd, offset, buf, size, storeAufsWriteDone, sio,
+        free_func);
+#endif
 }
Index: squid/src/fs/ufs/store_ufs.h
diff -u squid/src/fs/ufs/store_ufs.h:1.2 squid/src/fs/ufs/store_ufs.h:removed
--- squid/src/fs/ufs/store_ufs.h:1.2	Sat Oct 21 09:44:46 2000
+++ squid/src/fs/ufs/store_ufs.h	Tue Sep 28 18:35:35 2004
@@ -1,50 +0,0 @@
-/*
- * store_ufs.h
- *
- * Internal declarations for the ufs routines
- */
-
-#ifndef __STORE_UFS_H__
-#define __STORE_UFS_H__
-
-struct _ufsinfo_t {
-    int swaplog_fd;
-    int l1;
-    int l2;
-    fileMap *map;
-    int suggest;
-};
-
-struct _ufsstate_t {
-    int fd;
-    struct {
-	unsigned int close_request:1;
-	unsigned int reading:1;
-	unsigned int writing:1;
-    } flags;
-};
-
-typedef struct _ufsinfo_t ufsinfo_t;
-typedef struct _ufsstate_t ufsstate_t;
-
-/* The ufs_state memory pool */
-extern MemPool *ufs_state_pool;
-
-extern void storeUfsDirMapBitReset(SwapDir *, sfileno);
-extern int storeUfsDirMapBitAllocate(SwapDir *);
-extern char *storeUfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath);
-extern void storeUfsDirUnlinkFile(SwapDir *, sfileno);
-extern void storeUfsDirReplAdd(SwapDir * SD, StoreEntry *);
-extern void storeUfsDirReplRemove(StoreEntry *);
-
-/*
- * Store IO stuff
- */
-extern STOBJCREATE storeUfsCreate;
-extern STOBJOPEN storeUfsOpen;
-extern STOBJCLOSE storeUfsClose;
-extern STOBJREAD storeUfsRead;
-extern STOBJWRITE storeUfsWrite;
-extern STOBJUNLINK storeUfsUnlink;
-
-#endif