This patch is generated from the sfs branch of HEAD in squid Wed Sep 29 01:30:51 2004 GMT See http://devel.squid-cache.org/ Index: squid/doc/debug-sections.txt diff -u squid/doc/debug-sections.txt:1.3 squid/doc/debug-sections.txt:1.3.6.1 --- squid/doc/debug-sections.txt:1.3 Sun Jan 7 15:53:36 2001 +++ squid/doc/debug-sections.txt Thu Jan 25 07:15:19 2001 @@ -86,3 +86,4 @@ section 79 HTTP Meter Header section 80 WCCP section 81 Store Removal/Replacement policy +section 82 SFS Index: squid/src/mime.c diff -u squid/src/mime.c:1.8 squid/src/mime.c:1.8.4.1 --- squid/src/mime.c:1.8 Fri Jan 12 00:20:33 2001 +++ squid/src/mime.c Tue Feb 6 07:43:36 2001 @@ -449,4 +449,5 @@ debug(25, 3) ("Loaded icon %s\n", url); storeUnlockObject(e); memFree(buf, MEM_4K_BUF); + storeDirSync(); /* to handle flushing IO and calling completions */ } Index: squid/src/fs/sfs/CHANGELOG diff -u /dev/null squid/src/fs/sfs/CHANGELOG:1.1.2.1 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/CHANGELOG Wed Jan 24 06:11:54 2001 @@ -0,0 +1,52 @@ +Changelog for sfs. + +--- +sfs-0.2 - 19990202 +--- + +Altered types to use uintxx_t types - should be more portable, hopefully. +I've a small concern about some of them still, especially sscanf places, but +I'll sort that out eventually. Also cleared last of purify-related errors, +and all bar one compiler warning - Solaris 'cc' warns about incorrect +type when passing (void *)function into one of the pthread functions. + +Using uintx_t also means I _know_ how big each data structure/variable is. +This is becoming increasingly important ;) + +--- +sfs-0.1 - 19990201 +--- + +First 'versioned' file - this contains a cleaned up Makefile and some cleaned +up dependancies (thanks to Oskar Pearson). List of changes from his patch: + +o A real makefile +o Makefile includes linux and solaris sections. Defines _REENTRANT + for Linux. You have to manually specify this. + Currently linux just core-dumps with me immediately. I will try + and track this down. +o Compiles almost without warnings with 'gcc -Wall'. There are a + couple of things that I was not up to fixing immediately: noted them + with 'XXX'. I removed some unused variables (and commented out + others): if they were 'for future use', sorry. +o Now have $Id$ entries in every file, so that changes can be + tracked. +o Fixed recursive includes of header files +o sfs_seek code doesn't match documentation. Kludged it's variable + list in the meantime +o replaced broken squid_curtime definition. was the same as saying + 'exit + 1;' +o Included headers to get rid of silly warnings + +In addition, I've cleaned up the above-mentioned core dump, and the -Wall +warnings. Have also run through purify on Solaris - cleared most bugs. This +release is mainly to get the cleaning stuff back out there, so I can start +on making the interface match throughout. + +--- +First version: +--- + +This is a fairly scrappy version - I havent' even changed the attribution +headers on the files ;) It was compiling and running when I put it on +the web site, at least, but is very untested, and extremely alpha. Index: squid/src/fs/sfs/DESIGN diff -u /dev/null squid/src/fs/sfs/DESIGN:1.1.2.2 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/DESIGN Sat Feb 3 17:16:04 2001 @@ -0,0 +1,245 @@ +$Id$ + + SQUIDFS README + +This file outlines the design of the SquidFS filesystem: + +Analysis of a running Squid internet object cache has found the following +tidbits of information: + +Quantile Object Size +25% < 1677 bytes +50% < 3710 bytes +75% < 9304 bytes +90% < 21360 bytes + +It is proposed that we use 4K fragments and 8K chunks. + +At the beginning of each file will be stored that file's inode. This inode +consists of the following information: + +4 byte file length +63 x 4 bytes block pointers - each pointer indexes a 8K chunk or 4K fragment +64 x 4 byte indirect block pointers - each pointer indexes a 8K chunk that +itself contains a list of 4 byte pointers to fragments/chunks. The remainder +of the inode fragment/chunk will hold the first portion of data for that file. + +A file will consist of 0 or more 8K file chunks and 0 or 1 4 K fragments. + +An inode number is actually the index of the 4K fragment that contains the +inode on the disk. In this way indirect references to inodes are removed. + +*** Filesystem Bitmaps *** + +The filesystem will have 3 bitmaps that index all the fragments on a +filesystem. + +FBM - (on disk) Indexes fragments with valid data in them. +IBM - (on disk) Indexes inode fragments +MHB - (memory) Indexes blocks that are allocated but not necessarily written +to disk. + +States of the on-disk bitmaps are: + +FBM IBM +0 0 Fragment free +1 0 Fragment contains valid file data +0 1 Fragment contains complete inode data that references blocks + that may not be allocated. +1 1 Fragment contains completed inode data + +*** file writes, closes and bitmap updates *** + +The first block (inode) is never written until a close is issued. + +Whenever a subsequent block is written MHB is set - block is flushed +on demand. + +Whenever a file close occurs the current block and the completed inode block is +flushed to disk immediately and IBM and MHB is set for the inode block. +When IBM flush is complete FBM is set for all non-inode blocks +When FBM flush is complete FBM is then set for inode block and flushed to +disk. Once inode FBM flush is complete the file is valid + +Bitmap flushes should be scheduled events an should not occur on demand. +A file that is waiting for a bitmap flush to occur should register itself +to be called back when the flushes complete so that they may move onto +the next stage of bitmap updates. It is suggested that dirty bitmap pages +be flushed to disk every 10 seconds. In this instance it may take up to +30 seconds before a file close results in a completely valid file on disk. +In reality the file will be recoverable from on-disk data from 0-10 seconds +after the file close was issued however. + +*** Filesystem rebuilds *** + +The filesystem may be left in an inconsistent state in the event of a +power failure or system crash. In this event the following algorithms +are used to return the filesystem to a consistent state: + +FBM IBM +0 0 Fragment is free - ignore +1 0 Fragment contains valid data - ignore +0 1 Fragment is an inode that references data blocks that have not + completed their bitmap. Scan through inode and set all blocks + referenced to be valid data blocks. +1 1 Fragment is a valid inode block. + +*** Inode notes *** + +When a file is being written to the 8K inode block is held in memory, +unallocated from disk until a file close is issued. If the inode + file +data is < 4K, the 4K fragment is allocated from the free bitmap, preferably +as the last 4K of an 8K chunk. If the inode + file data is >8K, the inode +4K portion must be allocated on a free 8K chunk boundary and the data 4k +portion will be the last 4K of the 8K chunk. This 4K chunk is implied and +is not referenced by the direct block pointers in the inode. + +*** Performance Analysis *** + +By running a set of squid object sizes for cache misses and cache hits through +the above filesystem in a simulated environment we find the following: + +Internal block fragmentation: 17% + +File writes: + + Disk Objects +Accesses % + 1 70% + 2 15% + 3 6% + 4 3% + 5+ 6% +Average write accesses: 2.2 + +Bitmap update accesses per avg. object - assuming 10 secs between bitmap flushes +(worst case): 2 * inode updates + 1.2 * block updates = 3.2 +(average - 20% chance of disk bitmap locality @ 50 objects/sec): 0.05 +(best @ 50 objects/sec): 0.003 + +File reads: + + Disk Objects +Accesses % + 1 86% + 2 8% + 3 2% + 4 1% + 5+ 2% +Average read accesses: 1.5 +Bitmap update accesses: 0 + +File unlinks: + +1.05 disk accesses on average to retrieve block pointer information + worst +case 2 bitmap update accesses assuming no locality in a 10 second window + +*** Notes *** + +Average bitmap update accesses is hard to measure but assumes we are usually +writing to blocks that are close together for many files and so bitmap updates +will get clustered together a fair portion of the time. + +*** Filesystem IO *** + +There is one thread per mounted filesystem. +Blocks that are queued to be written are placed onto a thread's service +queue. Each thread inserts blocks into an doubly linked list ordered by +the location of each block in the disk. The thread scans backwards and +forwards along the list writing out blocks and removing them from the +write queue. The blocks may still remain "owned" by a open file and +the data within them may modified at any time. The thread just writes +out what it sees. This MAY cause inconsistencies but the theory is that +when the last write of that block is queued, the data will be consistent +then anyway. Since we're dealing with scenarios where this is acceptable +then this is not an issue. The requesting write will return straight away +and the write will continue in the background. If O_SYNC or O_WSYNC flag +is set the requestor will wait until the write request is finished. If the +O_NONBLOCK flag is also set along with O_SYNC or O_WSYNC the requestor keeps +a record of blocks it had queued for writing and returns EWOULDBLOCK. On a +subsequent write attempt by the parent process the requestor checks to see +if the last issued write for that block has finished and simply returns the +result of the write. + +Blocks that are to be read are placed onto the service queue for a filesystem's +service thread. The requestor then has the option of either waiting for +the IO to complete or coming back to it later. If the O_NONBLOCK flag +is set + +Open file closes are also placed onto a separate queue for the thread. +The filesystem thread is responsible for setting the bitmaps as flushing +out dirty pages of the bitmap as required. Bitmap pages are flushed out +only every N seconds, where N is a default of 15 but is user-modifiable +to any value. Every N seconds any dirty bitmap blocks are placed onto +the write queue and are flushed out like normal data. + +A filesystem IO thread's service queue looks like: + +lastbmflushtime = 0 +loop: + now = time() + if(service queue is not empty) + acquire mutex for service queue. + grab head of service queue and set service queue head pointer to NULL + release mutex + for each item on service queue + if item is marked as O_SYNC or O_WSYNC + flush out to disk immediately and report back + else + insert into disk queue ordered by disk location + + if(file close queue is not empty) + acquire mutex for file close queue + grab head of file close queue and set fcq gead pointer to NULL + release cfq mutex + for each item in file close queue + add item to pending file close queue + + if(now >= lastbmflushtime + bmflushinterval) + lastbmflushtime = now; + for each dirty bitmap block + issue immediate write of each dirty bitmap block + for each item in pending file close queue + advance state and modify necessary bitmap blocks + if(state == done) + free pending file close + + if(write queue is empty) + sleep(min((lastbmflushtime+bmflushinterval)-now, blockflushinterval)) + goto loop: + + write out block pointed to by file queue scan pointer (fqsp) + tmp = fqsp + /* Below is a simple SCAN algorithm however adding in VSCAN capability */ + /* is easy to do and should be done on final implementation */ + if(direction == forward) + if fqsp->next == NULL + direction = backward + fqsp = fqsp->prev + else + fqsp = fqsp->next + else + if fqsp->prev == NULL + direction = forward + fqsp = fqsp->next + else + fqsp = fqsp->prev + free tmp + goto loop: + +----- + +Grit: + + This is a record of the typing decisions, as they're made. + +sfsfd - file descriptor. This will be a uint32_t, the top 8 bits will be + the sfsid, the bottom 24 bits will be the 'thread-specific identifier'. + Essentially, the first byte tells us which drive, the other three + which file descriptor on that drive. + +sfsid - drive id. This will be one byte. Note, that limits us to 255 drives + - this is not a hard limit to raise. + +sfsinode - inode. Position of first inode for this file on disk. These will + be uint32_t - that gives us a fair whack of space to play with. Index: squid/src/fs/sfs/Makefile.in diff -u /dev/null squid/src/fs/sfs/Makefile.in:1.1.2.8 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/Makefile.in Wed Feb 7 01:01:15 2001 @@ -0,0 +1,70 @@ +# +# Makefile for the sfs storage driver for the Squid Object Cache server +# +# $Id$ +# + +FS = sfs + +top_srcdir = @top_srcdir@ +VPATH = @srcdir@ + +CC = @CC@ +MAKEDEPEND = @MAKEDEPEND@ +AR_R = @AR_R@ +RANLIB = @RANLIB@ +AC_CFLAGS = @CFLAGS@ +SHELL = /bin/sh +LDFLAGS = @LDFLAGS@ + +INCLUDE = -I../../../include -I$(top_srcdir)/include -I$(top_srcdir)/src/ +CFLAGS = $(AC_CFLAGS) $(INCLUDE) $(DEFINES) + +OUT = ../$(FS).a + +SFSOBJS = \ + sfs_fslo.o \ + sfs_interface.o \ + sfs_llo.o \ + sfs_splay.o \ + sfs_util.o \ + +OBJS = $(SFSOBJS) \ + store_dir_sfs.o \ + store_io_sfs.o + +all: $(OUT) + +test: $(SFSOBJS) sfs_test.o + $(CC) $(CFLAGS) $(LDFLAGS) -DSFS_TEST -L../../../lib $(SFSOBJS) sfs_test.o sfs_shim.c ../../globals.o -lmiscutil -lm -o sfs_test + +read: $(SFSOBJS) sfs_read.o + $(CC) $(CFLAGS) $(LDFLAGS) -DSFS_TEST -L../../../lib $(SFSOBJS) sfs_read.o sfs_shim.c ../../globals.o -lmiscutil -lm -o sfs_read + +$(OUT): $(OBJS) + @rm -f ../stamp + $(AR_R) $(OUT) $(OBJS) + $(RANLIB) $(OUT) + +$(OBJS): $(top_srcdir)/include/version.h ../../../include/autoconf.h +$(OBJS): store_sfs.h + +.c.o: + @rm -f ../stamp + $(CC) $(CFLAGS) -c $< + +clean: + -rm -rf *.o *pure_* core ../$(FS).a + +distclean: clean + -rm -f Makefile + -rm -f Makefile.bak + -rm -f tags + +install: + +tags: + ctags *.[ch] $(top_srcdir)/src/*.[ch] $(top_srcdir)/include/*.h $(top_srcdir)/lib/*.[ch] + +depend: + $(MAKEDEPEND) $(INCLUDE) -fMakefile *.c Index: squid/src/fs/sfs/sfs.h diff -u /dev/null squid/src/fs/sfs/sfs.h:1.1.2.2 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs.h Sat Feb 3 17:16:04 2001 @@ -0,0 +1,41 @@ +/* $Id$ */ + +/* NOT USED */ + +#ifndef SFS_H +#define SFS_H + +typedef struct sfs_statistic_t { + uint numreads; +} sfs_statistic_t; + +typedef struct sfs_stat_t { + uint sfs_ino; + uint sfs_numblocks; + uint sfs_len; +} sfs_stat_t; + +/* Library level user-callable functions */ + +int sfs_setoptions(int op, int opdata); + +/* Filesystem level user-callable functions */ + +int sfs_format(const char *path, ulong); +int sfs_mount(const char *path); +int sfs_fsck(int sfsid, int fscktype); +int sfs_unmount(int sfsid); + +/* File level user-callable functions */ + +int sfs_open(const char *path, int oflag, mode_t mode); +int sfs_close(int fd); +int sfs_sync(int fd); +int sfs_read(int fd, void *buf, int buflen); +int sfs_write(int fd, void *buf, int buflen); +int sfs_seek(int fd, int offset, int whence); +int sfs_unlink(int sfsid, uint sfsinode); +int sfs_truncate(int sfsid, uint sfsinode, int newlen); +int sfs_stat(int sfsid, uint sfsinode, sfs_stat_t *statbuf); + +#endif /* !SFS_H */ Index: squid/src/fs/sfs/sfs_defines.h diff -u /dev/null squid/src/fs/sfs/sfs_defines.h:1.1.2.10 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_defines.h Sat Feb 3 17:16:04 2001 @@ -0,0 +1,198 @@ +/* $Id$ */ + +#ifndef SFS_DEFINES_H +#define SFS_DEFINES_H + +#include +#include + +/* Possibly bogus defines? */ + +#ifndef uint8_t +#define uint8_t unsigned char +#endif + +#ifndef uint32_t +#define uint32_t int +#endif + +#ifndef uint64_t +#define uint64_t long long +#endif + +#define sfsfd_t uint32_t +#define sfsid_t int8_t +#define sfsblock_t uint32_t + +/* Code assumes CHUNKSIZE is twice FRAGSIZE. If it isn't things will break */ +/* very badly. */ + +#define FRAGSIZE 4096 +#define CHUNKSIZE 8192 +#define MINFSFRAGS 1024 /* Minimum acceptable number of FS frags */ +#define MAXFILESYS 127 /* Maximum number of mounted filesystems */ + +#define NUMDIP 62 +#define NUMSIN 64 + +#define BITINBYTE 8 + +/* Magic! */ +#define SFS_MAGIC 0xdeadf00d + +/* The below defines assume there are 8 bits in a byte */ + +#define TSTBIT(a, b) (((a[b>>3]) << (b & 0x7)) & 0x80) +#define SETBIT(a, b) ((a[b>>3]) |= (0x80 >> (b & 0x7))) +#define CLRBIT(a, b) ((a[b>>3]) &= (~(0x80 >> (b & 0x7)))) + +enum sfs_request_type { + _SFS_OP_NONE = 0, + _SFS_OP_READ, + _SFS_OP_WRITE, + _SFS_OP_OPEN_READ, + _SFS_OP_OPEN_WRITE, + _SFS_OP_CLOSE, + _SFS_OP_UNLINK, + _SFS_OP_SYNC, + _SFS_OP_UMOUNT, + _SFS_OP_SEEK, +}; + +enum sfs_request_state { + _SFS_PENDING = 0, + _SFS_IN_PROGRESS, + _SFS_DONE +}; + +enum sfs_block_type { + _SFS_UNKNOWN = 0, + _SFS_DATA, + _SFS_INODE +}; + +enum sfs_io_type { + _SFS_IO_SYNC = 0, + _SFS_IO_ASYNC +}; + +typedef struct sfs_requestor { + dlink_node node; /* List position */ + void *dataptr; /* Used by higher levels .. */ + enum sfs_request_type request_type; + enum sfs_request_state request_state; + enum sfs_io_type io_type; /* sync or async */ + pthread_cond_t done_signal; + pthread_mutex_t done_signal_lock; + sfsid_t sfsid; + sfsfd_t sfsfd; + sfsblock_t sfsinode; + ssize_t offset; /* The block inside the file in question (0..x) */ + ssize_t buflen; /* The length of the buffer, if pre-allocated, or the read */ + void *buf; + int ret; +} sfs_requestor; + +/* This corresponds to the structure as it's stored on disk. */ +typedef struct sfs_inode_t { + sfsblock_t len; + sfsblock_t dip[NUMDIP]; /* Direct block pointers */ + sfsblock_t sin[NUMSIN]; /* Single Indirect Pointers */ +} sfs_inode_t; + +/* This is the structure stored mid-fs */ +/* I have some doubts as to the correctness of the dealing with this structure +throughout the code - will check up on it. */ +typedef struct sfs_rootblock_t { + sfsblock_t numfrags; + sfsblock_t ibmpos; + sfsblock_t fbmpos; + uint32_t bmlen; + uint32_t magic; +} sfs_rootblock_t; + +/* These structures exist as members of linked lists hanging off a */ +/* sfs_openfile_t structure, except in the case of the inode block */ +/* and double indirect block of a file which are pointed to directly */ +/* since only one of these types of blocks can exist per file. */ +/* The buf points to a structure in either clean or dirty splay tree */ +typedef struct sfs_openblock_list { + struct sfs_blockbuf_t *buf; + struct sfs_openblock_list *next; + struct sfs_openblock_list *prev; +} sfs_openblock_list; + +/* Will be a chained hash table of open file descriptions */ +/* These can also be referenced by the background file flush daemon */ +/* that is in control of correctly flushing out all pending data and */ +/* bitmap updates when the file is closed or synced */ +typedef struct sfs_openfile_t { + sfsid_t sfsid; + sfsblock_t sfsinode; + sfsfd_t sfsfd; /* Fake fd for reference - unique per open file*/ + uint64_t pos; /* Position in the file, for partial reads */ + int rd_refcount; + int wr_refcount; + int flushonclose; + struct sfs_inode_t *inode; /* The block pointed to by inodebuf_p->buf */ + struct sfs_blockbuf_t *inodebuf_p; /* Pointer to blockbuf_t of inode */ + struct sfs_openblock_list *rwbuf_list_p; /* List of RW open blocks */ + struct sfs_openblock_list *sibuf_list_p; /* List of single indirect open blocks*/ + struct sfs_blockbuf_t *dibuf_p; /* Double indirect open block */ + struct sfs_openfile_t *prev; + struct sfs_openfile_t *next; +} sfs_openfile_t; + +/* This structure references blocks held in the buffer cache */ +/* There will be an array indexed by sfsid that points to two ordered */ +/* splay trees of blocks for that sfsid, one of dirty pages, and one */ +/* of clean pages. Periodically the dirty pages are flushed to disk */ +/* I'm going to add prev and next, and keep these as a list in time */ +/* order, also. */ +typedef struct sfs_blockbuf_t { + struct sfs_blockbuf_t *left; /* Splay left & right pointers */ + struct sfs_blockbuf_t *right; + struct sfs_blockbuf_t *prev; + struct sfs_blockbuf_t *next; + sfsid_t sfsid; + sfsblock_t sfsinode; + sfsblock_t diskpos; /* Position of this CHUNK on disk */ + int refcount; /* How many people holding this page */ + uint8_t dirty; /* Is page dirty */ + uint8_t type; /* Inode or data - required for updating bitmaps */ + int buflen; /* Length of buffer (*buf) (FRAGSIZE max) */ + char *buf; /* Pointer to page data */ +} sfs_blockbuf_t; + +/* This structure is the mount point parent information for a mounted */ +/* filesystem */ +typedef struct sfs_mountfs_t { + sfs_rootblock_t *path; + sfs_rootblock_t *rootblock; + sfsid_t sfsid; + sfsfd_t fd; /* Filedescriptor used for writing to this filesystem */ + char *fbm; /* Fragment allocation bitmap */ + char *ibm; /* Inode allocation bitmap */ + char *mhb; /* Memory holding fragment bitmap for disk bitmap updates */ + sfs_blockbuf_t *dirty; + sfs_blockbuf_t *clean; + sfs_blockbuf_t *head[2]; /* These are used for keeping a time-ordered */ + sfs_blockbuf_t *tail[2]; /* list of blocks */ + int accepting_requests; + int pending_requests; + dlink_list *request_queue; + pthread_mutex_t req_lock; + pthread_cond_t req_signal; + pthread_mutex_t req_signal_lock; + pthread_t thread_id; + + dlink_list done_queue; + int done_requests; + pthread_mutex_t done_lock; +} sfs_mountfs_t; + +#ifndef max +#define max(x,y) ((x)<(y)? (y) : (x)) +#endif + +#endif /* !SFS_DEFINES_H */ Index: squid/src/fs/sfs/sfs_fslo.c diff -u /dev/null squid/src/fs/sfs/sfs_fslo.c:1.1.2.2 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_fslo.c Wed Jan 24 08:06:38 2001 @@ -0,0 +1,117 @@ +/* sfs_fslo.c,v 1.17 2001/01/24 12:49:58 adrian Exp */ + +/* Squid FS */ +/* */ +/* Squid FS - Filesystem Level Operations */ +/* */ +/* Authors: Stew Forster (slf) - Original version */ +/* Kevin Littlejohn (darius@bofh.net.au) */ +/* */ + +/* A very simple stripped down UFS style filesystem that makes a lot */ +/* of assumptions based on the needs of the Squid web proxy caching */ +/* software. */ + +/* Note, the types in here are possibly wrong - this needs to be gone over */ + +#include "squid.h" + +#include "sfs_defines.h" + +int +sfs_format(const char *rawdevpath, u_int32_t numfrags) +{ + char *fbm; + int bmlen; + /* XXX - not sure if this next variable is to be used, but it's not at + * the moment, remove? */ + /* int bitsinfrag; */ + int fbmpos; + int ibmpos; + int fd; + int i; + sfs_rootblock_t *rblock; + char *rbbuf; + uint64_t os; + + if(numfrags < MINFSFRAGS) { + errno = ERANGE; + return -1; + } +/* Work out how long the bitmaps should be (in bytes) */ + bmlen = numfrags / BITINBYTE; + if(numfrags % BITINBYTE) + bmlen++; +/* Position them half-way through the fs */ + fbmpos = numfrags >> 1; + ibmpos = fbmpos - bmlen; + + if((fd = open(rawdevpath, O_RDWR)) < 0) + return -1; + + /* Write out the root block */ + + if((rbbuf = (char *)xcalloc(1, CHUNKSIZE)) == NULL) { + close(fd); + return -1; + } + rblock = (sfs_rootblock_t *)rbbuf; + rblock->numfrags = numfrags; + rblock->ibmpos = ibmpos; + rblock->fbmpos = fbmpos; + rblock->bmlen = bmlen; + rblock->magic = SFS_MAGIC; + os = 0; + if(lseek(fd, os, SEEK_SET) < 0) { + xfree(rbbuf); + close(fd); + return -1; + } + if(write(fd, rbbuf, CHUNKSIZE) < 0) { + xfree(rbbuf); + close(fd); + return -1; + } + xfree(rbbuf); + + /* Write out the inode bitmap. This will be all zeros. Since we */ + /* xcalloc()ed the file bitmap and it's the same size as the file */ + /* bitmap, just write it out */ + + os = ibmpos; + os *= FRAGSIZE; + if(lseek(fd, os, SEEK_SET) < 0) { + return -1; + } + if((fbm = (char *)xcalloc(1, bmlen)) == NULL) { + close(fd); + return -1; + } + if(write(fd, fbm, bmlen) < 0) { + xfree(fbm); + close(fd); + return -1; + } + + /* set all the blocks that contain */ + /* the bitmaps as allocated (not free). Also set as used the first */ + /* two fragments which will contain the filesystem root block */ + + for(i = ibmpos; i < (ibmpos + (2 * bmlen)); i++) + SETBIT(fbm, i); + SETBIT(fbm, 0); + SETBIT(fbm, 1); + + /* Write out the frag bitmap. We will already be at the right */ + /* location after writing out the inode bitmap */ + + if(write(fd, fbm, bmlen) < 0) { + xfree(fbm); + close(fd); + return -1; + } + xfree(fbm); + + close(fd); /* Done! So simple */ + return 0; +} /* sfs_format */ Index: squid/src/fs/sfs/sfs_interface.c diff -u /dev/null squid/src/fs/sfs/sfs_interface.c:1.1.2.18 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_interface.c Tue Feb 6 07:57:40 2001 @@ -0,0 +1,481 @@ +/* sfs_interface.c,v 1.58 2001/01/24 12:50:20 adrian Exp */ + +/* These functions comprise the interface portion of squidFS - the bits that + outside functions can call. + I think I'll make the interfaces as identical to normal interfaces as + possible - not overly happy about that, as it means juggling things into + and out of strings, but until I have time to clean up squid's own fs + interfaces, that's the best that can be done. + The above changes in the light of the new store_* stuff in squid. +*/ + +/* + * DEBUG 82 + */ + +#include "squid.h" + +#include "store_sfs.h" + +/* Define this is you want to compile sfs_test, the test program + * - it will remove the references to cbdata* functions, which cause + * linking problems otherwise. */ +#undef SFS_TEST + +/* Public interfaces - the ones squid requires us to provide */ + +sfsfd_t +sfs_open(sfsid_t sfsid, sfsblock_t sfsinode, int oflag, mode_t mode, + enum sfs_io_type io_type, void *dataptr) +{ + struct sfs_requestor *req; + enum sfs_request_type rt; + sfsfd_t ret; + + /* Currently, you have to specify either an inode, or O_CREAT. + * We also make the rather brash assumption that if we're opening to + * write, we're creating a new file - that assumption can change. + * Could do with error checking on the sscanf... */ + if (oflag & O_CREAT) { + rt = _SFS_OP_OPEN_WRITE; + } else { + rt = _SFS_OP_OPEN_READ; + /* If we're trying to open something that's not an inode, return. */ + if (!(CBIT_TEST(_sfs_mounted[sfsid].ibm, sfsinode))) { + printf("ERR: sfs_open opening non-inode\n"); + return -1; + } + } + if (!(req = _sfs_create_requestor(sfsid,rt, io_type))) { + return -1; + } + assert((io_type == _SFS_IO_SYNC) || dataptr); +#ifndef SFS_TEST + if (dataptr) + cbdataLock(dataptr); +#endif + req->sfsinode = sfsinode; + req->dataptr = dataptr; + _sfs_submit_request(req); + _sfs_print_request(req); + if (io_type != _SFS_IO_SYNC) { + return 0; + } + _sfs_waitfor_request(req); +#ifndef SFS_TEST + if (dataptr) + cbdataUnlock(dataptr); +#endif + ret = req->ret; + _sfs_remove_request(req); + return ret; +} + +int +sfs_close(sfsfd_t sfsfd, enum sfs_io_type io_type, void *dataptr) +{ + /* Need to flush the file to disk and remove the structure. */ + sfs_requestor *req; + int ret; + + if(!(req = _sfs_create_requestor(sfsfd >> 24, _SFS_OP_CLOSE, io_type))) + return -1; + assert((io_type == _SFS_IO_SYNC) || dataptr); +#ifndef SFS_TEST + if (dataptr) + cbdataLock(dataptr); +#endif + req->sfsfd = sfsfd; + req->dataptr = dataptr; + _sfs_submit_request(req); + if (io_type != _SFS_IO_SYNC) { + return 0; + } + _sfs_waitfor_request(req); +#ifndef SFS_TEST + if (dataptr) + cbdataUnlock(dataptr); +#endif + ret = req->ret; + _sfs_remove_request(req); + return ret; +} + +ssize_t +sfs_read(sfsfd_t sfsfd, void *buf, ssize_t buflen, enum sfs_io_type io_type, + void *dataptr) +/* Takes: sfsfd, a pointer to pre-allocated space, and length of said + space. + Returns: number of bytes read. + (Note, on Solaris 2.6, ssize_t is 4 bytes, and I believe signed) +*/ +{ + sfs_requestor *req; + ssize_t ret; + sfsid_t sfsid; + + sfsid = sfsfd >> 24; + if(!(req = _sfs_create_requestor(sfsid, _SFS_OP_READ, io_type))) { + return -1; + } + assert((io_type == _SFS_IO_SYNC) || dataptr); +#ifndef SFS_TEST + if (dataptr) + cbdataLock(dataptr); +#endif + req->sfsfd = sfsfd; + req->offset = -1; + req->buflen = buflen; + req->dataptr = dataptr; + _sfs_submit_request(req); + if (io_type != _SFS_IO_SYNC) + return 0; + _sfs_waitfor_request(req); +#ifndef SFS_TEST + if (dataptr) + cbdataUnlock(dataptr); +#endif + if ((!buf) || (req->ret == 0)) + return 0; + if (req->ret < 0) + return req->ret; + ret = req->buflen; + if (req->buf) { + memcpy(buf,req->buf,ret+1); + xfree(req->buf); + } + _sfs_remove_request(req); + return ret; +} + +ssize_t +sfs_write(sfsfd_t sfsfd, const void *buf, ssize_t buflen, + enum sfs_io_type io_type, void *dataptr) +{ + sfs_requestor *req; + ssize_t ret; + sfsid_t sfsid; + + sfsid = sfsfd >> 24; + if (!(req = _sfs_create_requestor(sfsid,_SFS_OP_WRITE, io_type))) { + return -1; + } + assert((io_type == _SFS_IO_SYNC) || dataptr); +#ifndef SFS_TEST + if (dataptr) + cbdataLock(dataptr); +#endif + req->sfsfd = sfsfd; + if (!(req->buf = xstrdup(buf))) { +#ifndef SFS_TEST + if (dataptr) + cbdataUnlock(dataptr); +#endif + return -1; + } + req->buflen = buflen; + req->dataptr = dataptr; + _sfs_submit_request(req); + if (io_type != _SFS_IO_SYNC) + return 0; + _sfs_waitfor_request(req); +#ifndef SFS_TEST + if (dataptr) + cbdataUnlock(dataptr); +#endif + ret = req->ret; + if (req->buf) + xfree(req->buf); + _sfs_remove_request(req); + return ret; +} + +int +sfs_unlink(sfsid_t sfsid, sfsblock_t sfsinode, enum sfs_io_type io_type, + void *dataptr) +{ +/* Should really take a full filename, by rights */ +/* Here's the trick with this one: You don't unlink a file till _after_ + you've closed it (normally). That means I can't take the normal sfsfd + and extract the relevant info :( */ + sfs_requestor *req; + int ret; + + if (!(req = _sfs_create_requestor(sfsid, _SFS_OP_UNLINK, io_type))) { + return -1; + } + /* + * We don't do this because its valid to have an async unlink without + * any notification info. Eww. -- adrian + */ +#if 0 + assert((io_type == _SFS_IO_SYNC) || dataptr); +#endif +#ifndef SFS_TEST + if (dataptr) + cbdataLock(dataptr); +#endif + req->sfsinode = sfsinode; + req->dataptr = dataptr; + _sfs_submit_request(req); + if (io_type != _SFS_IO_SYNC) + return 0; + _sfs_waitfor_request(req); +#ifndef SFS_TEST + if (dataptr) + cbdataUnlock(dataptr); +#endif + ret = req->ret; + _sfs_remove_request(req); + return ret; +} + +/* Private-ish interfaces - the ones people can call, but squid doesn't use */ +/* directly. */ + +void sfs_thread_loop(sfs_mountfs_t *mount_point); + +int +sfs_umount(sfsid_t sfsid, enum sfs_io_type io_type) +/* As noted below, mount and umount need to be called only from a single +thread - preferably the thread that calls init. I _can_ fix this, with +YALock, but I've chosen not to at this time. */ +{ + sfs_requestor *req; + int ret; + + if (sfsid >= MAXFILESYS) + return -1; + if (_sfs_mounted[sfsid].rootblock == NULL) + return 0; +/* Send a umount, and wait for the return. */ +/* The umount request simply tells the fs not to accept any more requests, +and to sync all changes to disk, close the fd, and remove itself from the +list of mounted fs'es. Basically, all the important stuff is done in the +thread itself. */ + if (!(req = _sfs_create_requestor(sfsid,_SFS_OP_UMOUNT, io_type))) + return -1; + if (_sfs_submit_request(req) < 0) + return -1; + if (io_type != _SFS_IO_SYNC) + return 0; + _sfs_waitfor_request(req); + ret = req->ret; + _sfs_remove_request(req); + if (ret == 0) { + if (_sfs_mounted[sfsid].rootblock) { + xfree(_sfs_mounted[sfsid].rootblock); + _sfs_mounted[sfsid].rootblock = NULL; + } + } + return ret; +} + +sfsid_t +sfs_mount(const char *rawdevpath) +{ + sfsid_t i; + sfsblock_t j, bmlen; + sfsblock_t ibmpos, fbmpos; + sfsblock_t magic; + + /* This hunt is not thread-safe - assume only one thread doing these + * things (initialising/mounting) - otherwise bad things happen(tm). + * Fixing this assumption would mean adding a lock over the _sfs_mounted + * array */ + for(i = 1; (_sfs_mounted[i].rootblock != NULL) && (i < MAXFILESYS); i++); + if (i == MAXFILESYS) + return -1; + if ((_sfs_mounted[i].fd = open(rawdevpath, O_RDWR)) < 0) + return -1; + if (lseek(_sfs_mounted[i].fd, (uint64_t)0, SEEK_SET) < (uint64_t)0) { + printf("ERR: Didn't manage to lseek in mount :(\n"); + close(_sfs_mounted[i].fd); + return -1; + } + if ((_sfs_mounted[i].rootblock = (sfs_rootblock_t *)xcalloc(1,CHUNKSIZE)) == NULL) { + close(_sfs_mounted[i].fd); + _sfs_mounted[i].rootblock = NULL; + return -1; + } + if (read(_sfs_mounted[i].fd, _sfs_mounted[i].rootblock, CHUNKSIZE) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return -1; + } + ibmpos = _sfs_mounted[i].rootblock->ibmpos; + fbmpos = _sfs_mounted[i].rootblock->fbmpos; + bmlen = _sfs_mounted[i].rootblock->bmlen; + magic = _sfs_mounted[i].rootblock->magic; + + printf("DEBUG: sfs root: ibmpos %d, fbmpos %d, bmlen %d, numfrags %d, magic %d\n", + _sfs_mounted[i].rootblock->ibmpos, + _sfs_mounted[i].rootblock->fbmpos, + _sfs_mounted[i].rootblock->bmlen, + _sfs_mounted[i].rootblock->numfrags, + _sfs_mounted[i].rootblock->magic); + + /* Check magic! */ + if (magic != SFS_MAGIC) + return -1; + + /* If any of the rootblock stuff == 0, we have a bad fs */ + if ((ibmpos == 0) || (fbmpos == 0) || (bmlen == 0)) + return -1; + + _sfs_mounted[i].sfsid = i; + _sfs_mounted[i].fbm = (char *)xcalloc(1,bmlen); + _sfs_mounted[i].ibm = (char *)xcalloc(1,bmlen); +/* Seek to the bitmaps, and read them in */ +/* I wonder whether or not it makes sense to have this stuff done in the +fs'es own thread. Maybe */ + if (lseek(_sfs_mounted[i].fd, ibmpos, SEEK_SET) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return -1; + } + if (read(_sfs_mounted[i].fd, _sfs_mounted[i].fbm, bmlen) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return -1; + } + if (read(_sfs_mounted[i].fd, _sfs_mounted[i].ibm, bmlen) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return -1; + } + if ((_sfs_mounted[i].mhb = (char *)xcalloc(1,bmlen)) == NULL) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return -1; + } + for (j = 0; j <= bmlen; j++) { + if (CBIT_TEST(_sfs_mounted[i].ibm, j) || CBIT_TEST(_sfs_mounted[i].fbm, j)) + CBIT_SET(_sfs_mounted[i].mhb, j); + } + _sfs_mounted[i].dirty = NULL; + _sfs_mounted[i].request_queue = xcalloc(1,sizeof(dlink_list)); + _sfs_mounted[i].pending_requests = 0; + pthread_mutex_init(&(_sfs_mounted[i].req_lock), NULL); + pthread_mutex_init(&(_sfs_mounted[i].req_signal_lock), NULL); + pthread_cond_init(&(_sfs_mounted[i].req_signal), NULL); + pthread_mutex_init(&(_sfs_mounted[i].done_lock), NULL); + _sfs_mounted[i].done_queue.head = NULL; + _sfs_mounted[i].done_queue.tail = NULL; + _sfs_mounted[i].done_requests = 0; + + pthread_create(&(_sfs_mounted[i].thread_id), NULL, (void *)&sfs_thread_loop, &(_sfs_mounted[i])); + while (!(_sfs_mounted[i].accepting_requests)) + sleep(1); + /* Return the sfsid */ + return i; +} + +off_t +sfs_seek(sfsfd_t sfsfd, off_t pos, enum sfs_io_type io_type, void *dataptr) +/* Takes: sfsid, fd, and position to seek to. + Returns: 0 for success, -1 for failure. +*/ +{ + sfs_requestor *req; + sfsid_t sfsid; + int ret; + + sfsid = sfsfd >> 24; + if(!(req = _sfs_create_requestor(sfsid, _SFS_OP_SEEK, io_type))) + return -1; + + assert((io_type == _SFS_IO_SYNC) || dataptr); +#ifndef SFS_TEST + if (dataptr) + cbdataLock(dataptr); +#endif + req->sfsfd = sfsfd; + req->dataptr = dataptr; + req->offset = pos; + _sfs_submit_request(req); + _sfs_print_request(req); + if (io_type != _SFS_IO_SYNC) { + return 0; + } + _sfs_waitfor_request(req); +#ifndef SFS_TEST + if (dataptr) + cbdataUnlock(dataptr); +#endif + ret = req->ret; + _sfs_remove_request(req); + return ret; +} + +/* + * sfs_getcompleted - retrieve a single completed async request + * + * This function retrieves a single completed request from the done queue. + * It does not remove it from the done queue - this is the job of + * _sfs_remove_request. + * + * done_lock isn't to be held here - but we grab it whilst walking the list. + */ +sfs_requestor * +sfs_getcompleted(sfsid_t sfsid) +{ + dlink_node *node; + + /* + * Walk the list, find the first async request, and return a pointer + * to it. I'll hold the lock whilst we look through the list. + */ + pthread_mutex_lock(&(_sfs_mounted[sfsid].done_lock)); + node = _sfs_mounted[sfsid].done_queue.head; + + while ((node != NULL) && + (((sfs_requestor *)node->data)->io_type != _SFS_IO_ASYNC)) + node = node->next; + + pthread_mutex_unlock(&(_sfs_mounted[sfsid].done_lock)); + + /* We now have a node. Return it. */ + if (node) + return node->data; + else + return NULL; +} + +int +sfs_filesize(sfsid_t sfsid, sfsblock_t sfsinode) +{ + /* This function returns the size of the given file - file given as + * sfsid and inode. It leans on the block cache to find the info. */ + sfs_blockbuf_t *inode; + + inode = _sfs_read_block(sfsid,sfsinode); + return ((sfs_inode_t *)inode->buf)->len; +} + +int +sfs_openNextInode(sfsid_t sfsid, sfsblock_t *cur) +{ + sfsblock_t i; + int found, fd; + /* This function walks through a mounted filesystem, returning the + * next inode that's in-use. Used by rebuildDir stuff. + * First block is in use for storing rootblock. */ + for(i=max(3,*cur), found=0; i<_sfs_mounted[sfsid].rootblock->numfrags; i += 1) { + if (CBIT_TEST(_sfs_mounted[sfsid].ibm, i)) { + *cur = i; + printf("DEBUG: next inode %d\n",*cur); + /* Note, should make an sio, but I'm too lazy - SYNC doesn't + * _really_ need one... */ + fd = sfs_open(sfsid,i,0,_SFS_OP_OPEN_READ,_SFS_IO_SYNC,NULL); + return fd; + } + } + *cur = 0; + return -2; +} Index: squid/src/fs/sfs/sfs_lib.h diff -u /dev/null squid/src/fs/sfs/sfs_lib.h:1.1.2.10 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_lib.h Sat Feb 3 17:16:04 2001 @@ -0,0 +1,68 @@ +/* sfs_lib.h,v 1.16 2001/01/24 12:47:58 adrian Exp */ + +/* Squid FS */ +/* */ +/* Authors: Stew Forster (slf) - Original version */ +/* Kevin Littlejohn (darius@bofh.net.au) */ +/* */ + +/* A very simple stripped down UFS style filesystem that makes a lot */ +/* of assumptions based on the needs of the Squid web proxy caching */ +/* software. */ + +#ifndef SFS_LIB_H +#define SFS_LIB_H + + +/* The mount list */ +extern sfs_mountfs_t _sfs_mounted[MAXFILESYS]; +extern sfs_openfile_t * _sfs_openfiles[MAXFILESYS]; + +/* Internal functions */ +/* sfs_util.c */ +extern void _sfs_waitfor_request(sfs_requestor *req); +extern int _sfs_remove_request(sfs_requestor *req); +extern void _sfs_done_request(sfs_requestor *req, int retval); +extern sfs_requestor * _sfs_create_requestor(int sfsid, + enum sfs_request_type reqtype, enum sfs_io_type iotype); +extern int _sfs_submit_request(sfs_requestor *req); +extern sfs_blockbuf_t * _sfs_read_block(uint sfsid, uint diskpos); +extern uint _sfs_calculate_diskpos(sfs_openfile_t *openfd, uint offset); +extern void _sfs_commit_block(int sfsid, sfs_blockbuf_t *block); +extern sfs_blockbuf_t * _sfs_write_block(uint sfsid, uint diskpos, + void *buf, int buflen, enum sfs_block_type type); +extern uint _sfs_allocate_fd(sfs_openfile_t *new); +extern uint _sfs_allocate_block(int sfsid, int blocktype); +extern sfs_openfile_t * _sfs_find_fd(int sfsfd); +extern void _sfs_flush_bitmaps(int sfsid); +extern int _sfs_flush_file(int sfsid, sfs_openfile_t *fd); +extern void _sfs_print_request(sfs_requestor *req); + +/* sfs_splay.c */ +extern sfs_blockbuf_t * _sfs_blockbuf_create(); +extern sfs_blockbuf_t *sfs_splay_find(uint diskpos, sfs_blockbuf_t *tree); +extern sfs_blockbuf_t * sfs_splay_insert(int sfsid, sfs_blockbuf_t *new, + sfs_blockbuf_t *tree); +extern sfs_blockbuf_t * sfs_splay_remove(int sfsid, sfs_blockbuf_t *tree); +extern sfs_blockbuf_t * sfs_splay_delete(int sfsid, sfs_blockbuf_t *tree); + + + +/* External stuff */ +extern int sfs_format(const char *, u_int32_t ); +extern sfsfd_t sfs_open(sfsid_t, sfsblock_t, int, mode_t, enum sfs_io_type, + void *); +extern int sfs_umount(sfsid_t, enum sfs_io_type ); +extern sfsid_t sfs_mount(const char * ); +extern int sfs_close(sfsfd_t, enum sfs_io_type, void *); +extern ssize_t sfs_read(sfsfd_t , void * , ssize_t, enum sfs_io_type, + void *); +extern off_t sfs_seek(sfsfd_t , off_t , enum sfs_io_type, void *); +extern int sfs_unlink(sfsid_t , sfsblock_t, enum sfs_io_type, void *); +extern ssize_t sfs_write(sfsfd_t , const void * , ssize_t, + enum sfs_io_type, void *); +extern sfs_requestor * sfs_getcompleted(sfsid_t); +int sfs_openNextInode(sfsid_t sfsid, sfsblock_t *cur); + + +#endif /* !SFS_LIB_H */ Index: squid/src/fs/sfs/sfs_llo.c diff -u /dev/null squid/src/fs/sfs/sfs_llo.c:1.1.2.13 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_llo.c Sat Feb 3 17:16:04 2001 @@ -0,0 +1,509 @@ +/* sfs_llo.c,v 1.84 1999/02/03 04:04:06 darius Exp */ + +/* Squid FS */ +/* */ +/* Authors: Stew Forster (slf) - Original version */ +/* Kevin Littlejohn (darius@bofh.net.au) */ +/* */ + +/* A very simple stripped down UFS style filesystem that makes a lot */ +/* of assumptions based on the needs of the Squid web proxy caching */ +/* software. */ + +#include "squid.h" + +#include "store_sfs.h" + +sfs_mountfs_t _sfs_mounted[MAXFILESYS]; +sfs_openfile_t *_sfs_openfiles[MAXFILESYS]; +int _sfs_initialised = 0; +int inode_data_size = FRAGSIZE - sizeof(sfs_inode_t); +int direct_pointer_threshold = FRAGSIZE - sizeof(sfs_inode_t) + (NUMDIP * FRAGSIZE); + +void sfs_do_umount(sfs_requestor *req); +void sfs_do_open(sfs_requestor *req); +void sfs_do_read(sfs_requestor *req); +void sfs_do_write(sfs_requestor *req); +void sfs_do_unlink(sfs_requestor *req); +void sfs_do_close(sfs_requestor *req); +void sfs_do_seek(sfs_requestor *req); + +void +sfs_initialise() +{ + int i; + + if (_sfs_initialised) + return; + _sfs_initialised = 1; + for(i = 0; i < MAXFILESYS; i++) { + _sfs_mounted[i].rootblock = NULL; + _sfs_mounted[i].accepting_requests = 0; + _sfs_openfiles[i] = NULL; + } +} /* sfs_initialise */ + +void +sfs_thread_loop(sfs_mountfs_t *mount_point) +{ + sigset_t new; + int i; + sfs_requestor *req; + dlink_node *tnode; + + /* Make sure to ignore signals which may possibly get sent to the parent */ + /* squid thread. Causes havoc with mutex's and condition waits otherwise */ + /* (Stolen from aiops.c) - Darius */ + + sigemptyset(&new); + sigaddset(&new, SIGPIPE); + sigaddset(&new, SIGCHLD); +#if (defined(_SQUID_LINUX_) && USE_ASYNC_IO) + sigaddset(&new, SIGQUIT); + sigaddset(&new, SIGTRAP); +#else + sigaddset(&new, SIGUSR1); + sigaddset(&new, SIGUSR2); +#endif + sigaddset(&new, SIGHUP); + sigaddset(&new, SIGTERM); + sigaddset(&new, SIGINT); + sigaddset(&new, SIGALRM); + pthread_sigmask(SIG_BLOCK, &new, NULL); + + /* Set a conditional, when it's realised scan through the service list. */ + pthread_cond_init(&(mount_point->req_signal), NULL); + pthread_mutex_lock(&(mount_point->req_signal_lock)); + mount_point->accepting_requests = 1; + i = 0; + while (1) { + pthread_cond_wait(&(mount_point->req_signal), &(mount_point->req_signal_lock)); + pthread_mutex_lock(&(mount_point->req_lock)); + while (mount_point->pending_requests > 0) { + printf("Pending Requests: %d\n",mount_point->pending_requests); + tnode = mount_point->request_queue->head; + assert(tnode); + req = tnode->data; + if (req && (req->request_state == _SFS_PENDING)) { + /* If we're not accepting requests, return fail for each + * request. Note, we can't just lock the request queue, as + * things are still being removed from it by other threads. */ + printf("dealing with pending request, %d\n",mount_point->pending_requests); + if (!(mount_point->accepting_requests)) { + _sfs_done_request(req,-1); + } else { + /* This portion sets the state, and works out exactly what + * to do - open, read, write, close, sync, unlink. */ + req->request_state = _SFS_IN_PROGRESS; + mount_point->pending_requests--; + printf("pending requests now %d\n",mount_point->pending_requests); + switch (req->request_type) { + case _SFS_OP_OPEN_READ: + case _SFS_OP_OPEN_WRITE: + sfs_do_open(req); + break; + case _SFS_OP_UNLINK: + sfs_do_unlink(req); + break; + case _SFS_OP_UMOUNT: + sfs_do_umount(req); + break; + case _SFS_OP_READ: + sfs_do_read(req); + break; + case _SFS_OP_WRITE: + sfs_do_write(req); + break; + case _SFS_OP_CLOSE: + sfs_do_close(req); + break; + case _SFS_OP_SEEK: + sfs_do_seek(req); + break; + default: + _sfs_done_request(req,-1); + } + } + } + printf("tnode prev: %p next: %p\n",tnode->prev,tnode->next); + tnode = tnode->next; + } + pthread_mutex_unlock(&(mount_point->req_lock)); + /* Flush the bitmaps every 10 seconds */ + i = (i + 1) % 10; + if (i == 0) { + _sfs_flush_bitmaps(mount_point->sfsid); + } + } +} + +void +sfs_do_read(sfs_requestor *req) +{ + sfs_blockbuf_t *new; + sfs_openfile_t *openfd; + int bytes_read, fragsize; + uint diskpos; + void *buf; + + if (!(openfd = _sfs_find_fd(req->sfsfd))) { + _sfs_done_request(req,-1); + return; + } + + if (req->offset > -1) + openfd->pos = req->offset; + + bytes_read = 0; + buf = NULL; + /* one block at a time */ + /* We _could_ alloc the space required for the entire file in one go, + * at least if the filesize was below a certain watermark - should be + * a significant speed boost if my understanding of realloc issues is + * correct... */ + while ((bytes_read < req->buflen) && (bytes_read < openfd->inode->len)) { + if (!(diskpos = _sfs_calculate_diskpos(openfd,openfd->pos))) { + req->buf = buf; + req->buflen = bytes_read; + _sfs_done_request(req,bytes_read); + return; + } + if (!(new = _sfs_read_block(openfd->sfsid,diskpos))) { + req->buf = buf; + req->buflen = bytes_read; + _sfs_done_request(req,bytes_read); + return; + } + /* In case of request only wanting a certain amount, work out how much + * to copy into req->buf */ + fragsize = min(req->buflen - bytes_read, new->buflen); + fragsize = min(fragsize, openfd->inode->len); + if (new->type == _SFS_INODE) + fragsize = min(fragsize, inode_data_size); + else { + if ((openfd->pos + bytes_read) == openfd->inode->len) { + fragsize = min(fragsize, (openfd->inode->len % FRAGSIZE)); + } + } + buf = (char *)xrealloc(buf, bytes_read + fragsize); + if (new->type == _SFS_INODE) + memcpy(((char *)buf)+bytes_read, new->buf+sizeof(sfs_inode_t), fragsize); + else + memcpy(((char *)buf)+bytes_read, new->buf, fragsize); + bytes_read += fragsize; + openfd->pos += fragsize; + if (fragsize == 0) + abort(); + } + req->buf = buf; + req->buflen = bytes_read; + _sfs_done_request(req,bytes_read); +} + +void +sfs_do_write(sfs_requestor *req) +{ + sfs_openfile_t *openfd; + sfs_blockbuf_t *new, *current = NULL; + int offset, written, fragsize, inblock; + int bytes_left, leader; + sfsblock_t diskpos; + void *buf; + int type; + + if (!(openfd = _sfs_find_fd(req->sfsfd))) { + _sfs_done_request(req,-1); + return; + } + + if (req->offset == -1) + offset = openfd->pos; + else + offset = openfd->pos = req->offset; + + diskpos = 0; + + /* written tracks how much we've written so far in total. */ + /* offset tells us where we start writing, minus the inode. */ + /* bytes_left gets set to the number of bytes left to write. */ + /* leader is the offset plus the inode - actual starting location. */ + written = 0; + leader = offset + sizeof(sfs_inode_t); + bytes_left = req->buflen; + buf = NULL; + /* We write one block at a time - the process to find which block to write + * to next is a little, um, involved at present. */ + while (written < req->buflen) { + /* Work out type and where in the block this write should go. + * If the total file is smaller than the inode block data size, then + * we're best off storing it in an inode data block - fastest retrieval + * and all that. + * + * offset + written + sizeof(sfs_inode_t) = position in file (filepos) + * filepos % FRAGSIZE = remainder (thisblock) + * fragsize indicates the maximum amount of data to write in this + * block. */ + if (openfd->pos < inode_data_size) { + type = _SFS_INODE; + inblock = leader; + fragsize = min(inode_data_size - openfd->pos,bytes_left); + } else { + type = _SFS_DATA; + inblock = leader % FRAGSIZE; + fragsize = min(FRAGSIZE - inblock,bytes_left); + } + current = NULL; + + /* Figure out where on disk it should be... */ + if (!(diskpos = _sfs_calculate_diskpos(openfd, openfd->pos))) { + /* If we're not within the file, allocate a new block */ + if (!(diskpos = _sfs_allocate_block(req->sfsid, type))) { + _sfs_done_request(req,written); + return; + } + if (type != _SFS_INODE) { + if (openfd->pos < direct_pointer_threshold) { + openfd->inode->dip[(openfd->pos - inode_data_size) / FRAGSIZE] = diskpos; + } else { + /* XXX indirect pointer - youch. This has not yet been + * implemented :( */ + } + } + } else { + current = _sfs_read_block(openfd->sfsid,diskpos); + } + + /* How much more to write? */ + if (current) { + buf = current->buf; + } else { + buf = (char *)xcalloc(1, FRAGSIZE); + } + memcpy(((char *)buf)+inblock,((char *)req->buf)+written,fragsize); + if (!current) { + if (!(new = _sfs_write_block(req->sfsid, diskpos, buf, fragsize, type))) { + _sfs_done_request(req,written); + return; + } + new->sfsinode = openfd->sfsinode; + new->type = type; + current = new; + } + written += fragsize; + openfd->pos += fragsize; + bytes_left -= fragsize; + } + /* $DEITY forbid two people try to write at once - maybe I need some + * locking to prevent that... */ + openfd->inode->len = max(openfd->inode->len,openfd->pos); + printf("DEBUG: written %d bytes to fd %d (block %d)\n",req->buflen,req->sfsfd,diskpos); + _sfs_done_request(req,written); +} + +void +sfs_do_umount(sfs_requestor *req) +{ + sfs_mountfs_t *mnt; + sfs_blockbuf_t *block_ptr; + dlink_node *lnode; + sfs_openfile_t *openfd; + int i; + + _sfs_mounted[req->sfsid].accepting_requests = 0; + mnt = &(_sfs_mounted[req->sfsid]); + /* Flush all the dirty blocks out to HDD */ + openfd = _sfs_openfiles[req->sfsid]; + while(openfd) { + /* flush_file has to get rid of stuff then, which is bad :( + * The structures get kinda confused at this point */ + printf("DEBUG: flushing file...\n"); + _sfs_flush_file(req->sfsid,openfd); + _sfs_openfiles[req->sfsid] = openfd->next; + xfree(openfd); + openfd = _sfs_openfiles[req->sfsid]; + } + printf ("DEBUG: umount flushing dirty blocks\n"); + while (mnt->dirty) { + _sfs_commit_block(req->sfsid, mnt->dirty); + mnt->dirty = sfs_splay_delete(req->sfsid, mnt->dirty); + } + _sfs_flush_bitmaps(req->sfsid); + if (mnt->fbm) + xfree(mnt->fbm); + if (mnt->ibm) + xfree(mnt->ibm); + if (mnt->mhb) + xfree(mnt->mhb); + block_ptr = mnt->clean; + /* Need to clean out the clean list */ + while (mnt->clean) { + mnt->clean = sfs_splay_delete(req->sfsid, mnt->clean); + } + if (mnt->request_queue->head != NULL) { + /* Make doubly sure we've cleared any pending requests - shouldn't need to, + * but we _are_ umounting... This is actually bodgy code, if it ever does + * anything, then something's gone wrong. Probably shouldn't do this, but + * it saves deadlock, and we'd rather see corruption than hang + * indefinately.*/ + lnode = mnt->request_queue->head; + while (lnode) { + req = lnode->data; + if (req->request_state == _SFS_PENDING) { + _sfs_done_request(req,-1); + } + lnode = lnode->next; + } + i = 0; + /* Waiting for the request queue to be empty of all bar the umount + * request */ + while ((mnt->request_queue->head->next) && (i < 5)) { + i++; + sleep(1); + } + } + /* At this stage, all I need to do is kill the thread :) + * I could shuffle these, do the done_request before actually completely + * finishing - that would guarantee the requests are all collected + * properly */ + _sfs_done_request(req,0); + if (mnt->rootblock) { + xfree(mnt->rootblock); + mnt->rootblock = NULL; + } + /* Should also free all open fd's */ + pthread_exit(NULL); +} + +void +sfs_do_open(sfs_requestor *req) +{ + sfs_openfile_t *fd, *fdptr; + + if ((fd = (sfs_openfile_t *)xcalloc(1, sizeof(sfs_openfile_t))) == NULL) { + _sfs_done_request(req,-1); + return; + } + fd->sfsid = req->sfsid; + fd->sfsfd = _sfs_allocate_fd(fd); + fd->pos = 0; + + if (req->request_type == _SFS_OP_OPEN_READ) { + fd->sfsinode = req->sfsinode; + fd->inodebuf_p = _sfs_read_block(fd->sfsid,fd->sfsinode); + } else { + /* Doesn't need to lock, as all allocation is within thread. */ + if (!(fd->sfsinode = _sfs_allocate_block(req->sfsid, _SFS_INODE))) { + xfree(fd); + printf("ERR: couldn't allocate sfsinode\n"); + _sfs_done_request(req,-1); + return; + } + /* Fill the new inode */ + if (!(fd->inodebuf_p = _sfs_blockbuf_create())) { + printf("DEBUG: Couldn't create a blockbuf\n"); + xfree(fd); + _sfs_done_request(req,-1); + return; + } + fd->inodebuf_p->type = _SFS_INODE; + fd->inodebuf_p->buf = (char *)xcalloc(1, FRAGSIZE); + fd->inodebuf_p->diskpos = fd->inodebuf_p->sfsinode = fd->sfsinode; + fd->inodebuf_p->buflen = 0; + _sfs_mounted[req->sfsid].clean = sfs_splay_insert(req->sfsid, fd->inodebuf_p, _sfs_mounted[req->sfsid].clean); + } + /* Nasty cast */ + fd->inode = (sfs_inode_t *)fd->inodebuf_p->buf; + if (req->request_type != _SFS_OP_OPEN_READ) { + fd->inode->len = 0; + fd->rwbuf_list_p = NULL; + fd->sibuf_list_p = NULL; + fd->dibuf_p = NULL; + } + fd->pos = 0; + fd->next = fd->prev = NULL; + /* Add this one to the sfsid list of open fd's */ + /* Allocating an fd */ + fdptr = _sfs_openfiles[req->sfsid]; + if (fdptr) { + while(fdptr->next) { + fdptr = fdptr->next; + } + fdptr->next = fd; + fd->prev = fdptr; + } else { + _sfs_openfiles[req->sfsid] = fd; + } + req->buf = fd; + _sfs_done_request(req,fd->sfsfd); +} + +void +sfs_do_unlink(sfs_requestor *req) +{ + /* XXX unused at the moment + sfs_openfile_t *ptr; + sfs_blockbuf_t *block; */ + + if (!(CBIT_TEST(_sfs_mounted[req->sfsid].ibm, req->sfsfd))) { + _sfs_done_request(req,-1); + return; + } +/* Check to make sure there's not an open file here - if there is, close and +flush it. Is this correct behaviour? At least, we shouldn't flush to disk - +at most, we should do something about the threads trying to hold the file open + while (ptr = _sfs_find_fd(req->sfsid, req->sfsfd)) + _sfs_flush_file(req->sfsid, ptr); +*/ +/* Without opening a file ;) read in the inode, walk the list of blocks, + and CBIT_CLEAR each one from .fbm */ +} + +void +sfs_do_close(sfs_requestor *req) +{ + sfs_openfile_t *ptr; + + printf("DEBUG: closing file %d\n",req->sfsfd); + if (!(ptr = _sfs_find_fd(req->sfsfd))) { + printf("DEBUG: couldn't find fd %d\n",req->sfsfd); + _sfs_done_request(req,-1); + return; + } + printf("DEBUG: flushing file %d\n",req->sfsfd); + _sfs_flush_file(req->sfsid, ptr); + if (ptr) { + /* Assuming _sfs_flush_file clears the other stuff from the openfd - + * will check that later... */ + xfree(ptr); + } + _sfs_done_request(req,0); + return; +} + +void +sfs_do_seek(sfs_requestor *req) +{ + sfs_openfile_t *ptr; + unsigned char sfsid; + + sfsid = req->sfsid; + ptr = _sfs_openfiles[sfsid]; + while (ptr) { + if (ptr->sfsfd == req->sfsfd) + break; + ptr = ptr->next; + } + if (!ptr) { + printf("DEBUG: Can't find an openfile for fd %d!\n", req->sfsfd); + _sfs_done_request(req,-1); + return; + } + if (req->offset > ptr->inode->len) { + printf("DEBUG: seek beyond EOF for fd %d\n",req->sfsfd); + _sfs_done_request(req,-1); + return; + } + ptr->pos = req->offset; + _sfs_done_request(req,0); + return; +} Index: squid/src/fs/sfs/sfs_read.c diff -u /dev/null squid/src/fs/sfs/sfs_read.c:1.1.2.4 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_read.c Wed Feb 7 02:41:49 2001 @@ -0,0 +1,56 @@ +/* $Id$ */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "squid.h" +#include "sfs_defines.h" +#include "sfs_lib.h" + +int main(int argc, char *argv[]) +{ + int sfsid; + int sfsfd; + char buf[512]; + int inode; + int err; + + /* Check args */ + if (argc < 2) { + printf("error: need mountpoint inode..\n"); + exit(1); + } + + /* get inode */ + inode = atoi(argv[2]); + if (inode < 0) { + printf("inode '%s' is an invalid number\n", argv[2]); + exit(1); + } + + sfsid = sfs_mount(argv[1]); + printf("sfsid mount = %d\n",sfsid); + if (sfsid < 0) + exit(1); + + printf("opening %d\n", inode); + sfsfd = sfs_open(sfsid, inode, O_RDONLY, 0, _SFS_IO_SYNC, NULL); + if (sfsfd < 0) + exit(1); + + while ((err = sfs_read(sfsfd, buf, 510, _SFS_IO_SYNC, NULL)) > 0) { + /* err is also our write length! */ + write(1, buf, err); + } + printf("Got %d from sfs_read\n", err); + + printf("close result = %d\n",sfs_close(sfsfd, _SFS_IO_SYNC, NULL)); + printf("umount result = %d\n",sfs_umount(sfsid, _SFS_IO_SYNC)); + exit(0); +} Index: squid/src/fs/sfs/sfs_shim.c diff -u /dev/null squid/src/fs/sfs/sfs_shim.c:1.1.2.3 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_shim.c Sat Jan 27 02:20:44 2001 @@ -0,0 +1,48 @@ +#include "squid.h" + +void +dlinkAdd(void *data, dlink_node * m, dlink_list * list) +{ + m->data = data; + m->prev = NULL; + m->next = list->head; + if (list->head) + list->head->prev = m; + list->head = m; + if (list->tail == NULL) + list->tail = m; +} + +void +dlinkAddTail(void *data, dlink_node * m, dlink_list * list) +{ + m->data = data; + m->next = NULL; + m->prev = list->tail; + if (list->tail) + list->tail->next = m; + list->tail = m; + if (list->head == NULL) + list->head = m; +} + +void +dlinkDelete(dlink_node * m, dlink_list * list) +{ + if (m->next) + m->next->prev = m->prev; + if (m->prev) + m->prev->next = m->next; + if (m == list->head) + list->head = m->next; + if (m == list->tail) + list->tail = m->prev; + m->next = m->prev = NULL; +} + +void +xassert(const char *msg, const char *file, int line) +{ + printf("Assertion failed: %s:%d - %s\n", file, line, msg); + abort(); +} Index: squid/src/fs/sfs/sfs_splay.c diff -u /dev/null squid/src/fs/sfs/sfs_splay.c:1.1.2.1 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_splay.c Wed Jan 24 06:11:54 2001 @@ -0,0 +1,157 @@ +/* $Id$ */ + +#include "squid.h" + +#include "store_sfs.h" + +sfs_blockbuf_t * +_sfs_blockbuf_create() +{ + sfs_blockbuf_t *new; + if ((new = (sfs_blockbuf_t *)xcalloc(1, sizeof(sfs_blockbuf_t))) == NULL) + return NULL; + new->left = NULL; + new->right = NULL; + new->prev = NULL; + new->next = NULL; + new->sfsid = -1; + new->sfsinode = 0; + new->diskpos = -1; + new->dirty = 0; + new->type = _SFS_UNKNOWN; + new->buf = NULL; + return new; +} + +sfs_blockbuf_t *sfs_splay_find(uint diskpos, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t *temp, *l, *r; + sfs_blockbuf_t new; + + if (tree == NULL) + return NULL; + + l = r = &new; + for (;;) { + if (diskpos < tree->diskpos) { + if (!(tree->left)) + break; + if (diskpos < tree->left->diskpos) { + temp = tree->left; + tree->left = temp->right; + temp->right = tree; + tree = temp; + if (tree->left == NULL) + break; + } + r->left = tree; + r = tree; + tree = tree->left; + } else if (diskpos > tree->diskpos) { + if (!(tree->right)) + break; + if (diskpos > tree->right->diskpos) { + temp = tree->right; + tree->right = temp->left; + temp->left = tree; + tree = temp; + if (tree->right == NULL) + break; + } + l->right = tree; + l = tree; + tree = tree->right; + } else { + break; + } + } + l->right = tree->left; + r->left = tree->right; + tree->left = new.right; + tree->right = new.left; + return tree; +} + +sfs_blockbuf_t * +sfs_splay_insert(int sfsid, sfs_blockbuf_t *new, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t **head, **tail; + + if (new == NULL) + return NULL; + head = &(_sfs_mounted[sfsid].head[new->dirty]); + tail = &(_sfs_mounted[sfsid].tail[new->dirty]); + if (tree == NULL) { + new->left = NULL; + new->right = NULL; + *head = *tail = new; + return new; + } + tree = sfs_splay_find(new->diskpos,tree); + if (new->diskpos == tree->diskpos) { + tree->refcount++; + return tree; + } + new->next = *head; + (*head)->prev = new; + if (new->diskpos < tree->diskpos) { + new->left = tree->left; + new->right = tree->right; + tree->left = NULL; + } else { + new->right = tree->right; + new->left = tree; + tree->right = NULL; + } + CBIT_SET(_sfs_mounted[sfsid].mhb, new->diskpos); + new->refcount = 1; + return new; +} + +sfs_blockbuf_t * +sfs_splay_remove(int sfsid, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t *new; + sfs_blockbuf_t **head, **tail; + + tree->refcount--; + if (tree->refcount > 0) + return tree; + new = NULL; + head = &(_sfs_mounted[sfsid].head[tree->dirty]); + tail = &(_sfs_mounted[sfsid].tail[tree->dirty]); + if (tree->left == NULL) { + new = tree->right; + } else { + new = sfs_splay_find(tree->left->diskpos,tree->left); + new->right = tree->right; + } + if (*head == tree) + *head = new; + if (*tail == tree) + *tail = new; + if (tree->prev) + tree->prev->next = tree->next; + if (tree->next) + tree->next->prev = tree->prev; + return new; +} + +sfs_blockbuf_t * +sfs_splay_delete(int sfsid, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t *old; + + if (tree == NULL) + return NULL; + old = tree; +/* Set this so it _will_ be deleted */ + if (tree->refcount > 1) + tree->refcount = 1; + tree = sfs_splay_remove(sfsid,tree); + if (tree != old) { + xfree(old->buf); + xfree(old); + } + return tree; +} Index: squid/src/fs/sfs/sfs_test.c diff -u /dev/null squid/src/fs/sfs/sfs_test.c:1.1.2.7 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_test.c Sat Feb 3 17:16:04 2001 @@ -0,0 +1,59 @@ +/* $Id$ */ + +#include +#include +#include +#include +#include +#include +#include + +#include "squid.h" +#include "sfs_defines.h" +#include "sfs_lib.h" + +int main() { + int sfsid; + uint sfsfd, sfsinode; + char filename[20]; + char buf[80]; + + if (creat("test.drv", 0644) < 0) { + printf("cannot open new file test.drv: %s", strerror(errno)); + exit(0); + } + + if (sfs_format("test.drv",4096) < 0) { + printf("unable to format test.drv! %s", strerror(errno)); + exit(0); + } + sfsid = sfs_mount("test.drv"); + printf("sfsid = %d\n",sfsid); + + snprintf(filename,20,"%d/0",sfsid); + printf("opening %s",filename); + sfsfd = sfs_open(sfsid, 0, O_CREAT, 0, _SFS_IO_SYNC, NULL); + + sfs_write(sfsfd,"Hello...\n",strlen("Hello...\n"), _SFS_IO_SYNC, NULL); + sfs_write(sfsfd,"Hello, again!\n",strlen("Hello, again!\n"), _SFS_IO_SYNC, NULL); + printf("close result = %d\n",sfs_close(sfsfd, _SFS_IO_SYNC, NULL)); + printf("umount result = %d\n",sfs_umount(sfsid, _SFS_IO_SYNC)); + printf("About to remount and read...\n"); + sfsid = sfs_mount("test.drv"); + printf("sfsid = %d\n",sfsid); + printf("Opening %d/%d\n",sfsid,sfsinode); + + snprintf(filename,20,"%d/%d",sfsid,sfsinode); + printf("DEBUG: %s\n",filename); + sfsfd = sfs_open(sfsid, sfsinode, O_RDONLY, 0, _SFS_IO_SYNC, NULL); + + printf("sfsfd = %d, sfsinode = %d\n",sfsfd,sfsinode); + if (sfsfd >= 0) { + printf("read result = %d\n",sfs_read(sfsfd,buf,80, _SFS_IO_SYNC, NULL)); + printf("\n***** %s *****\n",buf); + printf("strlen buf = %d\n",strlen(buf)); + printf("close result = %d\n",sfs_close(sfsfd, _SFS_IO_SYNC, NULL)); + } + printf("umount result = %d\n",sfs_umount(sfsid, _SFS_IO_SYNC)); + exit(0); +} Index: squid/src/fs/sfs/sfs_util.c diff -u /dev/null squid/src/fs/sfs/sfs_util.c:1.1.2.15 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/sfs_util.c Wed Feb 7 01:49:06 2001 @@ -0,0 +1,406 @@ +/* sfs_util.c,v 1.53 2001/01/24 12:49:34 adrian Exp */ + +#include "squid.h" +#include "store_sfs.h" + +extern int inode_data_size; +extern int direct_pointer_threshold; + +void +_sfs_waitfor_request(sfs_requestor *req) +/* You know, we could count the number of seconds a request has had to wait to + * be serviced here... */ +{ + assert(req->io_type == _SFS_IO_SYNC); + pthread_mutex_lock(&(req->done_signal_lock)); + if (!(req->request_state == _SFS_DONE)) + pthread_cond_wait(&(req->done_signal),&(req->done_signal_lock)); + pthread_mutex_unlock(&(req->done_signal_lock)); +} + +/* + * _sfs_remove_request + * + * Remove the request from the done request list, and deallocate it. + */ +int +_sfs_remove_request(sfs_requestor *req) +/* This doesn't free the buffer - not sure whether we have any need to keep + * the buffer anywhere or not, but the option is there... This will be changed + * by various commloops/modio changes - hopefully, we won't even be supplying + * the buffer. */ +{ + printf("DEBUG: Removing %d request from done queue\n",req->request_type); + pthread_mutex_lock(&(_sfs_mounted[req->sfsid].done_lock)); + dlinkDelete(&req->node, &(_sfs_mounted[req->sfsid].done_queue)); + _sfs_mounted[req->sfsid].done_requests--; + pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].done_lock)); + xfree(req); + return (0); +} + +/* + * _sfs_done_request + * + * Move the request from the request queue to the done queue, and then + * signal any sleeping thread(s) that this request has completed. + */ +void +_sfs_done_request(sfs_requestor *req, int retval) +{ + + req->ret = retval; + req->request_state = _SFS_DONE; + + /* Take it off the mount point request queue. Note, requests are + * always marked done from within the thread, within the lock that's + * already in place from the main sfs_thread_loop. Hence, no locking. */ + dlinkDelete(&req->node, _sfs_mounted[req->sfsid].request_queue); + + /* Add it to the squid done queue */ + pthread_mutex_lock(&(_sfs_mounted[req->sfsid].done_lock)); + dlinkAddTail(req, &req->node, &(_sfs_mounted[req->sfsid].done_queue)); + _sfs_mounted[req->sfsid].done_requests++; + pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].done_lock)); + + /* If it's a sync operation, signal it's done - that allows sfs_interface + * to pick up and return */ + if (req->io_type == _SFS_IO_SYNC) { + pthread_mutex_lock(&(req->done_signal_lock)); + pthread_cond_signal(&(req->done_signal)); + pthread_mutex_unlock(&(req->done_signal_lock)); + } +} + +sfs_requestor * +_sfs_create_requestor(int sfsid, enum sfs_request_type reqtype, + enum sfs_io_type iotype) +{ + struct sfs_requestor *req; + + if (!(req = (sfs_requestor *)xcalloc(1, sizeof(sfs_requestor)))) + return NULL; + + req->request_type = reqtype; + req->io_type = iotype; + req->request_state = _SFS_PENDING; + req->sfsid = sfsid; + req->sfsfd = 0; + req->offset = -1; + req->ret = 0; + req->buf = NULL; + pthread_cond_init(&(req->done_signal), NULL); + pthread_mutex_init(&(req->done_signal_lock), NULL); + return req; +} + +int +_sfs_submit_request(sfs_requestor *req) +{ + /* add the request to the end of the list rather than the start */ + pthread_mutex_lock(&(_sfs_mounted[req->sfsid].req_lock)); + dlinkAddTail(req, &req->node, _sfs_mounted[req->sfsid].request_queue); + _sfs_mounted[req->sfsid].pending_requests++; + pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].req_lock)); + + /* and signal that the request has been made */ + pthread_mutex_lock(&(_sfs_mounted[req->sfsid].req_signal_lock)); + pthread_cond_signal(&(_sfs_mounted[req->sfsid].req_signal)); + pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].req_signal_lock)); + + printf("DEBUG: request %p submitted\n",req); + return(0); +} + +sfs_blockbuf_t * +_sfs_read_block(uint sfsid, uint diskpos) +{ + /* This takes an sfsid, and a diskpos, and returns a blockbuf filled in + with the correct data. */ + sfs_blockbuf_t *new; + uint64_t dpos; + int readlen; + + /* Searching for the appropriate block in the clean list */ + if (_sfs_mounted[sfsid].clean) { + _sfs_mounted[sfsid].clean = sfs_splay_find(diskpos,_sfs_mounted[sfsid].clean); + if (_sfs_mounted[sfsid].clean->diskpos == diskpos) { + return _sfs_mounted[sfsid].clean; + } + } + /* And in the dirty list */ + /* We probably shouldn't find things in the dirty list - they should + * probably be served by squid's own cache first (assuming squid's still + * keeping a cache...) Might be worth measuring the frequency of this + * one. */ + if (_sfs_mounted[sfsid].dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(diskpos,_sfs_mounted[sfsid].dirty); + if (_sfs_mounted[sfsid].dirty->diskpos == diskpos) { + return _sfs_mounted[sfsid].dirty; + } + } + /* Otherwise we're reading a new one in off the disk. */ + dpos = diskpos * FRAGSIZE; + + if (!(new = _sfs_blockbuf_create())) + return NULL; + if (!(new->buf = (char *)xcalloc(1, FRAGSIZE))) + return NULL; + if (lseek(_sfs_mounted[sfsid].fd, dpos, SEEK_SET) < 0) { + xfree(new); + return NULL; + } + if ((readlen = read(_sfs_mounted[sfsid].fd, new->buf, FRAGSIZE)) < FRAGSIZE) { + xfree(new); + return NULL; + } + new->sfsid = sfsid; + new->diskpos = diskpos; + new->buflen = FRAGSIZE; + if (CBIT_TEST(_sfs_mounted[sfsid].ibm, diskpos)) { + new->type = _SFS_INODE; + } else { + new->type = _SFS_DATA; + } + /* Add it to the clean list on the spot. */ + _sfs_mounted[sfsid].clean = sfs_splay_insert(sfsid, new, _sfs_mounted[sfsid].clean); + return new; +} + +uint +_sfs_calculate_diskpos(sfs_openfile_t *openfd, uint offset) +{ +/* This function returns the disk position of the block into which bytes +should be written, or from which bytes should be read. It is granular +to a block level only. */ + sfs_blockbuf_t *din; + uint *dinptr; + sfsid_t sfsid; + + sfsid = openfd->sfsid; + if (offset < inode_data_size) + return openfd->inodebuf_p->diskpos; +/* Otherwise subtract inode_data_size, then div by FRAGSIZE to get entry in + direct block pointers */ + else if (offset < direct_pointer_threshold) { + return openfd->inode->dip[((offset - inode_data_size) / FRAGSIZE)]; + } else { +/* This insinuates that we're storing indirect pointers in chunks rather than + frags - I think that's fair, it gives us bigger files ;) Incidentally, + max filesize under this system is: + ((FRAGSIZE*(CHUNKSIZE / sizeof(uint)))*64)+(FRAGSIZE * 63)+inode_data_size + Under current settings, that's 512Mb + little bits. Extending this should + be done, but without creating frag problems, and without increasing the + number of indirect pointers required by too much. Having said that, + increasing number of indirect pointers at that stage is probably worth- + while - there's very few files that large legitimately. My preference is + to store another pointer in the first chunk of indirect pointers - drops + that down to 2047, and adds another chunk (giving another 512Mb easily, + and the option to keep chaining if required). At that stage, we'll want to + store state information so we don't read three times for each distinct + read call. +*/ + din = _sfs_read_block(sfsid,openfd->inode->sin[(offset - direct_pointer_threshold) / (FRAGSIZE * (CHUNKSIZE / sizeof(uint)))]); + dinptr = (uint *)din->buf; + return dinptr[(offset - direct_pointer_threshold) / FRAGSIZE]; + } +} + +void +_sfs_commit_block(int sfsid, sfs_blockbuf_t *block) +{ + char *i; + uint64_t dpos; + + i = (char *)block->buf; + dpos = block->diskpos * FRAGSIZE; + lseek(_sfs_mounted[sfsid].fd, dpos, SEEK_SET); + write(_sfs_mounted[sfsid].fd, block->buf, FRAGSIZE); + if (block->type == _SFS_INODE) { + CBIT_SET(_sfs_mounted[sfsid].ibm, block->diskpos); + } +} + +sfs_blockbuf_t * +_sfs_write_block(uint sfsid, uint diskpos, void *buf, int buflen, enum sfs_block_type type) +{ + sfs_blockbuf_t *new; + sfs_blockbuf_t *old; + + printf("DEBUG: sfsid = %d, diskpos = %d, buflen = %d\n",sfsid,diskpos,buflen); + new = old = NULL; + /* If it's an inode, make sure we have it in clean or dirty - gotta + * preserve the inode data */ + if (CBIT_TEST(_sfs_mounted[sfsid].ibm, diskpos)) + _sfs_read_block(sfsid, diskpos); + /* If it's in the clean list, remove it - it's now incorrect */ + if (_sfs_mounted[sfsid].clean) { + _sfs_mounted[sfsid].clean = sfs_splay_find(diskpos,_sfs_mounted[sfsid].clean); + if (_sfs_mounted[sfsid].clean->diskpos == diskpos) { + old = _sfs_mounted[sfsid].clean; + _sfs_mounted[sfsid].clean->refcount = 1; + _sfs_mounted[sfsid].clean = sfs_splay_remove(sfsid, _sfs_mounted[sfsid].clean); + _sfs_mounted[sfsid].dirty = sfs_splay_insert(sfsid, _sfs_mounted[sfsid].dirty, old); + } + } + /* Likewise the dirty list - we'll be simply replacing the contents if it's + * there */ + if (_sfs_mounted[sfsid].dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(diskpos,_sfs_mounted[sfsid].dirty); + if (_sfs_mounted[sfsid].dirty->diskpos == diskpos) { + xfree(_sfs_mounted[sfsid].dirty->buf); + _sfs_mounted[sfsid].dirty->buf = (char *)xcalloc(1, FRAGSIZE); + new = _sfs_mounted[sfsid].dirty; + } + } + if (!new) { + if (!(new = _sfs_blockbuf_create())) + return NULL; + new->buf = (char *)xcalloc(1, FRAGSIZE); + new->sfsid = sfsid; + new->diskpos = diskpos; + new->dirty = 1; + new->type = type; + CBIT_SET(_sfs_mounted[sfsid].mhb, diskpos); + printf("DEBUG: inserting block into dirty list\n"); + _sfs_mounted[sfsid].dirty = sfs_splay_insert(sfsid, new, _sfs_mounted[sfsid].dirty); + } + memcpy(new->buf,buf,buflen); + return new; +} + +uint +_sfs_allocate_fd(sfs_openfile_t *new) +/* This is to be called only from within an fs thread - no locking ;) */ +{ + sfs_openfile_t *tmp; + uint maxfd; + + maxfd = new->sfsid << 24; + tmp = _sfs_openfiles[new->sfsid]; + while(tmp) { + if (tmp->sfsfd > maxfd) + maxfd = tmp->sfsfd; + tmp = tmp->next; + } + return maxfd+1; +} + +uint +_sfs_allocate_block(int sfsid, int blocktype) +{ + uint i; + int found; + int blocks; + + if (blocktype == _SFS_INODE) + blocks = 2; + else + blocks = 1; +/* First block is always already used - rootblock */ + for(i=1, found=0; i<_sfs_mounted[sfsid].rootblock->numfrags; i += blocks) { + if (!(CBIT_TEST(_sfs_mounted[sfsid].mhb, i))) { + found = 1; + break; + } + } + if (found) { + CBIT_SET(_sfs_mounted[sfsid].mhb, i); + return i; + } else + return 0; +} + +sfs_openfile_t * +_sfs_find_fd(int sfsfd) +{ + sfs_openfile_t *ptr; + + ptr = _sfs_openfiles[sfsfd >> 24]; + while (ptr) { + if (ptr->sfsfd == sfsfd) + break; + ptr = ptr->next; + } + return ptr; +} + +void +_sfs_flush_bitmaps(int sfsid) +{ + printf("DEBUG: Flushing bitmaps\n"); + lseek(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].rootblock->ibmpos, SEEK_SET); + write(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].ibm, _sfs_mounted[sfsid].rootblock->bmlen); + write(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].fbm, _sfs_mounted[sfsid].rootblock->bmlen); +} + +int +_sfs_flush_file(int sfsid, sfs_openfile_t *fd) +{ + sfs_openblock_list *tmp, *nxt; + sfs_openfile_t *tmpfile; + uint diskpos; + + tmpfile = _sfs_openfiles[sfsid]; + while (tmpfile) { + if (tmpfile->sfsfd == fd->sfsfd) { + break; + } + tmpfile = tmpfile->next; + } + if (tmpfile) { + if (tmpfile->next) + tmpfile->next->prev = tmpfile->prev; + if (tmpfile->prev) + tmpfile->prev->next = tmpfile->next; + if ((tmpfile->next == tmpfile->prev) && (tmpfile->prev == NULL)) + _sfs_openfiles[sfsid] = NULL; + } +/* Flush the inode block */ + _sfs_commit_block(sfsid, fd->inodebuf_p); +/* Flush all the single indirect blocks */ + tmp = fd->sibuf_list_p; + while (tmp) { +/* Two variables for after the splay_delete, in case the structure goes away */ + nxt = tmp->next; + diskpos = tmp->buf->diskpos; + _sfs_commit_block(sfsid, tmp->buf); + if (tmp->buf->dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(tmp->buf->diskpos, _sfs_mounted[sfsid].dirty); + _sfs_mounted[sfsid].dirty = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].dirty); + } else { + _sfs_mounted[sfsid].dirty = sfs_splay_find(tmp->buf->diskpos, _sfs_mounted[sfsid].clean); + _sfs_mounted[sfsid].clean = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].clean); + } +/* This isn't strictly how we designed it, but there you go */ + CBIT_SET(_sfs_mounted[sfsid].fbm, diskpos); + xfree(tmp); + tmp = nxt; + } + fd->sibuf_list_p = NULL; +/* Flush all the stuff off the double indirect block */ + if (fd->dibuf_p) { +/* indirect pointer - Panic ;) */ + } +/* Go back and flush the inode properly */ + CBIT_SET(_sfs_mounted[sfsid].fbm, fd->inodebuf_p->diskpos); + if (fd->inodebuf_p->dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(fd->inodebuf_p->diskpos, _sfs_mounted[sfsid].dirty); + _sfs_mounted[sfsid].dirty = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].dirty); + } else { + _sfs_mounted[sfsid].clean = sfs_splay_find(fd->inodebuf_p->diskpos, _sfs_mounted[sfsid].clean); + _sfs_mounted[sfsid].clean = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].clean); + } +/* XXX - nowhere else do we actually check the return value of this code. */ +/* Change function to void? */ + return (0); +} + +/* This function prints out the contents of a request - debug function. */ +void +_sfs_print_request(sfs_requestor *req) +{ + printf(" fd %d: %d/%d - type %d, state %d, buflen %d\n", + req->sfsfd,req->sfsid,req->sfsinode,req->request_type, + req->request_state,req->buflen); +} Index: squid/src/fs/sfs/store_dir_sfs.c diff -u /dev/null squid/src/fs/sfs/store_dir_sfs.c:1.1.2.12 --- /dev/null Tue Sep 28 18:35:34 2004 +++ squid/src/fs/sfs/store_dir_sfs.c Wed Feb 7 02:18:08 2001 @@ -0,0 +1,1798 @@ + +/* + * $Id$ + * + * DEBUG: section 47 Store Directory Routines + * AUTHOR: Duane Wessels + * + * SQUID Web Proxy Cache http://www.squid-cache.org/ + * ---------------------------------------------------------- + * + * Squid is the result of efforts by numerous individuals from + * the Internet community; see the CONTRIBUTORS file for full + * details. Many organizations have provided support for Squid's + * development; see the SPONSORS file for full details. Squid is + * Copyrighted (C) 2001 by the Regents of the University of + * California; see the COPYRIGHT file for full details. Squid + * incorporates software developed and/or copyrighted by other + * sources; see the CREDITS file for full details. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. + * + */ + +#include "squid.h" + +#include "store_sfs.h" + +#define DefaultLevelOneDirs 16 +#define DefaultLevelTwoDirs 256 +#define STORE_META_BSFSZ 4096 + +typedef struct _RebuildState RebuildState; +struct _RebuildState { + SwapDir *sd; + int n_read; + FILE *log; + int speed; + int curlvl1; + int curlvl2; + struct { + unsigned int need_to_validate:1; + unsigned int clean:1; + unsigned int init:1; + } flags; + int done; + int in_dir; + int fn; + struct dirent *entry; + DIR *td; + char fullpath[SQUID_MAXPATHLEN]; + char fullfilename[SQUID_MAXPATHLEN]; + struct _store_rebuild_data counts; +}; + +static int n_sfs_dirs = 0; +static int *sfs_dir_index = NULL; +MemPool *sfs_state_pool = NULL; +static int sfs_initialised = 0; + +static char *storeSfsDirSwapSubDir(SwapDir *, int subdirn); +static int storeSfsDirVerifyCacheDirs(SwapDir *); +static int storeSfsDirVerifyDirectory(const char *path); +static char *storeSfsDirSwapLogFile(SwapDir *, const char *); +static EVH storeSfsDirRebuildFromDirectory; +static EVH storeSfsDirRebuildFromSwapLog; +static int storeSfsDirGetNextFile(RebuildState *, int *sfileno, int *size); +static StoreEntry *storeSfsDirAddDiskRestore(SwapDir * SD, const cache_key * key, + int file_number, + size_t swap_file_sz, + time_t expires, + time_t timestamp, + time_t lastref, + time_t lastmod, + u_num32 refcount, + u_short flags, + int clean); +static void storeSfsDirRebuild(SwapDir * sd); +static void storeSfsDirCloseTmpSwapLog(SwapDir * sd); +static FILE *storeSfsDirOpenTmpSwapLog(SwapDir *, int *, int *); +static STLOGOPEN storeSfsDirOpenSwapLog; +static STINIT storeSfsDirInit; +static STFREE storeSfsDirFree; +static STLOGCLEANSTART storeSfsDirWriteCleanStart; +static STLOGCLEANNEXTENTRY storeSfsDirCleanLogNextEntry; +static STLOGCLEANWRITE storeSfsDirWriteCleanEntry; +static STLOGCLEANDONE storeSfsDirWriteCleanDone; +static STLOGCLOSE storeSfsDirCloseSwapLog; +static STLOGWRITE storeSfsDirSwapLog; +static STNEWFS storeSfsDirNewfs; +static STDUMP storeSfsDirDump; +static STMAINTAINFS storeSfsDirMaintain; +static STCHECKOBJ storeSfsDirCheckObj; +static STREFOBJ storeSfsDirRefObj; +static STUNREFOBJ storeSfsDirUnrefObj; +static STSYNC storeSfsSync; +static QS rev_int_sort; +static int storeSfsDirClean(int swap_index); +static EVH storeSfsDirCleanEvent; +static int storeSfsDirIs(SwapDir * sd); +static int storeSfsFilenoBelongsHere(int fn, int F0, int F1, int F2); +static int storeSfsCleanupDoubleCheck(SwapDir *, StoreEntry *); +static void storeSfsDirStats(SwapDir *, StoreEntry *); +static void storeSfsDirInitBitmap(SwapDir *); +static int storeSfsDirValidFileno(SwapDir *, sfileno, int); +static STCALLBACK storeSfsDirCallback; + +/* + * These functions were ripped straight out of the heart of store_dir.c. + * They assume that the given filenum is on a sfs partiton, which may or + * may not be true.. + * XXX this evilness should be tidied up at a later date! + */ + +int +storeSfsDirMapBitTest(SwapDir * SD, int fn) +{ + sfileno filn = fn; + sfsinfo_t *sfsinfo; + sfsinfo = (sfsinfo_t *) SD->fsdata; + return file_map_bit_test(sfsinfo->map, filn); +} + +void +storeSfsDirMapBitSet(SwapDir * SD, int fn) +{ + sfileno filn = fn; + sfsinfo_t *sfsinfo; + sfsinfo = (sfsinfo_t *) SD->fsdata; + file_map_bit_set(sfsinfo->map, filn); +} + +void +storeSfsDirMapBitReset(SwapDir * SD, int fn) +{ + sfileno filn = fn; + sfsinfo_t *sfsinfo; + sfsinfo = (sfsinfo_t *) SD->fsdata; + /* + * We have to test the bit before calling file_map_bit_reset. + * file_map_bit_reset doesn't do bounds checking. It assumes + * filn is a valid file number, but it might not be because + * the map is dynamic in size. Also clearing an already clear + * bit puts the map counter of-of-whack. + */ + if (file_map_bit_test(sfsinfo->map, filn)) + file_map_bit_reset(sfsinfo->map, filn); +} + +int +storeSfsDirMapBitAllocate(SwapDir * SD) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + int fn; + fn = file_map_allocate(sfsinfo->map, sfsinfo->suggest); + file_map_bit_set(sfsinfo->map, fn); + sfsinfo->suggest = fn + 1; + return fn; +} + +/* + * Initialise the sfs bitmap + * + * If there already is a bitmap, and the numobjects is larger than currently + * configured, we allocate a new bitmap and 'grow' the old one into it. + */ +static void +storeSfsDirInitBitmap(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + + if (sfsinfo->map == NULL) { + /* First time */ + sfsinfo->map = file_map_create(); + } else if (sfsinfo->map->max_n_files) { + /* it grew, need to expand */ + /* XXX We don't need it anymore .. */ + } + /* else it shrunk, and we leave the old one in place */ +} + +static char * +storeSfsDirSwapSubDir(SwapDir * sd, int subdirn) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + + LOCAL_ARRAY(char, fullfilename, SQUID_MAXPATHLEN); + assert(0 <= subdirn && subdirn < sfsinfo->l1); + snprintf(fullfilename, SQUID_MAXPATHLEN, "%s/%02X", sd->path, subdirn); + return fullfilename; +} + +static int +storeSfsDirVerifyDirectory(const char *path) +{ + struct stat sb; + if (stat(path, &sb) < 0) { + debug(20, 0) ("%s: %s\n", path, xstrerror()); + return -1; + } + if (S_ISDIR(sb.st_mode) == 0) { + debug(20, 0) ("%s is not a directory\n", path); + return -1; + } + return 0; +} + +/* + * This function is called by storeSfsDirInit(). If this returns < 0, + * then Squid exits, complains about swap directories not + * existing, and instructs the admin to run 'squid -z' + */ +static int +storeSfsDirVerifyCacheDirs(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + int j; + const char *path = sd->path; + + if (storeSfsDirVerifyDirectory(path) < 0) + return -1; + for (j = 0; j < sfsinfo->l1; j++) { + path = storeSfsDirSwapSubDir(sd, j); + if (storeSfsDirVerifyDirectory(path) < 0) + return -1; + } + return 0; +} + + +static char * +storeSfsDirSwapLogFile(SwapDir * sd, const char *ext) +{ + LOCAL_ARRAY(char, path, SQUID_MAXPATHLEN); + LOCAL_ARRAY(char, pathtmp, SQUID_MAXPATHLEN); + LOCAL_ARRAY(char, digit, 32); + char *pathtmp2; + if (Config.Log.swap) { + xstrncpy(pathtmp, sd->path, SQUID_MAXPATHLEN - 64); + while (index(pathtmp, '/')) + *index(pathtmp, '/') = '.'; + while (strlen(pathtmp) && pathtmp[strlen(pathtmp) - 1] == '.') + pathtmp[strlen(pathtmp) - 1] = '\0'; + for (pathtmp2 = pathtmp; *pathtmp2 == '.'; pathtmp2++); + snprintf(path, SQUID_MAXPATHLEN - 64, Config.Log.swap, pathtmp2); + if (strncmp(path, Config.Log.swap, SQUID_MAXPATHLEN - 64) == 0) { + strcat(path, "."); + snprintf(digit, 32, "%02d", sd->index); + strncat(path, digit, 3); + } + } else { + xstrncpy(path, sd->path, SQUID_MAXPATHLEN - 64); + strcat(path, "/swap.state"); + } + if (ext) + strncat(path, ext, 16); + return path; +} + +static void +storeSfsDirOpenSwapLog(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + char *path; + int fd; + path = storeSfsDirSwapLogFile(sd, NULL); + fd = file_open(path, O_WRONLY | O_CREAT | O_BINARY); + if (fd < 0) { + debug(50, 1) ("%s: %s\n", path, xstrerror()); + fatal("storeSfsDirOpenSwapLog: Failed to open swap log."); + } + debug(47, 3) ("Cache Dir #%d log opened on FD %d\n", sd->index, fd); + sfsinfo->swaplog_fd = fd; + if (0 == n_sfs_dirs) + assert(NULL == sfs_dir_index); + n_sfs_dirs++; + assert(n_sfs_dirs <= Config.cacheSwap.n_configured); +} + +static void +storeSfsDirCloseSwapLog(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + if (sfsinfo->swaplog_fd < 0) /* not open */ + return; + file_close(sfsinfo->swaplog_fd); + debug(47, 3) ("Cache Dir #%d log closed on FD %d\n", + sd->index, sfsinfo->swaplog_fd); + sfsinfo->swaplog_fd = -1; + n_sfs_dirs--; + assert(n_sfs_dirs >= 0); + if (0 == n_sfs_dirs) + safe_free(sfs_dir_index); +} + +static void +storeSfsDirInit(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = sd->fsdata; + + static int started_clean_event = 0; + static const char *errmsg = + "\tFailed to verify one of the swap directories, Check cache.log\n" + "\tfor details. Run 'squid -z' to create swap directories\n" + "\tif needed, or if running Squid for the first time."; + storeSfsDirInitBitmap(sd); + + /* Mount the FS */ + assert(sfsinfo->sfsid == 0); + sfsinfo->sfsid = sfs_mount(sd->path); + if (sfsinfo->sfsid < 0) + fatalf("Failed to mount %s!\n", sd->path); + +#if 0 + /* We have to verify it some other way */ + if (storeSfsDirVerifyCacheDirs(sd) < 0) + fatal(errmsg); +#endif + storeSfsDirOpenSwapLog(sd); + storeSfsDirRebuild(sd); + if (!started_clean_event) { + eventAdd("storeDirClean", storeSfsDirCleanEvent, NULL, 15.0, 1); + started_clean_event = 1; + } + (void) storeDirGetBlkSize(sd->path, &sd->fs.blksize); +} + +static void +storeSfsDirRebuildFromDirectory(void *data) +{ + RebuildState *rb = data; + SwapDir *SD = rb->sd; + LOCAL_ARRAY(char, hdr_buf, SM_PAGE_SIZE); + StoreEntry *e = NULL; + StoreEntry tmpe; + cache_key key[MD5_DIGEST_CHARS]; + int sfileno = 0; + int count; + int size; + struct stat sb; + int swap_hdr_len; + int fd = -1; + tlv *tlv_list; + tlv *t; + + sfsblock_t currentEntry; + sfsinfo_t *sfsinfo = rb->sd->fsdata; + int filesize; + + assert(rb != NULL); + debug(20, 3) ("storeSfsDirRebuildFromDirectory: DIR #%d\n", rb->sd->index); + + /* We don't do anything right now */ + store_dirs_rebuilding--; + storeSfsDirCloseTmpSwapLog(rb->sd); + storeRebuildComplete(&rb->counts); + cbdataFree(rb); + return; + + currentEntry = 0; + + for (count = 0; count < rb->speed; count++) { + assert(fd == -1); + fd = sfs_openNextInode(sfsinfo->sfsid,¤tEntry); + if (fd == -2) { + debug(20, 1) ("Done scanning %s dir (%d entries)\n", + rb->sd->path, rb->n_read); + store_dirs_rebuilding--; + storeSfsDirCloseTmpSwapLog(rb->sd); + storeRebuildComplete(&rb->counts); + cbdataFree(rb); + return; + } else if (fd < 0) { + continue; + } + assert(fd > -1); + filesize = sfs_filesize(sfsinfo->sfsid,currentEntry); + + if ((++rb->counts.scancount & 0xFFFF) == 0) + debug(20, 3) (" %s %7d files opened so far.\n", + rb->sd->path, rb->counts.scancount); + debug(20, 9) ("file_in: fd=%d %08X\n", fd, currentEntry); + statCounter.syscalls.disk.reads++; + if (sfs_read(fd, hdr_buf, SM_PAGE_SIZE, _SFS_IO_SYNC, NULL) < 0) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: sfs_read(FD %d): %s\n", + fd, xstrerror()); + sfs_close(fd,_SFS_IO_SYNC,NULL); + store_open_disk_fd--; + fd = -1; + continue; + } + swap_hdr_len = 0; + tlv_list = storeSwapMetaUnpack(hdr_buf, &swap_hdr_len); + if (tlv_list == NULL) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: failed to get meta data for file %d\n",currentEntry); + sfs_close(fd,_SFS_IO_SYNC,NULL); + store_open_disk_fd--; + fd = -1; + sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL); + continue; + } + sfs_close(fd,_SFS_IO_SYNC,NULL); + store_open_disk_fd--; + fd = -1; + debug(20, 3) ("storeSfsDirRebuildFromDirectory: successful swap meta unpacking\n"); + memset(key, '\0', MD5_DIGEST_CHARS); + memset(&tmpe, '\0', sizeof(StoreEntry)); + for (t = tlv_list; t; t = t->next) { + switch (t->type) { + case STORE_META_KEY: + assert(t->length == MD5_DIGEST_CHARS); + xmemcpy(key, t->value, MD5_DIGEST_CHARS); + break; + case STORE_META_STD: + assert(t->length == STORE_HDR_METASIZE); + xmemcpy(&tmpe.timestamp, t->value, STORE_HDR_METASIZE); + break; + default: + break; + } + } + storeSwapTLVFree(tlv_list); + tlv_list = NULL; + if (storeKeyNull(key)) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: NULL key\n"); + sfs_close(fd,_SFS_IO_SYNC,NULL); + sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL); + continue; + } + tmpe.hash.key = key; + if (tmpe.swap_file_sz == 0) { + tmpe.swap_file_sz = filesize; + } else if (tmpe.swap_file_sz == filesize - swap_hdr_len) { + tmpe.swap_file_sz = filesize; + } else if (tmpe.swap_file_sz != filesize) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: SIZE MISMATCH %d!=%d\n", + tmpe.swap_file_sz, filesize); + sfs_close(fd,_SFS_IO_SYNC,NULL); + sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL); + continue; + } + if (EBIT_TEST(tmpe.flags, KEY_PRIVATE)) { + sfs_close(fd,_SFS_IO_SYNC,NULL); + sfs_unlink(sfsinfo->sfsid,currentEntry,_SFS_IO_SYNC,NULL); + rb->counts.badflags++; + continue; + } + e = storeGet(key); + if (e && e->lastref >= tmpe.lastref) { + /* key already exists, current entry is newer */ + /* keep old, ignore new */ + rb->counts.dupcount++; + continue; + } else if (NULL != e) { + /* URL already exists, this swapfile not being used */ + /* junk old, load new */ + storeRelease(e); /* release old entry */ + rb->counts.dupcount++; + } + rb->counts.objcount++; + storeEntryDump(&tmpe, 5); + e = storeSfsDirAddDiskRestore(SD, key, + sfileno, + tmpe.swap_file_sz, + tmpe.expires, + tmpe.timestamp, + tmpe.lastref, + tmpe.lastmod, + tmpe.refcount, /* refcount */ + tmpe.flags, /* flags */ + (int) rb->flags.clean); + storeDirSwapLog(e, SWAP_LOG_ADD); + } + eventAdd("storeRebuild", storeSfsDirRebuildFromDirectory, rb, 0.0, 1); +} + +static void +storeSfsDirRebuildFromSwapLog(void *data) +{ + RebuildState *rb = data; + SwapDir *SD = rb->sd; + StoreEntry *e = NULL; + storeSwapLogData s; + size_t ss = sizeof(storeSwapLogData); + int count; + int used; /* is swapfile already in use? */ + int disk_entry_newer; /* is the log entry newer than current entry? */ + double x; + assert(rb != NULL); + + /* We don't do anything right now */ + store_dirs_rebuilding--; + storeSfsDirCloseTmpSwapLog(rb->sd); + storeRebuildComplete(&rb->counts); + cbdataFree(rb); + return; + + /* load a number of objects per invocation */ + for (count = 0; count < rb->speed; count++) { + if (fread(&s, ss, 1, rb->log) != 1) { + debug(20, 1) ("Done reading %s swaplog (%d entries)\n", + rb->sd->path, rb->n_read); + fclose(rb->log); + rb->log = NULL; + store_dirs_rebuilding--; + storeSfsDirCloseTmpSwapLog(rb->sd); + storeRebuildComplete(&rb->counts); + cbdataFree(rb); + return; + } + rb->n_read++; + if (s.op <= SWAP_LOG_NOP) + continue; + if (s.op >= SWAP_LOG_MAX) + continue; + /* + * BC: during 2.4 development, we changed the way swap file + * numbers are assigned and stored. The high 16 bits used + * to encode the SD index number. There used to be a call + * to storeDirProperFileno here that re-assigned the index + * bits. Now, for backwards compatibility, we just need + * to mask it off. + */ + s.swap_filen &= 0x00FFFFFF; + debug(20, 3) ("storeSfsDirRebuildFromSwapLog: %s %s %08X\n", + swap_log_op_str[(int) s.op], + storeKeyText(s.key), + s.swap_filen); + if (s.op == SWAP_LOG_ADD) { + (void) 0; + } else if (s.op == SWAP_LOG_DEL) { + if ((e = storeGet(s.key)) != NULL) { + /* + * Make sure we don't unlink the file, it might be + * in use by a subsequent entry. Also note that + * we don't have to subtract from store_swap_size + * because adding to store_swap_size happens in + * the cleanup procedure. + */ + storeExpireNow(e); + storeReleaseRequest(e); + storeSfsDirReplRemove(e); + if (e->swap_filen > -1) { + storeSfsDirMapBitReset(SD, e->swap_filen); + e->swap_filen = -1; + e->swap_dirn = -1; + } + storeRelease(e); + rb->counts.objcount--; + rb->counts.cancelcount++; + } + continue; + } else { + x = log(++rb->counts.bad_log_op) / log(10.0); + if (0.0 == x - (double) (int) x) + debug(20, 1) ("WARNING: %d invalid swap log entries found\n", + rb->counts.bad_log_op); + rb->counts.invalid++; + continue; + } + if ((++rb->counts.scancount & 0xFFF) == 0) { + struct stat sb; + if (0 == fstat(fileno(rb->log), &sb)) + storeRebuildProgress(SD->index, + (int) sb.st_size / ss, rb->n_read); + } + if (!storeSfsDirValidFileno(SD, s.swap_filen, 0)) { + rb->counts.invalid++; + continue; + } + if (EBIT_TEST(s.flags, KEY_PRIVATE)) { + rb->counts.badflags++; + continue; + } + e = storeGet(s.key); + used = storeSfsDirMapBitTest(SD, s.swap_filen); + /* If this URL already exists in the cache, does the swap log + * appear to have a newer entry? Compare 'lastref' from the + * swap log to e->lastref. */ + disk_entry_newer = e ? (s.lastref > e->lastref ? 1 : 0) : 0; + if (used && !disk_entry_newer) { + /* log entry is old, ignore it */ + rb->counts.clashcount++; + continue; + } else if (used && e && e->swap_filen == s.swap_filen && e->swap_dirn == SD->index) { + /* swapfile taken, same URL, newer, update meta */ + if (e->store_status == STORE_OK) { + e->lastref = s.timestamp; + e->timestamp = s.timestamp; + e->expires = s.expires; + e->lastmod = s.lastmod; + e->flags = s.flags; + e->refcount += s.refcount; + storeSfsDirUnrefObj(SD, e); + } else { + debug_trap("storeSfsDirRebuildFromSwapLog: bad condition"); + debug(20, 1) ("\tSee %s:%d\n", __FILE__, __LINE__); + } + continue; + } else if (used) { + /* swapfile in use, not by this URL, log entry is newer */ + /* This is sorta bad: the log entry should NOT be newer at this + * point. If the log is dirty, the filesize check should have + * caught this. If the log is clean, there should never be a + * newer entry. */ + debug(20, 1) ("WARNING: newer swaplog entry for dirno %d, fileno %08X\n", + SD->index, s.swap_filen); + /* I'm tempted to remove the swapfile here just to be safe, + * but there is a bad race condition in the NOVM version if + * the swapfile has recently been opened for writing, but + * not yet opened for reading. Because we can't map + * swapfiles back to StoreEntrys, we don't know the state + * of the entry using that file. */ + /* We'll assume the existing entry is valid, probably because + * were in a slow rebuild and the the swap file number got taken + * and the validation procedure hasn't run. */ + assert(rb->flags.need_to_validate); + rb->counts.clashcount++; + continue; + } else if (e && !disk_entry_newer) { + /* key already exists, current entry is newer */ + /* keep old, ignore new */ + rb->counts.dupcount++; + continue; + } else if (e) { + /* key already exists, this swapfile not being used */ + /* junk old, load new */ + storeExpireNow(e); + storeReleaseRequest(e); + storeSfsDirReplRemove(e); + if (e->swap_filen > -1) { + /* Make sure we don't actually unlink the file */ + storeSfsDirMapBitReset(SD, e->swap_filen); + e->swap_filen = -1; + e->swap_dirn = -1; + } + storeRelease(e); + rb->counts.dupcount++; + } else { + /* URL doesnt exist, swapfile not in use */ + /* load new */ + (void) 0; + } + /* update store_swap_size */ + rb->counts.objcount++; + e = storeSfsDirAddDiskRestore(SD, s.key, + s.swap_filen, + s.swap_file_sz, + s.expires, + s.timestamp, + s.lastref, + s.lastmod, + s.refcount, + s.flags, + (int) rb->flags.clean); + storeDirSwapLog(e, SWAP_LOG_ADD); + } + eventAdd("storeRebuild", storeSfsDirRebuildFromSwapLog, rb, 0.0, 1); +} + +static int +storeSfsDirGetNextFile(RebuildState * rb, int *sfileno, int *size) +{ + SwapDir *SD = rb->sd; + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + int fd = -1; + int used = 0; + int dirs_opened = 0; + debug(20, 3) ("storeSfsDirGetNextFile: flag=%d, %d: /%02X/%02X\n", + rb->flags.init, + rb->sd->index, + rb->curlvl1, rb->curlvl2); + if (rb->done) + return -2; + while (fd < 0 && rb->done == 0) { + fd = -1; + if (0 == rb->flags.init) { /* initialize, open first file */ + rb->done = 0; + rb->curlvl1 = 0; + rb->curlvl2 = 0; + rb->in_dir = 0; + rb->flags.init = 1; + assert(Config.cacheSwap.n_configured > 0); + } + if (0 == rb->in_dir) { /* we need to read in a new directory */ + snprintf(rb->fullpath, SQUID_MAXPATHLEN, "%s/%02X/%02X", + rb->sd->path, + rb->curlvl1, + rb->curlvl2); + if (rb->flags.init && rb->td != NULL) + closedir(rb->td); + rb->td = NULL; + if (dirs_opened) + return -1; + rb->td = opendir(rb->fullpath); + dirs_opened++; + if (rb->td == NULL) { + debug(50, 1) ("storeSfsDirGetNextFile: opendir: %s: %s\n", + rb->fullpath, xstrerror()); + } else { + rb->entry = readdir(rb->td); /* skip . and .. */ + rb->entry = readdir(rb->td); + if (rb->entry == NULL && errno == ENOENT) + debug(20, 1) ("storeSfsDirGetNextFile: directory does not exist!.\n"); + debug(20, 3) ("storeSfsDirGetNextFile: Directory %s\n", rb->fullpath); + } + } + if (rb->td != NULL && (rb->entry = readdir(rb->td)) != NULL) { + rb->in_dir++; + if (sscanf(rb->entry->d_name, "%x", &rb->fn) != 1) { + debug(20, 3) ("storeSfsDirGetNextFile: invalid %s\n", + rb->entry->d_name); + continue; + } + if (!storeSfsFilenoBelongsHere(rb->fn, rb->sd->index, rb->curlvl1, rb->curlvl2)) { + debug(20, 3) ("storeSfsDirGetNextFile: %08X does not belong in %d/%d/%d\n", + rb->fn, rb->sd->index, rb->curlvl1, rb->curlvl2); + continue; + } + used = storeSfsDirMapBitTest(SD, rb->fn); + if (used) { + debug(20, 3) ("storeSfsDirGetNextFile: Locked, continuing with next.\n"); + continue; + } + snprintf(rb->fullfilename, SQUID_MAXPATHLEN, "%s/%s", + rb->fullpath, rb->entry->d_name); + debug(20, 3) ("storeSfsDirGetNextFile: Opening %s\n", rb->fullfilename); + fd = file_open(rb->fullfilename, O_RDONLY | O_BINARY); + if (fd < 0) + debug(50, 1) ("storeSfsDirGetNextFile: %s: %s\n", rb->fullfilename, xstrerror()); + else + store_open_disk_fd++; + continue; + } + rb->in_dir = 0; + if (++rb->curlvl2 < sfsinfo->l2) + continue; + rb->curlvl2 = 0; + if (++rb->curlvl1 < sfsinfo->l1) + continue; + rb->curlvl1 = 0; + rb->done = 1; + } + *sfileno = rb->fn; + return fd; +} + +/* Add a new object to the cache with empty memory copy and pointer to disk + * use to rebuild store from disk. */ +static StoreEntry * +storeSfsDirAddDiskRestore(SwapDir * SD, const cache_key * key, + int file_number, + size_t swap_file_sz, + time_t expires, + time_t timestamp, + time_t lastref, + time_t lastmod, + u_num32 refcount, + u_short flags, + int clean) +{ + StoreEntry *e = NULL; + debug(20, 5) ("storeSfsAddDiskRestore: %s, fileno=%08X\n", storeKeyText(key), file_number); + /* if you call this you'd better be sure file_number is not + * already in use! */ + e = new_StoreEntry(STORE_ENTRY_WITHOUT_MEMOBJ, NULL, NULL); + e->store_status = STORE_OK; + storeSetMemStatus(e, NOT_IN_MEMORY); + e->swap_status = SWAPOUT_DONE; + e->swap_filen = file_number; + e->swap_dirn = SD->index; + e->swap_file_sz = swap_file_sz; + e->lock_count = 0; + e->lastref = lastref; + e->timestamp = timestamp; + e->expires = expires; + e->lastmod = lastmod; + e->refcount = refcount; + e->flags = flags; + EBIT_SET(e->flags, ENTRY_CACHABLE); + EBIT_CLR(e->flags, RELEASE_REQUEST); + EBIT_CLR(e->flags, KEY_PRIVATE); + e->ping_status = PING_NONE; + EBIT_CLR(e->flags, ENTRY_VALIDATED); + storeSfsDirMapBitSet(SD, e->swap_filen); + storeHashInsert(e, key); /* do it after we clear KEY_PRIVATE */ + storeSfsDirReplAdd(SD, e); + return e; +} + +CBDATA_TYPE(RebuildState); +static void +storeSfsDirRebuild(SwapDir * sd) +{ + RebuildState *rb; + int clean = 0; + int zero = 0; + FILE *fp; + EVH *func = NULL; + CBDATA_INIT_TYPE(RebuildState); + rb = CBDATA_ALLOC(RebuildState, NULL); + rb->sd = sd; + rb->speed = opt_foreground_rebuild ? 1 << 30 : 50; + /* + * If the swap.state file exists in the cache_dir, then + * we'll use storeSfsDirRebuildFromSwapLog(), otherwise we'll + * use storeSfsDirRebuildFromDirectory() to open up each file + * and suck in the meta data. + */ + fp = storeSfsDirOpenTmpSwapLog(sd, &clean, &zero); + if (fp == NULL || zero) { + if (fp != NULL) + fclose(fp); + func = storeSfsDirRebuildFromDirectory; + } else { + func = storeSfsDirRebuildFromSwapLog; + rb->log = fp; + rb->flags.clean = (unsigned int) clean; + } + if (!clean) + rb->flags.need_to_validate = 1; + debug(20, 1) ("Rebuilding storage in %s (%s)\n", + sd->path, clean ? "CLEAN" : "DIRTY"); + store_dirs_rebuilding++; + eventAdd("storeRebuild", func, rb, 0.0, 1); +} + +static void +storeSfsDirCloseTmpSwapLog(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + char *swaplog_path = xstrdup(storeSfsDirSwapLogFile(sd, NULL)); + char *new_path = xstrdup(storeSfsDirSwapLogFile(sd, ".new")); + int fd; + file_close(sfsinfo->swaplog_fd); +#if defined (_SQUID_OS2_) || defined (_SQUID_CYGWIN_) + if (unlink(swaplog_path) < 0) { + debug(50, 0) ("%s: %s\n", swaplog_path, xstrerror()); + fatal("storeSfsDirCloseTmpSwapLog: unlink failed"); + } +#endif + if (xrename(new_path, swaplog_path) < 0) { + fatal("storeSfsDirCloseTmpSwapLog: rename failed"); + } + fd = file_open(swaplog_path, O_WRONLY | O_CREAT | O_BINARY); + if (fd < 0) { + debug(50, 1) ("%s: %s\n", swaplog_path, xstrerror()); + fatal("storeSfsDirCloseTmpSwapLog: Failed to open swap log."); + } + safe_free(swaplog_path); + safe_free(new_path); + sfsinfo->swaplog_fd = fd; + debug(47, 3) ("Cache Dir #%d log opened on FD %d\n", sd->index, fd); +} + +static FILE * +storeSfsDirOpenTmpSwapLog(SwapDir * sd, int *clean_flag, int *zero_flag) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + char *swaplog_path = xstrdup(storeSfsDirSwapLogFile(sd, NULL)); + char *clean_path = xstrdup(storeSfsDirSwapLogFile(sd, ".last-clean")); + char *new_path = xstrdup(storeSfsDirSwapLogFile(sd, ".new")); + struct stat log_sb; + struct stat clean_sb; + FILE *fp; + int fd; + if (stat(swaplog_path, &log_sb) < 0) { + debug(47, 1) ("Cache Dir #%d: No log file\n", sd->index); + safe_free(swaplog_path); + safe_free(clean_path); + safe_free(new_path); + return NULL; + } + *zero_flag = log_sb.st_size == 0 ? 1 : 0; + /* close the existing write-only FD */ + if (sfsinfo->swaplog_fd >= 0) + file_close(sfsinfo->swaplog_fd); + /* open a write-only FD for the new log */ + fd = file_open(new_path, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); + if (fd < 0) { + debug(50, 1) ("%s: %s\n", new_path, xstrerror()); + fatal("storeDirOpenTmpSwapLog: Failed to open swap log."); + } + sfsinfo->swaplog_fd = fd; + /* open a read-only stream of the old log */ + fp = fopen(swaplog_path, "r"); + if (fp == NULL) { + debug(50, 0) ("%s: %s\n", swaplog_path, xstrerror()); + fatal("Failed to open swap log for reading"); + } +#if defined(_SQUID_CYGWIN_) + setmode(fileno(fp), O_BINARY); +#endif + memset(&clean_sb, '\0', sizeof(struct stat)); + if (stat(clean_path, &clean_sb) < 0) + *clean_flag = 0; + else if (clean_sb.st_mtime < log_sb.st_mtime) + *clean_flag = 0; + else + *clean_flag = 1; + safeunlink(clean_path, 1); + safe_free(swaplog_path); + safe_free(clean_path); + safe_free(new_path); + return fp; +} + +struct _clean_state { + char *cur; + char *new; + char *cln; + char *outbuf; + off_t outbuf_offset; + int fd; + RemovalPolicyWalker *walker; +}; + +#define CLEAN_BUF_SZ 16384 +/* + * Begin the process to write clean cache state. For SFS this means + * opening some log files and allocating write buffers. Return 0 if + * we succeed, and assign the 'func' and 'data' return pointers. + */ +static int +storeSfsDirWriteCleanStart(SwapDir * sd) +{ + struct _clean_state *state = xcalloc(1, sizeof(*state)); + struct stat sb; + sd->log.clean.write = NULL; + sd->log.clean.state = NULL; + state->new = xstrdup(storeSfsDirSwapLogFile(sd, ".clean")); + state->fd = file_open(state->new, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); + if (state->fd < 0) { + xfree(state->new); + xfree(state); + return -1; + } + state->cur = xstrdup(storeSfsDirSwapLogFile(sd, NULL)); + state->cln = xstrdup(storeSfsDirSwapLogFile(sd, ".last-clean")); + state->outbuf = xcalloc(CLEAN_BUF_SZ, 1); + state->outbuf_offset = 0; + state->walker = sd->repl->WalkInit(sd->repl); +#if !(defined(_SQUID_OS2_) || defined (_SQUID_CYGWIN_)) + unlink(state->new); +#endif + unlink(state->cln); + debug(20, 3) ("storeDirWriteCleanLogs: opened %s, FD %d\n", + state->new, state->fd); +#if HAVE_FCHMOD + if (stat(state->cur, &sb) == 0) + fchmod(state->fd, sb.st_mode); +#endif + sd->log.clean.write = storeSfsDirWriteCleanEntry; + sd->log.clean.state = state; + return 0; +} + +/* + * Get the next entry that is a candidate for clean log writing + */ +const StoreEntry * +storeSfsDirCleanLogNextEntry(SwapDir * sd) +{ + const StoreEntry *entry = NULL; + struct _clean_state *state = sd->log.clean.state; + if (state->walker) + entry = state->walker->Next(state->walker); + return entry; +} + +/* + * "write" an entry to the clean log file. + */ +static void +storeSfsDirWriteCleanEntry(SwapDir * sd, const StoreEntry * e) +{ + storeSwapLogData s; + static size_t ss = sizeof(storeSwapLogData); + struct _clean_state *state = sd->log.clean.state; + memset(&s, '\0', ss); + s.op = (char) SWAP_LOG_ADD; + s.swap_filen = e->swap_filen; + s.timestamp = e->timestamp; + s.lastref = e->lastref; + s.expires = e->expires; + s.lastmod = e->lastmod; + s.swap_file_sz = e->swap_file_sz; + s.refcount = e->refcount; + s.flags = e->flags; + xmemcpy(&s.key, e->hash.key, MD5_DIGEST_CHARS); + xmemcpy(state->outbuf + state->outbuf_offset, &s, ss); + state->outbuf_offset += ss; + /* buffered write */ + if (state->outbuf_offset + ss > CLEAN_BUF_SZ) { + if (write(state->fd, state->outbuf, state->outbuf_offset) < 0) { + debug(50, 0) ("storeDirWriteCleanLogs: %s: write: %s\n", + state->new, xstrerror()); + debug(20, 0) ("storeDirWriteCleanLogs: Current swap logfile not replaced.\n"); + file_close(state->fd); + state->fd = -1; + unlink(state->new); + safe_free(state); + sd->log.clean.state = NULL; + sd->log.clean.write = NULL; + } + state->outbuf_offset = 0; + } +} + +static void +storeSfsDirWriteCleanDone(SwapDir * sd) +{ + int fd; + struct _clean_state *state = sd->log.clean.state; + if (NULL == state) + return; + if (state->fd < 0) + return; + state->walker->Done(state->walker); + if (write(state->fd, state->outbuf, state->outbuf_offset) < 0) { + debug(50, 0) ("storeDirWriteCleanLogs: %s: write: %s\n", + state->new, xstrerror()); + debug(20, 0) ("storeDirWriteCleanLogs: Current swap logfile " + "not replaced.\n"); + file_close(state->fd); + state->fd = -1; + unlink(state->new); + } + safe_free(state->outbuf); + /* + * You can't rename open files on Microsoft "operating systems" + * so we have to close before renaming. + */ + storeSfsDirCloseSwapLog(sd); + /* save the fd value for a later test */ + fd = state->fd; + /* rename */ + if (state->fd >= 0) { +#if defined(_SQUID_OS2_) || defined (_SQUID_CYGWIN_) + file_close(state->fd); + state->fd = -1; + if (unlink(state->cur) < 0) + debug(50, 0) ("storeDirWriteCleanLogs: unlinkd failed: %s, %s\n", + xstrerror(), state->cur); +#endif + xrename(state->new, state->cur); + } + /* touch a timestamp file if we're not still validating */ + if (store_dirs_rebuilding) + (void) 0; + else if (fd < 0) + (void) 0; + else + file_close(file_open(state->cln, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY)); + /* close */ + safe_free(state->cur); + safe_free(state->new); + safe_free(state->cln); + if (state->fd >= 0) + file_close(state->fd); + state->fd = -1; + safe_free(state); + sd->log.clean.state = NULL; + sd->log.clean.write = NULL; +} + +static void +storeSwapLogDataFree(void *s) +{ + memFree(s, MEM_SWAP_LOG_DATA); +} + +static void +storeSfsDirSwapLog(const SwapDir * sd, const StoreEntry * e, int op) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + storeSwapLogData *s = memAllocate(MEM_SWAP_LOG_DATA); + s->op = (char) op; + s->swap_filen = e->swap_filen; + s->timestamp = e->timestamp; + s->lastref = e->lastref; + s->expires = e->expires; + s->lastmod = e->lastmod; + s->swap_file_sz = e->swap_file_sz; + s->refcount = e->refcount; + s->flags = e->flags; + xmemcpy(s->key, e->hash.key, MD5_DIGEST_CHARS); + file_write(sfsinfo->swaplog_fd, + -1, + s, + sizeof(storeSwapLogData), + NULL, + NULL, + (FREE *) storeSwapLogDataFree); +} + +static void +storeSfsDirNewfs(SwapDir * sd) +{ + int sfsid; + + debug(47, 3) ("Creating swap space in %s\n", sd->path); + + /* Check to see whether we have a sfs store. */ + sfsid = sfs_mount(sd->path); + if (sfsid < 0) { + /* it failed, we can do stuff.. */ + /* + * note - the FS *data* size will be max_size, but sfs metadata + * will make it bigger. Just like normal FSes. + */ + if (sfs_format(sd->path, (sd->max_size * 1024) / FRAGSIZE) < 0) + fatalf("error whilst formatting %s! : (%d) %s\n", sd->path, + errno, strerror(errno)); + } else { + /* it suceeded, unmount */ + debug(47, 3) ("Swap space in %s is already formatted\n", sd->path); + sfs_umount(sfsid, _SFS_IO_SYNC); + } + + +} + +static int +rev_int_sort(const void *A, const void *B) +{ + const int *i1 = A; + const int *i2 = B; + return *i2 - *i1; +} + +static int +storeSfsDirClean(int swap_index) +{ + DIR *dp = NULL; + struct dirent *de = NULL; + LOCAL_ARRAY(char, p1, MAXPATHLEN + 1); + LOCAL_ARRAY(char, p2, MAXPATHLEN + 1); +#if USE_TRUNCATE + struct stat sb; +#endif + int files[20]; + int swapfileno; + int fn; /* same as swapfileno, but with dirn bits set */ + int n = 0; + int k = 0; + int N0, N1, N2; + int D0, D1, D2; + SwapDir *SD; + sfsinfo_t *sfsinfo; + N0 = n_sfs_dirs; + D0 = sfs_dir_index[swap_index % N0]; + SD = &Config.cacheSwap.swapDirs[D0]; + sfsinfo = (sfsinfo_t *) SD->fsdata; + N1 = sfsinfo->l1; + D1 = (swap_index / N0) % N1; + N2 = sfsinfo->l2; + D2 = ((swap_index / N0) / N1) % N2; + snprintf(p1, SQUID_MAXPATHLEN, "%s/%02X/%02X", + Config.cacheSwap.swapDirs[D0].path, D1, D2); + debug(36, 3) ("storeDirClean: Cleaning directory %s\n", p1); + dp = opendir(p1); + if (dp == NULL) { + if (errno == ENOENT) { + debug(36, 0) ("storeDirClean: WARNING: Creating %s\n", p1); + if (mkdir(p1, 0777) == 0) + return 0; + } + debug(50, 0) ("storeDirClean: %s: %s\n", p1, xstrerror()); + safeunlink(p1, 1); + return 0; + } + while ((de = readdir(dp)) != NULL && k < 20) { + if (sscanf(de->d_name, "%X", &swapfileno) != 1) + continue; + fn = swapfileno; /* XXX should remove this cruft ! */ + if (storeSfsDirValidFileno(SD, fn, 1)) + if (storeSfsDirMapBitTest(SD, fn)) + if (storeSfsFilenoBelongsHere(fn, D0, D1, D2)) + continue; +#if USE_TRUNCATE + if (!stat(de->d_name, &sb)) + if (sb.st_size == 0) + continue; +#endif + files[k++] = swapfileno; + } + closedir(dp); + if (k == 0) + return 0; + qsort(files, k, sizeof(int), rev_int_sort); + if (k > 10) + k = 10; + for (n = 0; n < k; n++) { + debug(36, 3) ("storeDirClean: Cleaning file %08X\n", files[n]); + snprintf(p2, MAXPATHLEN + 1, "%s/%08X", p1, files[n]); +#if USE_TRUNCATE + truncate(p2, 0); +#else + safeunlink(p2, 0); +#endif + statCounter.swap.files_cleaned++; + } + debug(36, 3) ("Cleaned %d unused files from %s\n", k, p1); + return k; +} + +static void +storeSfsDirCleanEvent(void *unused) +{ + static int swap_index = 0; + int i; + int j = 0; + int n = 0; + + /* We don't do anything right now */ + return; + /* + * Assert that there are SFS cache_dirs configured, otherwise + * we should never be called. + */ + assert(n_sfs_dirs); + if (NULL == sfs_dir_index) { + SwapDir *sd; + sfsinfo_t *sfsinfo; + /* + * Initialize the little array that translates SFS cache_dir + * number into the Config.cacheSwap.swapDirs array index. + */ + sfs_dir_index = xcalloc(n_sfs_dirs, sizeof(*sfs_dir_index)); + for (i = 0, n = 0; i < Config.cacheSwap.n_configured; i++) { + sd = &Config.cacheSwap.swapDirs[i]; + if (!storeSfsDirIs(sd)) + continue; + sfs_dir_index[n++] = i; + sfsinfo = (sfsinfo_t *) sd->fsdata; + j += (sfsinfo->l1 * sfsinfo->l2); + } + assert(n == n_sfs_dirs); + /* + * Start the storeSfsDirClean() swap_index with a random + * value. j equals the total number of SFS level 2 + * swap directories + */ + swap_index = (int) (squid_random() % j); + } + if (0 == store_dirs_rebuilding) { + n = storeSfsDirClean(swap_index); + swap_index++; + } + eventAdd("storeDirClean", storeSfsDirCleanEvent, NULL, + 15.0 * exp(-0.25 * n), 1); +} + +static int +storeSfsDirIs(SwapDir * sd) +{ + if (strncmp(sd->type, "sfs", 3) == 0) + return 1; + return 0; +} + +/* + * Does swapfile number 'fn' belong in cachedir #F0, + * level1 dir #F1, level2 dir #F2? + */ +static int +storeSfsFilenoBelongsHere(int fn, int F0, int F1, int F2) +{ + int D1, D2; + int L1, L2; + int filn = fn; + sfsinfo_t *sfsinfo; + assert(F0 < Config.cacheSwap.n_configured); + sfsinfo = (sfsinfo_t *) Config.cacheSwap.swapDirs[F0].fsdata; + L1 = sfsinfo->l1; + L2 = sfsinfo->l2; + D1 = ((filn / L2) / L2) % L1; + if (F1 != D1) + return 0; + D2 = (filn / L2) % L2; + if (F2 != D2) + return 0; + return 1; +} + +int +storeSfsDirValidFileno(SwapDir * SD, sfileno filn, int flag) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + if (filn < 0) + return 0; + /* + * If flag is set it means out-of-range file number should + * be considered invalid. + */ + if (flag) + if (filn > sfsinfo->map->max_n_files) + return 0; + return 1; +} + +void +storeSfsDirMaintain(SwapDir * SD) +{ + StoreEntry *e = NULL; + int removed = 0; + int max_scan; + int max_remove; + double f; + RemovalPurgeWalker *walker; + /* We can't delete objects while rebuilding swap */ + if (store_dirs_rebuilding) { + return; + } else { + f = (double) (SD->cur_size - SD->low_size) / (SD->max_size - SD->low_size); + f = f < 0.0 ? 0.0 : f > 1.0 ? 1.0 : f; + max_scan = (int) (f * 400.0 + 100.0); + max_remove = (int) (f * 70.0 + 10.0); + /* + * This is kinda cheap, but so we need this priority hack? + */ + } + debug(20, 3) ("storeMaintainSwapSpace: f=%f, max_scan=%d, max_remove=%d\n", f, max_scan, max_remove); + walker = SD->repl->PurgeInit(SD->repl, max_scan); + while (1) { + if (SD->cur_size < SD->low_size) + break; + if (removed >= max_remove) + break; + e = walker->Next(walker); + if (!e) + break; /* no more objects */ + removed++; + storeRelease(e); + } + walker->Done(walker); + debug(20, (removed ? 2 : 3)) ("storeSfsDirMaintain: %s removed %d/%d f=%.03f max_scan=%d\n", + SD->path, removed, max_remove, f, max_scan); +} + +/* + * storeSfsDirCheckObj + * + * This routine is called by storeDirSelectSwapDir to see if the given + * object is able to be stored on this filesystem. SFS filesystems will + * happily store anything as long as the LRU time isn't too small. + */ +int +storeSfsDirCheckObj(SwapDir * SD, const StoreEntry * e) +{ +#if OLD_UNUSED_CODE + if (storeSfsDirExpiredReferenceAge(SD) < 300) { + debug(20, 3) ("storeSfsDirCheckObj: NO: LRU Age = %d\n", + storeSfsDirExpiredReferenceAge(SD)); + /* store_check_cachable_hist.no.lru_age_too_low++; */ + return -1; + } +#endif + /* Return 999 (99.9%) constant load */ + return 0; +} + +/* + * storeSfsDirRefObj + * + * This routine is called whenever an object is referenced, so we can + * maintain replacement information within the storage fs. + */ +void +storeSfsDirRefObj(SwapDir * SD, StoreEntry * e) +{ + debug(1, 3) ("storeSfsDirRefObj: referencing %p %d/%d\n", e, e->swap_dirn, + e->swap_filen); + if (SD->repl->Referenced) + SD->repl->Referenced(SD->repl, e, &e->repl); +} + +/* + * storeSfsDirUnrefObj + * This routine is called whenever the last reference to an object is + * removed, to maintain replacement information within the storage fs. + */ +void +storeSfsDirUnrefObj(SwapDir * SD, StoreEntry * e) +{ + debug(1, 3) ("storeSfsDirUnrefObj: referencing %p %d/%d\n", e, e->swap_dirn, + e->swap_filen); + if (SD->repl->Dereferenced) + SD->repl->Dereferenced(SD->repl, e, &e->repl); +} + +/* + * storeSfsSync + * + * Sync the filesystem + */ +void +storeSfsSync(SwapDir *SD) +{ + /* Sync the FS */ + /* Handle any pending callbacks */ + while (storeSfsDirCallback(SD) > 0); +} + +/* + * storeSfsDirUnlinkFile + * + * This routine unlinks a file and pulls it out of the bitmap. + * It used to be in storeSfsUnlink(), however an interface change + * forced this bit of code here. Eeek. + */ +void +storeSfsDirUnlinkFile(SwapDir * SD, sfileno f) +{ + sfsinfo_t *sfsinfo = SD->fsdata; + int retval; + + debug(79, 3) ("storeSfsDirUnlinkFile: unlinking fileno %08X\n", f); + /* storeSfsDirMapBitReset(SD, f); */ + retval = sfs_unlink(sfsinfo->sfsid, (sfsblock_t)f, _SFS_IO_ASYNC, NULL); + if (retval < 0) { + debug(79, 1) ("storeSfsDirUnlinkFile: Can't unlink %d/%08X!\n", + SD->index, f); + } +} + +/* + * Add and remove the given StoreEntry from the replacement policy in + * use. + */ + +void +storeSfsDirReplAdd(SwapDir * SD, StoreEntry * e) +{ + debug(20, 4) ("storeSfsDirReplAdd: added node %p to dir %d\n", e, + SD->index); + SD->repl->Add(SD->repl, e, &e->repl); +} + + +void +storeSfsDirReplRemove(StoreEntry * e) +{ + SwapDir *SD = INDEXSD(e->swap_dirn); + debug(20, 4) ("storeSfsDirReplRemove: remove node %p from dir %d\n", e, + SD->index); + SD->repl->Remove(SD->repl, e, &e->repl); +} + +/* + * storeSfsDirCallback + * + * Handle pending IO operations that have completed + */ +int +storeSfsDirCallback(SwapDir *SD) +{ + int retval; + sfsinfo_t *sfsinfo = SD->fsdata; + sfs_requestor *req; + storeIOState *sio; + enum sfs_request_type rtype; + int ops = 0; + + /* XXX using sfs_requestor in here might be considered layer-breaking! */ + while((req = sfs_getcompleted(sfsinfo->sfsid)) != NULL) { + /* Find the sio in question */ + sio = req->dataptr; + rtype = req->request_type; + retval = req->ret; + + /* Remove the requestor from the list */ + _sfs_remove_request(req); + + if (cbdataValid(sio)) { + /* Callback time */ + switch (rtype) { + case _SFS_OP_READ: + storeSfsReadDone(sio, retval); + break; + + case _SFS_OP_WRITE: + storeSfsWriteDone(sio, retval); + break; + + case _SFS_OP_CLOSE: + storeSfsCloseDone(sio, retval); + break; + + case _SFS_OP_OPEN_READ: + case _SFS_OP_OPEN_WRITE: + case _SFS_OP_UNLINK: + case _SFS_OP_SYNC: + case _SFS_OP_UMOUNT: + break; + + default: + debug(20, 1) ("storeSfsDirCallback: unknown op %d\n", + req->request_type); + } + /* Tag that we've done an IO */ + ops = 1; + cbdataUnlock(sio); + } + } + + return ops; +} + + + + +/* ========== LOCAL FUNCTIONS ABOVE, GLOBAL FUNCTIONS BELOW ========== */ + +void +storeSfsDirStats(SwapDir * SD, StoreEntry * sentry) +{ + sfsinfo_t *sfsinfo = SD->fsdata; + int totl_kb = 0; + int free_kb = 0; + int totl_in = 0; + int free_in = 0; + int x; + storeAppendPrintf(sentry, "First level subdirectories: %d\n", sfsinfo->l1); + storeAppendPrintf(sentry, "Second level subdirectories: %d\n", sfsinfo->l2); + storeAppendPrintf(sentry, "Maximum Size: %d KB\n", SD->max_size); + storeAppendPrintf(sentry, "Current Size: %d KB\n", SD->cur_size); + storeAppendPrintf(sentry, "Percent Used: %0.2f%%\n", + 100.0 * SD->cur_size / SD->max_size); + storeAppendPrintf(sentry, "Filemap bits in use: %d of %d (%d%%)\n", + sfsinfo->map->n_files_in_map, sfsinfo->map->max_n_files, + percent(sfsinfo->map->n_files_in_map, sfsinfo->map->max_n_files)); + x = storeDirGetUFSStats(SD->path, &totl_kb, &free_kb, &totl_in, &free_in); + if (0 == x) { + storeAppendPrintf(sentry, "Filesystem Space in use: %d/%d KB (%d%%)\n", + totl_kb - free_kb, + totl_kb, + percent(totl_kb - free_kb, totl_kb)); + storeAppendPrintf(sentry, "Filesystem Inodes in use: %d/%d (%d%%)\n", + totl_in - free_in, + totl_in, + percent(totl_in - free_in, totl_in)); + } + storeAppendPrintf(sentry, "Flags:"); + if (SD->flags.selected) + storeAppendPrintf(sentry, " SELECTED"); + if (SD->flags.read_only) + storeAppendPrintf(sentry, " READ-ONLY"); + storeAppendPrintf(sentry, "\n"); +#if OLD_UNUSED_CODE +#if !HEAP_REPLACEMENT + storeAppendPrintf(sentry, "LRU Expiration Age: %6.2f days\n", + (double) storeSfsDirExpiredReferenceAge(SD) / 86400.0); +#else + storeAppendPrintf(sentry, "Storage Replacement Threshold:\t%f\n", + heap_peepminkey(sd.repl.heap.heap)); +#endif +#endif /* OLD_UNUSED_CODE */ +} + +/* + * storeSfsDirReconfigure + * + * This routine is called when the given swapdir needs reconfiguring + */ +void +storeSfsDirReconfigure(SwapDir * sd, int index, char *path) +{ + char *token; + int i; + int size; + int l1; + int l2; + unsigned int read_only = 0; + + i = GetInteger(); + size = i << 10; /* Mbytes to kbytes */ + if (size <= 0) + fatal("storeSfsDirReconfigure: invalid size value"); + i = GetInteger(); + l1 = i; + if (l1 <= 0) + fatal("storeSfsDirReconfigure: invalid level 1 directories value"); + i = GetInteger(); + l2 = i; + if (l2 <= 0) + fatal("storeSfsDirReconfigure: invalid level 2 directories value"); + if ((token = strtok(NULL, w_space))) + if (!strcasecmp(token, "read-only")) + read_only = 1; + + /* just reconfigure it */ + if (size == sd->max_size) + debug(3, 1) ("Cache dir '%s' size remains unchanged at %d KB\n", + path, size); + else + debug(3, 1) ("Cache dir '%s' size changed to %d KB\n", + path, size); + sd->max_size = size; + if (sd->flags.read_only != read_only) + debug(3, 1) ("Cache dir '%s' now %s\n", + path, read_only ? "Read-Only" : "Read-Write"); + sd->flags.read_only = read_only; + return; +} + +void +storeSfsDirDump(StoreEntry * entry, const char *name, SwapDir * s) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) s->fsdata; + storeAppendPrintf(entry, "%s %s %s %d %d %d\n", + name, + "sfs", + s->path, + s->max_size >> 10, + sfsinfo->l1, + sfsinfo->l2); +} + +/* + * Only "free" the filesystem specific stuff here + */ +static void +storeSfsDirFree(SwapDir * s) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) s->fsdata; + if (sfsinfo->swaplog_fd > -1) { + file_close(sfsinfo->swaplog_fd); + sfsinfo->swaplog_fd = -1; + } + + /* Sync the FS and handle pending callbacks */ + storeSfsSync(s); + + /* Unmount the FS */ + sfs_umount(sfsinfo->sfsid, _SFS_IO_SYNC); + sfsinfo->sfsid = -1; + + filemapFreeMemory(sfsinfo->map); + xfree(sfsinfo); + s->fsdata = NULL; /* Will aid debugging... */ +} + +char * +storeSfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath) +{ + LOCAL_ARRAY(char, fullfilename, SQUID_MAXPATHLEN); + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + int L1 = sfsinfo->l1; + int L2 = sfsinfo->l2; + if (!fullpath) + fullpath = fullfilename; + fullpath[0] = '\0'; + snprintf(fullpath, SQUID_MAXPATHLEN, "%s/%02X/%02X/%08X", + SD->path, + ((filn / L2) / L2) % L1, + (filn / L2) % L2, + filn); + return fullpath; +} + +/* + * storeSfsCleanupDoubleCheck + * + * This is called by storeCleanup() if -S was given on the command line. + */ +static int +storeSfsCleanupDoubleCheck(SwapDir * sd, StoreEntry * e) +{ + struct stat sb; + if (stat(storeSfsDirFullPath(sd, e->swap_filen, NULL), &sb) < 0) { + debug(20, 0) ("storeSfsCleanupDoubleCheck: MISSING SWAP FILE\n"); + debug(20, 0) ("storeSfsCleanupDoubleCheck: FILENO %08X\n", e->swap_filen); + debug(20, 0) ("storeSfsCleanupDoubleCheck: PATH %s\n", + storeSfsDirFullPath(sd, e->swap_filen, NULL)); + storeEntryDump(e, 0); + return -1; + } + if (e->swap_file_sz != sb.st_size) { + debug(20, 0) ("storeSfsCleanupDoubleCheck: SIZE MISMATCH\n"); + debug(20, 0) ("storeSfsCleanupDoubleCheck: FILENO %08X\n", e->swap_filen); + debug(20, 0) ("storeSfsCleanupDoubleCheck: PATH %s\n", + storeSfsDirFullPath(sd, e->swap_filen, NULL)); + debug(20, 0) ("storeSfsCleanupDoubleCheck: ENTRY SIZE: %d, FILE SIZE: %d\n", + e->swap_file_sz, (int) sb.st_size); + storeEntryDump(e, 0); + return -1; + } + return 0; +} + +/* + * storeSfsDirParse + * + * Called when a *new* fs is being setup. + */ +void +storeSfsDirParse(SwapDir * sd, int index, char *path) +{ + char *token; + int i; + int size; + int l1; + int l2; + unsigned int read_only = 0; + sfsinfo_t *sfsinfo; + + i = GetInteger(); + size = i << 10; /* Mbytes to kbytes */ + if (size <= 0) + fatal("storeSfsDirParse: invalid size value"); + i = GetInteger(); + l1 = i; + if (l1 <= 0) + fatal("storeSfsDirParse: invalid level 1 directories value"); + i = GetInteger(); + l2 = i; + if (l2 <= 0) + fatal("storeSfsDirParse: invalid level 2 directories value"); + if ((token = strtok(NULL, w_space))) + if (!strcasecmp(token, "read-only")) + read_only = 1; + + sfsinfo = xmalloc(sizeof(sfsinfo_t)); + if (sfsinfo == NULL) + fatal("storeSfsDirParse: couldn't xmalloc() sfsinfo_t!\n"); + + sd->index = index; + sd->path = xstrdup(path); + sd->max_size = size; + sd->fsdata = sfsinfo; + sfsinfo->l1 = l1; + sfsinfo->l2 = l2; + sfsinfo->swaplog_fd = -1; + sfsinfo->map = NULL; /* Debugging purposes */ + sfsinfo->suggest = 0; + sd->flags.read_only = read_only; + sd->init = storeSfsDirInit; + sd->newfs = storeSfsDirNewfs; + sd->dump = storeSfsDirDump; + sd->freefs = storeSfsDirFree; + sd->dblcheck = storeSfsCleanupDoubleCheck; + sd->statfs = storeSfsDirStats; + sd->maintainfs = storeSfsDirMaintain; + sd->checkobj = storeSfsDirCheckObj; + sd->refobj = storeSfsDirRefObj; + sd->unrefobj = storeSfsDirUnrefObj; + sd->callback = storeSfsDirCallback; + sd->sync = storeSfsSync; + sd->obj.create = storeSfsCreate; + sd->obj.open = storeSfsOpen; + sd->obj.close = storeSfsClose; + sd->obj.read = storeSfsRead; + sd->obj.write = storeSfsWrite; + sd->obj.unlink = storeSfsUnlink; + sd->log.open = storeSfsDirOpenSwapLog; + sd->log.close = storeSfsDirCloseSwapLog; + sd->log.write = storeSfsDirSwapLog; + sd->log.clean.start = storeSfsDirWriteCleanStart; + sd->log.clean.nextentry = storeSfsDirCleanLogNextEntry; + sd->log.clean.done = storeSfsDirWriteCleanDone; + + /* Initialise replacement policy stuff */ + sd->repl = createRemovalPolicy(Config.replPolicy); +} + +/* + * Initial setup / end destruction + */ +void +storeSfsDirDone(void) +{ + memPoolDestroy(sfs_state_pool); + sfs_initialised = 0; +} + +void +storeFsSetup_sfs(storefs_entry_t * storefs) +{ + assert(!sfs_initialised); + storefs->parsefunc = storeSfsDirParse; + storefs->reconfigurefunc = storeSfsDirReconfigure; + storefs->donefunc = storeSfsDirDone; + sfs_state_pool = memPoolCreate("SFS IO State data", sizeof(sfsstate_t)); + sfs_initialised = 1; +} Index: squid/src/fs/sfs/store_io_sfs.c diff -u /dev/null squid/src/fs/sfs/store_io_sfs.c:1.1.2.8 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/sfs/store_io_sfs.c Tue Feb 6 07:43:37 2001 @@ -0,0 +1,330 @@ + +/* + * $Id$ + * + * DEBUG: section 79 Storage Manager SFS Interface + * AUTHOR: Duane Wessels + * + * SQUID Web Proxy Cache http://www.squid-cache.org/ + * ---------------------------------------------------------- + * + * Squid is the result of efforts by numerous individuals from + * the Internet community; see the CONTRIBUTORS file for full + * details. Many organizations have provided support for Squid's + * development; see the SPONSORS file for full details. Squid is + * Copyrighted (C) 2001 by the Regents of the University of + * California; see the COPYRIGHT file for full details. Squid + * incorporates software developed and/or copyrighted by other + * sources; see the CREDITS file for full details. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. + * + */ + +#include "squid.h" +#include "store_sfs.h" + + +static void storeSfsIOCallback(storeIOState * sio, int errflag); +static CBDUNL storeSfsIOFreeEntry; + +/* === PUBLIC =========================================================== */ + +storeIOState * +storeSfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, + STIOCB * callback, void *callback_data) +{ + sfsinfo_t *sfsinfo = SD->fsdata; + + sfileno f = e->swap_filen; + storeIOState *sio; + sfsfd_t fd; + + debug(79, 3) ("storeSfsOpen: fileno %08X\n", f); + sio = NULL; + + sio = CBDATA_ALLOC(storeIOState, storeSfsIOFreeEntry); + sio->fsstate = memPoolAlloc(sfs_state_pool); + + fd = sfs_open(sfsinfo->sfsid, f, O_RDONLY, 0, _SFS_IO_SYNC, sio); + + if (fd < 0) { + debug(79, 3) ("storeSfsOpen: got failure (%d)\n", errno); + return NULL; + } + + debug(79, 3) ("storeSfsOpen: opened FD %d\n", fd); + + sio->swap_filen = f; + sio->swap_dirn = SD->index; + sio->mode = O_RDONLY; + sio->callback = callback; + sio->callback_data = callback_data; + cbdataLock(callback_data); + sio->e = e; + ((sfsstate_t *) (sio->fsstate))->fd = fd; + ((sfsstate_t *) (sio->fsstate))->flags.writing = 0; + ((sfsstate_t *) (sio->fsstate))->flags.reading = 0; + ((sfsstate_t *) (sio->fsstate))->flags.close_request = 0; + ((sfsstate_t *) (sio->fsstate))->swap_filen = -1; + + /* We should update the heap/dlink position here ! */ + return sio; +} + +storeIOState * +storeSfsCreate(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, STIOCB * callback, void *callback_data) +{ + storeIOState *sio; + sfsfd_t fd; + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + sfileno filn; + sdirno dirn; + + sio = NULL; + + /* Allocate a number */ + dirn = SD->index; + filn = storeSfsDirMapBitAllocate(SD); + sfsinfo->suggest = filn + 1; + + debug(79, 3) ("storeSfsCreate: fileno %08X\n", filn); + + sio = CBDATA_ALLOC(storeIOState, storeSfsIOFreeEntry); + sio->fsstate = memPoolAlloc(sfs_state_pool); + fd = sfs_open(sfsinfo->sfsid, filn, O_CREAT | O_RDWR, 0, _SFS_IO_SYNC, + sio); + + if (fd < 0) { + debug(79, 3) ("storeSfsCreate: got failure (%d)\n", errno); + return NULL; + } + + debug(79, 3) ("storeSfsCreate: opened FD %d\n", fd); + + sio->swap_filen = -1; /* Defer the actual allocation */ + sio->swap_dirn = dirn; + sio->mode = O_CREAT | O_RDWR; + sio->callback = callback; + sio->callback_data = callback_data; + sio->file_callback = file_callback; + cbdataLock(callback_data); + sio->e = (StoreEntry *) e; + ((sfsstate_t *) (sio->fsstate))->fd = fd; + ((sfsstate_t *) (sio->fsstate))->flags.writing = 0; + ((sfsstate_t *) (sio->fsstate))->flags.reading = 0; + ((sfsstate_t *) (sio->fsstate))->flags.close_request = 0; + ((sfsstate_t *) (sio->fsstate))->swap_filen = filn; + + /* now insert into the replacement policy */ + storeSfsDirReplAdd(SD, e); + return sio; +} + +void +storeSfsClose(SwapDir * SD, storeIOState * sio) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + + debug(79, 3) ("storeSfsClose: dirno %d, fileno %08X, FD %d\n", + sio->swap_dirn, sio->swap_filen, sfsstate->fd); + /* storeSfsIOCallback calls sfs_close as part of it's normal operation + * - who said this interface was untidy? :( */ + storeSfsIOCallback(sio, 0); +} + +void +storeSfsRead(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, STRCB * callback, void *callback_data) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + int retval; + + assert(sio->read.callback == NULL); + assert(sio->read.callback_data == NULL); + sio->read.callback = callback; + sio->read.callback_data = callback_data; + cbdataLock(callback_data); + debug(79, 3) ("storeSfsRead: dirno %d, fileno %08X, FD %d\n", + sio->swap_dirn, sio->swap_filen, sfsstate->fd); + sio->offset = offset; + sfsstate->flags.reading = 1; + assert(sfsstate->read_buf == NULL); + sfsstate->read_buf = buf; + + if (offset > -1) { + debug(79, 3) ("storeSfsRead: seeking to %d\n", offset); + retval = sfs_seek(sfsstate->fd, offset, _SFS_IO_SYNC, NULL); + if (retval < 0) { + debug(79, 2) ("storeSfsRead: sfs_seek on %d failed!\n", + sfsstate->fd); + storeSfsIOCallback(sio, DISK_ERROR); + } + } + retval = sfs_read(sfsstate->fd, buf, size, _SFS_IO_ASYNC, sio); + if (retval < 0) { + debug(79, 2) ("storeSfsRead: sfs_read on %d failed!\n", sfsstate->fd); + storeSfsIOCallback(sio, DISK_ERROR); + } +} + +void +storeSfsWrite(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, FREE * free_func) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + int retval; + + debug(79, 3) ("storeSfsWrite: dirn %d, fileno %08X, FD %d\n", sio->swap_dirn, sio->swap_filen, sfsstate->fd); + sfsstate->flags.writing = 1; + + if (offset > -1) { + debug(79, 3) ("storeSfsWrite: seeking to %d\n", offset); + retval = sfs_seek(sfsstate->fd, offset, _SFS_IO_SYNC, NULL); + if (retval < 0) { + debug(79, 2) ("storeSfsWrite: sfs_seek on %d failed!\n", + sfsstate->fd); + storeSfsIOCallback(sio, DISK_ERROR); + } + } + retval = sfs_write(sfsstate->fd, buf, size, _SFS_IO_ASYNC, sio); + if (retval < 0) { + debug(79, 2) ("storeSfsWrite: sfs_read on %d failed!\n", sfsstate->fd); + storeSfsIOCallback(sio, DISK_ERROR); + } +} + +void +storeSfsUnlink(SwapDir * SD, StoreEntry * e) +{ + debug(79, 3) ("storeSfsUnlink: fileno %08X\n", e->swap_filen); + storeSfsDirReplRemove(e); + storeSfsDirMapBitReset(SD, e->swap_filen); + storeSfsDirUnlinkFile(SD, e->swap_filen); +} + + +void +storeSfsReadDone(storeIOState *sio, int retval) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + STRCB *callback = sio->read.callback; + void *their_data = sio->read.callback_data; + ssize_t rlen; + char *buf = sfsstate->read_buf; + + debug(79, 3) ("storeSfsReadDone: dirno %d, fileno %08X, FD %d, len %d\n", + sio->swap_dirn, sio->swap_filen, sfsstate->fd, retval); + sfsstate->flags.reading = 0; + if (retval < 0) { + debug(79, 3) ("storeSfsReadDone: got failure\n"); + rlen = -1; + } else { + rlen = (ssize_t) retval; + sio->offset += retval; + } + assert(callback); + assert(their_data); + sio->read.callback = NULL; + sio->read.callback_data = NULL; + sfsstate->read_buf = NULL; + + if (cbdataValid(their_data)) + callback(their_data, buf, (size_t) rlen); + cbdataUnlock(their_data); +} + +void +storeSfsWriteDone(storeIOState *sio, int retval) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + debug(79, 3) ("storeSfsWriteDone: dirno %d, fileno %08X, FD %d, len %d\n", + sio->swap_dirn, sio->swap_filen, sfsstate->fd, retval); + sfsstate->flags.writing = 0; + if (retval < 0) { + debug(79, 0) ("storeSfsWriteDone: got failure\n"); + storeSfsIOCallback(sio, DISK_ERROR); + return; + } + sio->offset += retval; +} + +/* + * storeSfsCloseDone - called when we complete the CLOSE op + * Note that if we get here and the sio is invalid we don't + * call the file_callback to notify the upper layers of the + * change in swap filenumber. This has the side effect that + * if we are called due to a sfs_close() done because of an + * error during swapin/out, we don't notify the layers of a + * change in swap filenumber, which is ok. :-) + * -- adrian, typing justified text again + */ +void +storeSfsCloseDone(storeIOState *sio, int retval) +{ + sfsstate_t *sfsstate = sio->fsstate; + int errflag; + + debug(79, 3) ("storeSfsCloseDone: dirno %d, fileno %08X\n", + sio->swap_dirn, sfsstate->swap_filen); + + if (retval == 0) + errflag = DISK_OK; + else + errflag = DISK_ERROR; + + /* Call back the filen notify */ + if ((retval == 0) && sio->file_callback && cbdataValid(sio) && + cbdataValid(sio->callback_data)) { + sio->swap_filen = sfsstate->swap_filen; + sio->file_callback(sio->callback_data, 0, sio); + } + + if (cbdataValid(sio->callback_data)) + sio->callback(sio->callback_data, errflag, sio); + cbdataUnlock(sio->callback_data); + sio->callback_data = NULL; + sio->callback = NULL; + cbdataFree(sio); +} + + +/* === STATIC =========================================================== */ + +static void +storeSfsIOCallback(storeIOState * sio, int errflag) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + int retval; + + debug(79, 3) ("storeSfsIOCallback: errflag=%d\n", errflag); + if (sfsstate->fd > -1) { + retval = sfs_close(sfsstate->fd, _SFS_IO_ASYNC, sio); + if (retval < 0) { + debug(79, 1) ("storeSfsIOCallback: Can't close %d/%08X!\n", + sio->swap_dirn, sfsstate->fd); + } + } + + /* The rest of the shutdown will get run in storeSfsCloseDone() */ +} + + +/* + * Clean up any references from the SIO before it get's released. + */ +static void +storeSfsIOFreeEntry(void *sio) +{ + memPoolFree(sfs_state_pool, ((storeIOState *) sio)->fsstate); +} Index: squid/src/fs/sfs/store_sfs.h diff -u /dev/null squid/src/fs/sfs/store_sfs.h:1.1.2.5 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/sfs/store_sfs.h Tue Feb 6 07:43:37 2001 @@ -0,0 +1,62 @@ +/* + * store_sfs.h + * + * Internal declarations for the sfs routines + */ + +#ifndef __STORE_SFS_H__ +#define __STORE_SFS_H__ + +#include "sfs_defines.h" +#include "sfs_lib.h" + +struct _sfsinfo_t { + int swaplog_fd; + int l1; + int l2; + fileMap *map; + int suggest; + sfsid_t sfsid; /* The SFS mount id .. */ +}; + +struct _sfsstate_t { + sfsfd_t fd; + char *read_buf; + struct { + unsigned int close_request:1; + unsigned int reading:1; + unsigned int writing:1; + } flags; + int swap_filen; +}; + +typedef struct _sfsinfo_t sfsinfo_t; +typedef struct _sfsstate_t sfsstate_t; + +/* The sfs_state memory pool */ +extern MemPool *sfs_state_pool; + +/* + * store dir stuff + */ +extern void storeSfsDirMapBitReset(SwapDir *, sfileno); +extern int storeSfsDirMapBitAllocate(SwapDir *); +extern char *storeSfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath); +extern void storeSfsDirUnlinkFile(SwapDir *, sfileno); +extern void storeSfsDirReplAdd(SwapDir * SD, StoreEntry *); +extern void storeSfsDirReplRemove(StoreEntry *); +extern int sfs_openNextInode(sfsid_t sfsid, sfsblock_t *cur); + +/* + * Store IO stuff + */ +extern STOBJCREATE storeSfsCreate; +extern STOBJOPEN storeSfsOpen; +extern STOBJCLOSE storeSfsClose; +extern STOBJREAD storeSfsRead; +extern STOBJWRITE storeSfsWrite; +extern STOBJUNLINK storeSfsUnlink; +extern void storeSfsReadDone(storeIOState *, int); +extern void storeSfsWriteDone(storeIOState *, int); +extern void storeSfsCloseDone(storeIOState *, int); +#endif Index: squid/src/fs/ufs/Makefile.in diff -u squid/src/fs/ufs/Makefile.in:1.2 squid/src/fs/ufs/Makefile.in:1.2.34.1 --- squid/src/fs/ufs/Makefile.in:1.2 Sat Oct 21 09:44:46 2000 +++ squid/src/fs/ufs/Makefile.in Sat Apr 14 02:46:36 2001 @@ -1,10 +1,10 @@ # -# Makefile for the UFS storage driver for the Squid Object Cache server +# Makefile for the AUFS storage driver for the Squid Object Cache server # # $Id$ # -FS = ufs +FS = aufs top_srcdir = @top_srcdir@ VPATH = @srcdir@ @@ -22,11 +22,15 @@ OUT = ../$(FS).a OBJS = \ + aiops.o \ + async_io.o \ store_dir_ufs.o \ - store_io_ufs.o + store_io_ufs.o \ + fs_aufs.o \ + fs_ufs.o -all: $(OUT) +all: $(OUT) $(OUT): $(OBJS) @rm -f ../stamp @@ -34,6 +38,7 @@ $(RANLIB) $(OUT) $(OBJS): $(top_srcdir)/include/version.h ../../../include/autoconf.h +$(OBJS): fs_structs.h .c.o: @rm -f ../stamp Index: squid/src/fs/ufs/README diff -u /dev/null squid/src/fs/ufs/README:1.1.2.1 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/ufs/README Sat Apr 14 02:46:36 2001 @@ -0,0 +1,23 @@ +Ok, quick run-down of contents: + +store_io_ufs.c - this holds store* interface functions, as per the +programming guide. These are shared functions between aio and io (old +aufs and ufs) - they feed into the new io layer. + +fs_ufs.c - this holds ufs basic functions - the buildRequest and submitRequest +functions, essentially the new interface. These functions are leaned on +heavily by aufs/aio, they represent the bulkd of the shared code. + +fs_aufs.c - this is the async covering over ufs, it handles the ctrlp structure +and feeding requests into the queueing aio requires. It should be fairly +small shims on fs_ufs, and parts of it may vanish as time goes. + +store_dir_ufs.c - the old store_dir_ufs/store_dir_aufs file, as yet untranslated +in terms of function/structure names. This again is pretty much all shared +code - see http://www.squid-cache.org/mail-archive/squid-dev/200001/0043.html +for more details ;) + +fs_structs.h - structure definitions. + +aiops.c and async_io.c - from old aufs, untranslated yet, but will make up +the body of the queueing code for aio. Index: squid/src/fs/ufs/aiops.c diff -u /dev/null squid/src/fs/ufs/aiops.c:1.1.2.1 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/ufs/aiops.c Sat Apr 14 02:46:36 2001 @@ -0,0 +1,904 @@ +/* + * $Id$ + * + * DEBUG: section 43 AIOPS + * AUTHOR: Stewart Forster + * + * SQUID Web Proxy Cache http://www.squid-cache.org/ + * ---------------------------------------------------------- + * + * Squid is the result of efforts by numerous individuals from + * the Internet community; see the CONTRIBUTORS file for full + * details. Many organizations have provided support for Squid's + * development; see the SPONSORS file for full details. Squid is + * Copyrighted (C) 2001 by the Regents of the University of + * California; see the COPYRIGHT file for full details. Squid + * incorporates software developed and/or copyrighted by other + * sources; see the CREDITS file for full details. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. + * + */ + +#include "squid.h" +#include "store_asyncufs.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#if HAVE_SCHED_H +#include +#endif + +#define RIDICULOUS_LENGTH 4096 + +enum _aio_thread_status { + _THREAD_STARTING = 0, + _THREAD_WAITING, + _THREAD_BUSY, + _THREAD_FAILED, + _THREAD_DONE +}; + +enum _aio_request_type { + _AIO_OP_NONE = 0, + _AIO_OP_OPEN, + _AIO_OP_READ, + _AIO_OP_WRITE, + _AIO_OP_CLOSE, + _AIO_OP_UNLINK, + _AIO_OP_TRUNCATE, + _AIO_OP_OPENDIR, + _AIO_OP_STAT +}; + +typedef struct aio_request_t { + struct aio_request_t *next; + enum _aio_request_type request_type; + int cancelled; + char *path; + int oflag; + mode_t mode; + int fd; + char *bufferp; + char *tmpbufp; + int buflen; + off_t offset; + int whence; + int ret; + int err; + struct stat *tmpstatp; + struct stat *statp; + aio_result_t *resultp; +} aio_request_t; + +typedef struct aio_request_queue_t { + pthread_mutex_t mutex; + pthread_cond_t cond; + aio_request_t *volatile head; + aio_request_t *volatile *volatile tailp; + unsigned long requests; + unsigned long blocked; /* main failed to lock the queue */ +} aio_request_queue_t; + +typedef struct aio_thread_t aio_thread_t; +struct aio_thread_t { + aio_thread_t *next; + pthread_t thread; + enum _aio_thread_status status; + struct aio_request_t *current_req; + unsigned long requests; +}; + +int aio_cancel(aio_result_t *); +int aio_open(const char *, int, mode_t, aio_result_t *); +int aio_read(int, char *, int, off_t, int, aio_result_t *); +int aio_write(int, char *, int, off_t, int, aio_result_t *); +int aio_close(int, aio_result_t *); +int aio_unlink(const char *, aio_result_t *); +int aio_truncate(const char *, off_t length, aio_result_t *); +int aio_opendir(const char *, aio_result_t *); +aio_result_t *aio_poll_done(); +int aio_sync(void); + +static void aio_init(void); +static void aio_queue_request(aio_request_t *); +static void aio_cleanup_request(aio_request_t *); +static void *aio_thread_loop(void *); +static void aio_do_open(aio_request_t *); +static void aio_do_read(aio_request_t *); +static void aio_do_write(aio_request_t *); +static void aio_do_close(aio_request_t *); +static void aio_do_stat(aio_request_t *); +static void aio_do_unlink(aio_request_t *); +static void aio_do_truncate(aio_request_t *); +#if AIO_OPENDIR +static void *aio_do_opendir(aio_request_t *); +#endif +static void aio_debug(aio_request_t *); +static void aio_poll_queues(void); + +static aio_thread_t *threads = NULL; +static int aio_initialised = 0; + + +#define AIO_LARGE_BUFS 16384 +#define AIO_MEDIUM_BUFS AIO_LARGE_BUFS >> 1 +#define AIO_SMALL_BUFS AIO_LARGE_BUFS >> 2 +#define AIO_TINY_BUFS AIO_LARGE_BUFS >> 3 +#define AIO_MICRO_BUFS 128 + +static MemPool *aio_large_bufs = NULL; /* 16K */ +static MemPool *aio_medium_bufs = NULL; /* 8K */ +static MemPool *aio_small_bufs = NULL; /* 4K */ +static MemPool *aio_tiny_bufs = NULL; /* 2K */ +static MemPool *aio_micro_bufs = NULL; /* 128K */ + +static int request_queue_len = 0; +static MemPool *aio_request_pool = NULL; +static MemPool *aio_thread_pool = NULL; +static aio_request_queue_t request_queue; +static struct { + aio_request_t *head, **tailp; +} request_queue2 = { + + NULL, &request_queue2.head +}; +static aio_request_queue_t done_queue; +static struct { + aio_request_t *head, **tailp; +} done_requests = { + + NULL, &done_requests.head +}; +static pthread_attr_t globattr; +static struct sched_param globsched; +static pthread_t main_thread; + +static MemPool * +aio_get_pool(int size) +{ + MemPool *p; + if (size <= AIO_LARGE_BUFS) { + if (size <= AIO_MICRO_BUFS) + p = aio_micro_bufs; + else if (size <= AIO_TINY_BUFS) + p = aio_tiny_bufs; + else if (size <= AIO_SMALL_BUFS) + p = aio_small_bufs; + else if (size <= AIO_MEDIUM_BUFS) + p = aio_medium_bufs; + else + p = aio_large_bufs; + } else + p = NULL; + return p; +} + +static void * +aio_xmalloc(int size) +{ + void *p; + MemPool *pool; + + if ((pool = aio_get_pool(size)) != NULL) { + p = memPoolAlloc(pool); + } else + p = xmalloc(size); + + return p; +} + +static char * +aio_xstrdup(const char *str) +{ + char *p; + int len = strlen(str) + 1; + + p = aio_xmalloc(len); + strncpy(p, str, len); + + return p; +} + +static void +aio_xfree(void *p, int size) +{ + MemPool *pool; + + if ((pool = aio_get_pool(size)) != NULL) { + memPoolFree(pool, p); + } else + xfree(p); +} + +static void +aio_xstrfree(char *str) +{ + MemPool *pool; + int len = strlen(str) + 1; + + if ((pool = aio_get_pool(len)) != NULL) { + memPoolFree(pool, str); + } else + xfree(str); +} + +static void +aio_init(void) +{ + int i; + aio_thread_t *threadp; + + if (aio_initialised) + return; + + pthread_attr_init(&globattr); +#if HAVE_PTHREAD_ATTR_SETSCOPE + pthread_attr_setscope(&globattr, PTHREAD_SCOPE_SYSTEM); +#endif + globsched.sched_priority = 1; + main_thread = pthread_self(); +#if HAVE_PTHREAD_SETSCHEDPARAM + pthread_setschedparam(main_thread, SCHED_OTHER, &globsched); +#endif + globsched.sched_priority = 2; +#if HAVE_PTHREAD_ATTR_SETSCHEDPARAM + pthread_attr_setschedparam(&globattr, &globsched); +#endif + + /* Initialize request queue */ + if (pthread_mutex_init(&(request_queue.mutex), NULL)) + fatal("Failed to create mutex"); + if (pthread_cond_init(&(request_queue.cond), NULL)) + fatal("Failed to create condition variable"); + request_queue.head = NULL; + request_queue.tailp = &request_queue.head; + request_queue.requests = 0; + request_queue.blocked = 0; + + /* Initialize done queue */ + if (pthread_mutex_init(&(done_queue.mutex), NULL)) + fatal("Failed to create mutex"); + if (pthread_cond_init(&(done_queue.cond), NULL)) + fatal("Failed to create condition variable"); + done_queue.head = NULL; + done_queue.tailp = &done_queue.head; + done_queue.requests = 0; + done_queue.blocked = 0; + + /* Create threads and get them to sit in their wait loop */ + aio_thread_pool = memPoolCreate("aio_thread", sizeof(aio_thread_t)); + for (i = 0; i < NUMTHREADS; i++) { + threadp = memPoolAlloc(aio_thread_pool); + threadp->status = _THREAD_STARTING; + threadp->current_req = NULL; + threadp->requests = 0; + threadp->next = threads; + threads = threadp; + if (pthread_create(&threadp->thread, &globattr, aio_thread_loop, threadp)) { + fprintf(stderr, "Thread creation failed\n"); + threadp->status = _THREAD_FAILED; + continue; + } + } + + /* Create request pool */ + aio_request_pool = memPoolCreate("aio_request", sizeof(aio_request_t)); + aio_large_bufs = memPoolCreate("aio_large_bufs", AIO_LARGE_BUFS); + aio_medium_bufs = memPoolCreate("aio_medium_bufs", AIO_MEDIUM_BUFS); + aio_small_bufs = memPoolCreate("aio_small_bufs", AIO_SMALL_BUFS); + aio_tiny_bufs = memPoolCreate("aio_tiny_bufs", AIO_TINY_BUFS); + aio_micro_bufs = memPoolCreate("aio_micro_bufs", AIO_MICRO_BUFS); + + aio_initialised = 1; +} + + +static void * +aio_thread_loop(void *ptr) +{ + aio_thread_t *threadp = ptr; + aio_request_t *request; + sigset_t new; + + /* + * Make sure to ignore signals which may possibly get sent to + * the parent squid thread. Causes havoc with mutex's and + * condition waits otherwise + */ + + sigemptyset(&new); + sigaddset(&new, SIGPIPE); + sigaddset(&new, SIGCHLD); +#ifdef _SQUID_LINUX_THREADS_ + sigaddset(&new, SIGQUIT); + sigaddset(&new, SIGTRAP); +#else + sigaddset(&new, SIGUSR1); + sigaddset(&new, SIGUSR2); +#endif + sigaddset(&new, SIGHUP); + sigaddset(&new, SIGTERM); + sigaddset(&new, SIGINT); + sigaddset(&new, SIGALRM); + pthread_sigmask(SIG_BLOCK, &new, NULL); + + while (1) { + threadp->current_req = request = NULL; + request = NULL; + /* Get a request to process */ + threadp->status = _THREAD_WAITING; + pthread_mutex_lock(&request_queue.mutex); + while (!request_queue.head) { + pthread_cond_wait(&request_queue.cond, &request_queue.mutex); + } + request = request_queue.head; + if (request) + request_queue.head = request->next; + if (!request_queue.head) + request_queue.tailp = &request_queue.head; + pthread_mutex_unlock(&request_queue.mutex); + /* process the request */ + threadp->status = _THREAD_BUSY; + request->next = NULL; + threadp->current_req = request; + errno = 0; + if (!request->cancelled) { + switch (request->request_type) { + case _AIO_OP_OPEN: + aio_do_open(request); + break; + case _AIO_OP_READ: + aio_do_read(request); + break; + case _AIO_OP_WRITE: + aio_do_write(request); + break; + case _AIO_OP_CLOSE: + aio_do_close(request); + break; + case _AIO_OP_UNLINK: + aio_do_unlink(request); + break; + case _AIO_OP_TRUNCATE: + aio_do_truncate(request); + break; +#if AIO_OPENDIR /* Opendir not implemented yet */ + case _AIO_OP_OPENDIR: + aio_do_opendir(request); + break; +#endif + case _AIO_OP_STAT: + aio_do_stat(request); + break; + default: + request->ret = -1; + request->err = EINVAL; + break; + } + } else { /* cancelled */ + request->ret = -1; + request->err = EINTR; + } + threadp->status = _THREAD_DONE; + /* put the request in the done queue */ + pthread_mutex_lock(&done_queue.mutex); + *done_queue.tailp = request; + done_queue.tailp = &request->next; + pthread_mutex_unlock(&done_queue.mutex); + threadp->requests++; + } /* while forever */ + return NULL; +} /* aio_thread_loop */ + +static void +aio_queue_request(aio_request_t * request) +{ + static int high_start = 0; + debug(41, 9) ("aio_queue_request: %p type=%d result=%p\n", + request, request->request_type, request->resultp); + /* Mark it as not executed (failing result, no error) */ + request->ret = -1; + request->err = 0; + /* Internal housekeeping */ + request_queue_len += 1; + request->resultp->_data = request; + /* Play some tricks with the request_queue2 queue */ + request->next = NULL; + if (!request_queue2.head) { + if (pthread_mutex_trylock(&request_queue.mutex) == 0) { + /* Normal path */ + *request_queue.tailp = request; + request_queue.tailp = &request->next; + pthread_cond_signal(&request_queue.cond); + pthread_mutex_unlock(&request_queue.mutex); + } else { + /* Oops, the request queue is blocked, use request_queue2 */ + *request_queue2.tailp = request; + request_queue2.tailp = &request->next; + } + } else { + /* Secondary path. We have blocked requests to deal with */ + /* add the request to the chain */ + *request_queue2.tailp = request; + if (pthread_mutex_trylock(&request_queue.mutex) == 0) { + /* Ok, the queue is no longer blocked */ + *request_queue.tailp = request_queue2.head; + request_queue.tailp = &request->next; + pthread_cond_signal(&request_queue.cond); + pthread_mutex_unlock(&request_queue.mutex); + request_queue2.head = NULL; + request_queue2.tailp = &request_queue2.head; + } else { + /* still blocked, bump the blocked request chain */ + request_queue2.tailp = &request->next; + } + } + if (request_queue2.head) { + static int filter = 0; + static int filter_limit = 8; + if (++filter >= filter_limit) { + filter_limit += filter; + filter = 0; + debug(43, 1) ("aio_queue_request: WARNING - Queue congestion\n"); + } + } + /* Warn if out of threads */ + if (request_queue_len > MAGIC1) { + static int last_warn = 0; + static int queue_high, queue_low; + if (high_start == 0) { + high_start = squid_curtime; + queue_high = request_queue_len; + queue_low = request_queue_len; + } + if (request_queue_len > queue_high) + queue_high = request_queue_len; + if (request_queue_len < queue_low) + queue_low = request_queue_len; + if (squid_curtime >= (last_warn + 15) && + squid_curtime >= (high_start + 5)) { + debug(43, 1) ("aio_queue_request: WARNING - Disk I/O overloading\n"); + if (squid_curtime >= (high_start + 15)) + debug(43, 1) ("aio_queue_request: Queue Length: current=%d, high=%d, low=%d, duration=%d\n", + request_queue_len, queue_high, queue_low, squid_curtime - high_start); + last_warn = squid_curtime; + } + } else { + high_start = 0; + } + /* Warn if seriously overloaded */ + if (request_queue_len > RIDICULOUS_LENGTH) { + debug(43, 0) ("aio_queue_request: Async request queue growing uncontrollably!\n"); + debug(43, 0) ("aio_queue_request: Syncing pending I/O operations.. (blocking)\n"); + aio_sync(); + debug(43, 0) ("aio_queue_request: Synced\n"); + } +} /* aio_queue_request */ + +static void +aio_cleanup_request(aio_request_t * requestp) +{ + aio_result_t *resultp = requestp->resultp; + int cancelled = requestp->cancelled; + + /* Free allocated structures and copy data back to user space if the */ + /* request hasn't been cancelled */ + switch (requestp->request_type) { + case _AIO_OP_STAT: + if (!cancelled && requestp->ret == 0) + xmemcpy(requestp->statp, requestp->tmpstatp, sizeof(struct stat)); + aio_xfree(requestp->tmpstatp, sizeof(struct stat)); + aio_xstrfree(requestp->path); + break; + case _AIO_OP_OPEN: + if (cancelled && requestp->ret >= 0) + /* The open() was cancelled but completed */ + close(requestp->ret); + aio_xstrfree(requestp->path); + break; + case _AIO_OP_CLOSE: + if (cancelled && requestp->ret < 0) + /* The close() was cancelled and never got executed */ + close(requestp->fd); + break; + case _AIO_OP_UNLINK: + case _AIO_OP_TRUNCATE: + case _AIO_OP_OPENDIR: + aio_xstrfree(requestp->path); + break; + case _AIO_OP_READ: + if (!cancelled && requestp->ret > 0) + xmemcpy(requestp->bufferp, requestp->tmpbufp, requestp->ret); + aio_xfree(requestp->tmpbufp, requestp->buflen); + break; + case _AIO_OP_WRITE: + aio_xfree(requestp->tmpbufp, requestp->buflen); + break; + default: + break; + } + if (resultp != NULL && !cancelled) { + resultp->aio_return = requestp->ret; + resultp->aio_errno = requestp->err; + } + memPoolFree(aio_request_pool, requestp); +} /* aio_cleanup_request */ + + +int +aio_cancel(aio_result_t * resultp) +{ + aio_request_t *request = resultp->_data; + + if (request && request->resultp == resultp) { + debug(41, 9) ("aio_cancel: %p type=%d result=%p\n", + request, request->request_type, request->resultp); + request->cancelled = 1; + request->resultp = NULL; + resultp->_data = NULL; + return 0; + } + return 1; +} /* aio_cancel */ + + +int +aio_open(const char *path, int oflag, mode_t mode, aio_result_t * resultp) +{ + aio_request_t *requestp; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + requestp->path = (char *) aio_xstrdup(path); + requestp->oflag = oflag; + requestp->mode = mode; + requestp->resultp = resultp; + requestp->request_type = _AIO_OP_OPEN; + requestp->cancelled = 0; + + aio_queue_request(requestp); + return 0; +} + + +static void +aio_do_open(aio_request_t * requestp) +{ + requestp->ret = open(requestp->path, requestp->oflag, requestp->mode); + requestp->err = errno; +} + + +int +aio_read(int fd, char *bufp, int bufs, off_t offset, int whence, aio_result_t * resultp) +{ + aio_request_t *requestp; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + requestp->fd = fd; + requestp->bufferp = bufp; + requestp->tmpbufp = (char *) aio_xmalloc(bufs); + requestp->buflen = bufs; + requestp->offset = offset; + requestp->whence = whence; + requestp->resultp = resultp; + requestp->request_type = _AIO_OP_READ; + requestp->cancelled = 0; + + aio_queue_request(requestp); + return 0; +} + + +static void +aio_do_read(aio_request_t * requestp) +{ + lseek(requestp->fd, requestp->offset, requestp->whence); + requestp->ret = read(requestp->fd, requestp->tmpbufp, requestp->buflen); + requestp->err = errno; +} + + +int +aio_write(int fd, char *bufp, int bufs, off_t offset, int whence, aio_result_t * resultp) +{ + aio_request_t *requestp; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + requestp->fd = fd; + requestp->tmpbufp = (char *) aio_xmalloc(bufs); + xmemcpy(requestp->tmpbufp, bufp, bufs); + requestp->buflen = bufs; + requestp->offset = offset; + requestp->whence = whence; + requestp->resultp = resultp; + requestp->request_type = _AIO_OP_WRITE; + requestp->cancelled = 0; + + aio_queue_request(requestp); + return 0; +} + + +static void +aio_do_write(aio_request_t * requestp) +{ + requestp->ret = write(requestp->fd, requestp->tmpbufp, requestp->buflen); + requestp->err = errno; +} + + +int +aio_close(int fd, aio_result_t * resultp) +{ + aio_request_t *requestp; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + requestp->fd = fd; + requestp->resultp = resultp; + requestp->request_type = _AIO_OP_CLOSE; + requestp->cancelled = 0; + + aio_queue_request(requestp); + return 0; +} + + +static void +aio_do_close(aio_request_t * requestp) +{ + requestp->ret = close(requestp->fd); + requestp->err = errno; +} + + +int +aio_stat(const char *path, struct stat *sb, aio_result_t * resultp) +{ + aio_request_t *requestp; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + requestp->path = (char *) aio_xstrdup(path); + requestp->statp = sb; + requestp->tmpstatp = (struct stat *) aio_xmalloc(sizeof(struct stat)); + requestp->resultp = resultp; + requestp->request_type = _AIO_OP_STAT; + requestp->cancelled = 0; + + aio_queue_request(requestp); + return 0; +} + + +static void +aio_do_stat(aio_request_t * requestp) +{ + requestp->ret = stat(requestp->path, requestp->tmpstatp); + requestp->err = errno; +} + + +int +aio_unlink(const char *path, aio_result_t * resultp) +{ + aio_request_t *requestp; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + requestp->path = aio_xstrdup(path); + requestp->resultp = resultp; + requestp->request_type = _AIO_OP_UNLINK; + requestp->cancelled = 0; + + aio_queue_request(requestp); + return 0; +} + + +static void +aio_do_unlink(aio_request_t * requestp) +{ + requestp->ret = unlink(requestp->path); + requestp->err = errno; +} + +int +aio_truncate(const char *path, off_t length, aio_result_t * resultp) +{ + aio_request_t *requestp; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + requestp->path = (char *) aio_xstrdup(path); + requestp->offset = length; + requestp->resultp = resultp; + requestp->request_type = _AIO_OP_TRUNCATE; + requestp->cancelled = 0; + + aio_queue_request(requestp); + return 0; +} + + +static void +aio_do_truncate(aio_request_t * requestp) +{ + requestp->ret = truncate(requestp->path, requestp->offset); + requestp->err = errno; +} + + +#if AIO_OPENDIR +/* XXX aio_opendir NOT implemented yet.. */ + +int +aio_opendir(const char *path, aio_result_t * resultp) +{ + aio_request_t *requestp; + int len; + + if (!aio_initialised) + aio_init(); + requestp = memPoolAlloc(aio_request_pool); + return -1; +} + +static void +aio_do_opendir(aio_request_t * requestp) +{ + /* NOT IMPLEMENTED */ +} + +#endif + +static void +aio_poll_queues(void) +{ + /* kick "overflow" request queue */ + if (request_queue2.head && + pthread_mutex_trylock(&request_queue.mutex) == 0) { + *request_queue.tailp = request_queue2.head; + request_queue.tailp = request_queue2.tailp; + pthread_cond_signal(&request_queue.cond); + pthread_mutex_unlock(&request_queue.mutex); + request_queue2.head = NULL; + request_queue2.tailp = &request_queue2.head; + } + /* poll done queue */ + if (done_queue.head && pthread_mutex_trylock(&done_queue.mutex) == 0) { + struct aio_request_t *requests = done_queue.head; + done_queue.head = NULL; + done_queue.tailp = &done_queue.head; + pthread_mutex_unlock(&done_queue.mutex); + *done_requests.tailp = requests; + request_queue_len -= 1; + while (requests->next) { + requests = requests->next; + request_queue_len -= 1; + } + done_requests.tailp = &requests->next; + } + /* Give up the CPU to allow the threads to do their work */ + /* + * For Andres thoughts about yield(), see + * http://www.squid-cache.org/mail-archive/squid-dev/200012/0001.html + */ + if (done_queue.head || request_queue.head) +#ifndef _SQUID_SOLARIS_ + sched_yield(); +#else + yield(); +#endif +} + +aio_result_t * +aio_poll_done(void) +{ + aio_request_t *request; + aio_result_t *resultp; + int cancelled; + int polled = 0; + + AIO_REPOLL: + request = done_requests.head; + if (request == NULL && !polled) { + aio_poll_queues(); + polled = 1; + request = done_requests.head; + } + if (!request) { + return NULL; + } + debug(41, 9) ("aio_poll_done: %p type=%d result=%p\n", + request, request->request_type, request->resultp); + done_requests.head = request->next; + if (!done_requests.head) + done_requests.tailp = &done_requests.head; + resultp = request->resultp; + cancelled = request->cancelled; + aio_debug(request); + debug(43, 5) ("DONE: %d -> %d\n", request->ret, request->err); + aio_cleanup_request(request); + if (cancelled) + goto AIO_REPOLL; + return resultp; +} /* aio_poll_done */ + +int +aio_operations_pending(void) +{ + return request_queue_len + (done_requests.head ? 1 : 0); +} + +int +aio_sync(void) +{ + /* XXX This might take a while if the queue is large.. */ + do { + aio_poll_queues(); + } while (request_queue_len > 0); + return aio_operations_pending(); +} + +int +aio_get_queue_len(void) +{ + return request_queue_len; +} + +static void +aio_debug(aio_request_t * request) +{ + switch (request->request_type) { + case _AIO_OP_OPEN: + debug(43, 5) ("OPEN of %s to FD %d\n", request->path, request->ret); + break; + case _AIO_OP_READ: + debug(43, 5) ("READ on fd: %d\n", request->fd); + break; + case _AIO_OP_WRITE: + debug(43, 5) ("WRITE on fd: %d\n", request->fd); + break; + case _AIO_OP_CLOSE: + debug(43, 5) ("CLOSE of fd: %d\n", request->fd); + break; + case _AIO_OP_UNLINK: + debug(43, 5) ("UNLINK of %s\n", request->path); + break; + case _AIO_OP_TRUNCATE: + debug(43, 5) ("UNLINK of %s\n", request->path); + break; + default: + break; + } +} Index: squid/src/fs/ufs/async_io.c diff -u /dev/null squid/src/fs/ufs/async_io.c:1.1.2.1 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/ufs/async_io.c Sat Apr 14 02:46:36 2001 @@ -0,0 +1,365 @@ + +/* + * $Id$ + * + * DEBUG: section 32 Asynchronous Disk I/O + * AUTHOR: Pete Bentley + * AUTHOR: Stewart Forster + * + * SQUID Web Proxy Cache http://www.squid-cache.org/ + * ---------------------------------------------------------- + * + * Squid is the result of efforts by numerous individuals from + * the Internet community; see the CONTRIBUTORS file for full + * details. Many organizations have provided support for Squid's + * development; see the SPONSORS file for full details. Squid is + * Copyrighted (C) 2001 by the Regents of the University of + * California; see the COPYRIGHT file for full details. Squid + * incorporates software developed and/or copyrighted by other + * sources; see the CREDITS file for full details. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. + * + */ + +#include "squid.h" +#include "store_asyncufs.h" + +#define _AIO_OPEN 0 +#define _AIO_READ 1 +#define _AIO_WRITE 2 +#define _AIO_CLOSE 3 +#define _AIO_UNLINK 4 +#define _AIO_TRUNCATE 4 +#define _AIO_OPENDIR 5 +#define _AIO_STAT 6 + +typedef struct aio_ctrl_t { + struct aio_ctrl_t *next; + int fd; + int operation; + AIOCB *done_handler; + void *done_handler_data; + aio_result_t result; + char *bufp; + FREE *free_func; + dlink_node node; +} aio_ctrl_t; + +struct { + int open; + int close; + int cancel; + int write; + int read; + int stat; + int unlink; + int check_callback; +} aio_counts; + +typedef struct aio_unlinkq_t { + char *path; + struct aio_unlinkq_t *next; +} aio_unlinkq_t; + +static dlink_list used_list; +static int initialised = 0; +static OBJH aioStats; +static MemPool *aio_ctrl_pool; +static void aioFDWasClosed(int fd); + +static void +aioFDWasClosed(int fd) +{ + if (fd_table[fd].flags.closing) + fd_close(fd); +} + +void +aioInit(void) +{ + if (initialised) + return; + aio_ctrl_pool = memPoolCreate("aio_ctrl", sizeof(aio_ctrl_t)); + cachemgrRegister("aio_counts", "Async IO Function Counters", + aioStats, 0, 1); + initialised = 1; + comm_quick_poll_required(); +} + +void +aioDone(void) +{ + memPoolDestroy(aio_ctrl_pool); + initialised = 0; +} + +void +aioOpen(const char *path, int oflag, mode_t mode, AIOCB * callback, void *callback_data) +{ + aio_ctrl_t *ctrlp; + + assert(initialised); + aio_counts.open++; + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = -2; + ctrlp->done_handler = callback; + ctrlp->done_handler_data = callback_data; + ctrlp->operation = _AIO_OPEN; + cbdataLock(callback_data); + ctrlp->result.data = ctrlp; + aio_open(path, oflag, mode, &ctrlp->result); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); + return; +} + +void +aioClose(int fd) +{ + aio_ctrl_t *ctrlp; + + assert(initialised); + aio_counts.close++; + aioCancel(fd); + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = fd; + ctrlp->done_handler = NULL; + ctrlp->done_handler_data = NULL; + ctrlp->operation = _AIO_CLOSE; + ctrlp->result.data = ctrlp; + aio_close(fd, &ctrlp->result); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); + return; +} + +void +aioCancel(int fd) +{ + aio_ctrl_t *curr; + AIOCB *done_handler; + void *their_data; + dlink_node *m, *next; + + assert(initialised); + aio_counts.cancel++; + for (m = used_list.head; m; m = next) { + while (m) { + curr = m->data; + if (curr->fd == fd) + break; + m = m->next; + } + if (m == NULL) + break; + + aio_cancel(&curr->result); + + if ((done_handler = curr->done_handler)) { + their_data = curr->done_handler_data; + curr->done_handler = NULL; + curr->done_handler_data = NULL; + debug(0, 0) ("this be aioCancel\n"); + if (cbdataValid(their_data)) + done_handler(fd, their_data, -2, -2); + cbdataUnlock(their_data); + } + next = m->next; + dlinkDelete(m, &used_list); + memPoolFree(aio_ctrl_pool, curr); + } +} + + +void +aioWrite(int fd, int offset, char *bufp, int len, AIOCB * callback, void *callback_data, FREE * free_func) +{ + aio_ctrl_t *ctrlp; + int seekmode; + + assert(initialised); + aio_counts.write++; + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = fd; + ctrlp->done_handler = callback; + ctrlp->done_handler_data = callback_data; + ctrlp->operation = _AIO_WRITE; + ctrlp->bufp = bufp; + ctrlp->free_func = free_func; + if (offset >= 0) + seekmode = SEEK_SET; + else { + seekmode = SEEK_END; + offset = 0; + } + cbdataLock(callback_data); + ctrlp->result.data = ctrlp; + aio_write(fd, bufp, len, offset, seekmode, &ctrlp->result); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); +} /* aioWrite */ + + +void +aioRead(int fd, int offset, char *bufp, int len, AIOCB * callback, void *callback_data) +{ + aio_ctrl_t *ctrlp; + int seekmode; + + assert(initialised); + aio_counts.read++; + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = fd; + ctrlp->done_handler = callback; + ctrlp->done_handler_data = callback_data; + ctrlp->operation = _AIO_READ; + if (offset >= 0) + seekmode = SEEK_SET; + else { + seekmode = SEEK_CUR; + offset = 0; + } + cbdataLock(callback_data); + ctrlp->result.data = ctrlp; + aio_read(fd, bufp, len, offset, seekmode, &ctrlp->result); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); + return; +} /* aioRead */ + +void +aioStat(char *path, struct stat *sb, AIOCB * callback, void *callback_data) +{ + aio_ctrl_t *ctrlp; + + assert(initialised); + aio_counts.stat++; + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = -2; + ctrlp->done_handler = callback; + ctrlp->done_handler_data = callback_data; + ctrlp->operation = _AIO_STAT; + cbdataLock(callback_data); + ctrlp->result.data = ctrlp; + aio_stat(path, sb, &ctrlp->result); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); + return; +} /* aioStat */ + +void +aioUnlink(const char *path, AIOCB * callback, void *callback_data) +{ + aio_ctrl_t *ctrlp; + assert(initialised); + aio_counts.unlink++; + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = -2; + ctrlp->done_handler = callback; + ctrlp->done_handler_data = callback_data; + ctrlp->operation = _AIO_UNLINK; + cbdataLock(callback_data); + ctrlp->result.data = ctrlp; + aio_unlink(path, &ctrlp->result); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); +} /* aioUnlink */ + +void +aioTruncate(const char *path, off_t length, AIOCB * callback, void *callback_data) +{ + aio_ctrl_t *ctrlp; + assert(initialised); + aio_counts.unlink++; + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = -2; + ctrlp->done_handler = callback; + ctrlp->done_handler_data = callback_data; + ctrlp->operation = _AIO_TRUNCATE; + cbdataLock(callback_data); + ctrlp->result.data = ctrlp; + aio_truncate(path, length, &ctrlp->result); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); +} /* aioTruncate */ + + +int +aioCheckCallbacks(SwapDir * SD) +{ + aio_result_t *resultp; + aio_ctrl_t *ctrlp; + AIOCB *done_handler; + void *their_data; + int retval = 0; + + assert(initialised); + aio_counts.check_callback++; + for (;;) { + if ((resultp = aio_poll_done()) == NULL) + break; + ctrlp = (aio_ctrl_t *) resultp->data; + if (ctrlp == NULL) + continue; /* XXX Should not happen */ + dlinkDelete(&ctrlp->node, &used_list); + if ((done_handler = ctrlp->done_handler)) { + their_data = ctrlp->done_handler_data; + ctrlp->done_handler = NULL; + ctrlp->done_handler_data = NULL; + if (cbdataValid(their_data)) { + retval = 1; /* Return that we've actually done some work */ + done_handler(ctrlp->fd, their_data, + ctrlp->result.aio_return, ctrlp->result.aio_errno); + } + cbdataUnlock(their_data); + } + /* free data if requested to aioWrite() */ + if (ctrlp->free_func) + ctrlp->free_func(ctrlp->bufp); + if (ctrlp->operation == _AIO_CLOSE) + aioFDWasClosed(ctrlp->fd); + memPoolFree(aio_ctrl_pool, ctrlp); + } + return retval; +} + +void +aioStats(StoreEntry * sentry) +{ + storeAppendPrintf(sentry, "ASYNC IO Counters:\n"); + storeAppendPrintf(sentry, "open\t%d\n", aio_counts.open); + storeAppendPrintf(sentry, "close\t%d\n", aio_counts.close); + storeAppendPrintf(sentry, "cancel\t%d\n", aio_counts.cancel); + storeAppendPrintf(sentry, "write\t%d\n", aio_counts.write); + storeAppendPrintf(sentry, "read\t%d\n", aio_counts.read); + storeAppendPrintf(sentry, "stat\t%d\n", aio_counts.stat); + storeAppendPrintf(sentry, "unlink\t%d\n", aio_counts.unlink); + storeAppendPrintf(sentry, "check_callback\t%d\n", aio_counts.check_callback); + storeAppendPrintf(sentry, "queue\t%d\n", aio_get_queue_len()); +} + +/* Flush all pending I/O */ +void +aioSync(SwapDir * SD) +{ + if (!initialised) + return; /* nothing to do then */ + /* Flush all pending operations */ + debug(32, 1) ("aioSync: flushing pending I/O operations\n"); + do { + aioCheckCallbacks(SD); + } while (aio_sync()); + debug(32, 1) ("aioSync: done\n"); +} + +int +aioQueueSize(void) +{ + return memPoolInUseCount(aio_ctrl_pool); +} Index: squid/src/fs/ufs/fs_aufs.c diff -u /dev/null squid/src/fs/ufs/fs_aufs.c:1.1.2.2 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/ufs/fs_aufs.c Sun Apr 29 03:50:11 2001 @@ -0,0 +1,84 @@ +/* async-io specific functions */ + +void * +aio_buildOpenRequest(char *path, int flags, int mode, + STIOCB * callback, void *callback_data) +{ + ufs_request_t *requestp; + ufs_ctrl_t *ctrlp; + + if (!aio_initialised) + aio_init(); + + /* XXX counts need to be per-SD, in fsdata */ + aio_counts.open++; + ctrlp = memPoolAlloc(aio_ctrl_pool); + ctrlp->fd = -2; + ctrlp->done_handler = callback; + ctrlp->done_handler_data = callback_data; + ctrlp->operation = _AIO_OPEN; + cbdataLock(callback_data); + ctrlp->result.data = ctrlp; + + requestp = (aio_request_t *)io_buildOpenRequest(path, flags, mode, + callback, callback_data); + /* Link ctrlp and requestp together */ + requestp->resultp = &ctrlp->result; + requestp->resultp->_data = requestp; + + return (void *)requestp; +} + +void +aio_open(aio_request_t *requestp) +{ + aio_ctrl_t *ctrlp; + + ctrlp = (requestp->resultp).data; + aio_queue_request(requestp); + dlinkAdd(ctrlp, &ctrlp->node, &used_list); + return; +} + +void +aio_cleanupRequest(aio_request_t *requestp) +{ + aio_result_t *resultp = requestp->resultp; + + if (resultp != NULL && !cancelled) { + resultp->aio_return = requestp->ret; + resultp->aio_errno = requestp->err; + } + ufs_cleanupRequest(requestp); +} + +ufs_request_t * +aio_buildReadRequest() +{ +} + +void +aio_read(ufs_request_t *requestp) +{ + /* This code stolen from the store_io_aufs.c code. Note, this changes + * the timing of this snippet slightly, but hopefully not enough to + * count. Still got to think through the whole "pending requests" + * thing *frown* + * Obviously, this is just a placeholder snippet, not functional code. */ + if (fsstate->fd < 0) { + struct _queued_read *q; + debug(78, 3) ("storeUfsRead: queueing read because FD < 0\n"); + assert(fsstate->flags.opening); + assert(fsstate->pending_reads == NULL); + assert(fsstate->async); + q = memPoolAlloc(ufs_qread_pool); + q->buf = buf; + q->size = size; + q->offset = offset; + q->callback = callback; + q->callback_data = callback_data; + linklistPush(&(fsstate->pending_reads), q); + return; + } + +} Index: squid/src/fs/ufs/fs_structs.h diff -u /dev/null squid/src/fs/ufs/fs_structs.h:1.1.2.2 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/ufs/fs_structs.h Sun Apr 29 03:50:11 2001 @@ -0,0 +1,105 @@ +/* Request Types - fairly self explanatory, these are set in the request_t + * "objects". */ +enum _ufs_request_type { + _OP_NONE = 0, + _OP_OPEN, + _OP_CREATE, + _OP_READ, + _OP_WRITE, + _OP_CLOSE, + _OP_UNLINK, + _OP_TRUNCATE, + _OP_OPENDIR, + _OP_STAT +}; + +/* ufs_request_t is the structure that defines a request "object". One of + * these is created for every request squid makes of the fs. */ +struct _ufs_request_t { + struct ufs_request_t *next; + enum _ufs_request_type request_type; + int cancelled; + char *path; + int oflag; + mode_t mode; + int fd; + char *bufferp; + char *tmpbufp; + int buflen; + off_t offset; + int whence; + int ret; + int err; + struct stat *tmpstatp; + struct stat *statp; + storeIOState *sio; +}; + +/* _ufs_fsinfo_t defines the fs-specific information hanging off the SwapDir + * structure (called "fsdata"). */ +struct _ufs_fsinfo_t { + int swaplog_fd; + int l1; + int l2; + fileMap *map; + int suggest; + int async; +}; + +/* _ufs_state_t records the state of any particular request. It resides in + * ithe StoreIOState structure, called fsstate. */ +struct _ufs_state_t { + int fd; + struct { + unsigned int close_request:1; + unsigned int reading:1; + unsigned int writing:1; + unsigned int opening:1; + unsigned int write_kicking:1; + unsigned int read_kicking:1; + unsigned int inreaddone:1; + } flags; + const char *read_buf; + link_list *pending_writes; + link_list *pending_reads; +}; + +typedef struct _ufs_request_type ufs_request_type; +typedef struct _ufs_request_t ufs_request_t; +typedef struct _ufs_fsinfo_t ufs_fsinfo_t; +typedef struct _ufs_state_t ufs_state_t; + +/* The ufs_state memory pools. These are/were mostly used by aufs, but + * that may change slightly with the new layout, not sure yet. */ +extern MemPool *ufs_state_pool; +extern MemPool *ufs_qread_pool; +extern MemPool *ufs_qwrite_pool; + +/* + * Notes on the various data structures used within ufs: + * + * Incoming requests usually carry a SwapDir pointer, pointing at the swap + * directory we're using, and either a StoreEntry pointer (for open/close/ + * unlink), or a storeIOState pointer (for read and write). They will generate + * ufs_request_t structures, which are then passed into the "submitRequest" + * style functions. From there, things diverge slightly - at the moment, + * aufs also creates a "ctrlp" structure, which is used to hold the real + * callback, while the aufs-specific callback is stored in the request. This + * should change before the code is useable, I suspect. + * + * I'm trying to set a standard naming convention within ufs itself - in + * most cases, you'll see functions called "io_*", which are the vanilla + * non-async versions, and "aio_*", which are the async versions. I'm not + * sure that's appropriate atm, especially in light of the existence of + * the aio libaries/functions in glibc, but we'll see. + * + * Structures defined in here usually have an "_ufs_" prefix, to indicate + * they belong to the ufs file system. Note, however, elsewhere they're not + * referred to with the _ufs_* prefix, but as their base name - I need to + * make static definitions in each file for the data structures needed, just + * because it looks cleaner. C not being my prime language, I'm still a touch + * shaky on this. + * + * ufs_request_pool exists, and is a memory pool for ufs_request_t structures. + * This needs to be setup as a global thing, not a per-fs thing. + * */ Index: squid/src/fs/ufs/fs_ufs.c diff -u /dev/null squid/src/fs/ufs/fs_ufs.c:1.1.2.3 --- /dev/null Tue Sep 28 18:35:35 2004 +++ squid/src/fs/ufs/fs_ufs.c Sun Apr 29 03:50:11 2001 @@ -0,0 +1,166 @@ +/* sync-io specific functions - a lot of this actually gets shared by the + * aufs code, by virtue of aufs being a queueing layer over ufs. */ + +/* doCallback handles executing any given callback - short, simple, and used + * in lots of places */ +void +io_doCallback(ufs_request_t *requestp) +{ + STIOCB *done_handler; + void *their_data; + + if (done_handler = requestp->done_handler) { + their_data = requestp->done_handler_data; + requestp->done_handler = NULL; + requestp->done_handler_data = NULL; + if (cbdataValid(their_data)) { + done_handler(requestp->fd,their_data,requestp->ret,requestp->err); + } + cbdataUnlock(their_data); + } + if (requestp->fd > 0) { + io_close(io_buildCloseRequest(requestp->fd)); + /* + close(requestp->fd); + fd_close(requestp->fd); + requestp->fd = -1; + */ + } +} + +/* buildOpenRequest is called by ufs_storeCreate and ufs_storeOpen - it + * builds an "open" request */ +void * +io_buildOpenRequest(char *path, int flags, int mode, + STIOCB *callback, void *callback_data) +{ + ufs_request_t *requestp; + + requestp = memPoolAlloc(ufs_request_pool); + requestp->path = (char *) xstrdup(path); + requestp->oflag = flags; + requestp->mode = mode; + requestp->request_type = _OP_OPEN; + requestp->cancelled = 0; + return (void *)requestp; +} + +/* io_open actions an open request, from ufs_storeOpen and ufs_storeCreate */ +void +io_open(ufs_request_t *requestp) +{ + ufs_do_open(requestp); + io_openDone(requestp); +} + +void +io_openDone(ufs_request_t *requestp) +{ + storeIOState *sio = (storeIOState *)requestp->sio; + aiostate_t *aiostate = (aiostate_t *) sio->fsstate; + + aiostate->flags.opening = 0; + if (requestp->err || (requestp->ret < 0)) { + debug(50,3) ("io_openDone: error opening file %s: %s\n", requestp->path, + xstrerror()); + io_doCallback(requestp); + } else { + debug(6,5) ("io_openDone: FD %d\n",requestp->ret); + aiostate->fd = requestp->fd = requestp->ret; + fd_open(requestp->fd,FD_FILE,requestp->path); + if (_OP_CREATE == requestp->request_type) { + /* XXX I'm not happy about where SD comes from below */ + storeUfsDirReplAdd(&Config.cacheSwap.swapDirs[sio->swap_dirn], + sio->e); + } else { + /* Here, I believe we need to make sure the file size is filled + * in in the sio. Old ufs code did an fstat. */ + } + } + io_cleanupRequest(requestp); +} + +void * +io_buildCloseRequest(int fd) +{ + ufs_request_t *requestp; + + requestp = memPoolAlloc(ufs_request_pool); + requestp->fd = fd; + requestp->request_type = _AIO_OP_CLOSE; + requestp->cancelled = 0; + return (void *)requestp; +} + +/* io_close actions a close request */ +void +io_close(ufs_request_t *requestp) +{ + ufs_do_close(requestp); + io_closeDone(requestp); +} + +/* This is mainly here to maintain symmetry. */ +void +io_closeDone(ufs_request_t *requestp) +{ + fd_close(requestp->fd); + io_cleanupRequest(requestp); +} + +/* cleanupRequest de-allocates the various structures once we're completely + * done with the request. Also copies any data accumulated back into + * squid's purview. Note, there's some copying of data going on in here that + * should not be - we _should_ be reading data straight into squid's own + * buffers, not into our own temporary buffers. */ +void +io_cleanupRequest(ufs_request_t *requestp) +{ + int cancelled = requestp->cancelled; + + /* Free allocated structures and copy data back to user space if the */ + /* request hasn't been cancelled */ + switch (requestp->request_type) { + case _AIO_OP_STAT: + if (!cancelled && requestp->ret == 0) + xmemcpy(requestp->statp, requestp->tmpstatp, sizeof(struct stat)); + xfree(requestp->tmpstatp, sizeof(struct stat)); + xstrfree(requestp->path); + break; + case _AIO_OP_OPEN: + if (cancelled && requestp->ret >= 0) + /* The open() was cancelled but completed */ + close(requestp->ret); + xstrfree(requestp->path); + break; + case _AIO_OP_CLOSE: + if (cancelled && requestp->ret < 0) + /* The close() was cancelled and never got executed */ + close(requestp->fd); + break; + case _AIO_OP_UNLINK: + case _AIO_OP_TRUNCATE: + case _AIO_OP_OPENDIR: + xstrfree(requestp->path); + break; + case _AIO_OP_READ: + if (!cancelled && requestp->ret > 0) + xmemcpy(requestp->bufferp, requestp->tmpbufp, requestp->ret); + xfree(requestp->tmpbufp, requestp->buflen); + break; + case _AIO_OP_WRITE: + xfree(requestp->tmpbufp, requestp->buflen); + break; + default: + break; + } + memPoolFree(ufs_request_pool, requestp); +} + +static int +ufsSomethingPending(storeIOState * sio) +{ + aiostate_t *aiostate = (aiostate_t *) sio->fsstate; + return (aiostate->flags.reading || aiostate->flags.writing || + aiostate->flags.opening || aiostate->flags.inreaddone); +} Index: squid/src/fs/ufs/store_dir_ufs.c diff -u squid/src/fs/ufs/store_dir_ufs.c:1.10 squid/src/fs/ufs/store_dir_ufs.c:1.10.4.1 --- squid/src/fs/ufs/store_dir_ufs.c:1.10 Fri Jan 12 00:20:36 2001 +++ squid/src/fs/ufs/store_dir_ufs.c Sat Apr 14 02:46:36 2001 @@ -1348,20 +1348,19 @@ * This routine is called by storeDirSelectSwapDir to see if the given * object is able to be stored on this filesystem. UFS filesystems will * happily store anything as long as the LRU time isn't too small. + * (Darius was here) */ int -storeUfsDirCheckObj(SwapDir * SD, const StoreEntry * e) +ufsCheckObj(SwapDir * SD, const StoreEntry * e) { -#if OLD_UNUSED_CODE - if (storeUfsDirExpiredReferenceAge(SD) < 300) { - debug(20, 3) ("storeUfsDirCheckObj: NO: LRU Age = %d\n", - storeUfsDirExpiredReferenceAge(SD)); - /* store_check_cachable_hist.no.lru_age_too_low++; */ - return -1; - } -#endif - /* Return 999 (99.9%) constant load */ - return 999; + int loadav, ql; + + ql = ufsQueueSize(); + if (ql == 0) + loadav = 0; + loadav = ql * 1000 / MAXQUEUED; + debug(41, 9) ("ufsCheckObj: load=%d\n", loadav); + return loadav; } /* @@ -1369,28 +1368,30 @@ * * This routine is called whenever an object is referenced, so we can * maintain replacement information within the storage fs. + * (Darius was here) */ void -storeUfsDirRefObj(SwapDir * SD, StoreEntry * e) +ufsRefObj(SwapDir * SD, StoreEntry * e) { - debug(1, 3) ("storeUfsDirRefObj: referencing %p %d/%d\n", e, e->swap_dirn, - e->swap_filen); + debug(1, 3) ("fsRefObj: referencing %p %d/%d\n", e, e->swap_dirn, + e->swap_filen); if (SD->repl->Referenced) - SD->repl->Referenced(SD->repl, e, &e->repl); + SD->repl->Referenced(SD->repl, e, &e->repl); } /* * storeUfsDirUnrefObj * This routine is called whenever the last reference to an object is * removed, to maintain replacement information within the storage fs. + * (Darius was here) */ void -storeUfsDirUnrefObj(SwapDir * SD, StoreEntry * e) +ufsUnrefObj(SwapDir * SD, StoreEntry * e) { - debug(1, 3) ("storeUfsDirUnrefObj: referencing %p %d/%d\n", e, e->swap_dirn, - e->swap_filen); + debug(1, 3) ("fsUnrefObj: referencing %p %d/%d\n", e, e->swap_dirn, + e->swap_filen); if (SD->repl->Dereferenced) - SD->repl->Dereferenced(SD->repl, e, &e->repl); + SD->repl->Dereferenced(SD->repl, e, &e->repl); } /* Index: squid/src/fs/ufs/store_io_ufs.c diff -u squid/src/fs/ufs/store_io_ufs.c:1.5 squid/src/fs/ufs/store_io_ufs.c:1.5.4.3 --- squid/src/fs/ufs/store_io_ufs.c:1.5 Fri Jan 12 00:20:36 2001 +++ squid/src/fs/ufs/store_io_ufs.c Sun Apr 29 03:50:11 2001 @@ -1,264 +1,173 @@ - -/* - * $Id$ - * - * DEBUG: section 79 Storage Manager UFS Interface - * AUTHOR: Duane Wessels - * - * SQUID Web Proxy Cache http://www.squid-cache.org/ - * ---------------------------------------------------------- - * - * Squid is the result of efforts by numerous individuals from - * the Internet community; see the CONTRIBUTORS file for full - * details. Many organizations have provided support for Squid's - * development; see the SPONSORS file for full details. Squid is - * Copyrighted (C) 2001 by the Regents of the University of - * California; see the COPYRIGHT file for full details. Squid - * incorporates software developed and/or copyrighted by other - * sources; see the CREDITS file for full details. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. - * - */ - -#include "squid.h" -#include "store_ufs.h" - - -static DRCB storeUfsReadDone; -static DWCB storeUfsWriteDone; -static void storeUfsIOCallback(storeIOState * sio, int errflag); -static CBDUNL storeUfsIOFreeEntry; - -/* === PUBLIC =========================================================== */ - +/* This is the guts of the open and create store* functions - the only + * significant differences between the two are mode and flags, and what + * happens at the other end of the operation. */ storeIOState * -storeUfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, - STIOCB * callback, void *callback_data) +_storeUfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, + STIOCB * callback, void *callback_data, int flags, int filen) { - sfileno f = e->swap_filen; - char *path = storeUfsDirFullPath(SD, f, NULL); storeIOState *sio; - struct stat sb; - int fd; - debug(79, 3) ("storeUfsOpen: fileno %08X\n", f); - fd = file_open(path, O_RDONLY | O_BINARY); - if (fd < 0) { - debug(79, 3) ("storeUfsOpen: got failure (%d)\n", errno); - return NULL; - } - debug(79, 3) ("storeUfsOpen: opened FD %d\n", fd); + int fd, filen; + int flags = (O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); + int mode = 0644; + void *requestp; + char *path; + fsinfo_t *fsinfo; + fsstate_t *fsstate; + + /* We need to check for "too many files open" here, on a per-fs basis */ + /* We should probably also handle failures to get a new filen better */ + + /* Generate the new filename - this should happen in the fs'es open? */ + filen = fsNewFileNum(SD); + path = fsDirFullPath(SD, filen, NULL); + fd = -1; + + /* Setup the state object to return */ sio = CBDATA_ALLOC(storeIOState, storeUfsIOFreeEntry); - sio->fsstate = memPoolAlloc(ufs_state_pool); + sio->fsstate = memPoolAlloc(fs_state_pool); + + /* Convenience pointers for the fs state and other data */ + fsstate = ((fsstate_t *)(sio->fsstate)); + fsinfo = ((fsinfo_t *)(SD->fsdata)); - sio->swap_filen = f; + /* fill in the sio for this request in particular */ + sio->swap_filen = filen; sio->swap_dirn = SD->index; - sio->mode = O_RDONLY; + /* mode and flags are confused here */ + sio->mode = flags; sio->callback = callback; sio->callback_data = callback_data; cbdataLock(callback_data); sio->e = e; - ((ufsstate_t *) (sio->fsstate))->fd = fd; - ((ufsstate_t *) (sio->fsstate))->flags.writing = 0; - ((ufsstate_t *) (sio->fsstate))->flags.reading = 0; - ((ufsstate_t *) (sio->fsstate))->flags.close_request = 0; - if (fstat(fd, &sb) == 0) - sio->st_size = sb.st_size; - store_open_disk_fd++; + /* This needs to be moved somewhere io-specific, maybe */ + fsstate->flags.writing = 0; + fsstate->flags.reading = 0; + fsstate->flags.close_request = 0; + fsstate->flags.opening = 1; + + debug(79, 3) ("_storeUfsOpen: file %08X\n", filen); + requestp = fsinfo->buildOpenRequest(path, flags, mode, + callback, callback_data); + fsinfo->submitOpen(requestp); - /* We should update the heap/dlink position here ! */ return sio; } storeIOState * -storeUfsCreate(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, STIOCB * callback, void *callback_data) +storeUfsCreate(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, + STIOCB * callback, void *callback_data) { - storeIOState *sio; - int fd; - int mode = (O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); - char *path; - ufsinfo_t *ufsinfo = (ufsinfo_t *) SD->fsdata; - sfileno filn; - sdirno dirn; - - /* Allocate a number */ - dirn = SD->index; - filn = storeUfsDirMapBitAllocate(SD); - ufsinfo->suggest = filn + 1; - /* Shouldn't we handle a 'bitmap full' error here? */ - path = storeUfsDirFullPath(SD, filn, NULL); - - debug(79, 3) ("storeUfsCreate: fileno %08X\n", filn); - fd = file_open(path, mode); - if (fd < 0) { - debug(79, 3) ("storeUfsCreate: got failure (%d)\n", errno); - return NULL; - } - debug(79, 3) ("storeUfsCreate: opened FD %d\n", fd); - sio = CBDATA_ALLOC(storeIOState, storeUfsIOFreeEntry); - sio->fsstate = memPoolAlloc(ufs_state_pool); + int flags = (O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); + int filen = fsNewFileNum(SD); - sio->swap_filen = filn; - sio->swap_dirn = dirn; - sio->mode = mode; - sio->callback = callback; - sio->callback_data = callback_data; - cbdataLock(callback_data); - sio->e = (StoreEntry *) e; - ((ufsstate_t *) (sio->fsstate))->fd = fd; - ((ufsstate_t *) (sio->fsstate))->flags.writing = 0; - ((ufsstate_t *) (sio->fsstate))->flags.reading = 0; - ((ufsstate_t *) (sio->fsstate))->flags.close_request = 0; - store_open_disk_fd++; + return _storeUfsOpen(SD,e,file_callback,callback,callback_data,flags,filen); +} - /* now insert into the replacement policy */ - storeUfsDirReplAdd(SD, e); - return sio; +storeIOState * +storeUfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, + STIOCB * callback, void *callback_data) +{ + int flags = (O_RDONLY | O_BINARY); + int filen = e->swap_filen; + + return _storeUfsOpen(SD,e,file_callback,callback,callback_data,flags,filen); } void storeUfsClose(SwapDir * SD, storeIOState * sio) { - ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate; + fsstate_t *fsstate = (fsstate_t *) sio->fsstate; + void *requestp; debug(79, 3) ("storeUfsClose: dirno %d, fileno %08X, FD %d\n", - sio->swap_dirn, sio->swap_filen, ufsstate->fd); - if (ufsstate->flags.reading || ufsstate->flags.writing) { - ufsstate->flags.close_request = 1; - return; + sio->swap_dirn, sio->swap_filen, fsstate->fd); + if (ufsSomethingPending(sio)) { + /* The IO callback routines will close a file if close_request is set + * - this is kinda useful here. */ + fsstate->flags.close_request = 1; + return; } - storeUfsIOCallback(sio, 0); + requestp = fsinfo->buildCloseRequest(fsstate->fd); + fsinfo->submitClose(requestp); } +/* Not done past here - Darius */ + void -storeUfsRead(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, STRCB * callback, void *callback_data) +storeUfsUnlink(SwapDir * SD, StoreEntry * e) { - ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate; + debug(78, 3) ("storeUfsUnlink: dirno %d, fileno %08X\n", SD->index, + e->swap_filen); + storeufsDirReplRemove(e); + storeufsDirMapBitReset(SD, e->swap_filen); + storeufsDirUnlinkFile(SD, e->swap_filen); +} +void +storeUfsRead(SwapDir * SD, storeIOState * sio, char *buf, size_t size, + off_t offset, STRCB * callback, void *callback_data) +{ + /* This stuff needs a lot of sorting out, it's here that the + * two-level callback thing starts to bite */ + fsstate_t *fsstate = (fsstate_t *) sio->fsstate; assert(sio->read.callback == NULL); assert(sio->read.callback_data == NULL); + assert(!fsstate->flags.reading); sio->read.callback = callback; sio->read.callback_data = callback_data; + fsstate->read_buf = buf; cbdataLock(callback_data); - debug(79, 3) ("storeUfsRead: dirno %d, fileno %08X, FD %d\n", - sio->swap_dirn, sio->swap_filen, ufsstate->fd); + debug(78, 3) ("storeAufsRead: dirno %d, fileno %08X, FD %d\n", + sio->swap_dirn, sio->swap_filen, aiostate->fd); sio->offset = offset; - ufsstate->flags.reading = 1; - file_read(ufsstate->fd, - buf, - size, - offset, - storeUfsReadDone, - sio); + fsstate->flags.reading = 1; + requestp = fsinfo->buildReadRequest(fsstate->fd,offset,buf,size); + fsinfo->submitRead(requestp); } void -storeUfsWrite(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, FREE * free_func) -{ - ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate; - debug(79, 3) ("storeUfsWrite: dirn %d, fileno %08X, FD %d\n", sio->swap_dirn, sio->swap_filen, ufsstate->fd); - ufsstate->flags.writing = 1; - file_write(ufsstate->fd, - offset, - buf, - size, - storeUfsWriteDone, - sio, - free_func); -} - -void -storeUfsUnlink(SwapDir * SD, StoreEntry * e) -{ - debug(79, 3) ("storeUfsUnlink: fileno %08X\n", e->swap_filen); - storeUfsDirReplRemove(e); - storeUfsDirMapBitReset(SD, e->swap_filen); - storeUfsDirUnlinkFile(SD, e->swap_filen); -} - -/* === STATIC =========================================================== */ - -static void -storeUfsReadDone(int fd, const char *buf, int len, int errflag, void *my_data) -{ - storeIOState *sio = my_data; - ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate; - STRCB *callback = sio->read.callback; - void *their_data = sio->read.callback_data; - ssize_t rlen; - - debug(79, 3) ("storeUfsReadDone: dirno %d, fileno %08X, FD %d, len %d\n", - sio->swap_dirn, sio->swap_filen, fd, len); - ufsstate->flags.reading = 0; - if (errflag) { - debug(79, 3) ("storeUfsReadDone: got failure (%d)\n", errflag); - rlen = -1; - } else { - rlen = (ssize_t) len; - sio->offset += len; - } - assert(callback); - assert(their_data); - sio->read.callback = NULL; - sio->read.callback_data = NULL; - if (cbdataValid(their_data)) - callback(their_data, buf, (size_t) rlen); - cbdataUnlock(their_data); -} - -static void -storeUfsWriteDone(int fd, int errflag, size_t len, void *my_data) -{ - storeIOState *sio = my_data; - ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate; - debug(79, 3) ("storeUfsWriteDone: dirno %d, fileno %08X, FD %d, len %d\n", - sio->swap_dirn, sio->swap_filen, fd, len); - ufsstate->flags.writing = 0; - if (errflag) { - debug(79, 0) ("storeUfsWriteDone: got failure (%d)\n", errflag); - storeUfsIOCallback(sio, errflag); - return; - } - sio->offset += len; - if (ufsstate->flags.close_request) - storeUfsIOCallback(sio, errflag); -} - -static void -storeUfsIOCallback(storeIOState * sio, int errflag) -{ - ufsstate_t *ufsstate = (ufsstate_t *) sio->fsstate; - debug(79, 3) ("storeUfsIOCallback: errflag=%d\n", errflag); - if (ufsstate->fd > -1) { - file_close(ufsstate->fd); - store_open_disk_fd--; - } - if (cbdataValid(sio->callback_data)) - sio->callback(sio->callback_data, errflag, sio); - cbdataUnlock(sio->callback_data); - sio->callback_data = NULL; - sio->callback = NULL; - cbdataFree(sio); -} - - -/* - * Clean up any references from the SIO before it get's released. - */ -static void -storeUfsIOFreeEntry(void *sio) +storeUfsWrite(SwapDir * SD, storeIOState * sio, char *buf, size_t size, + off_t offset, FREE * free_func) { - memPoolFree(ufs_state_pool, ((storeIOState *) sio)->fsstate); + aiostate_t *aiostate = (aiostate_t *) sio->fsstate; + debug(78, 3) ("storeAufsWrite: dirno %d, fileno %08X, FD %d\n", + sio->swap_dirn, sio->swap_filen, aiostate->fd); + if (aiostate->fd < 0) { + /* disk file not opened yet */ + struct _queued_write *q; + assert(aiostate->flags.opening); + q = memPoolAlloc(aio_qwrite_pool); + q->buf = buf; + q->size = size; + q->offset = offset; + q->free_func = free_func; + linklistPush(&(aiostate->pending_writes), q); + return; + } +#if ASYNC_WRITE + if (aiostate->flags.writing) { + struct _queued_write *q; + debug(78, 3) ("storeAufsWrite: queuing write\n"); + q = memPoolAlloc(aio_qwrite_pool); + q->buf = buf; + q->size = size; + q->offset = offset; + q->free_func = free_func; + linklistPush(&(aiostate->pending_writes), q); + return; + } + aiostate->flags.writing = 1; + /* + * XXX it might be nice if aioWrite() gave is immediate + * feedback here about EWOULDBLOCK instead of in the + * callback function + * XXX Should never give EWOULDBLOCK under normal operations + * if it does then the MAGIC1/2 tuning is wrong. + */ + aioWrite(aiostate->fd, offset, buf, size, storeAufsWriteDone, sio, + free_func); +#else + file_write(aiostate->fd, offset, buf, size, storeAufsWriteDone, sio, + free_func); +#endif } Index: squid/src/fs/ufs/store_ufs.h diff -u squid/src/fs/ufs/store_ufs.h:1.2 squid/src/fs/ufs/store_ufs.h:removed --- squid/src/fs/ufs/store_ufs.h:1.2 Sat Oct 21 09:44:46 2000 +++ squid/src/fs/ufs/store_ufs.h Tue Sep 28 18:35:35 2004 @@ -1,50 +0,0 @@ -/* - * store_ufs.h - * - * Internal declarations for the ufs routines - */ - -#ifndef __STORE_UFS_H__ -#define __STORE_UFS_H__ - -struct _ufsinfo_t { - int swaplog_fd; - int l1; - int l2; - fileMap *map; - int suggest; -}; - -struct _ufsstate_t { - int fd; - struct { - unsigned int close_request:1; - unsigned int reading:1; - unsigned int writing:1; - } flags; -}; - -typedef struct _ufsinfo_t ufsinfo_t; -typedef struct _ufsstate_t ufsstate_t; - -/* The ufs_state memory pool */ -extern MemPool *ufs_state_pool; - -extern void storeUfsDirMapBitReset(SwapDir *, sfileno); -extern int storeUfsDirMapBitAllocate(SwapDir *); -extern char *storeUfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath); -extern void storeUfsDirUnlinkFile(SwapDir *, sfileno); -extern void storeUfsDirReplAdd(SwapDir * SD, StoreEntry *); -extern void storeUfsDirReplRemove(StoreEntry *); - -/* - * Store IO stuff - */ -extern STOBJCREATE storeUfsCreate; -extern STOBJOPEN storeUfsOpen; -extern STOBJCLOSE storeUfsClose; -extern STOBJREAD storeUfsRead; -extern STOBJWRITE storeUfsWrite; -extern STOBJUNLINK storeUfsUnlink; - -#endif