--------------------- PatchSet 1375 Date: 2001/01/24 14:11:54 Author: adri Branch: sfs Tag: (none) Log: Add the initial port of the sfs-0.3 code into the source tree. This code is originally from Stewart Forster and has been modified by Kevin Littlejohn. This filesystem compiles but the dir/io routines simply implement UFS right now. This will change, obviously. :) Members: src/fs/sfs/CHANGELOG:1.1->1.1.2.1 src/fs/sfs/DESIGN:1.1->1.1.2.1 src/fs/sfs/Makefile.in:1.1->1.1.2.1 src/fs/sfs/defines.h:1.1->1.1.2.1 src/fs/sfs/sfs.h:1.1->1.1.2.1 src/fs/sfs/sfs_defines.h:1.1->1.1.2.1 src/fs/sfs/sfs_fslo.c:1.1->1.1.2.1 src/fs/sfs/sfs_interface.c:1.1->1.1.2.1 src/fs/sfs/sfs_lib.h:1.1->1.1.2.1 src/fs/sfs/sfs_llo.c:1.1->1.1.2.1 src/fs/sfs/sfs_splay.c:1.1->1.1.2.1 src/fs/sfs/sfs_test.c:1.1->1.1.2.1 src/fs/sfs/sfs_util.c:1.1->1.1.2.1 src/fs/sfs/store_dir_sfs.c:1.1->1.1.2.1 src/fs/sfs/store_io_sfs.c:1.1->1.1.2.1 src/fs/sfs/store_sfs.h:1.1->1.1.2.1 --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/CHANGELOG Wed Feb 14 00:49:24 2007 @@ -0,0 +1,52 @@ +Changelog for sfs. + +--- +sfs-0.2 - 19990202 +--- + +Altered types to use uintxx_t types - should be more portable, hopefully. +I've a small concern about some of them still, especially sscanf places, but +I'll sort that out eventually. Also cleared last of purify-related errors, +and all bar one compiler warning - Solaris 'cc' warns about incorrect +type when passing (void *)function into one of the pthread functions. + +Using uintx_t also means I _know_ how big each data structure/variable is. +This is becoming increasingly important ;) + +--- +sfs-0.1 - 19990201 +--- + +First 'versioned' file - this contains a cleaned up Makefile and some cleaned +up dependancies (thanks to Oskar Pearson). List of changes from his patch: + +o A real makefile +o Makefile includes linux and solaris sections. Defines _REENTRANT + for Linux. You have to manually specify this. + Currently linux just core-dumps with me immediately. I will try + and track this down. +o Compiles almost without warnings with 'gcc -Wall'. There are a + couple of things that I was not up to fixing immediately: noted them + with 'XXX'. I removed some unused variables (and commented out + others): if they were 'for future use', sorry. +o Now have $Id: CHANGELOG,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ entries in every file, so that changes can be + tracked. +o Fixed recursive includes of header files +o sfs_seek code doesn't match documentation. Kludged it's variable + list in the meantime +o replaced broken squid_curtime definition. was the same as saying + 'exit + 1;' +o Included headers to get rid of silly warnings + +In addition, I've cleaned up the above-mentioned core dump, and the -Wall +warnings. Have also run through purify on Solaris - cleared most bugs. This +release is mainly to get the cleaning stuff back out there, so I can start +on making the interface match throughout. + +--- +First version: +--- + +This is a fairly scrappy version - I havent' even changed the attribution +headers on the files ;) It was compiling and running when I put it on +the web site, at least, but is very untested, and extremely alpha. --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/DESIGN Wed Feb 14 00:49:24 2007 @@ -0,0 +1,245 @@ +$Id: DESIGN,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ + + SQUIDFS README + +This file outlines the design of the SquidFS filesystem: + +Analysis of a running Squid internet object cache has found the following +tidbits of information: + +Quantile Object Size +25% < 1677 bytes +50% < 3710 bytes +75% < 9304 bytes +90% < 21360 bytes + +It is proposed that we use 4K fragments and 8K chunks. + +At the beginning of each file will be stored that file's inode. This inode +consists of the following information: + +4 byte file length +63 x 4 bytes block pointers - each pointer indexes a 8K chunk or 4K fragment +64 x 4 byte indirect block pointers - each pointer indexes a 8K chunk that +itself contains a list of 4 byte pointers to fragments/chunks. The remainder +of the inode fragment/chunk will hold the first portion of data for that file. + +A file will consist of 0 or more 8K file chunks and 0 or 1 4 K fragments. + +An inode number is actually the index of the 4K fragment that contains the +inode on the disk. In this way indirect references to inodes are removed. + +*** Filesystem Bitmaps *** + +The filesystem will have 3 bitmaps that index all the fragments on a +filesystem. + +BM1 - (on disk) Indexes fragments with valid data in them. +BM2 - (on disk) Indexes inode fragments +IMB - (memory) Indexes blocks that are allocated but not necessarily written +to disk. + +States of the on-disk bitmaps are: + +BM1 BM2 +0 0 Fragment free +1 0 Fragment contains valid file data +0 1 Fragment contains complete inode data that references blocks + that may not be allocated. +1 1 Fragment contains completed inode data + +*** file writes, closes and bitmap updates *** + +The first block (inode) is never written until a close is issued. + +Whenever a subsequent block is written IMB is set - block is flushed +on demand. + +Whenever a file close occurs the current block and the completed inode block is +flushed to disk immediately and BM2 and IMB is set for the inode block. +When BM2 flush is complete BM1 is set for all non-inode blocks +When BM1 flush is complete BM1 is then set for inode block and flushed to +disk. Once inode BM1 flush is complete the file is valid + +Bitmap flushes should be scheduled events an should not occur on demand. +A file that is waiting for a bitmap flush to occur should register itself +to be called back when the flushes complete so that they may move onto +the next stage of bitmap updates. It is suggested that dirty bitmap pages +be flushed to disk every 10 seconds. In this instance it may take up to +30 seconds before a file close results in a completely valid file on disk. +In reality the file will be recoverable from on-disk data from 0-10 seconds +after the file close was issued however. + +*** Filesystem rebuilds *** + +The filesystem may be left in an inconsistent state in the event of a +power failure or system crash. In this event the following algorithms +are used to return the filesystem to a consistent state: + +BM1 BM2 +0 0 Fragment is free - ignore +1 0 Fragment contains valid data - ignore +0 1 Fragment is an inode that references data blocks that have not + completed their bitmap. Scan through inode and set all blocks + referenced to be valid data blocks. +1 1 Fragment is a valid inode block. + +*** Inode notes *** + +When a file is being written to the 8K inode block is held in memory, +unallocated from disk until a file close is issued. If the inode + file +data is < 4K, the 4K fragment is allocated from the free bitmap, preferably +as the last 4K of an 8K chunk. If the inode + file data is >8K, the inode +4K portion must be allocated on a free 8K chunk boundary and the data 4k +portion will be the last 4K of the 8K chunk. This 4K chunk is implied and +is not referenced by the direct block pointers in the inode. + +*** Performance Analysis *** + +By running a set of squid object sizes for cache misses and cache hits through +the above filesystem in a simulated environment we find the following: + +Internal block fragmentation: 17% + +File writes: + + Disk Objects +Accesses % + 1 70% + 2 15% + 3 6% + 4 3% + 5+ 6% +Average write accesses: 2.2 + +Bitmap update accesses per avg. object - assuming 10 secs between bitmap flushes +(worst case): 2 * inode updates + 1.2 * block updates = 3.2 +(average - 20% chance of disk bitmap locality @ 50 objects/sec): 0.05 +(best @ 50 objects/sec): 0.003 + +File reads: + + Disk Objects +Accesses % + 1 86% + 2 8% + 3 2% + 4 1% + 5+ 2% +Average read accesses: 1.5 +Bitmap update accesses: 0 + +File unlinks: + +1.05 disk accesses on average to retrieve block pointer information + worst +case 2 bitmap update accesses assuming no locality in a 10 second window + +*** Notes *** + +Average bitmap update accesses is hard to measure but assumes we are usually +writing to blocks that are close together for many files and so bitmap updates +will get clustered together a fair portion of the time. + +*** Filesystem IO *** + +There is one thread per mounted filesystem. +Blocks that are queued to be written are placed onto a thread's service +queue. Each thread inserts blocks into an doubly linked list ordered by +the location of each block in the disk. The thread scans backwards and +forwards along the list writing out blocks and removing them from the +write queue. The blocks may still remain "owned" by a open file and +the data within them may modified at any time. The thread just writes +out what it sees. This MAY cause inconsistencies but the theory is that +when the last write of that block is queued, the data will be consistent +then anyway. Since we're dealing with scenarios where this is acceptable +then this is not an issue. The requesting write will return straight away +and the write will continue in the background. If O_SYNC or O_WSYNC flag +is set the requestor will wait until the write request is finished. If the +O_NONBLOCK flag is also set along with O_SYNC or O_WSYNC the requestor keeps +a record of blocks it had queued for writing and returns EWOULDBLOCK. On a +subsequent write attempt by the parent process the requestor checks to see +if the last issued write for that block has finished and simply returns the +result of the write. + +Blocks that are to be read are placed onto the service queue for a filesystem's +service thread. The requestor then has the option of either waiting for +the IO to complete or coming back to it later. If the O_NONBLOCK flag +is set + +Open file closes are also placed onto a separate queue for the thread. +The filesystem thread is responsible for setting the bitmaps as flushing +out dirty pages of the bitmap as required. Bitmap pages are flushed out +only every N seconds, where N is a default of 15 but is user-modifiable +to any value. Every N seconds any dirty bitmap blocks are placed onto +the write queue and are flushed out like normal data. + +A filesystem IO thread's service queue looks like: + +lastbmflushtime = 0 +loop: + now = time() + if(service queue is not empty) + acquire mutex for service queue. + grab head of service queue and set service queue head pointer to NULL + release mutex + for each item on service queue + if item is marked as O_SYNC or O_WSYNC + flush out to disk immediately and report back + else + insert into disk queue ordered by disk location + + if(file close queue is not empty) + acquire mutex for file close queue + grab head of file close queue and set fcq gead pointer to NULL + release cfq mutex + for each item in file close queue + add item to pending file close queue + + if(now >= lastbmflushtime + bmflushinterval) + lastbmflushtime = now; + for each dirty bitmap block + issue immediate write of each dirty bitmap block + for each item in pending file close queue + advance state and modify necessary bitmap blocks + if(state == done) + free pending file close + + if(write queue is empty) + sleep(min((lastbmflushtime+bmflushinterval)-now, blockflushinterval)) + goto loop: + + write out block pointed to by file queue scan pointer (fqsp) + tmp = fqsp + /* Below is a simple SCAN algorithm however adding in VSCAN capability */ + /* is easy to do and should be done on final implementation */ + if(direction == forward) + if fqsp->next == NULL + direction = backward + fqsp = fqsp->prev + else + fqsp = fqsp->next + else + if fqsp->prev == NULL + direction = forward + fqsp = fqsp->next + else + fqsp = fqsp->prev + free tmp + goto loop: + +----- + +Grit: + + This is a record of the typing decisions, as they're made. + +sfsfd - file descriptor. This will be a uint32_t, the top 8 bits will be + the sfsid, the bottom 24 bits will be the 'thread-specific identifier'. + Essentially, the first byte tells us which drive, the other three + which file descriptor on that drive. + +sfsid - drive id. This will be one byte. Note, that limits us to 255 drives + - this is not a hard limit to raise. + +sfsinode - inode. Position of first inode for this file on disk. These will + be uint32_t - that gives us a fair whack of space to play with. --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/Makefile.in Wed Feb 14 00:49:24 2007 @@ -0,0 +1,62 @@ +# +# Makefile for the sfs storage driver for the Squid Object Cache server +# +# $Id: Makefile.in,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ +# + +FS = sfs + +top_srcdir = @top_srcdir@ +VPATH = @srcdir@ + +CC = @CC@ +MAKEDEPEND = @MAKEDEPEND@ +AR_R = @AR_R@ +RANLIB = @RANLIB@ +AC_CFLAGS = @CFLAGS@ +SHELL = /bin/sh + +INCLUDE = -I../../../include -I$(top_srcdir)/include -I$(top_srcdir)/src/ +CFLAGS = $(AC_CFLAGS) $(INCLUDE) $(DEFINES) + +OUT = ../$(FS).a + +OBJS = \ + sfs_fslo.o \ + sfs_interface.o \ + sfs_llo.o \ + sfs_splay.o \ + sfs_util.o \ + store_dir_sfs.o \ + store_io_sfs.o + + +all: $(OUT) + +$(OUT): $(OBJS) + @rm -f ../stamp + $(AR_R) $(OUT) $(OBJS) + $(RANLIB) $(OUT) + +$(OBJS): $(top_srcdir)/include/version.h ../../../include/autoconf.h +$(OBJS): store_sfs.h + +.c.o: + @rm -f ../stamp + $(CC) $(CFLAGS) -c $< + +clean: + -rm -rf *.o *pure_* core ../$(FS).a + +distclean: clean + -rm -f Makefile + -rm -f Makefile.bak + -rm -f tags + +install: + +tags: + ctags *.[ch] $(top_srcdir)/src/*.[ch] $(top_srcdir)/include/*.h $(top_srcdir)/lib/*.[ch] + +depend: + $(MAKEDEPEND) $(INCLUDE) -fMakefile *.c --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/defines.h Wed Feb 14 00:49:24 2007 @@ -0,0 +1,260 @@ + +/* + * $Id: defines.h,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ + * + * + * SQUID Internet Object Cache http://squid.nlanr.net/Squid/ + * ---------------------------------------------------------- + * + * Squid is the result of efforts by numerous individuals from the + * Internet community. Development is led by Duane Wessels of the + * National Laboratory for Applied Network Research and funded by the + * National Science Foundation. Squid is Copyrighted (C) 1998 by + * Duane Wessels and the University of California San Diego. Please + * see the COPYRIGHT file for full details. Squid incorporates + * software developed and/or copyrighted by other sources. Please see + * the CREDITS file for full details. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. + * + */ + +#ifndef TRUE +#define TRUE 1 +#endif +#ifndef FALSE +#define FALSE 0 +#endif + +#define ACL_NAME_SZ 32 +#define BROWSERNAMELEN 128 + +#define ACL_SUNDAY 0x01 +#define ACL_MONDAY 0x02 +#define ACL_TUESDAY 0x04 +#define ACL_WEDNESDAY 0x08 +#define ACL_THURSDAY 0x10 +#define ACL_FRIDAY 0x20 +#define ACL_SATURDAY 0x40 +#define ACL_ALLWEEK 0x7F +#define ACL_WEEKDAYS 0x3E + +#define DefaultDnsChildrenMax 32 /* 32 processes */ +#define DefaultRedirectChildrenMax 32 /* 32 processes */ +#define DefaultAuthenticateChildrenMax 32 /* 32 processes */ +#define MAXHTTPPORTS 12 + +#define COMM_OK (0) +#define COMM_ERROR (-1) +#define COMM_NOMESSAGE (-3) +#define COMM_TIMEOUT (-4) +#define COMM_SHUTDOWN (-5) +#define COMM_INPROGRESS (-6) +#define COMM_ERR_CONNECT (-7) +#define COMM_ERR_DNS (-8) +#define COMM_ERR_CLOSING (-9) + +/* Select types. */ +#define COMM_SELECT_READ (0x1) +#define COMM_SELECT_WRITE (0x2) +#define MAX_DEBUG_SECTIONS 100 + +#define COMM_NONBLOCKING 0x01 +#define COMM_NOCLOEXEC 0x02 +#define COMM_REUSEADDR 0x04 + +#define debug(SECTION, LEVEL) \ + ((_db_level = (LEVEL)) > debugLevels[SECTION]) ? (void) 0 : _db_print + +#define safe_free(x) if (x) { xxfree(x); x = NULL; } + +#define DISK_OK (0) +#define DISK_ERROR (-1) +#define DISK_EOF (-2) +#define DISK_NO_SPACE_LEFT (-6) + +#define DNS_INBUF_SZ 4096 + +#define FD_DESC_SZ 64 + +#define FQDN_LOOKUP_IF_MISS 0x01 +#define FQDN_MAX_NAMES 5 + +#define HTTP_REPLY_FIELD_SZ 128 + +#define BUF_TYPE_8K 1 +#define BUF_TYPE_MALLOC 2 + +#define ANONYMIZER_NONE 0 +#define ANONYMIZER_STANDARD 1 +#define ANONYMIZER_PARANOID 2 + +#define USER_IDENT_SZ 64 +#define IDENT_NONE 0 +#define IDENT_PENDING 1 +#define IDENT_DONE 2 + +#define IP_LOOKUP_IF_MISS 0x01 + +#define MAX_MIME 4096 + +/* Mark a neighbor cache as dead if it doesn't answer this many pings */ +#define HIER_MAX_DEFICIT 20 + +#define ICP_FLAG_HIT_OBJ 0x80000000ul +#define ICP_FLAG_SRC_RTT 0x40000000ul + +/* Version */ +#define ICP_VERSION_2 2 +#define ICP_VERSION_3 3 +#define ICP_VERSION_CURRENT ICP_VERSION_2 + +#define DIRECT_NO 0 +#define DIRECT_MAYBE 1 +#define DIRECT_YES 2 + +#define REDIRECT_AV_FACTOR 1000 + +#define REDIRECT_NONE 0 +#define REDIRECT_PENDING 1 +#define REDIRECT_DONE 2 + +#define AUTHENTICATE_AV_FACTOR 1000 + +#define AUTHENTICATE_NONE 0 +#define AUTHENTICATE_PENDING 1 +#define AUTHENTICATE_DONE 2 + +#define CONNECT_PORT 443 + +#define current_stacksize(stack) ((stack)->top - (stack)->base) + +/* logfile status */ +#define LOG_ENABLE 1 +#define LOG_DISABLE 0 + +#define SM_PAGE_SIZE 4096 +#define DISK_PAGE_SIZE 8192 + +#define EBIT_SET(flag, bit) ((void)((flag) |= ((1<<(bit))))) +#define EBIT_CLR(flag, bit) ((void)((flag) &= ~((1<<(bit))))) +#define EBIT_TEST(flag, bit) ((flag) & ((1<<(bit)))) + +/* bit opearations on a char[] mask of unlimited length */ +#define CBIT_BIT(bit) (1<<((bit)%8)) +#define CBIT_BIN(mask, bit) (mask)[(bit)>>3] +#define CBIT_SET(mask, bit) ((void)(CBIT_BIN(mask, bit) |= CBIT_BIT(bit))) +#define CBIT_CLR(mask, bit) ((void)(CBIT_BIN(mask, bit) &= ~CBIT_BIT(bit))) +#define CBIT_TEST(mask, bit) (CBIT_BIN(mask, bit) & CBIT_BIT(bit)) + +#define MAX_FILES_PER_DIR (1<<20) + +#define MAX_URL 4096 +#define MAX_LOGIN_SZ 128 + +#define PEER_MAX_ADDRESSES 10 +#define RTT_AV_FACTOR 50 + +#define PEER_DEAD 0 +#define PEER_ALIVE 1 + +#define AUTH_MSG_SZ 4096 +#define HTTP_REPLY_BUF_SZ 4096 + +#if !defined(ERROR_BUF_SZ) && defined(MAX_URL) +#define ERROR_BUF_SZ (MAX_URL << 2) +#endif + +#define READ_AHEAD_GAP (1<<14) + +#if SQUID_SNMP +#define VIEWINCLUDED 1 +#define VIEWEXCLUDED 2 +#endif + +#define STORE_META_OK 0x03 +#define STORE_META_DIRTY 0x04 +#define STORE_META_BAD 0x05 + +#define IPC_NONE 0 +#define IPC_TCP_SOCKET 1 +#define IPC_UDP_SOCKET 2 +#define IPC_FIFO 3 + +#define STORE_META_KEY STORE_META_KEY_MD5 + +#define STORE_META_TLD_START sizeof(int)+sizeof(char) +#define STORE_META_TLD_SIZE STORE_META_TLD_START +#define SwapMetaType(x) (char)x[0] +#define SwapMetaSize(x) &x[sizeof(char)] +#define SwapMetaData(x) &x[STORE_META_TLD_START] +#define STORE_HDR_METASIZE (4*sizeof(time_t)+2*sizeof(u_short)+sizeof(int)) + +#define STORE_ENTRY_WITH_MEMOBJ 1 +#define STORE_ENTRY_WITHOUT_MEMOBJ 0 +#define STORE_SWAP_BUF DISK_PAGE_SIZE +#define VM_WINDOW_SZ DISK_PAGE_SIZE + +#define SKIP_BASIC_SZ ((size_t) 6) + +#define PINGER_PAYLOAD_SZ 8192 + +#define COUNT_INTERVAL 60 +/* + * keep 60 minutes' worth of per-minute readings (+ current reading) + */ +#define N_COUNT_HIST (3600 / COUNT_INTERVAL) + 1 +/* + * keep 3 days' (72 hours) worth of hourly readings + */ +#define N_COUNT_HOUR_HIST (86400 * 3) / (60 * COUNT_INTERVAL) + +/* were to look for errors if config path fails */ +#define DEFAULT_SQUID_ERROR_DIR "/usr/local/squid/etc/errors" + +/* gb_type operations */ +#define gb_flush_limit (0x3FFFFFFF) +#define gb_inc(gb, delta) { if ((gb)->bytes > gb_flush_limit || delta > gb_flush_limit) gb_flush(gb); (gb)->bytes += delta; (gb)->count++; } + +/* iteration for HttpHdrRange */ +#define HttpHdrRangeInitPos (-1) + +/* use this and only this to initialize HttpHeaderPos */ +#define HttpHeaderInitPos (-1) + +/* handy to determine the #elements in a static array */ +#define countof(arr) (sizeof(arr)/sizeof(*arr)) + +/* to initialize static variables (see also MemBufNull) */ +#define MemBufNULL { NULL, 0, 0, 0, NULL } + +/* + * Max number of ICP messages to receive per call to icpHandleUdp + */ +#define INCOMING_ICP_MAX 15 +/* + * Max number of HTTP connections to accept per call to httpAccept + * and PER HTTP PORT + */ +#define INCOMING_HTTP_MAX 10 +#define INCOMING_TOTAL_MAX (INCOMING_ICP_MAX+INCOMING_HTTP_MAX) + +/* + * This many TCP connections must FAIL before we mark the + * peer as DEAD + */ +#define PEER_TCP_MAGIC_COUNT 10 + +#define CLIENT_SOCK_SZ 4096 --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs.h Wed Feb 14 00:49:24 2007 @@ -0,0 +1,39 @@ +/* $Id: sfs.h,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ */ + +#ifndef SFS_H +#define SFS_H + +typedef struct sfs_statistic_t { + uint numreads; +} sfs_statistic_t; + +typedef struct sfs_stat_t { + uint sfs_ino; + uint sfs_numblocks; + uint sfs_len; +} sfs_stat_t; + +/* Library level user-callable functions */ + +int sfs_setoptions(int op, int opdata); + +/* Filesystem level user-callable functions */ + +int sfs_format(const char *path, ulong); +int sfs_mount(const char *path); +int sfs_fsck(int sfsid, int fscktype); +int sfs_unmount(int sfsid); + +/* File level user-callable functions */ + +int sfs_open(const char *path, int oflag, mode_t mode); +int sfs_close(int fd); +int sfs_sync(int fd); +int sfs_read(int fd, void *buf, int buflen); +int sfs_write(int fd, void *buf, int buflen); +int sfs_seek(int fd, int offset, int whence); +int sfs_unlink(int sfsid, uint sfsinode); +int sfs_truncate(int sfsid, uint sfsinode, int newlen); +int sfs_stat(int sfsid, uint sfsinode, sfs_stat_t *statbuf); + +#endif /* !SFS_H */ --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_defines.h Wed Feb 14 00:49:24 2007 @@ -0,0 +1,186 @@ +/* $Id: sfs_defines.h,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ */ + +#ifndef SFS_DEFINES_H +#define SFS_DEFINES_H + +#include +#include + +/* Possibly bogus defines? */ + +#ifndef uint8_t +#define uint8_t unsigned char +#endif + +#ifndef uint32_t +#define uint32_t int +#endif + +#ifndef uint64_t +#define uint64_t long long +#endif + +#define sfsfd_t uint32_t +#define sfsid_t uint8_t +#define sfsblock_t uint32_t + +/* Code assumes CHUNKSIZE is twice FRAGSIZE. If it isn't things will break */ +/* very badly. */ + +#define FRAGSIZE 4096 +#define CHUNKSIZE 8192 +#define POINTERSIZE 4 /* Number of bytes in a pointer */ +#define MINFSFRAGS 1024 /* Minimum acceptable number of FS frags */ +#define MAXFILESYS 512 /* Maximum number of mounted filesystems */ + +#define NUMDIP 62 +#define NUMSIN 64 + +#define BITINBYTE 8 + +/* The below defines assume there are 8 bits in a byte */ + +#define TSTBIT(a, b) (((a[b>>3]) << (b & 0x7)) & 0x80) +#define SETBIT(a, b) ((a[b>>3]) |= (0x80 >> (b & 0x7))) +#define CLRBIT(a, b) ((a[b>>3]) &= (~(0x80 >> (b & 0x7)))) + +enum sfs_request_type { + _SFS_OP_NONE = 0, + _SFS_OP_READ, + _SFS_OP_WRITE, + _SFS_OP_OPEN_READ, + _SFS_OP_OPEN_WRITE, + _SFS_OP_CLOSE, + _SFS_OP_UNLINK, + _SFS_OP_SYNC, + _SFS_OP_UMOUNT +}; + +enum sfs_request_state { + _SFS_PENDING = 0, + _SFS_IN_PROGRESS, + _SFS_DONE +}; + +enum sfs_block_type { + _SFS_UNKNOWN = 0, + _SFS_DATA, + _SFS_INODE +}; + +typedef struct sfs_requestor { + enum sfs_request_type request_type; + enum sfs_request_state request_state; + pthread_cond_t done_signal; + pthread_mutex_t done_signal_lock; + int orphan; /* Set this, and the thread will clean this requestor up */ + sfsid_t sfsid; + sfsfd_t sfsfd; + sfsblock_t sfsinode; + ssize_t offset; /* The block inside the file in question (0..x) */ + ssize_t buflen; /* The length of the buffer, if pre-allocated, or the read */ + void *buf; + int ret; +} sfs_requestor; + +typedef struct sfs_requestor_list { + struct sfs_requestor_list *prev; + struct sfs_requestor_list *next; + struct sfs_requestor *req; +} sfs_requestor_list; + +/* This corresponds to the structure as it's stored on disk. */ +typedef struct sfs_inode_t { + sfsblock_t len; + sfsblock_t dip[NUMDIP]; /* Direct block pointers */ + sfsblock_t sin[NUMSIN]; /* Single Indirect Pointers */ +} sfs_inode_t; + +/* This is the structure stored mid-fs */ +/* I have some doubts as to the correctness of the dealing with this structure +throughout the code - will check up on it. */ +typedef struct sfs_rootblock_t { + sfsblock_t numfrags; + sfsblock_t ibmpos; + sfsblock_t fbmpos; + uint32_t bmlen; +} sfs_rootblock_t; + +/* These structures exist as members of linked lists hanging off a */ +/* sfs_openfile_t structure, except in the case of the inode block */ +/* and double indirect block of a file which are pointed to directly */ +/* since only one of these types of blocks can exist per file. */ +/* The buf points to a structure in either clean or dirty splay tree */ +typedef struct sfs_openblock_list { + struct sfs_blockbuf_t *buf; + struct sfs_openblock_list *next; + struct sfs_openblock_list *prev; +} sfs_openblock_list; + +/* Will be a chained hash table of open file descriptions */ +/* These can also be referenced by the background file flush daemon */ +/* that is in control of correctly flushing out all pending data and */ +/* bitmap updates when the file is closed or synced */ +typedef struct sfs_openfile_t { + sfsid_t sfsid; + sfsblock_t sfsinode; + sfsfd_t sfsfd; /* Fake fd for reference - unique per open file*/ + uint64_t pos; /* Position in the file, for partial reads */ + int rd_refcount; + int wr_refcount; + int flushonclose; + struct sfs_inode_t *inode; /* The block pointed to by inodebuf_p->buf */ + struct sfs_blockbuf_t *inodebuf_p; /* Pointer to blockbuf_t of inode */ + struct sfs_openblock_list *rwbuf_list_p; /* List of RW open blocks */ + struct sfs_openblock_list *sibuf_list_p; /* List of single indirect open blocks*/ + struct sfs_blockbuf_t *dibuf_p; /* Double indirect open block */ + struct sfs_openfile_t *prev; + struct sfs_openfile_t *next; +} sfs_openfile_t; + +/* This structure references blocks held in the buffer cache */ +/* There will be an array indexed by sfsid that points to two ordered */ +/* splay trees of blocks for that sfsid, one of dirty pages, and one */ +/* of clean pages. Periodically the dirty pages are flushed to disk */ +/* I'm going to add prev and next, and keep these as a list in time */ +/* order, also. */ +typedef struct sfs_blockbuf_t { + struct sfs_blockbuf_t *left; /* Splay left & right pointers */ + struct sfs_blockbuf_t *right; + struct sfs_blockbuf_t *prev; + struct sfs_blockbuf_t *next; + sfsid_t sfsid; + sfsblock_t sfsinode; + sfsblock_t diskpos; /* Position of this CHUNK on disk */ + int refcount; /* How many people holding this page */ + uint8_t dirty; /* Is page dirty */ + uint8_t type; /* Inode or data - required for updating bitmaps */ + int buflen; /* Length of buffer (*buf) (FRAGSIZE max) */ + char *buf; /* Pointer to page data */ +} sfs_blockbuf_t; + +/* This structure is the mount point parent information for a mounted */ +/* filesystem */ +typedef struct sfs_mountfs_t { + sfs_rootblock_t *path; + sfs_rootblock_t *rootblock; + sfsid_t sfsid; + sfsfd_t fd; /* Filedescriptor used for writing to this filesystem */ + char *fbm; /* Fragment allocation bitmap */ + char *ibm; /* Inode allocation bitmap */ + char *mhb; /* Memory holding fragment bitmap for disk bitmap updates */ + sfs_blockbuf_t *dirty; + sfs_blockbuf_t *clean; + sfs_blockbuf_t *head[2]; /* These are used for keeping a time-ordered */ + sfs_blockbuf_t *tail[2]; /* list of blocks */ + int accepting_requests; + int pending_requests; + sfs_requestor_list *request_queue; + pthread_mutex_t req_lock; + pthread_cond_t req_signal; + pthread_mutex_t req_signal_lock; + pthread_mutex_t openfiles_lock; /* Lock for the 'other' array */ + pthread_t thread_id; +} sfs_mountfs_t; + +#endif /* !SFS_DEFINES_H */ --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_fslo.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,116 @@ +/* sfs_fslo.c,v 1.17 2001/01/24 12:49:58 adrian Exp */ + +/* Squid FS */ +/* */ +/* Squid FS - Filesystem Level Operations */ +/* */ +/* Authors: Stew Forster (slf) - Original version */ +/* Kevin Littlejohn (darius@bofh.net.au) */ +/* */ + +/* A very simple stripped down UFS style filesystem that makes a lot */ +/* of assumptions based on the needs of the Squid web proxy caching */ +/* software. */ + +/* Note, the types in here are possibly wrong - this needs to be gone over */ + +#include "squid.h" + +#include "sfs_defines.h" + +int +sfs_format(const char *rawdevpath, u_int32_t numfrags) +{ + char *fbm; + int bmlen; + /* XXX - not sure if this next variable is to be used, but it's not at + * the moment, remove? */ + /* int bitsinfrag; */ + int fbmpos; + int ibmpos; + int fd; + int i; + sfs_rootblock_t *rblock; + char *rbbuf; + uint64_t os; + + if(numfrags < MINFSFRAGS) { + errno = ERANGE; + return -1; + } +/* Work out how long the bitmaps should be (in bytes) */ + bmlen = numfrags / BITINBYTE; + if(numfrags % BITINBYTE) + bmlen++; +/* Position them half-way through the fs */ + fbmpos = numfrags >> 1; + ibmpos = fbmpos - bmlen; + + if((fd = open(rawdevpath, O_RDWR)) < 0) + return -1; + + /* Write out the root block */ + + if((rbbuf = (char *)xcalloc(1, CHUNKSIZE)) == NULL) { + close(fd); + return -1; + } + rblock = (sfs_rootblock_t *)rbbuf; + rblock->numfrags = numfrags; + rblock->ibmpos = ibmpos; + rblock->fbmpos = fbmpos; + rblock->bmlen = bmlen; + os = 0; + if(lseek(fd, os, SEEK_SET) < 0) { + xfree(rbbuf); + close(fd); + return -1; + } + if(write(fd, rbbuf, CHUNKSIZE) < 0) { + xfree(rbbuf); + close(fd); + return -1; + } + xfree(rbbuf); + + /* Write out the inode bitmap. This will be all zeros. Since we */ + /* xcalloc()ed the file bitmap and it's the same size as the file */ + /* bitmap, just write it out */ + + os = ibmpos; + os *= FRAGSIZE; + if(lseek(fd, os, SEEK_SET) < 0) { + return -1; + } + if((fbm = (char *)xcalloc(1, bmlen)) == NULL) { + close(fd); + return -1; + } + if(write(fd, fbm, bmlen) < 0) { + xfree(fbm); + close(fd); + return -1; + } + + /* set all the blocks that contain */ + /* the bitmaps as allocated (not free). Also set as used the first */ + /* two fragments which will contain the filesystem root block */ + + for(i = ibmpos; i < (ibmpos + (2 * bmlen)); i++) + SETBIT(fbm, i); + SETBIT(fbm, 0); + SETBIT(fbm, 1); + + /* Write out the frag bitmap. We will already be at the right */ + /* location after writing out the inode bitmap */ + + if(write(fd, fbm, bmlen) < 0) { + xfree(fbm); + close(fd); + return -1; + } + xfree(fbm); + + close(fd); /* Done! So simple */ + return 0; +} /* sfs_format */ --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_interface.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,349 @@ +/* sfs_interface.c,v 1.58 2001/01/24 12:50:20 adrian Exp */ + +/* These functions comprise the interface portion of squidFS - the bits that + outside functions can call. + I think I'll make the interfaces as identical to normal interfaces as + possible - not overly happy about that, as it means juggling things into + and out of strings, but until I have time to clean up squid's own fs + interfaces, that's the best that can be done. + The above changes in the light of the new store_* stuff in squid. +*/ + +/* + * DEBUG 78 + */ + +#include "squid.h" + +#include "store_sfs.h" + +/* Public interfaces - the ones squid requires us to provide */ + +sfsfd_t +sfs_open(const char *path, int oflag, mode_t mode) +{ + struct sfs_requestor *req; + enum sfs_request_type rt; + sfsfd_t ret; + sfsid_t sfsid; + sfsblock_t sfsinode; + uint temp_sfsid; + +/* Currently, you have to specify either an inode, or O_CREAT */ +/* We also make the rather brash assumption that if we're opening to write, + we're creating a new file - that assumption can change. */ +/* Could do with error checking on the sscanf... */ + sfsid = 0; + sfsinode = 0; + if (path != (char *)NULL) { + sscanf(path,"%u/%u",&temp_sfsid,&sfsinode); + sfsid = (sfsid_t)temp_sfsid; + } + if (oflag & O_CREAT) { + rt = _SFS_OP_OPEN_WRITE; + } else { + rt = _SFS_OP_OPEN_READ; +/* If we're trying to open something that's not an inode, return. */ + if (!(CBIT_TEST(_sfs_mounted[sfsid].ibm, sfsinode))) { + printf("DEBUG: sfs_open opening non-inode\n"); + return -1; + } + } + if (!(req = _sfs_create_requestor(sfsid,rt))) + return -1; + req->sfsinode = sfsinode; + _sfs_submit_request(req); + _sfs_waitfor_request(req); + ret = req->ret; + _sfs_remove_request(&_sfs_mounted[sfsid].request_queue,req); + return ret; +} + +int +sfs_close(sfsfd_t sfsfd) +{ +/* Need to flush the file to disk and remove the structure. */ + sfs_requestor *req; + int ret; + + if(!(req = _sfs_create_requestor(sfsfd >> 24, _SFS_OP_CLOSE))) + return -1; + req->sfsfd = sfsfd; + _sfs_submit_request(req); + _sfs_waitfor_request(req); + ret = req->ret; + _sfs_remove_request(&_sfs_mounted[sfsfd >> 24].request_queue,req); + return ret; +} + +ssize_t +sfs_read(sfsfd_t sfsfd, void *buf, ssize_t buflen) +/* Takes: sfsfd, a pointer to pre-allocated space, and length of said + space. + Returns: number of bytes read. + (Note, on Solaris 2.6, ssize_t is 4 bytes, and I believe signed) +*/ +{ + sfs_requestor *req; + ssize_t ret; + sfsid_t sfsid; + + sfsid = sfsfd >> 24; + if(!(req = _sfs_create_requestor(sfsid, _SFS_OP_READ))) + return -1; + req->sfsfd = sfsfd; + req->offset = -1; + req->buflen = buflen; + _sfs_submit_request(req); + _sfs_waitfor_request(req); + if ((!buf) || (req->ret == 0)) + return 0; + if (req->ret < 0) + return req->ret; + ret = req->buflen; + if (req->buf) { + memcpy(buf,req->buf,ret+1); + xfree(req->buf); + } +/* I think I'm munging things by adding a '\0', but you get that. */ + *((char *)(buf)+ret+1) = '\0'; + _sfs_remove_request(&_sfs_mounted[sfsid].request_queue,req); + return ret; +} + +ssize_t +sfs_write(sfsfd_t sfsfd, const void *buf, size_t buflen) +{ + sfs_requestor *req; + ssize_t ret; + sfsid_t sfsid; + + sfsid = sfsfd >> 24; + if (!(req = _sfs_create_requestor(sfsid,_SFS_OP_WRITE))) + return -1; + req->sfsfd = sfsfd; + if (!(req->buf = xstrdup(buf))) { + /* Panic! */ + return -1; + } + req->buflen = buflen; + _sfs_submit_request(req); + _sfs_waitfor_request(req); + ret = req->ret; + if (req->buf) + xfree(req->buf); + _sfs_remove_request(&_sfs_mounted[sfsid].request_queue,req); + return ret; +} + +int +sfs_unlink(sfsid_t sfsid, sfsblock_t sfsinode) +{ +/* Should really take a full filename, by rights */ +/* Here's the trick with this one: You don't unlink a file till _after_ + you've closed it (normally). That means I can't take the normal sfsfd + and extract the relevant info :( */ + sfs_requestor *req; + int ret; + + if (!(req = _sfs_create_requestor(sfsid, _SFS_OP_UNLINK))) + return -1; + req->sfsinode = sfsinode; + _sfs_submit_request(req); + _sfs_waitfor_request(req); + ret = req->ret; + _sfs_remove_request(&_sfs_mounted[sfsid].request_queue,req); + return ret; +} + +/* Private-ish interfaces - the ones people can call, but squid doesn't use */ +/* directly. */ + +void sfs_thread_loop(sfs_mountfs_t *mount_point); + +sfsblock_t +sfs_get_inode(sfsfd_t sfsfd) +{ +/* This function returns the inode position for a given open fd. The theory + is when you open a new file, you don't know what it's 'called'. Given the + filedescriptor, this will return the canonical 'name' (number) of the file + - which is also the block location on disk. */ +/* Note also, this one employs thread-level locks without going into the thread + - I judge it faster this way. */ + sfsid_t sfsid; + sfs_openfile_t *tmp; + sfsblock_t sfsinode; + + sfsid = sfsfd >> 24; + pthread_mutex_lock(&(_sfs_mounted[sfsid].openfiles_lock)); + tmp = _sfs_openfiles[sfsid]; +/* This while loop could be optimised */ + while (tmp) { + if (tmp->sfsfd == sfsfd) { + sfsinode = tmp->sfsinode; + pthread_mutex_unlock(&(_sfs_mounted[sfsid].openfiles_lock)); + return sfsinode; + } + tmp = tmp->next; + } + pthread_mutex_unlock(&(_sfs_mounted[sfsid].openfiles_lock)); +/* block 0 should never be an inode - therefore can be used as an error value, + in place of negative values (unsigned :(. */ + return 0; +} + +int +sfs_umount(sfsid_t sfsid) +/* As noted below, mount and umount need to be called only from a single +thread - preferably the thread that calls init. I _can_ fix this, with +YALock, but I've chosen not to at this time. */ +{ + sfs_requestor *req; + int ret; + + if (sfsid >= MAXFILESYS) + return -1; + if (_sfs_mounted[sfsid].rootblock == NULL) + return 0; +/* Send a umount, and wait for the return. */ +/* The umount request simply tells the fs not to accept any more requests, +and to sync all changes to disk, close the fd, and remove itself from the +list of mounted fs'es. Basically, all the important stuff is done in the +thread itself. */ + if (!(req = _sfs_create_requestor(sfsid,_SFS_OP_UMOUNT))) + return -1; + if (_sfs_submit_request(req) < 0) + return -1; + _sfs_waitfor_request(req); + ret = req->ret; + _sfs_remove_request(&_sfs_mounted[sfsid].request_queue,req); + if (ret == 0) { + if (_sfs_mounted[sfsid].rootblock) { + xfree(_sfs_mounted[sfsid].rootblock); + _sfs_mounted[sfsid].rootblock = NULL; + } + } + return ret; +} + +/* I believe the return value on this to be bogus - but I need an error value +to return. Oh for exceptions, hey? */ +/* Using 0 as the error value is also pretty crufty :( I'm not sure what + else to do here - I really want to use an unsigned int for this... */ +sfsid_t +sfs_mount(const char *rawdevpath) +{ + sfsid_t i; + sfsblock_t j, bmlen; + sfsblock_t ibmpos, fbmpos; + +/* This hunt is not thread-safe - assume only one thread doing these +things (initialising/mounting) - otherwise bad things happen(tm). +Fixing this assumption would mean adding a lock over the _sfs_mounted +array */ + for(i = 1; (_sfs_mounted[i].rootblock != NULL) && (i < MAXFILESYS); i++); + if (i == MAXFILESYS) + return 0; + if ((_sfs_mounted[i].fd = open(rawdevpath, O_RDWR)) < 0) + return 0; + if (lseek(_sfs_mounted[i].fd, (uint64_t)0, SEEK_SET) < (uint64_t)0) { + printf("Didn't manage to lseek in mount :(\n"); + close(_sfs_mounted[i].fd); + return 0; + } + if ((_sfs_mounted[i].rootblock = (sfs_rootblock_t *)xcalloc(1,CHUNKSIZE)) == NULL) { + close(_sfs_mounted[i].fd); + _sfs_mounted[i].rootblock = NULL; + return 0; + } + if (read(_sfs_mounted[i].fd, _sfs_mounted[i].rootblock, CHUNKSIZE) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return 0; + } + ibmpos = _sfs_mounted[i].rootblock->ibmpos; + fbmpos = _sfs_mounted[i].rootblock->fbmpos; + bmlen = _sfs_mounted[i].rootblock->bmlen; + _sfs_mounted[i].sfsid = i; + _sfs_mounted[i].fbm = (char *)xcalloc(1,bmlen); + _sfs_mounted[i].ibm = (char *)xcalloc(1,bmlen); +/* Seek to the bitmaps, and read them in */ +/* I wonder whether or not it makes sense to have this stuff done in the +fs'es own thread. Maybe */ + if (lseek(_sfs_mounted[i].fd, ibmpos, SEEK_SET) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return 0; + } + if (read(_sfs_mounted[i].fd, _sfs_mounted[i].fbm, bmlen) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return 0; + } + if (read(_sfs_mounted[i].fd, _sfs_mounted[i].ibm, bmlen) < 0) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return 0; + } + if ((_sfs_mounted[i].mhb = (char *)xcalloc(1,bmlen)) == NULL) { + close(_sfs_mounted[i].fd); + xfree(_sfs_mounted[i].rootblock); + _sfs_mounted[i].rootblock = NULL; + return 0; + } + for (j = 0; j <= bmlen; j++) { + if (CBIT_TEST(_sfs_mounted[i].ibm, j) || CBIT_TEST(_sfs_mounted[i].fbm, j)) + CBIT_SET(_sfs_mounted[i].mhb, j); + } + _sfs_mounted[i].dirty = NULL; + _sfs_mounted[i].request_queue = NULL; + _sfs_mounted[i].pending_requests = 0; + pthread_mutex_init(&(_sfs_mounted[i].req_lock), NULL); + pthread_mutex_init(&(_sfs_mounted[i].req_signal_lock), NULL); + pthread_cond_init(&(_sfs_mounted[i].req_signal), NULL); + pthread_mutex_init(&(_sfs_mounted[i].openfiles_lock), NULL); + _sfs_mounted[i].accepting_requests = 1; + pthread_create(&(_sfs_mounted[i].thread_id), NULL, (void *)&sfs_thread_loop, &(_sfs_mounted[i])); +/* Return the sfsid */ + return i; +} + +off_t +sfs_seek(sfsfd_t sfsfd, off_t pos, int whence) +/* Takes: sfsid, fd, and position to seek to. + Returns: 0 for success, -1 for failure. + + This can all be done inside or outside the thread - I chose outside + atm, for no particularly good reason (less requests == good?) + + whence is probably broken atm - I don't like the idea of supporting + seeks from the end. squid doesn't need it either, AFAICS. +*/ +{ + sfs_openfile_t *tmp; + unsigned char sfsid; + + sfsid = sfsfd >> 24; + pthread_mutex_lock(&(_sfs_mounted[sfsid].openfiles_lock)); + tmp = _sfs_openfiles[sfsid]; + while (tmp) { + if (tmp->sfsfd == sfsfd) + break; + tmp = tmp->next; + } + if (!tmp) { + pthread_mutex_unlock(&(_sfs_mounted[sfsid].openfiles_lock)); + return -1; + } + if (pos > tmp->inode->len) { + pthread_mutex_unlock(&(_sfs_mounted[sfsid].openfiles_lock)); + return -1; + } + tmp->pos = pos; + pthread_mutex_unlock(&(_sfs_mounted[sfsid].openfiles_lock)); + return 0; +} --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_lib.h Wed Feb 14 00:49:24 2007 @@ -0,0 +1,66 @@ +/* sfs_lib.h,v 1.16 2001/01/24 12:47:58 adrian Exp */ + +/* Squid FS */ +/* */ +/* Authors: Stew Forster (slf) - Original version */ +/* Kevin Littlejohn (darius@bofh.net.au) */ +/* */ + +/* A very simple stripped down UFS style filesystem that makes a lot */ +/* of assumptions based on the needs of the Squid web proxy caching */ +/* software. */ + +#ifndef SFS_LIB_H +#define SFS_LIB_H + +#include "squid.h" +#include "store_sfs.h" + + +/* The mount list */ +extern sfs_mountfs_t _sfs_mounted[MAXFILESYS]; +extern sfs_openfile_t * _sfs_openfiles[MAXFILESYS]; + +/* Internal functions */ +/* sfs_util.c */ +extern void _sfs_waitfor_request(sfs_requestor *req); +extern int _sfs_remove_request(sfs_requestor_list **lreq, sfs_requestor *req); +extern void _sfs_done_request(sfs_requestor *req, int retval); +extern sfs_requestor * _sfs_create_requestor(int sfsid, + enum sfs_request_type reqtype); +extern int _sfs_submit_request(sfs_requestor *req); +extern sfs_blockbuf_t * _sfs_read_block(int sfsid, uint diskpos); +extern uint _sfs_calculate_diskpos(int sfsid, sfs_openfile_t *openfd, + uint offset); +extern void _sfs_commit_block(int sfsid, sfs_blockbuf_t *block); +extern sfs_blockbuf_t * _sfs_write_block(int sfsid, uint diskpos, void *buf, + int buflen, enum sfs_block_type type); +extern uint _sfs_allocate_fd(sfs_openfile_t *new); +extern uint _sfs_allocate_block(int sfsid, int blocktype); +extern sfs_openfile_t * _sfs_find_fd(int sfsfd); +extern void _sfs_flush_bitmaps(int sfsid); +extern int _sfs_flush_file(int sfsid, sfs_openfile_t *fd); + +/* sfs_splay.c */ +extern sfs_blockbuf_t * _sfs_blockbuf_create(); +extern sfs_blockbuf_t *sfs_splay_find(uint diskpos, sfs_blockbuf_t *tree); +extern sfs_blockbuf_t * sfs_splay_insert(int sfsid, sfs_blockbuf_t *new, + sfs_blockbuf_t *tree); +extern sfs_blockbuf_t * sfs_splay_remove(int sfsid, sfs_blockbuf_t *tree); +extern sfs_blockbuf_t * sfs_splay_delete(int sfsid, sfs_blockbuf_t *tree); + + + +/* External stuff */ +extern int sfs_format(const char *, u_int32_t ); +extern sfsfd_t sfs_open(const char *, int , mode_t ); +extern int sfs_umount(sfsid_t ); +extern sfsid_t sfs_mount(const char * ); +extern int sfs_close(sfsfd_t ); +extern ssize_t sfs_read(sfsfd_t , void * , ssize_t ); +extern off_t sfs_seek(sfsfd_t , off_t , int ); +extern int sfs_unlink(sfsid_t , sfsblock_t ); +extern ssize_t sfs_write(sfsfd_t , const void * , size_t ); +extern sfsblock_t sfs_get_inode(sfsblock_t ); + +#endif /* !SFS_LIB_H */ --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_llo.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,435 @@ +/* sfs_llo.c,v 1.84 1999/02/03 04:04:06 darius Exp */ + +/* Squid FS */ +/* */ +/* Authors: Stew Forster (slf) - Original version */ +/* Kevin Littlejohn (darius@bofh.net.au) */ +/* */ + +/* A very simple stripped down UFS style filesystem that makes a lot */ +/* of assumptions based on the needs of the Squid web proxy caching */ +/* software. */ + +#include "squid.h" + +#include "store_sfs.h" + +sfs_mountfs_t _sfs_mounted[MAXFILESYS]; +sfs_openfile_t *_sfs_openfiles[MAXFILESYS]; +int _sfs_initialised = 0; +int inode_data_size = FRAGSIZE - sizeof(sfs_inode_t); +int direct_pointer_threshold = FRAGSIZE - sizeof(sfs_inode_t) + (NUMDIP * FRAGSIZE); + +void sfs_do_umount(sfs_requestor *req); +void sfs_do_open(sfs_requestor *req); +void sfs_do_read(sfs_requestor *req); +void sfs_do_write(sfs_requestor *req); +void sfs_do_unlink(sfs_requestor *req); +void sfs_do_close(sfs_requestor *req); + +void +sfs_initialise() +{ + int i; + + if (_sfs_initialised) + return; + _sfs_initialised = 1; + for(i = 0; i < MAXFILESYS; i++) { + _sfs_mounted[i].rootblock = NULL; + _sfs_mounted[i].accepting_requests = 0; + _sfs_openfiles[i] = NULL; + } +} /* sfs_initialise */ + +void +sfs_thread_loop(sfs_mountfs_t *mount_point) +{ + sigset_t new; + int i; + sfs_requestor_list *temp; + + /* Make sure to ignore signals which may possibly get sent to the parent */ + /* squid thread. Causes havoc with mutex's and condition waits otherwise */ + /* (Stolen from aiops.c) - Darius */ + + sigemptyset(&new); + sigaddset(&new, SIGPIPE); + sigaddset(&new, SIGCHLD); +#if (defined(_SQUID_LINUX_) && USE_ASYNC_IO) + sigaddset(&new, SIGQUIT); + sigaddset(&new, SIGTRAP); +#else + sigaddset(&new, SIGUSR1); + sigaddset(&new, SIGUSR2); +#endif + sigaddset(&new, SIGHUP); + sigaddset(&new, SIGTERM); + sigaddset(&new, SIGINT); + sigaddset(&new, SIGALRM); + pthread_sigmask(SIG_BLOCK, &new, NULL); + +/* Set a conditional, when it's realised scan through the service list. */ + pthread_cond_init(&(mount_point->req_signal), NULL); + pthread_mutex_lock(&(mount_point->req_signal_lock)); + i = 0; + while (1) { + printf("DEBUG: Going into wait...\n"); + pthread_cond_wait(&(mount_point->req_signal), &(mount_point->req_signal_lock)); + printf("DEBUG: Coming out of wait...\n"); + temp = mount_point->request_queue; +/* Should I lock the request queue while I cycle through these? Probably + wise, but slow... */ + while ((mount_point->pending_requests > 0) && (temp)) { + printf("DEBUG: %d pending requests\n",mount_point->pending_requests); + if (temp->req->request_state == _SFS_PENDING) { +/* If we're not accepting requests, return fail for each request. */ +/* Note, we can't just lock the request queue, as things are still being +removed from it by other threads. */ + if (!(mount_point->accepting_requests)) { + _sfs_done_request(temp->req,-1); + } else { +/* This portion sets the state, and works out exactly what to do - open, + read, write, close, sync, unlink. */ + temp->req->request_state = _SFS_IN_PROGRESS; + mount_point->pending_requests--; + switch (temp->req->request_type) { + case _SFS_OP_UNLINK: + sfs_do_unlink(temp->req); + break; + case _SFS_OP_UMOUNT: + sfs_do_umount(temp->req); + break; + case _SFS_OP_READ: + sfs_do_read(temp->req); + break; + case _SFS_OP_WRITE: + sfs_do_write(temp->req); + break; + case _SFS_OP_OPEN_READ: + case _SFS_OP_OPEN_WRITE: + sfs_do_open(temp->req); + break; + case _SFS_OP_CLOSE: + sfs_do_close(temp->req); + break; + default: + _sfs_done_request(temp->req,-1); + } + } + } + if (temp) + temp = temp->next; + } + i = (i + 1) % 10; + if (i == 0) { + _sfs_flush_bitmaps(mount_point->sfsid); + } + } +} + +void +sfs_do_read(sfs_requestor *req) +{ + sfs_blockbuf_t *new; + sfs_openfile_t *openfd; + sfsblock_t diskpos; + int offset, retlen, fragsize; + void *buf, *tmp; + + if (!(openfd = _sfs_find_fd(req->sfsfd))) { + _sfs_done_request(req,-1); + return; + } + + if (req->offset == -1) + offset = openfd->pos; + else + offset = req->offset; + + retlen = 0; + buf = NULL; + while (retlen < req->buflen) { + if (!(diskpos = _sfs_calculate_diskpos(req->sfsid, openfd, offset+retlen))) { + _sfs_done_request(req,retlen); + return; + } + if (!(new = _sfs_read_block(req->sfsid, diskpos))) { + req->buf = buf; + _sfs_done_request(req,retlen); + return; + } + tmp = buf; + if ((retlen + new->buflen) > req->buflen) + fragsize = req->buflen - retlen; + else + fragsize = new->buflen; + buf = (char *)xcalloc(1, retlen + fragsize + 1); + memcpy(buf, tmp, retlen); + if (new->type == _SFS_INODE) { + if (fragsize > inode_data_size) + fragsize = inode_data_size; + memcpy(((char *)buf)+retlen, new->buf+sizeof(sfs_inode_t), fragsize); + } else { + memcpy(((char *)buf)+retlen, new->buf, fragsize); + } + if (tmp) + xfree(tmp); + retlen += fragsize; + openfd->pos += fragsize; + } + req->buf = buf; + _sfs_done_request(req,retlen); +} + +void +sfs_do_write(sfs_requestor *req) +{ + sfs_openfile_t *openfd; + sfs_blockbuf_t *new, *current = NULL; + int offset, retlen, fragsize, inblock, maxfragsize; + sfsblock_t diskpos; + void *buf, *tmp; + int type; + + if (!(openfd = _sfs_find_fd(req->sfsfd))) { + _sfs_done_request(req,-1); + return; + } + + if (req->offset == -1) + offset = openfd->pos; + else + offset = req->offset; + + retlen = 0; + buf = tmp = NULL; +/* Work out type and where in the block this write should go */ + while (retlen < req->buflen) { + if (offset + retlen < inode_data_size) { + type = _SFS_INODE; + inblock = offset + retlen + sizeof(sfs_inode_t); + } else { + type = _SFS_DATA; + inblock = ((offset + retlen) - sizeof(sfs_inode_t)) - ((((offset + retlen) - sizeof(sfs_inode_t)) / FRAGSIZE) * FRAGSIZE); + } + maxfragsize = FRAGSIZE - inblock; + current = NULL; + +/* Figure out where on disk it should be... */ + if (!(diskpos = _sfs_calculate_diskpos(req->sfsid, openfd, offset+retlen))) { + if (!(diskpos = _sfs_allocate_block(req->sfsid, type))) { + _sfs_done_request(req,retlen); + return; + } + if ((offset + retlen) < 261632) { + openfd->inode->dip[(offset + retlen - inode_data_size) / FRAGSIZE] = diskpos; + } else { +/* XXX indirect pointer - youch */ + } + } else { + printf("DEBUG: sfs_do_write Reading in old block\n"); + current = _sfs_read_block(req->sfsid,diskpos); + } + +/* How much more to write? */ + if ((req->buflen - retlen) >= maxfragsize) { + fragsize = maxfragsize; + } else { + fragsize = req->buflen - retlen; + } + if (current) { + printf("DEBUG: current block found\n"); + buf = current->buf; + } else { + printf("DEBUG: sfs_do_write new buf created\n"); + buf = (char *)xcalloc(1, FRAGSIZE); + } + printf("DEBUG: buf = %p, inblock = %d, fragsize = %d, retlen = %d, buflen = %d\n",buf,inblock,fragsize,retlen,req->buflen); + memcpy(((char *)buf)+inblock,((char *)req->buf)+retlen,fragsize); + if (!current) { + if (!(new = _sfs_write_block(req->sfsid, diskpos, buf, fragsize, type))) { + _sfs_done_request(req,retlen); + return; + } + new->sfsinode = openfd->sfsinode; + new->type = type; + current = new; + } + retlen += fragsize; + openfd->pos += fragsize; + } + printf("DEBUG: written %d bytes to fd %d (%s)\n",req->buflen,req->sfsfd,current->buf+sizeof(sfs_inode_t)); + _sfs_done_request(req,retlen); +} + +void +sfs_do_umount(sfs_requestor *req) +{ + sfs_mountfs_t *mnt; + sfs_blockbuf_t *block_ptr; + sfs_requestor_list *lreq; + sfs_openfile_t *openfd; + int i; + + _sfs_mounted[req->sfsid].accepting_requests = 0; + mnt = &(_sfs_mounted[req->sfsid]); +/* Flush all the dirty blocks out to HDD */ + openfd = _sfs_openfiles[req->sfsid]; + while(openfd) { +/* flush_file has to get rid of stuff then, which is bad :( */ +/* The structures get kinda confused at this point */ + printf("DEBUG: flushing file...\n"); + _sfs_flush_file(req->sfsid,openfd); + _sfs_openfiles[req->sfsid] = openfd->next; + xfree(openfd); + openfd = _sfs_openfiles[req->sfsid]; + } + printf ("DEBUG: umount flushing dirty blocks\n"); + while (mnt->dirty) { + _sfs_commit_block(req->sfsid, mnt->dirty); + mnt->dirty = sfs_splay_delete(req->sfsid, mnt->dirty); + } + _sfs_flush_bitmaps(req->sfsid); + if (mnt->fbm) + xfree(mnt->fbm); + if (mnt->ibm) + xfree(mnt->ibm); + if (mnt->mhb) + xfree(mnt->mhb); + block_ptr = mnt->clean; +/* Need to clean out the clean list */ + while (mnt->clean) { + mnt->clean = sfs_splay_delete(req->sfsid, mnt->clean); + } + if (mnt->request_queue) { +/* Make doubly sure we've cleared any pending requests - shouldn't need to, +but we _are_ umounting... This is actually bodgy code, if it ever does anything, +then something's gone wrong. Probably shouldn't do this, but it saves deadlock, +and we'd rather see corruption than hang indefinately.*/ + lreq = mnt->request_queue; + while (lreq) { + if (lreq->req->request_state == _SFS_PENDING) { + _sfs_done_request(lreq->req,-1); + } + lreq = lreq->next; + } + i = 0; + while ((mnt->request_queue->next) && (i < 5)) { +/* Waiting for the request queue to be empty of all bar the umount request */ + i++; + sleep(1); + } + } +/* At this stage, all I need to do is kill the thread :) */ +/* I could shuffle these, do the done_request before actually completely +finishing - that would guarantee the requests are all collected properly */ + _sfs_done_request(req,0); + if (mnt->rootblock) { + xfree(mnt->rootblock); + mnt->rootblock = NULL; + } +/* Should also free all open fd's */ + pthread_exit(NULL); +} + +void +sfs_do_open(sfs_requestor *req) +{ + sfs_openfile_t *fd, *fdptr; + + if ((fd = (sfs_openfile_t *)xcalloc(1, sizeof(sfs_openfile_t))) == NULL) { + _sfs_done_request(req,-1); + return; + } + fd->sfsid = req->sfsid; + fd->sfsfd = _sfs_allocate_fd(fd); + + if (req->request_type == _SFS_OP_OPEN_READ) { + fd->sfsinode = req->sfsinode; + fd->inodebuf_p = _sfs_read_block(req->sfsid, req->sfsinode); + } else { +/* Doesn't need to lock, as all allocation is within thread. */ + if (!(fd->sfsinode = _sfs_allocate_block(req->sfsid, _SFS_INODE))) { + xfree(fd); + _sfs_done_request(req,-1); + return; + } +/* Fill the new inode */ + if (!(fd->inodebuf_p = _sfs_blockbuf_create())) { + printf("DEBUG: Couldn't create a blockbuf\n"); + xfree(fd); + _sfs_done_request(req,-1); + return; + } + fd->inodebuf_p->type = _SFS_INODE; + fd->inodebuf_p->buf = (char *)xcalloc(1, FRAGSIZE); + fd->inodebuf_p->diskpos = fd->inodebuf_p->sfsinode = fd->sfsinode; + _sfs_mounted[req->sfsid].clean = sfs_splay_insert(req->sfsid, fd->inodebuf_p, _sfs_mounted[req->sfsid].clean); + } + fd->inode = (sfs_inode_t *)fd->inodebuf_p->buf; + if (req->request_type != _SFS_OP_OPEN_READ) { + fd->inode->len = 1; + fd->rwbuf_list_p = NULL; + fd->sibuf_list_p = NULL; + fd->dibuf_p = NULL; + } + fd->pos = 0; + fd->next = fd->prev = NULL; +/* Add this one to the sfsid list of open fd's */ +/* Allocating an fd */ + fdptr = _sfs_openfiles[req->sfsid]; + if (fdptr) { + while(fdptr->next) { + fdptr = fdptr->next; + } + fdptr->next = fd; + fd->prev = fdptr; + } else { + _sfs_openfiles[req->sfsid] = fd; + } + req->buf = fd; + _sfs_done_request(req,fd->sfsfd); +} + +void +sfs_do_unlink(sfs_requestor *req) +{ + /* XXX unused at the moment + sfs_openfile_t *ptr; + sfs_blockbuf_t *block; */ + + if (!(CBIT_TEST(_sfs_mounted[req->sfsid].ibm, req->sfsfd))) { + _sfs_done_request(req,-1); + return; + } +/* Check to make sure there's not an open file here - if there is, close and +flush it. Is this correct behaviour? At least, we shouldn't flush to disk - +at most, we should do something about the threads trying to hold the file open + while (ptr = _sfs_find_fd(req->sfsid, req->sfsfd)) + _sfs_flush_file(req->sfsid, ptr); +*/ +/* Without opening a file ;) read in the inode, walk the list of blocks, + and CBIT_CLEAR each one from .fbm */ +} + +void +sfs_do_close(sfs_requestor *req) +{ + sfs_openfile_t *ptr; + + if (!(ptr = _sfs_find_fd(req->sfsfd))) { + printf("DEBUG: couldn't find fd %d\n",req->sfsfd); + _sfs_done_request(req,-1); + return; + } + printf("DEBUG: flushing file %d\n",req->sfsfd); + _sfs_flush_file(req->sfsid, ptr); + if (ptr) { +/* Assuming _sfs_flush_file clears the other stuff from the openfd - will + check that later... */ + xfree(ptr); + } + _sfs_done_request(req,0); + return; +} --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_splay.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,157 @@ +/* $Id: sfs_splay.c,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ */ + +#include "squid.h" + +#include "store_sfs.h" + +sfs_blockbuf_t * +_sfs_blockbuf_create() +{ + sfs_blockbuf_t *new; + if ((new = (sfs_blockbuf_t *)xcalloc(1, sizeof(sfs_blockbuf_t))) == NULL) + return NULL; + new->left = NULL; + new->right = NULL; + new->prev = NULL; + new->next = NULL; + new->sfsid = -1; + new->sfsinode = 0; + new->diskpos = -1; + new->dirty = 0; + new->type = _SFS_UNKNOWN; + new->buf = NULL; + return new; +} + +sfs_blockbuf_t *sfs_splay_find(uint diskpos, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t *temp, *l, *r; + sfs_blockbuf_t new; + + if (tree == NULL) + return NULL; + + l = r = &new; + for (;;) { + if (diskpos < tree->diskpos) { + if (!(tree->left)) + break; + if (diskpos < tree->left->diskpos) { + temp = tree->left; + tree->left = temp->right; + temp->right = tree; + tree = temp; + if (tree->left == NULL) + break; + } + r->left = tree; + r = tree; + tree = tree->left; + } else if (diskpos > tree->diskpos) { + if (!(tree->right)) + break; + if (diskpos > tree->right->diskpos) { + temp = tree->right; + tree->right = temp->left; + temp->left = tree; + tree = temp; + if (tree->right == NULL) + break; + } + l->right = tree; + l = tree; + tree = tree->right; + } else { + break; + } + } + l->right = tree->left; + r->left = tree->right; + tree->left = new.right; + tree->right = new.left; + return tree; +} + +sfs_blockbuf_t * +sfs_splay_insert(int sfsid, sfs_blockbuf_t *new, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t **head, **tail; + + if (new == NULL) + return NULL; + head = &(_sfs_mounted[sfsid].head[new->dirty]); + tail = &(_sfs_mounted[sfsid].tail[new->dirty]); + if (tree == NULL) { + new->left = NULL; + new->right = NULL; + *head = *tail = new; + return new; + } + tree = sfs_splay_find(new->diskpos,tree); + if (new->diskpos == tree->diskpos) { + tree->refcount++; + return tree; + } + new->next = *head; + (*head)->prev = new; + if (new->diskpos < tree->diskpos) { + new->left = tree->left; + new->right = tree->right; + tree->left = NULL; + } else { + new->right = tree->right; + new->left = tree; + tree->right = NULL; + } + CBIT_SET(_sfs_mounted[sfsid].mhb, new->diskpos); + new->refcount = 1; + return new; +} + +sfs_blockbuf_t * +sfs_splay_remove(int sfsid, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t *new; + sfs_blockbuf_t **head, **tail; + + tree->refcount--; + if (tree->refcount > 0) + return tree; + new = NULL; + head = &(_sfs_mounted[sfsid].head[tree->dirty]); + tail = &(_sfs_mounted[sfsid].tail[tree->dirty]); + if (tree->left == NULL) { + new = tree->right; + } else { + new = sfs_splay_find(tree->left->diskpos,tree->left); + new->right = tree->right; + } + if (*head == tree) + *head = new; + if (*tail == tree) + *tail = new; + if (tree->prev) + tree->prev->next = tree->next; + if (tree->next) + tree->next->prev = tree->prev; + return new; +} + +sfs_blockbuf_t * +sfs_splay_delete(int sfsid, sfs_blockbuf_t *tree) +{ + sfs_blockbuf_t *old; + + if (tree == NULL) + return NULL; + old = tree; +/* Set this so it _will_ be deleted */ + if (tree->refcount > 1) + tree->refcount = 1; + tree = sfs_splay_remove(sfsid,tree); + if (tree != old) { + xfree(old->buf); + xfree(old); + } + return tree; +} --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_test.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,55 @@ +/* $Id: sfs_test.c,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ */ + +#include "sfs_lib.h" +#include +#include +#include +#include +#include +#include +#include + +int main() { + int sfsid; + uint sfsfd, sfsinode; + char filename[20]; + char buf[80]; + + if (creat("test.drv", 0644) < 0) { + printf("cannot open new file test.drv: %s", strerror(errno)); + exit(0); + } + + if (sfs_format("test.drv",4096) < 0) { + printf("unable to format test.drv! %s", strerror(errno)); + exit(0); + } + sfsid = sfs_mount("test.drv"); + printf("sfsid = %d\n",sfsid); + sleep(1); + sprintf(filename,"%d/0",sfsid); + sfsfd = sfs_open(filename, O_CREAT, 0); + sfsinode = sfs_get_inode(sfsfd); + printf("sfsfd = %d, sfsinode = %d\n",sfsfd,sfsinode); + sfs_write(sfsfd,"Hello...\n",strlen("Hello...\n")); + sfs_write(sfsfd,"Hello, again!\n",strlen("Hello, again!\n")); + printf("close result = %d\n",sfs_close(sfsfd)); + printf("umount result = %d\n",sfs_umount(sfsid)); + sleep(1); + printf("About to remount and read...\n"); + sfsid = sfs_mount("test.drv"); + printf("sfsid = %d\n",sfsid); + sleep(1); + printf("Opening %d/%d\n",sfsid,sfsinode); + sprintf(filename,"%d/%d",sfsid,sfsinode); + printf("DEBUG: %s\n",filename); + sfsfd = sfs_open(filename, O_RDONLY, 0); + printf("sfsfd = %d, sfsinode = %d\n",sfsfd,sfsinode); + if (sfsfd >= 0) { + printf("read result = %d\n",sfs_read(sfsfd,buf,80)); + printf("\n***** %s *****\n",buf); + printf("close result = %d\n",sfs_close(sfsfd)); + } + printf("umount result = %d\n",sfs_umount(sfsid)); + exit(0); +} --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/sfs_util.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,438 @@ +/* sfs_util.c,v 1.53 2001/01/24 12:49:34 adrian Exp */ + +#include "sfs_lib.h" +#include +#include +#include +#include +#ifdef LINUX +#include +#endif + +extern int inode_data_size; +extern int direct_pointer_threshold; + +/* Temporary, for testing/compiling without squid */ +#define squid_curtime time(NULL); + +void +_sfs_waitfor_request(sfs_requestor *req) +/* You know, we could count the number of seconds a request has had to wait to + be serviced here... */ +{ + struct timespec waittime; + + while ((req) && (req->request_state != _SFS_DONE)) { + waittime.tv_sec = squid_curtime; + waittime.tv_sec += 1; + waittime.tv_nsec = 0; + pthread_cond_timedwait(&(req->done_signal),&(req->done_signal_lock),&waittime); + } +} + +int +_sfs_remove_request(sfs_requestor_list **lreq,sfs_requestor *req) +/* This doesn't free the buffer - not sure whether we have any need to keep +the buffer anywhere or not, but the option is there... */ +{ + sfs_requestor_list *ptr, *x; + + printf("DEBUG: Removing %d request from queue\n",req->request_type); + ptr = *lreq; + while((ptr->next != NULL) && (ptr->req != req)) + ptr = ptr->next; + if (ptr->req == req) { + pthread_mutex_lock(&(_sfs_mounted[req->sfsid].req_lock)); + if (ptr->prev) + ptr->prev->next = ptr->next; + if (ptr->next) + ptr->next->prev = ptr->prev; +/* Could probably do this with a while over *lreq, actually */ + if (*lreq == ptr) { + x = ptr; + while (x->prev) + x = x->prev; + if (x == ptr) + *lreq = x->next; + else + *lreq = x; + } + pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].req_lock)); + + xfree(req); + xfree(ptr); + return (0); + } else { + return -1; + } +} + +void +_sfs_done_request(sfs_requestor *req, int retval) +{ + + printf ("DEBUG: _sfs_done_request %d\n",req->request_type); + if (req->orphan) { + _sfs_remove_request(&_sfs_mounted[req->sfsid].request_queue,req); + } else { + req->ret = retval; + req->request_state = _SFS_DONE; + pthread_cond_signal(&(req->done_signal)); + } +} + +sfs_requestor * +_sfs_create_requestor(int sfsid, enum sfs_request_type reqtype) +{ + struct sfs_requestor *req; + + if (!(req = (sfs_requestor *)xcalloc(1, sizeof(sfs_requestor)))) + return NULL; + + req->request_type = reqtype; + req->request_state = _SFS_PENDING; + req->sfsid = sfsid; + req->sfsfd = 0; + req->offset = -1; + req->ret = 0; + req->orphan = 0; + req->buf = NULL; + pthread_cond_init(&(req->done_signal), NULL); + pthread_mutex_init(&(req->done_signal_lock), NULL); + pthread_mutex_lock(&(req->done_signal_lock)); + return req; +} + +int +_sfs_submit_request(sfs_requestor *req) +{ + sfs_requestor_list *lreq; + sfs_requestor_list *ptr; + + if ((lreq =(sfs_requestor_list *) xcalloc(1, sizeof(sfs_requestor_list))) == NULL) + return -1; + + lreq->req = req; + lreq->next = NULL; + printf("DEBUG: Locking req_lock\n"); + pthread_mutex_lock(&(_sfs_mounted[req->sfsid].req_lock)); + printf("DEBUG: Locked req_lock\n"); + ptr = _sfs_mounted[req->sfsid].request_queue; + if (ptr) { + while (ptr->next) + ptr = ptr->next; + ptr->next = lreq; + lreq->prev = ptr; + } else { + _sfs_mounted[req->sfsid].request_queue = lreq; + lreq->prev = NULL; + } + _sfs_mounted[req->sfsid].pending_requests++; + printf("DEBUG: Added %d request\n",req->request_type); + pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].req_lock)); +/* And signal that the request has been made */ + pthread_mutex_lock(&(_sfs_mounted[req->sfsid].req_signal_lock)); + pthread_cond_signal(&(_sfs_mounted[req->sfsid].req_signal)); + pthread_mutex_unlock(&(_sfs_mounted[req->sfsid].req_signal_lock)); + + return(0); +} + +sfs_blockbuf_t * +_sfs_read_block(int sfsid, uint diskpos) +{ +/* This takes an sfsid, and a diskpos, and returns a blockbuf filled in +with the correct data. */ + sfs_blockbuf_t *new; + uint64_t dpos; + int readlen; + + printf("DEBUG: _sfs_read_block\n"); +/* Searching for the appropriate block in the clean list */ + if (_sfs_mounted[sfsid].clean) { + _sfs_mounted[sfsid].clean = sfs_splay_find(diskpos,_sfs_mounted[sfsid].clean); + if (_sfs_mounted[sfsid].clean->diskpos == diskpos) { + return _sfs_mounted[sfsid].clean; + } + } +/* And in the dirty list */ +/* We probably shouldn't find things in the dirty list - they should probably +be served by squid's own cache first. Might be worth measuring the frequency +of this one... */ + if (_sfs_mounted[sfsid].dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(diskpos,_sfs_mounted[sfsid].dirty); + if (_sfs_mounted[sfsid].dirty->diskpos == diskpos) { + return _sfs_mounted[sfsid].dirty; + } + } +/* Otherwise we're reading a new one in off the disk. */ + dpos = diskpos * FRAGSIZE; + + if (!(new = _sfs_blockbuf_create())) + return NULL; + if (!(new->buf = (char *)xcalloc(1, FRAGSIZE))) + return NULL; + if (lseek(_sfs_mounted[sfsid].fd, dpos, SEEK_SET) < 0) { + xfree(new); + return NULL; + } + if ((readlen = read(_sfs_mounted[sfsid].fd, new->buf, FRAGSIZE)) < FRAGSIZE) { + xfree(new); + return NULL; + } + printf("DEBUG: read %d bytes (fragsize = %d)\n",readlen,FRAGSIZE); + new->sfsid = sfsid; + new->diskpos = diskpos; + new->buflen = FRAGSIZE; + if (CBIT_TEST(_sfs_mounted[sfsid].ibm, diskpos)) { + new->type = _SFS_INODE; + printf("DEBUG: just read inode %s\n",new->buf+sizeof(sfs_inode_t)); + } else { + new->type = _SFS_DATA; + printf("DEBUG: just read data %s\n",new->buf); + } +/* Add it to the clean list on the spot. */ + _sfs_mounted[sfsid].clean = sfs_splay_insert(sfsid, new, _sfs_mounted[sfsid].clean); + return new; +} + +uint +_sfs_calculate_diskpos(int sfsid, sfs_openfile_t *openfd, uint offset) +{ +/* This function returns the disk position of the block into which bytes +should be written, or from which bytes should be read. It is granular +to a block level only. */ + sfs_blockbuf_t *din; + uint *dinptr; + uint ret; + + if (offset < inode_data_size) + return openfd->inodebuf_p->diskpos; +/* Otherwise subtract inode_data_size, then div by FRAGSIZE to get entry in + direct block pointers */ + else if (offset < direct_pointer_threshold) { + return openfd->inode->dip[((offset - inode_data_size) / FRAGSIZE)]; + } else { +/* This insinuates that we're storing indirect pointers in chunks rather than + frags - I think that's fair, it gives us bigger files ;) Incidentally, + max filesize under this system is: + ((FRAGSIZE*(CHUNKSIZE / sizeof(uint)))*64)+(FRAGSIZE * 63)+inode_data_size + Under current settings, that's 512Mb + little bits. Extending this should + be done, but without creating frag problems, and without increasing the + number of indirect pointers required by too much. Having said that, + increasing number of indirect pointers at that stage is probably worth- + while - there's very few files that large legitimately. My preference is + to store another pointer in the first chunk of indirect pointers - drops + that down to 2047, and adds another chunk (giving another 512Mb easily, + and the option to keep chaining if required). At that stage, we'll want to + store state information so we don't read three times for each distinct + read call. +*/ + din = _sfs_read_block(sfsid, openfd->inode->sin[(offset - direct_pointer_threshold) / (FRAGSIZE * (CHUNKSIZE / sizeof(uint)))]); + dinptr = (uint *)din->buf; + ret = dinptr[(offset - direct_pointer_threshold) / FRAGSIZE]; + return ret; + } +} + +void +_sfs_commit_block(int sfsid, sfs_blockbuf_t *block) +{ + char *i; + uint64_t dpos; + + i = (char *)block->buf; + dpos = block->diskpos * FRAGSIZE; + lseek(_sfs_mounted[sfsid].fd, dpos, SEEK_SET); + write(_sfs_mounted[sfsid].fd, block->buf, FRAGSIZE); + if (block->type == _SFS_INODE) { + CBIT_SET(_sfs_mounted[sfsid].ibm, block->diskpos); + } +} + +sfs_blockbuf_t * +_sfs_write_block(int sfsid, uint diskpos, void *buf, int buflen, enum sfs_block_type type) +{ + sfs_blockbuf_t *new; + sfs_blockbuf_t *old; + sfs_inode_t temp_inode; + + printf("DEBUG: sfsid = %d, diskpos = %d, buflen = %d\n",sfsid,diskpos,buflen); + new = old = NULL; +/* If it's an inode, make sure we have it in clean or dirty - gotta preserve + the inode data */ + if (CBIT_TEST(_sfs_mounted[sfsid].ibm, diskpos)) + _sfs_read_block(sfsid, diskpos); +/* If it's in the clean list, remove it - it's now incorrect */ + if (_sfs_mounted[sfsid].clean) { + _sfs_mounted[sfsid].clean = sfs_splay_find(diskpos,_sfs_mounted[sfsid].clean); + if (_sfs_mounted[sfsid].clean->diskpos == diskpos) { + old = _sfs_mounted[sfsid].clean; + _sfs_mounted[sfsid].clean->refcount = 1; + _sfs_mounted[sfsid].clean = sfs_splay_remove(sfsid, _sfs_mounted[sfsid].clean); + _sfs_mounted[sfsid].dirty = sfs_splay_insert(sfsid, _sfs_mounted[sfsid].dirty, old); + } + } +/* Likewise the dirty list - we'll be simply replacing the contents if it's +there */ + if (_sfs_mounted[sfsid].dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(diskpos,_sfs_mounted[sfsid].dirty); + if (_sfs_mounted[sfsid].dirty->diskpos == diskpos) { + if (type == _SFS_INODE) { + if (temp_inode.len == 0) + temp_inode.len = 1; + memcpy(&temp_inode,_sfs_mounted[sfsid].dirty->buf,sizeof(sfs_inode_t)); + } + xfree(_sfs_mounted[sfsid].dirty->buf); + _sfs_mounted[sfsid].dirty->buf = (char *)xcalloc(1, FRAGSIZE); + new = _sfs_mounted[sfsid].dirty; + } + } + if (!new) { + if (!(new = _sfs_blockbuf_create())) + return NULL; + new->buf = (char *)xcalloc(1, FRAGSIZE); + new->sfsid = sfsid; + new->diskpos = diskpos; + new->dirty = 1; + CBIT_SET(_sfs_mounted[sfsid].mhb, diskpos); + printf("DEBUG: inserting block into dirty list\n"); + _sfs_mounted[sfsid].dirty = sfs_splay_insert(sfsid, new, _sfs_mounted[sfsid].dirty); + } + if (type == _SFS_INODE) { + new->type = _SFS_INODE; + memcpy(new->buf,&temp_inode,sizeof(sfs_inode_t)); + memcpy(new->buf+sizeof(sfs_inode_t),buf,buflen); + } else { + memcpy(new->buf,buf,buflen); + } + return new; +} + +uint +_sfs_allocate_fd(sfs_openfile_t *new) +/* This is to be called only from within an fs thread - no locking ;) */ +{ + sfs_openfile_t *tmp; + uint maxfd; + + maxfd = new->sfsid << 24; + tmp = _sfs_openfiles[new->sfsid]; + while(tmp) { + if (tmp->sfsfd > maxfd) + maxfd = tmp->sfsfd; + tmp = tmp->next; + } + return maxfd+1; +} + +uint +_sfs_allocate_block(int sfsid, int blocktype) +{ + uint i; + int found; + int blocks; + + if (blocktype == _SFS_INODE) + blocks = 2; + else + blocks = 1; +/* First two blocks are always already used? */ + for(i=1, found=0; i<_sfs_mounted[sfsid].rootblock->numfrags; i += blocks) { + if (!(CBIT_TEST(_sfs_mounted[sfsid].mhb, i))) { + found = 1; + break; + } + } + if (found) { + CBIT_SET(_sfs_mounted[sfsid].mhb, i); + return i; + } else + return 0; +} + +sfs_openfile_t * +_sfs_find_fd(int sfsfd) +{ + sfs_openfile_t *ptr; + + ptr = _sfs_openfiles[sfsfd >> 24]; + while (ptr) { + if (ptr->sfsfd == sfsfd) + break; + ptr = ptr->next; + } + return ptr; +} + +void +_sfs_flush_bitmaps(int sfsid) +{ + printf("DEBUG: Flushing bitmaps\n"); + lseek(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].rootblock->ibmpos, SEEK_SET); + write(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].ibm, _sfs_mounted[sfsid].rootblock->bmlen); + write(_sfs_mounted[sfsid].fd, _sfs_mounted[sfsid].fbm, _sfs_mounted[sfsid].rootblock->bmlen); +} + +int +_sfs_flush_file(int sfsid, sfs_openfile_t *fd) +{ + sfs_openblock_list *tmp, *nxt; + sfs_openfile_t *tmpfile; + uint diskpos; + + pthread_mutex_lock(&(_sfs_mounted[sfsid].openfiles_lock)); + tmpfile = _sfs_openfiles[sfsid]; + while (tmpfile) { + if (tmpfile->sfsfd == fd->sfsfd) { + break; + } + tmpfile = tmpfile->next; + } + if (tmpfile) { + if (tmpfile->next) + tmpfile->next->prev = tmpfile->prev; + if (tmpfile->prev) + tmpfile->prev->next = tmpfile->next; + if ((tmpfile->next == tmpfile->prev) && (tmpfile->prev == NULL)) + _sfs_openfiles[sfsid] = NULL; + } +/* Flush the inode block */ + _sfs_commit_block(sfsid, fd->inodebuf_p); +/* Flush all the single indirect blocks */ + tmp = fd->sibuf_list_p; + while (tmp) { +/* Two variables for after the splay_delete, in case the structure goes away */ + nxt = tmp->next; + diskpos = tmp->buf->diskpos; + _sfs_commit_block(sfsid, tmp->buf); + if (tmp->buf->dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(tmp->buf->diskpos, _sfs_mounted[sfsid].dirty); + _sfs_mounted[sfsid].dirty = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].dirty); + } else { + _sfs_mounted[sfsid].dirty = sfs_splay_find(tmp->buf->diskpos, _sfs_mounted[sfsid].clean); + _sfs_mounted[sfsid].clean = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].clean); + } +/* This isn't strictly how we designed it, but there you go */ + CBIT_SET(_sfs_mounted[sfsid].fbm, diskpos); + xfree(tmp); + tmp = nxt; + } + fd->sibuf_list_p = NULL; +/* Flush all the stuff off the double indirect block */ + if (fd->dibuf_p) { +/* indirect pointer - Panic ;) */ + } +/* Go back and flush the inode properly */ + CBIT_SET(_sfs_mounted[sfsid].fbm, fd->inodebuf_p->diskpos); + if (fd->inodebuf_p->dirty) { + _sfs_mounted[sfsid].dirty = sfs_splay_find(fd->inodebuf_p->diskpos, _sfs_mounted[sfsid].dirty); + _sfs_mounted[sfsid].dirty = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].dirty); + } else { + _sfs_mounted[sfsid].clean = sfs_splay_find(fd->inodebuf_p->diskpos, _sfs_mounted[sfsid].clean); + _sfs_mounted[sfsid].clean = sfs_splay_delete(sfsid, _sfs_mounted[sfsid].clean); + } +/* XXX - nowhere else do we actually check the return value of this code. */ +/* Change function to void? */ + return (0); +} --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/store_dir_sfs.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,1702 @@ + +/* + * $Id: store_dir_sfs.c,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ + * + * DEBUG: section 47 Store Directory Routines + * AUTHOR: Duane Wessels + * + * SQUID Web Proxy Cache http://www.squid-cache.org/ + * ---------------------------------------------------------- + * + * Squid is the result of efforts by numerous individuals from + * the Internet community; see the CONTRIBUTORS file for full + * details. Many organizations have provided support for Squid's + * development; see the SPONSORS file for full details. Squid is + * Copyrighted (C) 2001 by the Regents of the University of + * California; see the COPYRIGHT file for full details. Squid + * incorporates software developed and/or copyrighted by other + * sources; see the CREDITS file for full details. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. + * + */ + +#include "squid.h" + +#include "store_sfs.h" + +#define DefaultLevelOneDirs 16 +#define DefaultLevelTwoDirs 256 +#define STORE_META_BSFSZ 4096 + +typedef struct _RebuildState RebuildState; +struct _RebuildState { + SwapDir *sd; + int n_read; + FILE *log; + int speed; + int curlvl1; + int curlvl2; + struct { + unsigned int need_to_validate:1; + unsigned int clean:1; + unsigned int init:1; + } flags; + int done; + int in_dir; + int fn; + struct dirent *entry; + DIR *td; + char fullpath[SQUID_MAXPATHLEN]; + char fullfilename[SQUID_MAXPATHLEN]; + struct _store_rebuild_data counts; +}; + +static int n_sfs_dirs = 0; +static int *sfs_dir_index = NULL; +MemPool *sfs_state_pool = NULL; +static int sfs_initialised = 0; + +static char *storeSfsDirSwapSubDir(SwapDir *, int subdirn); +static int storeSfsDirCreateDirectory(const char *path, int); +static int storeSfsDirVerifyCacheDirs(SwapDir *); +static int storeSfsDirVerifyDirectory(const char *path); +static void storeSfsDirCreateSwapSubDirs(SwapDir *); +static char *storeSfsDirSwapLogFile(SwapDir *, const char *); +static EVH storeSfsDirRebuildFromDirectory; +static EVH storeSfsDirRebuildFromSwapLog; +static int storeSfsDirGetNextFile(RebuildState *, int *sfileno, int *size); +static StoreEntry *storeSfsDirAddDiskRestore(SwapDir * SD, const cache_key * key, + int file_number, + size_t swap_file_sz, + time_t expires, + time_t timestamp, + time_t lastref, + time_t lastmod, + u_num32 refcount, + u_short flags, + int clean); +static void storeSfsDirRebuild(SwapDir * sd); +static void storeSfsDirCloseTmpSwapLog(SwapDir * sd); +static FILE *storeSfsDirOpenTmpSwapLog(SwapDir *, int *, int *); +static STLOGOPEN storeSfsDirOpenSwapLog; +static STINIT storeSfsDirInit; +static STFREE storeSfsDirFree; +static STLOGCLEANSTART storeSfsDirWriteCleanStart; +static STLOGCLEANNEXTENTRY storeSfsDirCleanLogNextEntry; +static STLOGCLEANWRITE storeSfsDirWriteCleanEntry; +static STLOGCLEANDONE storeSfsDirWriteCleanDone; +static STLOGCLOSE storeSfsDirCloseSwapLog; +static STLOGWRITE storeSfsDirSwapLog; +static STNEWFS storeSfsDirNewfs; +static STDUMP storeSfsDirDump; +static STMAINTAINFS storeSfsDirMaintain; +static STCHECKOBJ storeSfsDirCheckObj; +static STREFOBJ storeSfsDirRefObj; +static STUNREFOBJ storeSfsDirUnrefObj; +static QS rev_int_sort; +static int storeSfsDirClean(int swap_index); +static EVH storeSfsDirCleanEvent; +static int storeSfsDirIs(SwapDir * sd); +static int storeSfsFilenoBelongsHere(int fn, int F0, int F1, int F2); +static int storeSfsCleanupDoubleCheck(SwapDir *, StoreEntry *); +static void storeSfsDirStats(SwapDir *, StoreEntry *); +static void storeSfsDirInitBitmap(SwapDir *); +static int storeSfsDirValidFileno(SwapDir *, sfileno, int); + +/* + * These functions were ripped straight out of the heart of store_dir.c. + * They assume that the given filenum is on a sfs partiton, which may or + * may not be true.. + * XXX this evilness should be tidied up at a later date! + */ + +int +storeSfsDirMapBitTest(SwapDir * SD, int fn) +{ + sfileno filn = fn; + sfsinfo_t *sfsinfo; + sfsinfo = (sfsinfo_t *) SD->fsdata; + return file_map_bit_test(sfsinfo->map, filn); +} + +void +storeSfsDirMapBitSet(SwapDir * SD, int fn) +{ + sfileno filn = fn; + sfsinfo_t *sfsinfo; + sfsinfo = (sfsinfo_t *) SD->fsdata; + file_map_bit_set(sfsinfo->map, filn); +} + +void +storeSfsDirMapBitReset(SwapDir * SD, int fn) +{ + sfileno filn = fn; + sfsinfo_t *sfsinfo; + sfsinfo = (sfsinfo_t *) SD->fsdata; + /* + * We have to test the bit before calling file_map_bit_reset. + * file_map_bit_reset doesn't do bounds checking. It assumes + * filn is a valid file number, but it might not be because + * the map is dynamic in size. Also clearing an already clear + * bit puts the map counter of-of-whack. + */ + if (file_map_bit_test(sfsinfo->map, filn)) + file_map_bit_reset(sfsinfo->map, filn); +} + +int +storeSfsDirMapBitAllocate(SwapDir * SD) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + int fn; + fn = file_map_allocate(sfsinfo->map, sfsinfo->suggest); + file_map_bit_set(sfsinfo->map, fn); + sfsinfo->suggest = fn + 1; + return fn; +} + +/* + * Initialise the sfs bitmap + * + * If there already is a bitmap, and the numobjects is larger than currently + * configured, we allocate a new bitmap and 'grow' the old one into it. + */ +static void +storeSfsDirInitBitmap(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + + if (sfsinfo->map == NULL) { + /* First time */ + sfsinfo->map = file_map_create(); + } else if (sfsinfo->map->max_n_files) { + /* it grew, need to expand */ + /* XXX We don't need it anymore .. */ + } + /* else it shrunk, and we leave the old one in place */ +} + +static char * +storeSfsDirSwapSubDir(SwapDir * sd, int subdirn) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + + LOCAL_ARRAY(char, fullfilename, SQUID_MAXPATHLEN); + assert(0 <= subdirn && subdirn < sfsinfo->l1); + snprintf(fullfilename, SQUID_MAXPATHLEN, "%s/%02X", sd->path, subdirn); + return fullfilename; +} + +static int +storeSfsDirCreateDirectory(const char *path, int should_exist) +{ + int created = 0; + struct stat st; + getCurrentTime(); + if (0 == stat(path, &st)) { + if (S_ISDIR(st.st_mode)) { + debug(20, should_exist ? 3 : 1) ("%s exists\n", path); + } else { + fatalf("Swap directory %s is not a directory.", path); + } + } else if (0 == mkdir(path, 0755)) { + debug(20, should_exist ? 1 : 3) ("%s created\n", path); + created = 1; + } else { + fatalf("Failed to make swap directory %s: %s", + path, xstrerror()); + } + return created; +} + +static int +storeSfsDirVerifyDirectory(const char *path) +{ + struct stat sb; + if (stat(path, &sb) < 0) { + debug(20, 0) ("%s: %s\n", path, xstrerror()); + return -1; + } + if (S_ISDIR(sb.st_mode) == 0) { + debug(20, 0) ("%s is not a directory\n", path); + return -1; + } + return 0; +} + +/* + * This function is called by storeSfsDirInit(). If this returns < 0, + * then Squid exits, complains about swap directories not + * existing, and instructs the admin to run 'squid -z' + */ +static int +storeSfsDirVerifyCacheDirs(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + int j; + const char *path = sd->path; + + if (storeSfsDirVerifyDirectory(path) < 0) + return -1; + for (j = 0; j < sfsinfo->l1; j++) { + path = storeSfsDirSwapSubDir(sd, j); + if (storeSfsDirVerifyDirectory(path) < 0) + return -1; + } + return 0; +} + +static void +storeSfsDirCreateSwapSubDirs(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + int i, k; + int should_exist; + LOCAL_ARRAY(char, name, MAXPATHLEN); + for (i = 0; i < sfsinfo->l1; i++) { + snprintf(name, MAXPATHLEN, "%s/%02X", sd->path, i); + if (storeSfsDirCreateDirectory(name, 0)) + should_exist = 0; + else + should_exist = 1; + debug(47, 1) ("Making directories in %s\n", name); + for (k = 0; k < sfsinfo->l2; k++) { + snprintf(name, MAXPATHLEN, "%s/%02X/%02X", sd->path, i, k); + storeSfsDirCreateDirectory(name, should_exist); + } + } +} + +static char * +storeSfsDirSwapLogFile(SwapDir * sd, const char *ext) +{ + LOCAL_ARRAY(char, path, SQUID_MAXPATHLEN); + LOCAL_ARRAY(char, pathtmp, SQUID_MAXPATHLEN); + LOCAL_ARRAY(char, digit, 32); + char *pathtmp2; + if (Config.Log.swap) { + xstrncpy(pathtmp, sd->path, SQUID_MAXPATHLEN - 64); + while (index(pathtmp, '/')) + *index(pathtmp, '/') = '.'; + while (strlen(pathtmp) && pathtmp[strlen(pathtmp) - 1] == '.') + pathtmp[strlen(pathtmp) - 1] = '\0'; + for (pathtmp2 = pathtmp; *pathtmp2 == '.'; pathtmp2++); + snprintf(path, SQUID_MAXPATHLEN - 64, Config.Log.swap, pathtmp2); + if (strncmp(path, Config.Log.swap, SQUID_MAXPATHLEN - 64) == 0) { + strcat(path, "."); + snprintf(digit, 32, "%02d", sd->index); + strncat(path, digit, 3); + } + } else { + xstrncpy(path, sd->path, SQUID_MAXPATHLEN - 64); + strcat(path, "/swap.state"); + } + if (ext) + strncat(path, ext, 16); + return path; +} + +static void +storeSfsDirOpenSwapLog(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + char *path; + int fd; + path = storeSfsDirSwapLogFile(sd, NULL); + fd = file_open(path, O_WRONLY | O_CREAT | O_BINARY); + if (fd < 0) { + debug(50, 1) ("%s: %s\n", path, xstrerror()); + fatal("storeSfsDirOpenSwapLog: Failed to open swap log."); + } + debug(47, 3) ("Cache Dir #%d log opened on FD %d\n", sd->index, fd); + sfsinfo->swaplog_fd = fd; + if (0 == n_sfs_dirs) + assert(NULL == sfs_dir_index); + n_sfs_dirs++; + assert(n_sfs_dirs <= Config.cacheSwap.n_configured); +} + +static void +storeSfsDirCloseSwapLog(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + if (sfsinfo->swaplog_fd < 0) /* not open */ + return; + file_close(sfsinfo->swaplog_fd); + debug(47, 3) ("Cache Dir #%d log closed on FD %d\n", + sd->index, sfsinfo->swaplog_fd); + sfsinfo->swaplog_fd = -1; + n_sfs_dirs--; + assert(n_sfs_dirs >= 0); + if (0 == n_sfs_dirs) + safe_free(sfs_dir_index); +} + +static void +storeSfsDirInit(SwapDir * sd) +{ + static int started_clean_event = 0; + static const char *errmsg = + "\tFailed to verify one of the swap directories, Check cache.log\n" + "\tfor details. Run 'squid -z' to create swap directories\n" + "\tif needed, or if running Squid for the first time."; + storeSfsDirInitBitmap(sd); + if (storeSfsDirVerifyCacheDirs(sd) < 0) + fatal(errmsg); + storeSfsDirOpenSwapLog(sd); + storeSfsDirRebuild(sd); + if (!started_clean_event) { + eventAdd("storeDirClean", storeSfsDirCleanEvent, NULL, 15.0, 1); + started_clean_event = 1; + } + (void) storeDirGetBlkSize(sd->path, &sd->fs.blksize); +} + +static void +storeSfsDirRebuildFromDirectory(void *data) +{ + RebuildState *rb = data; + SwapDir *SD = rb->sd; + LOCAL_ARRAY(char, hdr_buf, SM_PAGE_SIZE); + StoreEntry *e = NULL; + StoreEntry tmpe; + cache_key key[MD5_DIGEST_CHARS]; + int sfileno = 0; + int count; + int size; + struct stat sb; + int swap_hdr_len; + int fd = -1; + tlv *tlv_list; + tlv *t; + assert(rb != NULL); + debug(20, 3) ("storeSfsDirRebuildFromDirectory: DIR #%d\n", rb->sd->index); + for (count = 0; count < rb->speed; count++) { + assert(fd == -1); + fd = storeSfsDirGetNextFile(rb, &sfileno, &size); + if (fd == -2) { + debug(20, 1) ("Done scanning %s swaplog (%d entries)\n", + rb->sd->path, rb->n_read); + store_dirs_rebuilding--; + storeSfsDirCloseTmpSwapLog(rb->sd); + storeRebuildComplete(&rb->counts); + cbdataFree(rb); + return; + } else if (fd < 0) { + continue; + } + assert(fd > -1); + /* lets get file stats here */ + if (fstat(fd, &sb) < 0) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: fstat(FD %d): %s\n", + fd, xstrerror()); + file_close(fd); + store_open_disk_fd--; + fd = -1; + continue; + } + if ((++rb->counts.scancount & 0xFFFF) == 0) + debug(20, 3) (" %s %7d files opened so far.\n", + rb->sd->path, rb->counts.scancount); + debug(20, 9) ("file_in: fd=%d %08X\n", fd, sfileno); + statCounter.syscalls.disk.reads++; + if (read(fd, hdr_buf, SM_PAGE_SIZE) < 0) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: read(FD %d): %s\n", + fd, xstrerror()); + file_close(fd); + store_open_disk_fd--; + fd = -1; + continue; + } + file_close(fd); + store_open_disk_fd--; + fd = -1; + swap_hdr_len = 0; +#if USE_TRUNCATE + if (sb.st_size == 0) + continue; +#endif + tlv_list = storeSwapMetaUnpack(hdr_buf, &swap_hdr_len); + if (tlv_list == NULL) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: failed to get meta data\n"); + /* XXX shouldn't this be a call to storeSfsUnlink ? */ + storeSfsDirUnlinkFile(SD, sfileno); + continue; + } + debug(20, 3) ("storeSfsDirRebuildFromDirectory: successful swap meta unpacking\n"); + memset(key, '\0', MD5_DIGEST_CHARS); + memset(&tmpe, '\0', sizeof(StoreEntry)); + for (t = tlv_list; t; t = t->next) { + switch (t->type) { + case STORE_META_KEY: + assert(t->length == MD5_DIGEST_CHARS); + xmemcpy(key, t->value, MD5_DIGEST_CHARS); + break; + case STORE_META_STD: + assert(t->length == STORE_HDR_METASIZE); + xmemcpy(&tmpe.timestamp, t->value, STORE_HDR_METASIZE); + break; + default: + break; + } + } + storeSwapTLVFree(tlv_list); + tlv_list = NULL; + if (storeKeyNull(key)) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: NULL key\n"); + storeSfsDirUnlinkFile(SD, sfileno); + continue; + } + tmpe.hash.key = key; + /* check sizes */ + if (tmpe.swap_file_sz == 0) { + tmpe.swap_file_sz = sb.st_size; + } else if (tmpe.swap_file_sz == sb.st_size - swap_hdr_len) { + tmpe.swap_file_sz = sb.st_size; + } else if (tmpe.swap_file_sz != sb.st_size) { + debug(20, 1) ("storeSfsDirRebuildFromDirectory: SIZE MISMATCH %d!=%d\n", + tmpe.swap_file_sz, (int) sb.st_size); + storeSfsDirUnlinkFile(SD, sfileno); + continue; + } + if (EBIT_TEST(tmpe.flags, KEY_PRIVATE)) { + storeSfsDirUnlinkFile(SD, sfileno); + rb->counts.badflags++; + continue; + } + e = storeGet(key); + if (e && e->lastref >= tmpe.lastref) { + /* key already exists, current entry is newer */ + /* keep old, ignore new */ + rb->counts.dupcount++; + continue; + } else if (NULL != e) { + /* URL already exists, this swapfile not being used */ + /* junk old, load new */ + storeRelease(e); /* release old entry */ + rb->counts.dupcount++; + } + rb->counts.objcount++; + storeEntryDump(&tmpe, 5); + e = storeSfsDirAddDiskRestore(SD, key, + sfileno, + tmpe.swap_file_sz, + tmpe.expires, + tmpe.timestamp, + tmpe.lastref, + tmpe.lastmod, + tmpe.refcount, /* refcount */ + tmpe.flags, /* flags */ + (int) rb->flags.clean); + storeDirSwapLog(e, SWAP_LOG_ADD); + } + eventAdd("storeRebuild", storeSfsDirRebuildFromDirectory, rb, 0.0, 1); +} + +static void +storeSfsDirRebuildFromSwapLog(void *data) +{ + RebuildState *rb = data; + SwapDir *SD = rb->sd; + StoreEntry *e = NULL; + storeSwapLogData s; + size_t ss = sizeof(storeSwapLogData); + int count; + int used; /* is swapfile already in use? */ + int disk_entry_newer; /* is the log entry newer than current entry? */ + double x; + assert(rb != NULL); + /* load a number of objects per invocation */ + for (count = 0; count < rb->speed; count++) { + if (fread(&s, ss, 1, rb->log) != 1) { + debug(20, 1) ("Done reading %s swaplog (%d entries)\n", + rb->sd->path, rb->n_read); + fclose(rb->log); + rb->log = NULL; + store_dirs_rebuilding--; + storeSfsDirCloseTmpSwapLog(rb->sd); + storeRebuildComplete(&rb->counts); + cbdataFree(rb); + return; + } + rb->n_read++; + if (s.op <= SWAP_LOG_NOP) + continue; + if (s.op >= SWAP_LOG_MAX) + continue; + /* + * BC: during 2.4 development, we changed the way swap file + * numbers are assigned and stored. The high 16 bits used + * to encode the SD index number. There used to be a call + * to storeDirProperFileno here that re-assigned the index + * bits. Now, for backwards compatibility, we just need + * to mask it off. + */ + s.swap_filen &= 0x00FFFFFF; + debug(20, 3) ("storeSfsDirRebuildFromSwapLog: %s %s %08X\n", + swap_log_op_str[(int) s.op], + storeKeyText(s.key), + s.swap_filen); + if (s.op == SWAP_LOG_ADD) { + (void) 0; + } else if (s.op == SWAP_LOG_DEL) { + if ((e = storeGet(s.key)) != NULL) { + /* + * Make sure we don't unlink the file, it might be + * in use by a subsequent entry. Also note that + * we don't have to subtract from store_swap_size + * because adding to store_swap_size happens in + * the cleanup procedure. + */ + storeExpireNow(e); + storeReleaseRequest(e); + storeSfsDirReplRemove(e); + if (e->swap_filen > -1) { + storeSfsDirMapBitReset(SD, e->swap_filen); + e->swap_filen = -1; + e->swap_dirn = -1; + } + storeRelease(e); + rb->counts.objcount--; + rb->counts.cancelcount++; + } + continue; + } else { + x = log(++rb->counts.bad_log_op) / log(10.0); + if (0.0 == x - (double) (int) x) + debug(20, 1) ("WARNING: %d invalid swap log entries found\n", + rb->counts.bad_log_op); + rb->counts.invalid++; + continue; + } + if ((++rb->counts.scancount & 0xFFF) == 0) { + struct stat sb; + if (0 == fstat(fileno(rb->log), &sb)) + storeRebuildProgress(SD->index, + (int) sb.st_size / ss, rb->n_read); + } + if (!storeSfsDirValidFileno(SD, s.swap_filen, 0)) { + rb->counts.invalid++; + continue; + } + if (EBIT_TEST(s.flags, KEY_PRIVATE)) { + rb->counts.badflags++; + continue; + } + e = storeGet(s.key); + used = storeSfsDirMapBitTest(SD, s.swap_filen); + /* If this URL already exists in the cache, does the swap log + * appear to have a newer entry? Compare 'lastref' from the + * swap log to e->lastref. */ + disk_entry_newer = e ? (s.lastref > e->lastref ? 1 : 0) : 0; + if (used && !disk_entry_newer) { + /* log entry is old, ignore it */ + rb->counts.clashcount++; + continue; + } else if (used && e && e->swap_filen == s.swap_filen && e->swap_dirn == SD->index) { + /* swapfile taken, same URL, newer, update meta */ + if (e->store_status == STORE_OK) { + e->lastref = s.timestamp; + e->timestamp = s.timestamp; + e->expires = s.expires; + e->lastmod = s.lastmod; + e->flags = s.flags; + e->refcount += s.refcount; + storeSfsDirUnrefObj(SD, e); + } else { + debug_trap("storeSfsDirRebuildFromSwapLog: bad condition"); + debug(20, 1) ("\tSee %s:%d\n", __FILE__, __LINE__); + } + continue; + } else if (used) { + /* swapfile in use, not by this URL, log entry is newer */ + /* This is sorta bad: the log entry should NOT be newer at this + * point. If the log is dirty, the filesize check should have + * caught this. If the log is clean, there should never be a + * newer entry. */ + debug(20, 1) ("WARNING: newer swaplog entry for dirno %d, fileno %08X\n", + SD->index, s.swap_filen); + /* I'm tempted to remove the swapfile here just to be safe, + * but there is a bad race condition in the NOVM version if + * the swapfile has recently been opened for writing, but + * not yet opened for reading. Because we can't map + * swapfiles back to StoreEntrys, we don't know the state + * of the entry using that file. */ + /* We'll assume the existing entry is valid, probably because + * were in a slow rebuild and the the swap file number got taken + * and the validation procedure hasn't run. */ + assert(rb->flags.need_to_validate); + rb->counts.clashcount++; + continue; + } else if (e && !disk_entry_newer) { + /* key already exists, current entry is newer */ + /* keep old, ignore new */ + rb->counts.dupcount++; + continue; + } else if (e) { + /* key already exists, this swapfile not being used */ + /* junk old, load new */ + storeExpireNow(e); + storeReleaseRequest(e); + storeSfsDirReplRemove(e); + if (e->swap_filen > -1) { + /* Make sure we don't actually unlink the file */ + storeSfsDirMapBitReset(SD, e->swap_filen); + e->swap_filen = -1; + e->swap_dirn = -1; + } + storeRelease(e); + rb->counts.dupcount++; + } else { + /* URL doesnt exist, swapfile not in use */ + /* load new */ + (void) 0; + } + /* update store_swap_size */ + rb->counts.objcount++; + e = storeSfsDirAddDiskRestore(SD, s.key, + s.swap_filen, + s.swap_file_sz, + s.expires, + s.timestamp, + s.lastref, + s.lastmod, + s.refcount, + s.flags, + (int) rb->flags.clean); + storeDirSwapLog(e, SWAP_LOG_ADD); + } + eventAdd("storeRebuild", storeSfsDirRebuildFromSwapLog, rb, 0.0, 1); +} + +static int +storeSfsDirGetNextFile(RebuildState * rb, int *sfileno, int *size) +{ + SwapDir *SD = rb->sd; + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + int fd = -1; + int used = 0; + int dirs_opened = 0; + debug(20, 3) ("storeSfsDirGetNextFile: flag=%d, %d: /%02X/%02X\n", + rb->flags.init, + rb->sd->index, + rb->curlvl1, rb->curlvl2); + if (rb->done) + return -2; + while (fd < 0 && rb->done == 0) { + fd = -1; + if (0 == rb->flags.init) { /* initialize, open first file */ + rb->done = 0; + rb->curlvl1 = 0; + rb->curlvl2 = 0; + rb->in_dir = 0; + rb->flags.init = 1; + assert(Config.cacheSwap.n_configured > 0); + } + if (0 == rb->in_dir) { /* we need to read in a new directory */ + snprintf(rb->fullpath, SQUID_MAXPATHLEN, "%s/%02X/%02X", + rb->sd->path, + rb->curlvl1, + rb->curlvl2); + if (rb->flags.init && rb->td != NULL) + closedir(rb->td); + rb->td = NULL; + if (dirs_opened) + return -1; + rb->td = opendir(rb->fullpath); + dirs_opened++; + if (rb->td == NULL) { + debug(50, 1) ("storeSfsDirGetNextFile: opendir: %s: %s\n", + rb->fullpath, xstrerror()); + } else { + rb->entry = readdir(rb->td); /* skip . and .. */ + rb->entry = readdir(rb->td); + if (rb->entry == NULL && errno == ENOENT) + debug(20, 1) ("storeSfsDirGetNextFile: directory does not exist!.\n"); + debug(20, 3) ("storeSfsDirGetNextFile: Directory %s\n", rb->fullpath); + } + } + if (rb->td != NULL && (rb->entry = readdir(rb->td)) != NULL) { + rb->in_dir++; + if (sscanf(rb->entry->d_name, "%x", &rb->fn) != 1) { + debug(20, 3) ("storeSfsDirGetNextFile: invalid %s\n", + rb->entry->d_name); + continue; + } + if (!storeSfsFilenoBelongsHere(rb->fn, rb->sd->index, rb->curlvl1, rb->curlvl2)) { + debug(20, 3) ("storeSfsDirGetNextFile: %08X does not belong in %d/%d/%d\n", + rb->fn, rb->sd->index, rb->curlvl1, rb->curlvl2); + continue; + } + used = storeSfsDirMapBitTest(SD, rb->fn); + if (used) { + debug(20, 3) ("storeSfsDirGetNextFile: Locked, continuing with next.\n"); + continue; + } + snprintf(rb->fullfilename, SQUID_MAXPATHLEN, "%s/%s", + rb->fullpath, rb->entry->d_name); + debug(20, 3) ("storeSfsDirGetNextFile: Opening %s\n", rb->fullfilename); + fd = file_open(rb->fullfilename, O_RDONLY | O_BINARY); + if (fd < 0) + debug(50, 1) ("storeSfsDirGetNextFile: %s: %s\n", rb->fullfilename, xstrerror()); + else + store_open_disk_fd++; + continue; + } + rb->in_dir = 0; + if (++rb->curlvl2 < sfsinfo->l2) + continue; + rb->curlvl2 = 0; + if (++rb->curlvl1 < sfsinfo->l1) + continue; + rb->curlvl1 = 0; + rb->done = 1; + } + *sfileno = rb->fn; + return fd; +} + +/* Add a new object to the cache with empty memory copy and pointer to disk + * use to rebuild store from disk. */ +static StoreEntry * +storeSfsDirAddDiskRestore(SwapDir * SD, const cache_key * key, + int file_number, + size_t swap_file_sz, + time_t expires, + time_t timestamp, + time_t lastref, + time_t lastmod, + u_num32 refcount, + u_short flags, + int clean) +{ + StoreEntry *e = NULL; + debug(20, 5) ("storeSfsAddDiskRestore: %s, fileno=%08X\n", storeKeyText(key), file_number); + /* if you call this you'd better be sure file_number is not + * already in use! */ + e = new_StoreEntry(STORE_ENTRY_WITHOUT_MEMOBJ, NULL, NULL); + e->store_status = STORE_OK; + storeSetMemStatus(e, NOT_IN_MEMORY); + e->swap_status = SWAPOUT_DONE; + e->swap_filen = file_number; + e->swap_dirn = SD->index; + e->swap_file_sz = swap_file_sz; + e->lock_count = 0; + e->lastref = lastref; + e->timestamp = timestamp; + e->expires = expires; + e->lastmod = lastmod; + e->refcount = refcount; + e->flags = flags; + EBIT_SET(e->flags, ENTRY_CACHABLE); + EBIT_CLR(e->flags, RELEASE_REQUEST); + EBIT_CLR(e->flags, KEY_PRIVATE); + e->ping_status = PING_NONE; + EBIT_CLR(e->flags, ENTRY_VALIDATED); + storeSfsDirMapBitSet(SD, e->swap_filen); + storeHashInsert(e, key); /* do it after we clear KEY_PRIVATE */ + storeSfsDirReplAdd(SD, e); + return e; +} + +CBDATA_TYPE(RebuildState); +static void +storeSfsDirRebuild(SwapDir * sd) +{ + RebuildState *rb; + int clean = 0; + int zero = 0; + FILE *fp; + EVH *func = NULL; + CBDATA_INIT_TYPE(RebuildState); + rb = CBDATA_ALLOC(RebuildState, NULL); + rb->sd = sd; + rb->speed = opt_foreground_rebuild ? 1 << 30 : 50; + /* + * If the swap.state file exists in the cache_dir, then + * we'll use storeSfsDirRebuildFromSwapLog(), otherwise we'll + * use storeSfsDirRebuildFromDirectory() to open up each file + * and suck in the meta data. + */ + fp = storeSfsDirOpenTmpSwapLog(sd, &clean, &zero); + if (fp == NULL || zero) { + if (fp != NULL) + fclose(fp); + func = storeSfsDirRebuildFromDirectory; + } else { + func = storeSfsDirRebuildFromSwapLog; + rb->log = fp; + rb->flags.clean = (unsigned int) clean; + } + if (!clean) + rb->flags.need_to_validate = 1; + debug(20, 1) ("Rebuilding storage in %s (%s)\n", + sd->path, clean ? "CLEAN" : "DIRTY"); + store_dirs_rebuilding++; + eventAdd("storeRebuild", func, rb, 0.0, 1); +} + +static void +storeSfsDirCloseTmpSwapLog(SwapDir * sd) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + char *swaplog_path = xstrdup(storeSfsDirSwapLogFile(sd, NULL)); + char *new_path = xstrdup(storeSfsDirSwapLogFile(sd, ".new")); + int fd; + file_close(sfsinfo->swaplog_fd); +#if defined (_SQUID_OS2_) || defined (_SQUID_CYGWIN_) + if (unlink(swaplog_path) < 0) { + debug(50, 0) ("%s: %s\n", swaplog_path, xstrerror()); + fatal("storeSfsDirCloseTmpSwapLog: unlink failed"); + } +#endif + if (xrename(new_path, swaplog_path) < 0) { + fatal("storeSfsDirCloseTmpSwapLog: rename failed"); + } + fd = file_open(swaplog_path, O_WRONLY | O_CREAT | O_BINARY); + if (fd < 0) { + debug(50, 1) ("%s: %s\n", swaplog_path, xstrerror()); + fatal("storeSfsDirCloseTmpSwapLog: Failed to open swap log."); + } + safe_free(swaplog_path); + safe_free(new_path); + sfsinfo->swaplog_fd = fd; + debug(47, 3) ("Cache Dir #%d log opened on FD %d\n", sd->index, fd); +} + +static FILE * +storeSfsDirOpenTmpSwapLog(SwapDir * sd, int *clean_flag, int *zero_flag) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + char *swaplog_path = xstrdup(storeSfsDirSwapLogFile(sd, NULL)); + char *clean_path = xstrdup(storeSfsDirSwapLogFile(sd, ".last-clean")); + char *new_path = xstrdup(storeSfsDirSwapLogFile(sd, ".new")); + struct stat log_sb; + struct stat clean_sb; + FILE *fp; + int fd; + if (stat(swaplog_path, &log_sb) < 0) { + debug(47, 1) ("Cache Dir #%d: No log file\n", sd->index); + safe_free(swaplog_path); + safe_free(clean_path); + safe_free(new_path); + return NULL; + } + *zero_flag = log_sb.st_size == 0 ? 1 : 0; + /* close the existing write-only FD */ + if (sfsinfo->swaplog_fd >= 0) + file_close(sfsinfo->swaplog_fd); + /* open a write-only FD for the new log */ + fd = file_open(new_path, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); + if (fd < 0) { + debug(50, 1) ("%s: %s\n", new_path, xstrerror()); + fatal("storeDirOpenTmpSwapLog: Failed to open swap log."); + } + sfsinfo->swaplog_fd = fd; + /* open a read-only stream of the old log */ + fp = fopen(swaplog_path, "r"); + if (fp == NULL) { + debug(50, 0) ("%s: %s\n", swaplog_path, xstrerror()); + fatal("Failed to open swap log for reading"); + } +#if defined(_SQUID_CYGWIN_) + setmode(fileno(fp), O_BINARY); +#endif + memset(&clean_sb, '\0', sizeof(struct stat)); + if (stat(clean_path, &clean_sb) < 0) + *clean_flag = 0; + else if (clean_sb.st_mtime < log_sb.st_mtime) + *clean_flag = 0; + else + *clean_flag = 1; + safeunlink(clean_path, 1); + safe_free(swaplog_path); + safe_free(clean_path); + safe_free(new_path); + return fp; +} + +struct _clean_state { + char *cur; + char *new; + char *cln; + char *outbuf; + off_t outbuf_offset; + int fd; + RemovalPolicyWalker *walker; +}; + +#define CLEAN_BUF_SZ 16384 +/* + * Begin the process to write clean cache state. For SFS this means + * opening some log files and allocating write buffers. Return 0 if + * we succeed, and assign the 'func' and 'data' return pointers. + */ +static int +storeSfsDirWriteCleanStart(SwapDir * sd) +{ + struct _clean_state *state = xcalloc(1, sizeof(*state)); + struct stat sb; + sd->log.clean.write = NULL; + sd->log.clean.state = NULL; + state->new = xstrdup(storeSfsDirSwapLogFile(sd, ".clean")); + state->fd = file_open(state->new, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); + if (state->fd < 0) { + xfree(state->new); + xfree(state); + return -1; + } + state->cur = xstrdup(storeSfsDirSwapLogFile(sd, NULL)); + state->cln = xstrdup(storeSfsDirSwapLogFile(sd, ".last-clean")); + state->outbuf = xcalloc(CLEAN_BUF_SZ, 1); + state->outbuf_offset = 0; + state->walker = sd->repl->WalkInit(sd->repl); +#if !(defined(_SQUID_OS2_) || defined (_SQUID_CYGWIN_)) + unlink(state->new); +#endif + unlink(state->cln); + debug(20, 3) ("storeDirWriteCleanLogs: opened %s, FD %d\n", + state->new, state->fd); +#if HAVE_FCHMOD + if (stat(state->cur, &sb) == 0) + fchmod(state->fd, sb.st_mode); +#endif + sd->log.clean.write = storeSfsDirWriteCleanEntry; + sd->log.clean.state = state; + return 0; +} + +/* + * Get the next entry that is a candidate for clean log writing + */ +const StoreEntry * +storeSfsDirCleanLogNextEntry(SwapDir * sd) +{ + const StoreEntry *entry = NULL; + struct _clean_state *state = sd->log.clean.state; + if (state->walker) + entry = state->walker->Next(state->walker); + return entry; +} + +/* + * "write" an entry to the clean log file. + */ +static void +storeSfsDirWriteCleanEntry(SwapDir * sd, const StoreEntry * e) +{ + storeSwapLogData s; + static size_t ss = sizeof(storeSwapLogData); + struct _clean_state *state = sd->log.clean.state; + memset(&s, '\0', ss); + s.op = (char) SWAP_LOG_ADD; + s.swap_filen = e->swap_filen; + s.timestamp = e->timestamp; + s.lastref = e->lastref; + s.expires = e->expires; + s.lastmod = e->lastmod; + s.swap_file_sz = e->swap_file_sz; + s.refcount = e->refcount; + s.flags = e->flags; + xmemcpy(&s.key, e->hash.key, MD5_DIGEST_CHARS); + xmemcpy(state->outbuf + state->outbuf_offset, &s, ss); + state->outbuf_offset += ss; + /* buffered write */ + if (state->outbuf_offset + ss > CLEAN_BUF_SZ) { + if (write(state->fd, state->outbuf, state->outbuf_offset) < 0) { + debug(50, 0) ("storeDirWriteCleanLogs: %s: write: %s\n", + state->new, xstrerror()); + debug(20, 0) ("storeDirWriteCleanLogs: Current swap logfile not replaced.\n"); + file_close(state->fd); + state->fd = -1; + unlink(state->new); + safe_free(state); + sd->log.clean.state = NULL; + sd->log.clean.write = NULL; + } + state->outbuf_offset = 0; + } +} + +static void +storeSfsDirWriteCleanDone(SwapDir * sd) +{ + int fd; + struct _clean_state *state = sd->log.clean.state; + if (NULL == state) + return; + if (state->fd < 0) + return; + state->walker->Done(state->walker); + if (write(state->fd, state->outbuf, state->outbuf_offset) < 0) { + debug(50, 0) ("storeDirWriteCleanLogs: %s: write: %s\n", + state->new, xstrerror()); + debug(20, 0) ("storeDirWriteCleanLogs: Current swap logfile " + "not replaced.\n"); + file_close(state->fd); + state->fd = -1; + unlink(state->new); + } + safe_free(state->outbuf); + /* + * You can't rename open files on Microsoft "operating systems" + * so we have to close before renaming. + */ + storeSfsDirCloseSwapLog(sd); + /* save the fd value for a later test */ + fd = state->fd; + /* rename */ + if (state->fd >= 0) { +#if defined(_SQUID_OS2_) || defined (_SQUID_CYGWIN_) + file_close(state->fd); + state->fd = -1; + if (unlink(state->cur) < 0) + debug(50, 0) ("storeDirWriteCleanLogs: unlinkd failed: %s, %s\n", + xstrerror(), state->cur); +#endif + xrename(state->new, state->cur); + } + /* touch a timestamp file if we're not still validating */ + if (store_dirs_rebuilding) + (void) 0; + else if (fd < 0) + (void) 0; + else + file_close(file_open(state->cln, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY)); + /* close */ + safe_free(state->cur); + safe_free(state->new); + safe_free(state->cln); + if (state->fd >= 0) + file_close(state->fd); + state->fd = -1; + safe_free(state); + sd->log.clean.state = NULL; + sd->log.clean.write = NULL; +} + +static void +storeSwapLogDataFree(void *s) +{ + memFree(s, MEM_SWAP_LOG_DATA); +} + +static void +storeSfsDirSwapLog(const SwapDir * sd, const StoreEntry * e, int op) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) sd->fsdata; + storeSwapLogData *s = memAllocate(MEM_SWAP_LOG_DATA); + s->op = (char) op; + s->swap_filen = e->swap_filen; + s->timestamp = e->timestamp; + s->lastref = e->lastref; + s->expires = e->expires; + s->lastmod = e->lastmod; + s->swap_file_sz = e->swap_file_sz; + s->refcount = e->refcount; + s->flags = e->flags; + xmemcpy(s->key, e->hash.key, MD5_DIGEST_CHARS); + file_write(sfsinfo->swaplog_fd, + -1, + s, + sizeof(storeSwapLogData), + NULL, + NULL, + (FREE *) storeSwapLogDataFree); +} + +static void +storeSfsDirNewfs(SwapDir * sd) +{ + debug(47, 3) ("Creating swap space in %s\n", sd->path); + storeSfsDirCreateDirectory(sd->path, 0); + storeSfsDirCreateSwapSubDirs(sd); +} + +static int +rev_int_sort(const void *A, const void *B) +{ + const int *i1 = A; + const int *i2 = B; + return *i2 - *i1; +} + +static int +storeSfsDirClean(int swap_index) +{ + DIR *dp = NULL; + struct dirent *de = NULL; + LOCAL_ARRAY(char, p1, MAXPATHLEN + 1); + LOCAL_ARRAY(char, p2, MAXPATHLEN + 1); +#if USE_TRUNCATE + struct stat sb; +#endif + int files[20]; + int swapfileno; + int fn; /* same as swapfileno, but with dirn bits set */ + int n = 0; + int k = 0; + int N0, N1, N2; + int D0, D1, D2; + SwapDir *SD; + sfsinfo_t *sfsinfo; + N0 = n_sfs_dirs; + D0 = sfs_dir_index[swap_index % N0]; + SD = &Config.cacheSwap.swapDirs[D0]; + sfsinfo = (sfsinfo_t *) SD->fsdata; + N1 = sfsinfo->l1; + D1 = (swap_index / N0) % N1; + N2 = sfsinfo->l2; + D2 = ((swap_index / N0) / N1) % N2; + snprintf(p1, SQUID_MAXPATHLEN, "%s/%02X/%02X", + Config.cacheSwap.swapDirs[D0].path, D1, D2); + debug(36, 3) ("storeDirClean: Cleaning directory %s\n", p1); + dp = opendir(p1); + if (dp == NULL) { + if (errno == ENOENT) { + debug(36, 0) ("storeDirClean: WARNING: Creating %s\n", p1); + if (mkdir(p1, 0777) == 0) + return 0; + } + debug(50, 0) ("storeDirClean: %s: %s\n", p1, xstrerror()); + safeunlink(p1, 1); + return 0; + } + while ((de = readdir(dp)) != NULL && k < 20) { + if (sscanf(de->d_name, "%X", &swapfileno) != 1) + continue; + fn = swapfileno; /* XXX should remove this cruft ! */ + if (storeSfsDirValidFileno(SD, fn, 1)) + if (storeSfsDirMapBitTest(SD, fn)) + if (storeSfsFilenoBelongsHere(fn, D0, D1, D2)) + continue; +#if USE_TRUNCATE + if (!stat(de->d_name, &sb)) + if (sb.st_size == 0) + continue; +#endif + files[k++] = swapfileno; + } + closedir(dp); + if (k == 0) + return 0; + qsort(files, k, sizeof(int), rev_int_sort); + if (k > 10) + k = 10; + for (n = 0; n < k; n++) { + debug(36, 3) ("storeDirClean: Cleaning file %08X\n", files[n]); + snprintf(p2, MAXPATHLEN + 1, "%s/%08X", p1, files[n]); +#if USE_TRUNCATE + truncate(p2, 0); +#else + safeunlink(p2, 0); +#endif + statCounter.swap.files_cleaned++; + } + debug(36, 3) ("Cleaned %d unused files from %s\n", k, p1); + return k; +} + +static void +storeSfsDirCleanEvent(void *unused) +{ + static int swap_index = 0; + int i; + int j = 0; + int n = 0; + /* + * Assert that there are SFS cache_dirs configured, otherwise + * we should never be called. + */ + assert(n_sfs_dirs); + if (NULL == sfs_dir_index) { + SwapDir *sd; + sfsinfo_t *sfsinfo; + /* + * Initialize the little array that translates SFS cache_dir + * number into the Config.cacheSwap.swapDirs array index. + */ + sfs_dir_index = xcalloc(n_sfs_dirs, sizeof(*sfs_dir_index)); + for (i = 0, n = 0; i < Config.cacheSwap.n_configured; i++) { + sd = &Config.cacheSwap.swapDirs[i]; + if (!storeSfsDirIs(sd)) + continue; + sfs_dir_index[n++] = i; + sfsinfo = (sfsinfo_t *) sd->fsdata; + j += (sfsinfo->l1 * sfsinfo->l2); + } + assert(n == n_sfs_dirs); + /* + * Start the storeSfsDirClean() swap_index with a random + * value. j equals the total number of SFS level 2 + * swap directories + */ + swap_index = (int) (squid_random() % j); + } + if (0 == store_dirs_rebuilding) { + n = storeSfsDirClean(swap_index); + swap_index++; + } + eventAdd("storeDirClean", storeSfsDirCleanEvent, NULL, + 15.0 * exp(-0.25 * n), 1); +} + +static int +storeSfsDirIs(SwapDir * sd) +{ + if (strncmp(sd->type, "sfs", 3) == 0) + return 1; + return 0; +} + +/* + * Does swapfile number 'fn' belong in cachedir #F0, + * level1 dir #F1, level2 dir #F2? + */ +static int +storeSfsFilenoBelongsHere(int fn, int F0, int F1, int F2) +{ + int D1, D2; + int L1, L2; + int filn = fn; + sfsinfo_t *sfsinfo; + assert(F0 < Config.cacheSwap.n_configured); + sfsinfo = (sfsinfo_t *) Config.cacheSwap.swapDirs[F0].fsdata; + L1 = sfsinfo->l1; + L2 = sfsinfo->l2; + D1 = ((filn / L2) / L2) % L1; + if (F1 != D1) + return 0; + D2 = (filn / L2) % L2; + if (F2 != D2) + return 0; + return 1; +} + +int +storeSfsDirValidFileno(SwapDir * SD, sfileno filn, int flag) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + if (filn < 0) + return 0; + /* + * If flag is set it means out-of-range file number should + * be considered invalid. + */ + if (flag) + if (filn > sfsinfo->map->max_n_files) + return 0; + return 1; +} + +void +storeSfsDirMaintain(SwapDir * SD) +{ + StoreEntry *e = NULL; + int removed = 0; + int max_scan; + int max_remove; + double f; + RemovalPurgeWalker *walker; + /* We can't delete objects while rebuilding swap */ + if (store_dirs_rebuilding) { + return; + } else { + f = (double) (SD->cur_size - SD->low_size) / (SD->max_size - SD->low_size); + f = f < 0.0 ? 0.0 : f > 1.0 ? 1.0 : f; + max_scan = (int) (f * 400.0 + 100.0); + max_remove = (int) (f * 70.0 + 10.0); + /* + * This is kinda cheap, but so we need this priority hack? + */ + } + debug(20, 3) ("storeMaintainSwapSpace: f=%f, max_scan=%d, max_remove=%d\n", f, max_scan, max_remove); + walker = SD->repl->PurgeInit(SD->repl, max_scan); + while (1) { + if (SD->cur_size < SD->low_size) + break; + if (removed >= max_remove) + break; + e = walker->Next(walker); + if (!e) + break; /* no more objects */ + removed++; + storeRelease(e); + } + walker->Done(walker); + debug(20, (removed ? 2 : 3)) ("storeSfsDirMaintain: %s removed %d/%d f=%.03f max_scan=%d\n", + SD->path, removed, max_remove, f, max_scan); +} + +/* + * storeSfsDirCheckObj + * + * This routine is called by storeDirSelectSwapDir to see if the given + * object is able to be stored on this filesystem. SFS filesystems will + * happily store anything as long as the LRU time isn't too small. + */ +int +storeSfsDirCheckObj(SwapDir * SD, const StoreEntry * e) +{ +#if OLD_UNUSED_CODE + if (storeSfsDirExpiredReferenceAge(SD) < 300) { + debug(20, 3) ("storeSfsDirCheckObj: NO: LRU Age = %d\n", + storeSfsDirExpiredReferenceAge(SD)); + /* store_check_cachable_hist.no.lru_age_too_low++; */ + return -1; + } +#endif + /* Return 999 (99.9%) constant load */ + return 999; +} + +/* + * storeSfsDirRefObj + * + * This routine is called whenever an object is referenced, so we can + * maintain replacement information within the storage fs. + */ +void +storeSfsDirRefObj(SwapDir * SD, StoreEntry * e) +{ + debug(1, 3) ("storeSfsDirRefObj: referencing %p %d/%d\n", e, e->swap_dirn, + e->swap_filen); + if (SD->repl->Referenced) + SD->repl->Referenced(SD->repl, e, &e->repl); +} + +/* + * storeSfsDirUnrefObj + * This routine is called whenever the last reference to an object is + * removed, to maintain replacement information within the storage fs. + */ +void +storeSfsDirUnrefObj(SwapDir * SD, StoreEntry * e) +{ + debug(1, 3) ("storeSfsDirUnrefObj: referencing %p %d/%d\n", e, e->swap_dirn, + e->swap_filen); + if (SD->repl->Dereferenced) + SD->repl->Dereferenced(SD->repl, e, &e->repl); +} + +/* + * storeSfsDirUnlinkFile + * + * This routine unlinks a file and pulls it out of the bitmap. + * It used to be in storeSfsUnlink(), however an interface change + * forced this bit of code here. Eeek. + */ +void +storeSfsDirUnlinkFile(SwapDir * SD, sfileno f) +{ + debug(79, 3) ("storeSfsDirUnlinkFile: unlinking fileno %08X\n", f); + /* storeSfsDirMapBitReset(SD, f); */ + unlinkdUnlink(storeSfsDirFullPath(SD, f, NULL)); +} + +/* + * Add and remove the given StoreEntry from the replacement policy in + * use. + */ + +void +storeSfsDirReplAdd(SwapDir * SD, StoreEntry * e) +{ + debug(20, 4) ("storeSfsDirReplAdd: added node %p to dir %d\n", e, + SD->index); + SD->repl->Add(SD->repl, e, &e->repl); +} + + +void +storeSfsDirReplRemove(StoreEntry * e) +{ + SwapDir *SD = INDEXSD(e->swap_dirn); + debug(20, 4) ("storeSfsDirReplRemove: remove node %p from dir %d\n", e, + SD->index); + SD->repl->Remove(SD->repl, e, &e->repl); +} + + + +/* ========== LOCAL FUNCTIONS ABOVE, GLOBAL FUNCTIONS BELOW ========== */ + +void +storeSfsDirStats(SwapDir * SD, StoreEntry * sentry) +{ + sfsinfo_t *sfsinfo = SD->fsdata; + int totl_kb = 0; + int free_kb = 0; + int totl_in = 0; + int free_in = 0; + int x; + storeAppendPrintf(sentry, "First level subdirectories: %d\n", sfsinfo->l1); + storeAppendPrintf(sentry, "Second level subdirectories: %d\n", sfsinfo->l2); + storeAppendPrintf(sentry, "Maximum Size: %d KB\n", SD->max_size); + storeAppendPrintf(sentry, "Current Size: %d KB\n", SD->cur_size); + storeAppendPrintf(sentry, "Percent Used: %0.2f%%\n", + 100.0 * SD->cur_size / SD->max_size); + storeAppendPrintf(sentry, "Filemap bits in use: %d of %d (%d%%)\n", + sfsinfo->map->n_files_in_map, sfsinfo->map->max_n_files, + percent(sfsinfo->map->n_files_in_map, sfsinfo->map->max_n_files)); + x = storeDirGetUFSStats(SD->path, &totl_kb, &free_kb, &totl_in, &free_in); + if (0 == x) { + storeAppendPrintf(sentry, "Filesystem Space in use: %d/%d KB (%d%%)\n", + totl_kb - free_kb, + totl_kb, + percent(totl_kb - free_kb, totl_kb)); + storeAppendPrintf(sentry, "Filesystem Inodes in use: %d/%d (%d%%)\n", + totl_in - free_in, + totl_in, + percent(totl_in - free_in, totl_in)); + } + storeAppendPrintf(sentry, "Flags:"); + if (SD->flags.selected) + storeAppendPrintf(sentry, " SELECTED"); + if (SD->flags.read_only) + storeAppendPrintf(sentry, " READ-ONLY"); + storeAppendPrintf(sentry, "\n"); +#if OLD_UNUSED_CODE +#if !HEAP_REPLACEMENT + storeAppendPrintf(sentry, "LRU Expiration Age: %6.2f days\n", + (double) storeSfsDirExpiredReferenceAge(SD) / 86400.0); +#else + storeAppendPrintf(sentry, "Storage Replacement Threshold:\t%f\n", + heap_peepminkey(sd.repl.heap.heap)); +#endif +#endif /* OLD_UNUSED_CODE */ +} + +/* + * storeSfsDirReconfigure + * + * This routine is called when the given swapdir needs reconfiguring + */ +void +storeSfsDirReconfigure(SwapDir * sd, int index, char *path) +{ + char *token; + int i; + int size; + int l1; + int l2; + unsigned int read_only = 0; + + i = GetInteger(); + size = i << 10; /* Mbytes to kbytes */ + if (size <= 0) + fatal("storeSfsDirReconfigure: invalid size value"); + i = GetInteger(); + l1 = i; + if (l1 <= 0) + fatal("storeSfsDirReconfigure: invalid level 1 directories value"); + i = GetInteger(); + l2 = i; + if (l2 <= 0) + fatal("storeSfsDirReconfigure: invalid level 2 directories value"); + if ((token = strtok(NULL, w_space))) + if (!strcasecmp(token, "read-only")) + read_only = 1; + + /* just reconfigure it */ + if (size == sd->max_size) + debug(3, 1) ("Cache dir '%s' size remains unchanged at %d KB\n", + path, size); + else + debug(3, 1) ("Cache dir '%s' size changed to %d KB\n", + path, size); + sd->max_size = size; + if (sd->flags.read_only != read_only) + debug(3, 1) ("Cache dir '%s' now %s\n", + path, read_only ? "Read-Only" : "Read-Write"); + sd->flags.read_only = read_only; + return; +} + +void +storeSfsDirDump(StoreEntry * entry, const char *name, SwapDir * s) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) s->fsdata; + storeAppendPrintf(entry, "%s %s %s %d %d %d\n", + name, + "sfs", + s->path, + s->max_size >> 10, + sfsinfo->l1, + sfsinfo->l2); +} + +/* + * Only "free" the filesystem specific stuff here + */ +static void +storeSfsDirFree(SwapDir * s) +{ + sfsinfo_t *sfsinfo = (sfsinfo_t *) s->fsdata; + if (sfsinfo->swaplog_fd > -1) { + file_close(sfsinfo->swaplog_fd); + sfsinfo->swaplog_fd = -1; + } + filemapFreeMemory(sfsinfo->map); + xfree(sfsinfo); + s->fsdata = NULL; /* Will aid debugging... */ + +} + +char * +storeSfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath) +{ + LOCAL_ARRAY(char, fullfilename, SQUID_MAXPATHLEN); + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + int L1 = sfsinfo->l1; + int L2 = sfsinfo->l2; + if (!fullpath) + fullpath = fullfilename; + fullpath[0] = '\0'; + snprintf(fullpath, SQUID_MAXPATHLEN, "%s/%02X/%02X/%08X", + SD->path, + ((filn / L2) / L2) % L1, + (filn / L2) % L2, + filn); + return fullpath; +} + +/* + * storeSfsCleanupDoubleCheck + * + * This is called by storeCleanup() if -S was given on the command line. + */ +static int +storeSfsCleanupDoubleCheck(SwapDir * sd, StoreEntry * e) +{ + struct stat sb; + if (stat(storeSfsDirFullPath(sd, e->swap_filen, NULL), &sb) < 0) { + debug(20, 0) ("storeSfsCleanupDoubleCheck: MISSING SWAP FILE\n"); + debug(20, 0) ("storeSfsCleanupDoubleCheck: FILENO %08X\n", e->swap_filen); + debug(20, 0) ("storeSfsCleanupDoubleCheck: PATH %s\n", + storeSfsDirFullPath(sd, e->swap_filen, NULL)); + storeEntryDump(e, 0); + return -1; + } + if (e->swap_file_sz != sb.st_size) { + debug(20, 0) ("storeSfsCleanupDoubleCheck: SIZE MISMATCH\n"); + debug(20, 0) ("storeSfsCleanupDoubleCheck: FILENO %08X\n", e->swap_filen); + debug(20, 0) ("storeSfsCleanupDoubleCheck: PATH %s\n", + storeSfsDirFullPath(sd, e->swap_filen, NULL)); + debug(20, 0) ("storeSfsCleanupDoubleCheck: ENTRY SIZE: %d, FILE SIZE: %d\n", + e->swap_file_sz, (int) sb.st_size); + storeEntryDump(e, 0); + return -1; + } + return 0; +} + +/* + * storeSfsDirParse + * + * Called when a *new* fs is being setup. + */ +void +storeSfsDirParse(SwapDir * sd, int index, char *path) +{ + char *token; + int i; + int size; + int l1; + int l2; + unsigned int read_only = 0; + sfsinfo_t *sfsinfo; + + i = GetInteger(); + size = i << 10; /* Mbytes to kbytes */ + if (size <= 0) + fatal("storeSfsDirParse: invalid size value"); + i = GetInteger(); + l1 = i; + if (l1 <= 0) + fatal("storeSfsDirParse: invalid level 1 directories value"); + i = GetInteger(); + l2 = i; + if (l2 <= 0) + fatal("storeSfsDirParse: invalid level 2 directories value"); + if ((token = strtok(NULL, w_space))) + if (!strcasecmp(token, "read-only")) + read_only = 1; + + sfsinfo = xmalloc(sizeof(sfsinfo_t)); + if (sfsinfo == NULL) + fatal("storeSfsDirParse: couldn't xmalloc() sfsinfo_t!\n"); + + sd->index = index; + sd->path = xstrdup(path); + sd->max_size = size; + sd->fsdata = sfsinfo; + sfsinfo->l1 = l1; + sfsinfo->l2 = l2; + sfsinfo->swaplog_fd = -1; + sfsinfo->map = NULL; /* Debugging purposes */ + sfsinfo->suggest = 0; + sd->flags.read_only = read_only; + sd->init = storeSfsDirInit; + sd->newfs = storeSfsDirNewfs; + sd->dump = storeSfsDirDump; + sd->freefs = storeSfsDirFree; + sd->dblcheck = storeSfsCleanupDoubleCheck; + sd->statfs = storeSfsDirStats; + sd->maintainfs = storeSfsDirMaintain; + sd->checkobj = storeSfsDirCheckObj; + sd->refobj = storeSfsDirRefObj; + sd->unrefobj = storeSfsDirUnrefObj; + sd->callback = NULL; + sd->sync = NULL; + sd->obj.create = storeSfsCreate; + sd->obj.open = storeSfsOpen; + sd->obj.close = storeSfsClose; + sd->obj.read = storeSfsRead; + sd->obj.write = storeSfsWrite; + sd->obj.unlink = storeSfsUnlink; + sd->log.open = storeSfsDirOpenSwapLog; + sd->log.close = storeSfsDirCloseSwapLog; + sd->log.write = storeSfsDirSwapLog; + sd->log.clean.start = storeSfsDirWriteCleanStart; + sd->log.clean.nextentry = storeSfsDirCleanLogNextEntry; + sd->log.clean.done = storeSfsDirWriteCleanDone; + + /* Initialise replacement policy stuff */ + sd->repl = createRemovalPolicy(Config.replPolicy); +} + +/* + * Initial setup / end destruction + */ +void +storeSfsDirDone(void) +{ + memPoolDestroy(sfs_state_pool); + sfs_initialised = 0; +} + +void +storeFsSetup_sfs(storefs_entry_t * storefs) +{ + assert(!sfs_initialised); + storefs->parsefunc = storeSfsDirParse; + storefs->reconfigurefunc = storeSfsDirReconfigure; + storefs->donefunc = storeSfsDirDone; + sfs_state_pool = memPoolCreate("SFS IO State data", sizeof(sfsstate_t)); + sfs_initialised = 1; +} --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/store_io_sfs.c Wed Feb 14 00:49:24 2007 @@ -0,0 +1,264 @@ + +/* + * $Id: store_io_sfs.c,v 1.1.2.1 2001/01/24 14:11:54 adri Exp $ + * + * DEBUG: section 79 Storage Manager SFS Interface + * AUTHOR: Duane Wessels + * + * SQUID Web Proxy Cache http://www.squid-cache.org/ + * ---------------------------------------------------------- + * + * Squid is the result of efforts by numerous individuals from + * the Internet community; see the CONTRIBUTORS file for full + * details. Many organizations have provided support for Squid's + * development; see the SPONSORS file for full details. Squid is + * Copyrighted (C) 2001 by the Regents of the University of + * California; see the COPYRIGHT file for full details. Squid + * incorporates software developed and/or copyrighted by other + * sources; see the CREDITS file for full details. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA. + * + */ + +#include "squid.h" +#include "store_sfs.h" + + +static DRCB storeSfsReadDone; +static DWCB storeSfsWriteDone; +static void storeSfsIOCallback(storeIOState * sio, int errflag); +static CBDUNL storeSfsIOFreeEntry; + +/* === PUBLIC =========================================================== */ + +storeIOState * +storeSfsOpen(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, + STIOCB * callback, void *callback_data) +{ + sfileno f = e->swap_filen; + char *path = storeSfsDirFullPath(SD, f, NULL); + storeIOState *sio; + struct stat sb; + int fd; + debug(79, 3) ("storeSfsOpen: fileno %08X\n", f); + fd = file_open(path, O_RDONLY | O_BINARY); + if (fd < 0) { + debug(79, 3) ("storeSfsOpen: got failure (%d)\n", errno); + return NULL; + } + debug(79, 3) ("storeSfsOpen: opened FD %d\n", fd); + sio = CBDATA_ALLOC(storeIOState, storeSfsIOFreeEntry); + sio->fsstate = memPoolAlloc(sfs_state_pool); + + sio->swap_filen = f; + sio->swap_dirn = SD->index; + sio->mode = O_RDONLY; + sio->callback = callback; + sio->callback_data = callback_data; + cbdataLock(callback_data); + sio->e = e; + ((sfsstate_t *) (sio->fsstate))->fd = fd; + ((sfsstate_t *) (sio->fsstate))->flags.writing = 0; + ((sfsstate_t *) (sio->fsstate))->flags.reading = 0; + ((sfsstate_t *) (sio->fsstate))->flags.close_request = 0; + if (fstat(fd, &sb) == 0) + sio->st_size = sb.st_size; + store_open_disk_fd++; + + /* We should update the heap/dlink position here ! */ + return sio; +} + +storeIOState * +storeSfsCreate(SwapDir * SD, StoreEntry * e, STFNCB * file_callback, STIOCB * callback, void *callback_data) +{ + storeIOState *sio; + int fd; + int mode = (O_WRONLY | O_CREAT | O_TRUNC | O_BINARY); + char *path; + sfsinfo_t *sfsinfo = (sfsinfo_t *) SD->fsdata; + sfileno filn; + sdirno dirn; + + /* Allocate a number */ + dirn = SD->index; + filn = storeSfsDirMapBitAllocate(SD); + sfsinfo->suggest = filn + 1; + /* Shouldn't we handle a 'bitmap full' error here? */ + path = storeSfsDirFullPath(SD, filn, NULL); + + debug(79, 3) ("storeSfsCreate: fileno %08X\n", filn); + fd = file_open(path, mode); + if (fd < 0) { + debug(79, 3) ("storeSfsCreate: got failure (%d)\n", errno); + return NULL; + } + debug(79, 3) ("storeSfsCreate: opened FD %d\n", fd); + sio = CBDATA_ALLOC(storeIOState, storeSfsIOFreeEntry); + sio->fsstate = memPoolAlloc(sfs_state_pool); + + sio->swap_filen = filn; + sio->swap_dirn = dirn; + sio->mode = mode; + sio->callback = callback; + sio->callback_data = callback_data; + cbdataLock(callback_data); + sio->e = (StoreEntry *) e; + ((sfsstate_t *) (sio->fsstate))->fd = fd; + ((sfsstate_t *) (sio->fsstate))->flags.writing = 0; + ((sfsstate_t *) (sio->fsstate))->flags.reading = 0; + ((sfsstate_t *) (sio->fsstate))->flags.close_request = 0; + store_open_disk_fd++; + + /* now insert into the replacement policy */ + storeSfsDirReplAdd(SD, e); + return sio; +} + +void +storeSfsClose(SwapDir * SD, storeIOState * sio) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + + debug(79, 3) ("storeSfsClose: dirno %d, fileno %08X, FD %d\n", + sio->swap_dirn, sio->swap_filen, sfsstate->fd); + if (sfsstate->flags.reading || sfsstate->flags.writing) { + sfsstate->flags.close_request = 1; + return; + } + storeSfsIOCallback(sio, 0); +} + +void +storeSfsRead(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, STRCB * callback, void *callback_data) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + + assert(sio->read.callback == NULL); + assert(sio->read.callback_data == NULL); + sio->read.callback = callback; + sio->read.callback_data = callback_data; + cbdataLock(callback_data); + debug(79, 3) ("storeSfsRead: dirno %d, fileno %08X, FD %d\n", + sio->swap_dirn, sio->swap_filen, sfsstate->fd); + sio->offset = offset; + sfsstate->flags.reading = 1; + file_read(sfsstate->fd, + buf, + size, + offset, + storeSfsReadDone, + sio); +} + +void +storeSfsWrite(SwapDir * SD, storeIOState * sio, char *buf, size_t size, off_t offset, FREE * free_func) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + debug(79, 3) ("storeSfsWrite: dirn %d, fileno %08X, FD %d\n", sio->swap_dirn, sio->swap_filen, sfsstate->fd); + sfsstate->flags.writing = 1; + file_write(sfsstate->fd, + offset, + buf, + size, + storeSfsWriteDone, + sio, + free_func); +} + +void +storeSfsUnlink(SwapDir * SD, StoreEntry * e) +{ + debug(79, 3) ("storeSfsUnlink: fileno %08X\n", e->swap_filen); + storeSfsDirReplRemove(e); + storeSfsDirMapBitReset(SD, e->swap_filen); + storeSfsDirUnlinkFile(SD, e->swap_filen); +} + +/* === STATIC =========================================================== */ + +static void +storeSfsReadDone(int fd, const char *buf, int len, int errflag, void *my_data) +{ + storeIOState *sio = my_data; + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + STRCB *callback = sio->read.callback; + void *their_data = sio->read.callback_data; + ssize_t rlen; + + debug(79, 3) ("storeSfsReadDone: dirno %d, fileno %08X, FD %d, len %d\n", + sio->swap_dirn, sio->swap_filen, fd, len); + sfsstate->flags.reading = 0; + if (errflag) { + debug(79, 3) ("storeSfsReadDone: got failure (%d)\n", errflag); + rlen = -1; + } else { + rlen = (ssize_t) len; + sio->offset += len; + } + assert(callback); + assert(their_data); + sio->read.callback = NULL; + sio->read.callback_data = NULL; + if (cbdataValid(their_data)) + callback(their_data, buf, (size_t) rlen); + cbdataUnlock(their_data); +} + +static void +storeSfsWriteDone(int fd, int errflag, size_t len, void *my_data) +{ + storeIOState *sio = my_data; + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + debug(79, 3) ("storeSfsWriteDone: dirno %d, fileno %08X, FD %d, len %d\n", + sio->swap_dirn, sio->swap_filen, fd, len); + sfsstate->flags.writing = 0; + if (errflag) { + debug(79, 0) ("storeSfsWriteDone: got failure (%d)\n", errflag); + storeSfsIOCallback(sio, errflag); + return; + } + sio->offset += len; + if (sfsstate->flags.close_request) + storeSfsIOCallback(sio, errflag); +} + +static void +storeSfsIOCallback(storeIOState * sio, int errflag) +{ + sfsstate_t *sfsstate = (sfsstate_t *) sio->fsstate; + debug(79, 3) ("storeSfsIOCallback: errflag=%d\n", errflag); + if (sfsstate->fd > -1) { + file_close(sfsstate->fd); + store_open_disk_fd--; + } + if (cbdataValid(sio->callback_data)) + sio->callback(sio->callback_data, errflag, sio); + cbdataUnlock(sio->callback_data); + sio->callback_data = NULL; + sio->callback = NULL; + cbdataFree(sio); +} + + +/* + * Clean up any references from the SIO before it get's released. + */ +static void +storeSfsIOFreeEntry(void *sio) +{ + memPoolFree(sfs_state_pool, ((storeIOState *) sio)->fsstate); +} --- /dev/null Wed Feb 14 00:48:56 2007 +++ squid/src/fs/sfs/store_sfs.h Wed Feb 14 00:49:24 2007 @@ -0,0 +1,56 @@ +/* + * store_sfs.h + * + * Internal declarations for the sfs routines + */ + +#ifndef __STORE_SFS_H__ +#define __STORE_SFS_H__ + +#include "sfs_defines.h" +#include "sfs_lib.h" + +struct _sfsinfo_t { + int swaplog_fd; + int l1; + int l2; + fileMap *map; + int suggest; +}; + +struct _sfsstate_t { + int fd; + struct { + unsigned int close_request:1; + unsigned int reading:1; + unsigned int writing:1; + } flags; +}; + +typedef struct _sfsinfo_t sfsinfo_t; +typedef struct _sfsstate_t sfsstate_t; + +/* The sfs_state memory pool */ +extern MemPool *sfs_state_pool; + +/* + * store dir stuff + */ +extern void storeSfsDirMapBitReset(SwapDir *, sfileno); +extern int storeSfsDirMapBitAllocate(SwapDir *); +extern char *storeSfsDirFullPath(SwapDir * SD, sfileno filn, char *fullpath); +extern void storeSfsDirUnlinkFile(SwapDir *, sfileno); +extern void storeSfsDirReplAdd(SwapDir * SD, StoreEntry *); +extern void storeSfsDirReplRemove(StoreEntry *); + +/* + * Store IO stuff + */ +extern STOBJCREATE storeSfsCreate; +extern STOBJOPEN storeSfsOpen; +extern STOBJCLOSE storeSfsClose; +extern STOBJREAD storeSfsRead; +extern STOBJWRITE storeSfsWrite; +extern STOBJUNLINK storeSfsUnlink; + +#endif