--------------------- PatchSet 5050 Date: 2007/07/12 00:27:59 Author: amosjeffries Branch: docs Tag: (none) Log: Split Storage FS API and Removal Policy API documentation Members: doc/Programming-Guide/12_StorageInterface.dox:1.1.2.2->1.1.2.3 doc/Programming-Guide/12b_RemovalPolicy.dox:1.1->1.1.2.1 Index: squid3/doc/Programming-Guide/12_StorageInterface.dox =================================================================== RCS file: /cvsroot/squid-sf//squid3/doc/Programming-Guide/Attic/12_StorageInterface.dox,v retrieving revision 1.1.2.2 retrieving revision 1.1.2.3 diff -u -r1.1.2.2 -r1.1.2.3 --- squid3/doc/Programming-Guide/12_StorageInterface.dox 11 Jul 2007 23:35:00 -0000 1.1.2.2 +++ squid3/doc/Programming-Guide/12_StorageInterface.dox 12 Jul 2007 00:27:59 -0000 1.1.2.3 @@ -1,7 +1,7 @@ /** \page 12_StorageInterface Storage Interface -\section Itroduction Introduction +\section Introduction Introduction \par Traditionally, Squid has always used the Unix filesystem (UFS) to store cache objects on disk. Over the years, the @@ -21,10 +21,11 @@ \section BuildStructure Build structure \par - The storage types live in squid/src/fs/ . Each subdirectory corresponds + The storage types live in src/fs/ . Each subdirectory corresponds to the name of the storage type. When a new storage type is implemented configure.in must be updated to autogenerate a Makefile in - squid/src/fs/$type/ from a Makefile.in file. + src/fs/foo/ from a Makefile.in file. +\todo DOCS: add template addition to configure.in for storage module addition. \par configure will take a list of storage types through the @@ -34,15 +35,17 @@ \par Each storage type must create an archive file - in squid/src/fs/$type/.a . This file is automatically linked into + in src/fs/foo/.a . This file is automatically linked into squid at compile time. \par - Each storefs must export a function named storeFsSetup_$type(). + Each storefs must export a function named storeFsSetup_foo(). This function is called at runtime to initialise each storage type. The list of storage types is passed through store_modules.sh to generate the initialisation function storeFsSetup(). This function lives in store_modules.c. +\todo DOCS: find out what has replaced storeFsSetup() and store_modules.c + \par Example of the automatically generated file: automatically generated by ./store_modules.sh ufs coss do not edit @@ -53,29 +56,29 @@ extern STSETUP storeFsSetup_coss; void storeFsSetup(void) { - storeFsAdd("ufs", storeFsSetup_ufs); - storeFsAdd("coss", storeFsSetup_coss); + storeFsAdd("ufs", storeFsSetup_ufs); + storeFsAdd("coss", storeFsSetup_coss); } \endcode \section InitStorageType Initialization of a storage type \par - Each storage type initializes through the storeFsSetup_$type() - function. The storeFsSetup_$type() function takes a single + Each storage type initializes through the storeFsSetup_foo() + function. The storeFsSetup_foo() function takes a single argument - a storefs_entry_t pointer. This pointer references the storefs_entry to initialise. A typical setup function is as follows: \code void - storeFsSetup_ufs(storefs_entry_t *storefs) + storeFsSetup_foo(storefs_entry_t *storefs) { - assert(!ufs_initialised); - storefs->parsefunc = storeUfsDirParse; - storefs->reconfigurefunc = storeUfsDirReconfigure; - storefs->donefunc = storeUfsDirDone; - ufs_state_pool = memPoolCreate("UFS IO State data", sizeof(ufsstate_t)); - ufs_initialised = 1; + assert(!foo_initialised); + storefs->parsefunc = storeFooDirParse; + storefs->reconfigurefunc = storeFooDirReconfigure; + storefs->donefunc = storeFooDirDone; + foo_state_pool = memPoolCreate("FOO IO State data", sizeof(foostate_t)); + foo_initialised = 1; } \endcode @@ -774,310 +777,4 @@ move them to become the official state-holding log ready to be opened. - -\section ReplacementPolicyImplementation Replacement Policy Implementation - -\par -The replacement policy can be updated during STOBJREAD/STOBJWRITE/STOBJOPEN/ -STOBJCLOSE as well as STREFOBJ and STUNREFOBJ. Care should be taken to -only modify the relevant replacement policy entries in the StoreEntry. -The responsibility of replacement policy maintainence has been moved into -each SwapDir so that the storage code can have tight control of the -replacement policy. Cyclic filesystems such as COSS require this tight -coupling between the storage layer and the replacement policy. - -\section RemovalPolicyAPI Removal policy API - -\par - The removal policy is responsible for determining in which order - objects are deleted when Squid needs to reclaim space for new objects. - Such a policy is used by a object storage for maintaining the stored - objects and determining what to remove to reclaim space for new objects. - (together they implements a replacement policy) - -\subsection API API - -\par - It is implemented as a modular API where a storage directory or - memory creates a policy of choice for maintaining it's objects, - and modules registering to be used by this API. - -\subsubsection createRemovalPolicy createRemovalPolicy() - -\code - RemovalPolicy policy = createRemovalPolicy(cons char *type, cons char *args) -\endcode - -\par - Creates a removal policy instance where object priority can be - maintained - -\par - The returned RemovalPolicy instance is cbdata registered - -\subsubsection policy.free policy.Free() - -\code - policy->Free(RemovalPolicy *policy) -\endcode - -\par - Destroys the policy instance and frees all related memory. - -\subsubsection policy.Add policy.Add() - -\code - policy->Add(RemovalPolicy *policy, StoreEntry *, RemovalPolicyNode *node) -\endcode - -\par - Adds a StoreEntry to the policy instance. - -\par - datap is a pointer to where policy specific data can be stored - for the store entry, currently the size of one (void *) pointer. - -\subsubsection policy.Remove policy.Remove() -\code - policy->Remove(RemovalPolicy *policy, StoreEntry *, RemovalPolicyNode *node) -\endcode - -\par - Removes a StoreEntry from the policy instance out of - policy order. For example when an object is replaced - by a newer one or is manually purged from the store. - -\par - datap is a pointer to where policy specific data is stored - for the store entry, currently the size of one (void *) pointer. - -\subsubsection policy.Referenced policy.Referenced() -\code - policy->Referenced(RemovalPolicy *policy, const StoreEntry *, RemovalPolicyNode *node) -\endcode - -\par - Tells the policy that a StoreEntry is going to be referenced. Called - whenever a entry gets locked. - -\par - node is a pointer to where policy specific data is stored - for the store entry, currently the size of one (void *) pointer. - -\subsubsection policy.Dereferenced policy.Dereferenced() -\code - policy->Dereferenced(RemovalPolicy *policy, const StoreEntry *, RemovalPolicyNode *node) -\endcode - -\par - Tells the policy that a StoreEntry has been referenced. Called when - an access to the entry has finished. - -\par - node is a pointer to where policy specific data is stored - for the store entry, currently the size of one (void *) pointer. - -\subsubsection policy.WalkInit policy.WalkInit() -\code - RemovalPolicyWalker walker = policy->WalkInit(RemovalPolicy *policy) -\endcode - -\par - Initiates a walk of all objects in the policy instance. - The objects is returned in an order suitable for using - as reinsertion order when rebuilding the policy. - -\par - The returned RemovalPolicyWalker instance is cbdata registered - -\note The walk must be performed as an atomic operation - with no other policy actions intervening, or the outcome - will be undefined. - -\subsubsection walker.Next walker.Next() -\code - const StoreEntry *entry = walker->Next(RemovalPolicyWalker *walker) -\endcode - -\par - Gets the next object in the walk chain - -\par - Return NULL when there is no further objects - -\subsubsectino walker.Done walker.Done() -\code - walker->Done(RemovalPolicyWalker *walker) -\endcode - -\par - Finishes a walk of the maintained objects, destroys - walker. - -\subsubsection policy.PurgeInit policy.PurgeInit() -\code - RemovalPurgeWalker purgewalker = policy->PurgeInit(RemovalPolicy *policy, int max_scan) -\endcode - -\par - Initiates a search for removal candidates. Search depth is indicated - by max_scan. - -\par - The returned RemovalPurgeWalker instance is cbdata registered - -\note The walk must be performed as an atomic operation - with no other policy actions intervening, or the outcome - will be undefined. - -\subsubsection purgewalker.Next purgewalker.Next() -\code - StoreEntry *entry = purgewalker->Next(RemovalPurgeWalker *purgewalker) -\endcode - -\par - Gets the next object to purge. The purgewalker will remove each - returned object from the policy. - -\par - It is the polices responsibility to verify that the object - isn't locked or otherwise prevented from being removed. What this - means is that the policy must not return objects where - storeEntryLocked() is true. - -\par - Return NULL when there is no further purgeable objects in the policy. - -\subsubsection purgewalker.Done purgewalker.Done() - -\code - purgewalker->Done(RemovalPurgeWalker *purgewalker) -\endcode - -\par - Finishes a walk of the maintained objects, destroys - walker and restores the policy to it's normal state. - -\subsubsection policy.Stats policy.Stats() - -\code - purgewalker->Stats(RemovalPurgeWalker *purgewalker, StoreEntry *entry) -\endcode - -\par - Appends statistics about the policy to the given entry. - -\subsection SourceLayout Source layout - -\par - Policy implementations resides in src/repl/<name>/, and a make in - such a directory must result in a object archive src/repl/<name>.a - containing all the objects implementing the policy. - -\subsection InternalStructures Internal structures - -\subsubsection RemovalPolicy RemovalPolicy - -\code - typedef struct _RemovalPolicy RemovalPolicy; - struct _RemovalPolicy { - char *_type; - void *_data; - void (*add)(RemovalPolicy *policy, StoreEntry *); - ... // see the API definition above - }; -\endcode - -\par - The _type member is mainly for debugging and diagnostics purposes, and - should be a pointer to the name of the policy (same name as used for - creation) - -\par - The _data member is for storing policy specific information. - -\subsubsection RemvalPolicyWalker RemovalPolicyWalker - -\code - typedef struct _RemovalPolicyWalker RemovalPolicyWalker; - struct _RemovalPolicyWalker { - RemovalPolicy *_policy; - void *_data; - StoreEntry *(*next)(RemovalPolicyWalker *); - ... // see the API definition above - }; -\endcode - - -\subsubsection RemovalPolicyNode _RemovalPolicyNode - -\code - typedef struct _RemovalPolicyNode RemovalPolicyNode; - struct _RemovalPolicyNode { - void *data; - }; -\endcode - -\par - Stores policy specific information about a entry. Currently - there is only space for a single pointer, but plans are to - maybe later provide more space here to allow simple policies - to store all their data "inline" to preserve some memory. - -\subsection PolicyRegistration Policy Registration - -\par - Policies are automatically registered in the Squid binary from the - policy selection made by the user building Squid. In the future this - might get extended to support loadable modules. All registered - policies are available to object stores which wishes to use them. - -\subsection PolicyInstanceCreation >Policy instance creation - -\par - Each policy must implement a "create/new" function "RemovalPolicy * - createRemovalPolicy_<name>(char *arguments)". This function - creates the policy instance and populates it with at least the API - methods supported. Currently all API calls are mandatory, but the - policy implementation must make sure to NULL fill the structure prior - to populating it in order to assure future API compability. - -\par - It should also populate the _data member with a pointer to policy - specific data. - -\subsection Walker Walker - -\par - When a walker is created the policy populates it with at least the API - methods supported. Currently all API calls are mandatory, but the - policy implementation must make sure to NULL fill the structure prior - to populating it in order to assure future API compatibility. - -\subsection DesignNotes Design notes/bugs - -\par - The RemovalPolicyNode design is incomplete/insufficient. The intention - was to abstract the location of the index pointers from the policy - implementation to allow the policy to work on both on-disk and memory - caches, but unfortunately the purge method for HEAP based policies - needs to update this, and it is also preferable if the purge method - in general knows how to clear the information. I think the agreement - was that the current design of tightly coupling the two together - on one StoreEntry is not the best design possible. - -\par - It is debated if the design in having the policy index control the - clean index writes is the correct approach. Perhaps not. Perhaps a - more appropriate design is probably to do the store indexing - completely outside the policy implementation (i.e. using the hash - index), and only ask the policy to dump it's state somehow. - -\par - The Referenced/Dereferenced() calls is today mapped to lock/unlock - which is an approximation of when they are intended to be called. - However, the real intention is to have Referenced() called whenever - an object is referenced, and Dereferenced() only called when the - object has actually been used for anything good. - */ --- /dev/null Mon Jul 23 00:19:28 2007 +++ squid3/doc/Programming-Guide/12b_RemovalPolicy.dox Mon Jul 23 00:19:28 2007 @@ -0,0 +1,305 @@ +/** +\page 12b_RemovalPolicy Replacement Policy Implementation + +\par +The replacement policy can be updated during STOBJREAD/STOBJWRITE/STOBJOPEN/ +STOBJCLOSE as well as STREFOBJ and STUNREFOBJ. Care should be taken to +only modify the relevant replacement policy entries in the StoreEntry. +The responsibility of replacement policy maintainence has been moved into +each SwapDir so that the storage code can have tight control of the +replacement policy. Cyclic filesystems such as COSS require this tight +coupling between the storage layer and the replacement policy. + +\section RemovalPolicyAPI Removal policy API + +\par + The removal policy is responsible for determining in which order + objects are deleted when Squid needs to reclaim space for new objects. + Such a policy is used by a object storage for maintaining the stored + objects and determining what to remove to reclaim space for new objects. + (together they implements a replacement policy) + +\par + It is implemented as a modular API where a storage directory or + memory creates a policy of choice for maintaining it's objects, + and modules registering to be used by this API. + +\subsection createRemovalPolicy createRemovalPolicy() + +\code + RemovalPolicy policy = createRemovalPolicy(cons char *type, cons char *args) +\endcode + +\par + Creates a removal policy instance where object priority can be + maintained + +\par + The returned RemovalPolicy instance is cbdata registered + +\subsection policy.free policy.Free() + +\code + policy->Free(RemovalPolicy *policy) +\endcode + +\par + Destroys the policy instance and frees all related memory. + +\subsection policy.Add policy.Add() + +\code + policy->Add(RemovalPolicy *policy, StoreEntry *, RemovalPolicyNode *node) +\endcode + +\par + Adds a StoreEntry to the policy instance. + +\par + datap is a pointer to where policy specific data can be stored + for the store entry, currently the size of one (void *) pointer. + +\subsection policy.Remove policy.Remove() +\code + policy->Remove(RemovalPolicy *policy, StoreEntry *, RemovalPolicyNode *node) +\endcode + +\par + Removes a StoreEntry from the policy instance out of + policy order. For example when an object is replaced + by a newer one or is manually purged from the store. + +\par + datap is a pointer to where policy specific data is stored + for the store entry, currently the size of one (void *) pointer. + +\subsection policy.Referenced policy.Referenced() +\code + policy->Referenced(RemovalPolicy *policy, const StoreEntry *, RemovalPolicyNode *node) +\endcode + +\par + Tells the policy that a StoreEntry is going to be referenced. Called + whenever a entry gets locked. + +\par + node is a pointer to where policy specific data is stored + for the store entry, currently the size of one (void *) pointer. + +\subsection policy.Dereferenced policy.Dereferenced() +\code + policy->Dereferenced(RemovalPolicy *policy, const StoreEntry *, RemovalPolicyNode *node) +\endcode + +\par + Tells the policy that a StoreEntry has been referenced. Called when + an access to the entry has finished. + +\par + node is a pointer to where policy specific data is stored + for the store entry, currently the size of one (void *) pointer. + +\subsection policy.WalkInit policy.WalkInit() +\code + RemovalPolicyWalker walker = policy->WalkInit(RemovalPolicy *policy) +\endcode + +\par + Initiates a walk of all objects in the policy instance. + The objects is returned in an order suitable for using + as reinsertion order when rebuilding the policy. + +\par + The returned RemovalPolicyWalker instance is cbdata registered + +\note The walk must be performed as an atomic operation + with no other policy actions intervening, or the outcome + will be undefined. + +\subsection walker.Next walker.Next() +\code + const StoreEntry *entry = walker->Next(RemovalPolicyWalker *walker) +\endcode + +\par + Gets the next object in the walk chain + +\par + Return NULL when there is no further objects + +\subsection walker.Done walker.Done() +\code + walker->Done(RemovalPolicyWalker *walker) +\endcode + +\par + Finishes a walk of the maintained objects, destroys + walker. + +\subsection policy.PurgeInit policy.PurgeInit() +\code + RemovalPurgeWalker purgewalker = policy->PurgeInit(RemovalPolicy *policy, int max_scan) +\endcode + +\par + Initiates a search for removal candidates. Search depth is indicated + by max_scan. + +\par + The returned RemovalPurgeWalker instance is cbdata registered + +\note The walk must be performed as an atomic operation + with no other policy actions intervening, or the outcome + will be undefined. + +\subsection purgewalker.Next purgewalker.Next() +\code + StoreEntry *entry = purgewalker->Next(RemovalPurgeWalker *purgewalker) +\endcode + +\par + Gets the next object to purge. The purgewalker will remove each + returned object from the policy. + +\par + It is the polices responsibility to verify that the object + isn't locked or otherwise prevented from being removed. What this + means is that the policy must not return objects where + storeEntryLocked() is true. + +\par + Return NULL when there is no further purgeable objects in the policy. + +\subsection purgewalker.Done purgewalker.Done() + +\code + purgewalker->Done(RemovalPurgeWalker *purgewalker) +\endcode + +\par + Finishes a walk of the maintained objects, destroys + walker and restores the policy to it's normal state. + +\subsection policy.Stats policy.Stats() + +\code + purgewalker->Stats(RemovalPurgeWalker *purgewalker, StoreEntry *entry) +\endcode + +\par + Appends statistics about the policy to the given entry. + +\section SourceLayout Source layout + +\par + Policy implementations resides in src/repl/<name>/, and a make in + such a directory must result in a object archive src/repl/<name>.a + containing all the objects implementing the policy. + +\section InternalStructures Internal structures + +\subsection RemovalPolicy RemovalPolicy + +\code + typedef struct _RemovalPolicy RemovalPolicy; + struct _RemovalPolicy { + char *_type; + void *_data; + void (*add)(RemovalPolicy *policy, StoreEntry *); + ... // see the API definition above + }; +\endcode + +\par + The _type member is mainly for debugging and diagnostics purposes, and + should be a pointer to the name of the policy (same name as used for + creation) + +\par + The _data member is for storing policy specific information. + +\subsection RemvalPolicyWalker RemovalPolicyWalker + +\code + typedef struct _RemovalPolicyWalker RemovalPolicyWalker; + struct _RemovalPolicyWalker { + RemovalPolicy *_policy; + void *_data; + StoreEntry *(*next)(RemovalPolicyWalker *); + ... // see the API definition above + }; +\endcode + + +\subsection RemovalPolicyNode _RemovalPolicyNode + +\code + typedef struct _RemovalPolicyNode RemovalPolicyNode; + struct _RemovalPolicyNode { + void *data; + }; +\endcode + +\par + Stores policy specific information about a entry. Currently + there is only space for a single pointer, but plans are to + maybe later provide more space here to allow simple policies + to store all their data "inline" to preserve some memory. + +\section PolicyRegistration Policy Registration + +\par + Policies are automatically registered in the Squid binary from the + policy selection made by the user building Squid. In the future this + might get extended to support loadable modules. All registered + policies are available to object stores which wishes to use them. + +\section PolicyInstanceCreation >Policy instance creation + +\par + Each policy must implement a "create/new" function "RemovalPolicy * + createRemovalPolicy_<name>(char *arguments)". This function + creates the policy instance and populates it with at least the API + methods supported. Currently all API calls are mandatory, but the + policy implementation must make sure to NULL fill the structure prior + to populating it in order to assure future API compability. + +\par + It should also populate the _data member with a pointer to policy + specific data. + +\section Walker Walker + +\par + When a walker is created the policy populates it with at least the API + methods supported. Currently all API calls are mandatory, but the + policy implementation must make sure to NULL fill the structure prior + to populating it in order to assure future API compatibility. + +\section DesignNotes Design notes/bugs + +\par + The RemovalPolicyNode design is incomplete/insufficient. The intention + was to abstract the location of the index pointers from the policy + implementation to allow the policy to work on both on-disk and memory + caches, but unfortunately the purge method for HEAP based policies + needs to update this, and it is also preferable if the purge method + in general knows how to clear the information. I think the agreement + was that the current design of tightly coupling the two together + on one StoreEntry is not the best design possible. + +\par + It is debated if the design in having the policy index control the + clean index writes is the correct approach. Perhaps not. Perhaps a + more appropriate design is probably to do the store indexing + completely outside the policy implementation (i.e. using the hash + index), and only ask the policy to dump it's state somehow. + +\par + The Referenced/Dereferenced() calls is today mapped to lock/unlock + which is an approximation of when they are intended to be called. + However, the real intention is to have Referenced() called whenever + an object is referenced, and Dereferenced() only called when the + object has actually been used for anything good. + + */