Sunday, March 8, 2009

Session-based object instantiation with memcached

I can't help but ponder every so often the amount of objects that gets instantiated for a typical request when a framework-based application gets large and saturated. At runtime, a script sets up everything that it requires, allocating memory for variables and objects and all these gets torn down upon end of execution. There is no persistency beyond the notion of a session. And so a user goes about requesting for login.php and subsequently requests for say, MenuController.php in essence runs two disparate scripts without PHP bothering if they are from the same chap other than the fact that the session remains similar alongside any other information that might be stored in the session. This implies that an object instantiated upon login.php will no longer be present upon MenuController.php and needs to be instantiated again, that is unless it is stored in the session (and the code explicitly uses the object in session). So, there will be lots of object construction and destruction going on while the user of a PHP web app goes about her business dealing with the same app. Here's where object caching with memcached helps, but I thought another paradigm on top of it might be useful or feasible:

class SessionObjectSingleton
{
public static function getInstance($class)
{
if (!isset($_SESSION[$class])) {
$_SESSION[$class] = new $class;
}
return $_SESSION[$class];
}
}
So, everytime an object needs to be instantiated, we do this$newObj = SessionObjectFactory::getInstance("Some_class_name"); instead of $newObj = new Some_class_name;. Typical of a singleton but using a session as store, we get an object that persists along with the session that is specific to the session. Hence, the object can be user specific and still persist. Garbage collection is inherently managed when session_destroy() is called as usual. Furthermore, memcache can be used as the session handler, speeding up access to instantiated objects. (For more information on doing this, see phpslacker's session-clustering with memcache). Anyway, the objective of the whole shebang is to minimise resource use in a situation where objects are instantiated exhuberently. This paradigm has the utter inconvenience of having to instantiate object without the perennial new keyword, but heck other solutions seems to require expressive effort as well. It is also more useful for applications that needs to instantiate a lot of complex objects and maintain sessions that involve heavy state switching (e.g. back-end CMS-es). It is of course not relevant for applications that shouldn't be bothered with sessions in the first place. What do you think?

8 comments:

  1. Sessions do not store "objects" they store a serialized string representing that object..

    It still has to be "created" and if that object held any resources they are no longer valid and have to be redone.

    Complex objects will still be complex, will still have overhead to as they are created from the serialized data.

    But in the end only objects that store data pulled from the database or other "slow" sources and therefore can skip talking to those sources will benefit from such things.

    If you used this method for all factory methods it would provide unneeded overhead and a overload of useless default properties and data in your memcache

    ReplyDelete
  2. The question is: What do you want to achieve?

    Your code implements the singleton pattern with session persistence. If this is what you need then fine.

    But I understand your post to be more aimed at performance considerations, and until proven otherwise, I doubt you'll gain much. Each object you either want to instantiate or revive from persistent memory has it's class code to be parsed before.

    If that's your problem because your code base is getting bigger, then I'd rather try opcode caches, just like APC.

    ReplyDelete
  3. Have you actually come across performance problems with the default behaviour, or do you just have a hunch it must be bad so you'll "optimise" it out just in case?

    ReplyDelete
  4. Data in session gets serialized and torn down with the PHP process just as much as any other object. Unless your object instantiation has more overhead than PHP object deserialization (which in itself, creates a "new" instance anyway), then this is just going to make it worse.

    Serializing the object into memcached is much cleaner and won't pollute session data with things that don't naturally belong in it.

    ReplyDelete
  5. This seems like an interesting concept, but I worry that it sacrifices too much in the way of loose coupling if you're using this throughout an entire framework. If you're working in a framework like Zend, you lose a lot of what makes it unique and powerful if all of your classes depend on a SessionObjectFactory class to be instantiated.

    ReplyDelete
  6. Where objects have significant cost for initialization (lotta function calls/parsing files/DB queries), this is a good thing. E.g. a permissions/roles object for a user (store in session), or an app config object that has to parse ini/yaml/xml files (store serialized, but not in sessions!).

    Where this won't get you anything is simple objects with low construction cost (e.g. controller objects). Remember a memcache get doesn't do any magic: it just pulls a string and calls unserialize(). I.e. PHP still has to execute the bytecode to define the class and functions in your app, and this may take quite a bit of the request time.

    Guess I'm saying, don't throw out the "new" operator. :)

    ReplyDelete
  7. Thanks for all the feedback! Some further thoughts:

    Deserialization from memcached involves memory, while new object instantiation may require file access (barring opcode caches).
    Opcode caches are great, but we need further session specificity if we want to give user with long logins and plenty of action priority resource.

    ReplyDelete
  8. I'm actually working on a very similar idea at the moment. Only I skip the session part and save/restore directly from memcached. I agree with others here in that this should not be used blindly with every object. I use this sort of system mainly for business domain objects, because often these objects hit the database to fetch business-specific data.

    For example, I use this for a Customer object. This type of object will often have to select a row from a DB associated with a Customer ID. This happens on every request. But by caching the object in memcached, I save that DB call. Care must be taken when modifying the data and invalidating the cached object.

    For a set of Customers, I only need to hold the set of IDs in a list, as opposed to querying all data for all customers and holding it in a multi-dimensional array.

    I also use magic methods (__get, __set) to access object fields so I can intercept these requests and use lazy-loading. That way, if I want $customer->orders, I won't query the DB for orders upon instantiation. Oh, and each subsequent request for that customer's orders is pulled from cache, even across requests.

    ReplyDelete