GAE

Magic Properties on Google AppEngine

Update to my previous Google AppEngine auto-updating properties post.

In my last post I was talking about "[...] however there is still an issue, for now the MagicProperty is only updated the first time the property it affects is set"

I fixed that issue, it has a local cache that is added to the model, so if the value does change, the cache would no longer be valid and the value is recreated, otherwise the already computed value is returned. There was however another issue I had to deal with, and that was the fact that the cache was not being written to the datastore, this meant that upon getting the data back from the datastore we recomputed it anyway. Now the programmer has to do some extra work, they have to add a cache unindexed property to the model, and pass it into the MagicProperty to be used as the caching variable. Within the save() they have to prime the cache by setting a temporary variable equal to the to be computed variable just in case the cache variables have not been primed, this has not yet been documented other than in my SVN history.

We still don't allow setting of the magic property, however upon returning from the datastore it calls make_value_from_datastore, which returns a _MagicDatastore, as long as we set with an instance of _MagicDatastore it will allow it to go through, so we can get the value from the datastore without having to recompute the value.

Read the code for an example on how to use this.

Code is added below, feel free to use it as you wish under the license that is attached.

###
 # Copyright (c) 2010 Bert JW Regeer;
 #
 # Permission to use, copy, modify, and distribute this software for any
 # purpose with or without fee is hereby granted, provided that the above
 # copyright notice and this permission notice appear in all copies.
 #
 # THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 # WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 # MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 # ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 # WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 # ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 # OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 #
###

import logging
import hashlib
from google.appengine.ext import db


def MagicProperty(prop, magic_func=None, cache_prop=None, pass_instance=False, *args, **kw):
	if magic_func:
		# No pants required.
		return _MagicProperty(prop, magic_func, cache_prop, pass_instance, *args, **kw)
	else:
		# We are putting some pants on the function.
		def pants(magic_func):
			return _MagicProperty(prop, magic_func, cache_prop, pass_instance, *args, **kw)
		return pants

class _MagicDatastore():
	"""This is an internal class for _MagicProperty."""
	
	def __init__(self, val):
		self.value = val
	
	def retval(self):
		return self.value

class _MagicProperty(db.Property):
	"""MagicProperty which will modify output based on a function that it is given.
	
	This has several ways in which it may be called:
	
	In this example we create the model, no caching is done, so any data that is returned from the datastore
	will be erased the first time that the MagicProperty is accessed and recomputed.
	
		class MagicTest(db.Model):
			title = db.StringProperty(required=True)
			chars = utils.MagicProperty(title, len, required=True)
	
		mytest = MagicTest(title="It was for the good of the school!")
		
		>>> print mytest.title
		It was for the good of the school!
		>>> print mytest.chars
		34
		
		mytest = MagicTest.all().get()
		
		>>> print mytest.title
		Hello
		>>> print mytest.chars		
		5	# Do note, this is recalculated the first time it is called, as long as title does not change it won't be recomputed.
	
	In this example we create the model, and we also create a caching property so that even when we get values back 
	from the datastore we use the cached computed value rather than running the function again. Do note that this requires overriding put()
	to prime the cache as there is currently no way to specify that certain properties should be "saved" before others.
	
		class MagicTesting(db.Model):
			title = db.StringProperty(required=True)
			cache = db.UnindexedProperty()
			chars = utils.MagicProperty(title, len, cache_prop=cache, required=True)
		
			def put(self, *args, **kw):
				prime_cache = self.chars
				super(MagicTesting).put(*args, **kw)
		
		mytesting = MagicTesting(title="Get the pocket knife out of my boot.")
	
		>>> print mytesting.title
		Get the pocket knife out of my boot.
		>>> print mytesting.chars
		36
		
		mytesting = MagicTest.all().get()
		
		>>> print mytesting.title
		How are you?
		>>> print mytesting.chars	
		12	# This is not recomputed so long as the title has not changed, however it uses more datastore space to store a hash.
		
		
	Inspired by: 
	http://appengine-cookbook.appspot.com/recipe/custom-model-properties-are-cute
	http://code.google.com/appengine/articles/extending_models.html
	http://googleappengine.blogspot.com/2009/07/writing-custom-property-classes.html
	
	"""
	def __init__(self, prop, magic_func, cache_prop, pass_instance, *args, **kw):
		"""
		Extra parameters you can give this initializer.
		
			prop		= Property to be acted upon
			magic_func	= The function to be called when the property is accessed
			cache_prop	= The property that can hold our cache, I suggest it is an db.UnindexedProperty() since it just stores sha1 hashes
		"""
		super(_MagicProperty, self).__init__(*args, **kw)
		self.magic_func = magic_func
		self.magic_prop = prop
		self.magic_cache = cache_prop
		self.magic_pass = pass_instance
		
	def get_cache_val(self, model_instance, class_instance):
		if self.magic_cache is not None:
			return self.magic_cache.__get__(model_instance, class_instance)
		return getattr(model_instance, self.attr_name() + "orig", None)
			
	def set_cache_val(self, model_instance, val):
		val = hashlib.sha1(val).hexdigest()
		
		if self.magic_cache is not None:
			self.magic_cache.__set__(model_instance, val)
		setattr(model_instance, self.attr_name() + "orig", val)
	
	def attr_name(self):
		# In google.appengine.ex.db there is an explicit warning not to use this method, so we test for it first.
		if self._attr_name:
			return self._attr_name()
		else:
			return "_" + self.name
	
	def __get__(self, model_instance, class_instance):
		if model_instance is None:
			return self
		
		cur = self.magic_prop.__get__(model_instance, class_instance)
		cur = cur.encode('utf-8')		
		last = self.get_cache_val(model_instance, class_instance)
		if last == hashlib.sha1(cur).hexdigest():
			logging.info("Cache hit: %s" % (cur))
			return getattr(model_instance, self.attr_name(), None)
		
		logging.info("Cache miss: %s" % (cur))
		
		magic_done = u""
		if self.magic_pass:
			magic_done = self.magic_func(model_instance, cur)
		else:
			magic_done = self.magic_func(cur)
		
		# Set the attribute in the model
		setattr(model_instance, self.attr_name(), magic_done)
		self.set_cache_val(model_instance, cur)
		
		return magic_done
		
	
	def __set__(self, model_instance, value):
		if isinstance(value, _MagicDatastore):
			setattr(model_instance, self.attr_name(), value.retval())
		else:
			raise db.DerivedPropertyError("MagicProperty is magic. Magic may not be modified.")
	
	def make_value_from_datastore(self, value):
		return _MagicDatastore(value)

Auto-updating Properties in Google AppEngine Models

EDIT: I have an updated version of this code available in a newer post, please go to Magic Properties on Google AppEngine for the updated information. For the generic information that has not changed, and the idea behind the code please continue below.

----

In Google AppEngine a model may contain many different properties, from this model you can generate a form using djangoforms. This is all good and well until you have a required property on a model that needs to have part of it be calculated based upon something the user types in. For example, a blog entry generally has a "slug" which is generated from the title, now in Google AppEngine's models, if a property is required it has to be set at object creation time.

Djangoforms has a form.save(committed=False) that allows one to get back an unsaved Model, however since slug is not in the form (since it is auto generated) it cannot be set at model creation time. Even adding a hidden field to the form with the name of the property is not going to satisfy the model since a value of None is considered to be empty, which means it does not meet the required=True status.

After trying a multitude of different options I ran across an article about extending models that specifically talked about adding custom properties that were saved into the DB, and could be searched on, this got me digging and I found a blog post on the Google AppEngine blogspot about the exact same thing, using yet again a different example. The one that got me on the right track was the following post by Rodrigo Moraes on his custom model properties.

I came up with the following fairly quickly after also taking a look at googleappengine.ext.db within the Google AppEngine framework, however there is still an issue, for now the MagicProperty is only updated the first time the property it affects is set. I am not yet sure how to work around that, I will have to do some more digging, maybe there is a way to get notified if another property gets modified or updated.

import logging
from google.appengine.ext import db


def MagicProperty(prop, magic_func=None, *args, **kw):
	if magic_func:
		# No pants required.
		return _MagicProperty(prop, magic_func, *args, **kw)
	else:
		# We are putting some pants on the function.
		def pants(magic_func):
			return _MagicProperty(prop, magic_func, *args, **kw)
		return pants


class _MagicProperty(db.Property):
	"""MagicProperty which will modify output based on a function that it is given.
	
	Inspired by: 
	http://appengine-cookbook.appspot.com/recipe/custom-model-properties-are-cute
	http://code.google.com/appengine/articles/extending_models.html
	http://googleappengine.blogspot.com/2009/07/writing-custom-property-classes.html
	
	"""
	def __init__(self, prop, magic_func, *args, **kw):
		super(_MagicProperty, self).__init__(*args, **kw)
		self.magic_func = magic_func
		self.magic_prop = prop
	
	def attr_name(self):
		# In google.appengine.ex.db there is an explicit warning not to use this method, so we test for it first.
		if self._attr_name:
			return self._attr_name()
		else:
			return "_" + self.name
	
	def __get__(self, model_instance, class_instance):
		if model_instance is None:
			return self
		
		# TODO: If the property that this one is functioning on is changed, we need a way to know that so that we recalculate
		
		# Original __get__ in google.appengine.ext.db has getattr to retrieve the value from the model instance
		magic_done = getattr(model_instance, self.attr_name(), None)
		if magic_done is None:
			magic_done = self.magic_func(self.magic_prop.__get__(model_instance, class_instance))
			
			# Set the attribute in the model
			setattr(model_instance, self._attr_name(), magic_done)
		return magic_done
		
	
	def __set__(self, *args):
		raise db.DerivedPropertyError("MagicProperty is magic. Magic may not be modified.")

The approach that the DateTimeProperty takes is one that I can't use because my class is dependant upon an external property, not whether or not a value was set when the object was instantiated.

Google AppEngine is posing some rather interesting challenges. Although, for slugs it may not be entirely bad that it can only be set once, as that means title changes on the article won't cause the slug to change possibly breaking old links. This definitely requires more research and more thinking.

Interesting bug

I was starting to see something interesting in the output from my Google AppEngine based project. I was automatically creating a link based on a config setting, named SETTINGS["contact"], and the code would work perfectly the first time the page was created and loaded, however multiple refreshes of the same page would show that the link was gaining slashes even when it shouldn't have.

The logic was as follows:

if contact is "mailto:" then do nothing
else
        if contact does not start with "/" or not with "http"
                contact = "/" + contact

This code would work perfectly the first time around, however if I had properly tested by setting contact = "/contact" as well, I may have caught this bug faster.

However since that piece of code was only being called once per page view, it worked perfectly as far as I was aware, the thing is I started seeing more and more slashes being prepended to contact each time I refreshed the page. I looked at the code above, and did not see that my logic should have contained an "and" instead of an "or". Even so if the code is being loaded once per page view the variable "contact" that started off containing the word "contactme" should only turn into "/contactme" and not "////contactme" on the fourth page load.

Each time I added in a logging or debugging statement and refreshed the page it would go back to "/contactme", I would change another thing, or add another debug statement and "/contactme" was there staring me in the face. It took me a little while before I hit refresh a couple of times and noticed that my debug output now contained what I had seen beforehand. That is when it hit me, Google AppEngine caches imports that included my global variables. As long as I did not modify any of the files in my project it would happily use the cached variable, which after the first reload would contain "/contactme", then because of my use of or instead of and, would become "//contactme" on the second reload.

Google AppEngine has some very aggressive caching, which may well be required to have it run at the scale Google wants it to run. This is just another warning for myself that I need to watch out when I modify global variables and that they will stay the same across different sessions.

This is the logic I ended up with:

if contact is "mailto:" or contact starts with "http" then do nothing
else
        if contact does not start with "/"
                contact = "/" + contact

Works exactly how I had expected it to work.

Followup on my previous post; workaround choices not accepting tuples in AppEngine

In my previous post I was contemplating moving the forms back in with the model, I have now completed that transition. The form now lives in the same file as the model itself, which has made maintaining the two much easier since they are practically tied together. Also, since Google AppEngine does not allow one to set the choices field of a model property to something along the lines of: (("item", "user friendly item"), ("item2", "User friendly item 2")) in its choices initialiser; to have a form give the user the choices with the user friendly items but have the backend store "item", this is a work-around that solves the problem.

REDIRECT_CHOICES = (
	(301, "301 - Permanent Redirect"),
	(302, "302 - Temporary Redirect"),
	(403, "403 - Access Denied"),
	(404, "404 - File Not Found"),
	(500, "500 - Internal Server Error")
)

class Shorturl(db.Model):
	uripath = db.StringProperty(verbose_name="Path", required=True)
	httpcode = db.IntegerProperty(verbose_name="HTTP code", required=True, default=301, choices=[x[0] for x in REDIRECT_CHOICES])
	location = db.StringProperty(verbose_name="Location")

class ShorturlForm(djangoforms.ModelForm):
	httpcode = forms.IntegerField(widget=forms.Select(choices=REDIRECT_CHOICES), required=True, label='HTTP code')

	class Meta:
		model = Shorturl

Given the above tuple it will set the choices in the model for Google AppEngine and pass it to the field on the Django form allowing user friendly output. This will allow the user to pick the right redirect choice without having to know that various different redirect codes.

Project file structure

I'm currently working on a new website that will be hosted on Google AppEngine, however I am currently in the middle of revamping some older code and realised that maybe just maybe it belongs with the model rather than in a separate directory.

The setup currently is to have a models directory and a forms directory, they contain the models and the forms respectively, however I am starting to realise more and more that the form is a direct extension of the validation required in the model (which is not possible with the current database stuff that Google AppEngine offers), the form is taking user input and making sure that it is correct and then just passing it off to the model to check it into the datastore.

I'm thinking of merging the form into the model source code so that I can remove the forms folder entirely, since it just doesn't make sense to have the two in separate locations whereby one may be changed but the other may not. I'm afraid that they may get to be out of sync, and cause issues.

Will have to think about it for a little bit, no need to make rash decisions this late into the evening and blow away part of my source tree.

Syndicate content