Thursday, July 01, 2010

Performance bug in AppStats

I was trying to upload a CSV file with 100 entries to my local dev_appserver, but it was way too slow (30s). I first thought it was due to the datastore file implementation, and tried using the sqlite one, but it still took 30s to complete.

After a few hours putting time.clock all around the place, it turns out the problem was in AppStats. When AppStats collects all the info about a request, it stores in memcache the trace of every AppEngine API call. My code is like this:

@local_or_admin_required
def post(self):
# csv_data comes from FieldStorage, and these fields are not converted to
# Unicode by AppEngine / WebOb. So csv_data is a utf-8 bytestring.
csv_data = self.request.get('file_to_convert')
read_pos = csv.DictReader(csv_data.split('\n'))
generated_pos = [self.convert_press_office(po) for po in read_pos]
po_with_errors = []

if self.request.get('import_file_after_conversion') == 'true':
# Write the result to the datastore
for po in generated_pos:
name = po.name
del po.name

try:
update.add_press_office(name, **po)
except update.DuplicatedPressOfficeName:
...

add_press_office makes 3 API calls, so in total I'm making 300 API calls. For each call AppStats records the entire stack frame, with local variables. This includes csv_data, generated_pos, that are 50K and around 100K respectively. Total more than ~45M.

The 300 stack traces with pointers to local variables (csv_data, generated_pos, ...) are sharing the 50K and 100K of these variables, but as soon as you try to serialize it in a protocol buffer, it copies this information for each stack. And this happends in appstats/recording.py@356 in the function Recorder::get_both_protos_encoded. This function first encodes the full protocol buffer, and wastes 30s doing it. Then it discovers that "oh no! this thing is too big to keep it in memcache!" and it deletes the local variables from the stack traces. But at this point is too late and you already paid the 30s.

If you have this problem, add "appstats_MAX_LOCALS = 0" to your appengine_config.py, you will lose the value of local variables in AppStats but it will be faster.


No comments: