Journey With Django Signals

mourya venkat
5 min readApr 11, 2020

--

Before starting of what signals are, I would like to give a brief about

  • What my use case was and why I chose to use Django signals.
  • What are all the road blocks that I’ve hit and how I solved them.
  • How Django opens a transaction on save.
  • How Signals can pulled us into infinite recursion

Background

I am part of GoIbibo Discounting Team where in our day to day job is provide a way to give discounts to users and charge convenience fee from them based on certain criteria. That certain criteria from which we decide the discounts and convenience fee is via evaluating some hundreds of rules set by the revenue team.

Obviously for the revenue team to configure rules, they need a panel. Instead of building one from scratch, we thought of going ahead with Django as it provides a very easy to use and easy to build admin panel for different set of use cases.

We all know Django is either coupled with SQL Databases or can be NOSQL Databases like Mongo or Couch, but for the scale at which we operate(100–120 K RPM) none of the above specified databases are suitable. So we chose Aerospike(Persistent store based out of SSD) as our primary datastore where in the data will be stored by getting transformed into key value store which will be done by our backend GO micro service. Now that I’ve given enough background on what we were doing, let’s get to the problem statement.

Problem Statement

Now on every new save or update of an existing rule, data gets stored in MYSQL DB which had to be transformed and sent to the backend service for storing in Aerospike which brings us couple of problems related to Partial Failures

  • If due to some error in transformation logic(at any of the two services), the save happened at MYSQL but the backend service didn’t received a call. In this case we are having a partial failure where in data getting stored at one end and not at the other. With hundreds of rules, it’s impossible to debug an issue caused by these sort of partial failures.

Approaches

  • Approach-1 is where we thought of saving the rule to our backend service first and if it succeeds then we go ahead and store the rule in admin. The problem here is that we have nested of nested models(with many to many, one to one, many to one relationships) and before storing the data into MYSQL there is no possible way for us to get the instance’s nested data.
  • Approach-2 is where we thought of saving the rule to the admin first with sync_status as False and make a network call to our backend service and when it returns success status, we update the sync_status to True for that rule. So if there is an intermediate failure we can surely know if it failed or not. ( If backend service returns true and there is a failure at updating the sync status at admin end, it’s not a problem as next time we try to re-sync the same data )

Implementations and road blocks

Moving ahead with the advantages by approach-2, this is how Django Signals came into picture. This is how our initial app layout looks like

The code inside a model looks this way

If you look at the piece of code shared above, we were writing all the application logic in our models.py which is an anti-pattern. Ideally the save and application logic should be decoupled from each other, which mean there should be some sort of event that should trigger the start of transformation process and call our back-end service. Here comes Django Signals.

With Django Signals we completely decoupled our application logic from models. An overview of how Signals work.

We not had one more file in our directory called signals.py where in the sole responsibility of it is to listen to different set of events.

Here comes one of the interesting problem.

Who has to update the sync_status flag, is it our admin service or the backend service. We decided to go ahead by deciding that our backend service had to update the flag on successful data storage in Aerospike.

Road Block -1

When updating the sync status from our backend service, I keep getting this lock time out error and on further analysis I some how figured out that when Django save get’s called, it starts a transaction and locks the records from any updates until post-save signal returns. As it’s an anti-pattern of updating the DB from some other service and turns out it’s not working as expected, we halted this approach

To overcome this, we spun up a thread with a delay of 5 seconds so that the post save signal doesn’t get blocked and the transaction completes. We perform all the transformation logic and update the sync_status in the background once the backend service returns success status.

Road Block -2

This resulted in an infinite recursion. This is one of the silliest mistakes I’ve made. As this save is being done on a post save signal, the save will invoke this post save signal again and the post save signal invokes save in a never ending loop.

At this phase, I was tired of solutions and losing hope. This is where I came across Django Filter Update.

With this, instead of saving the record which invokes the post save signal I’ve gone ahead with filter update where in it updates the record instead os saving it. With this, the infinite recursion is ended and the partial failure problem of syncing data between two services had been successfully solved.

Please feel free to raise any concerns.

--

--

mourya venkat
mourya venkat

Written by mourya venkat

I will starve to death if you don’t feed me some code. Quora : https://www.quora.com/profile/Mourya-Venkat-1

No responses yet