Go: A beginners guide to Record UDF’s in Aerospike

Things I try to cover in this article

  • What, When, and How to use a UDF.
  • A real-world scenario where we fit in UDF.
  • How to write, register, execute a UDF in Aerospike Context.

What is a UDF?

UDF means User Defined Functions which means the database has to run the user-defined function in its own context at run time and return the response back to the client.

The user-defined function can be written in any language that the database understands. In the case of Aerospike, UDF’s are written in Lua which is a scripting language. By default, aerospike comes with a Lua Interpreter to interpret and execute the scripts.

When to use UDF in Aerospike?

Aerospike is a distributed Key-Value store where the value can be of any aerospike scalar data types like Int, Float, List, Map, Bool, Bytes. Aerospike does provide lots of operations for the client to execute on these data types such as AutoIncr for Int, MapRemoveByKeys, MapGetByKeys for the map, ListAppend, ListPrepend for lists which are all atomic & thread-safe in nature.

By default when we pick aerospike as our data store, the modeling of the data should be in such a way that we can get the work done with the default atomic operations that Aerospike provides.

But when the application grows and requirements alter over time, we might need some sort of operations that are atomic in nature but not available in Aerospike. That is when we need a User Defined function where we can write our custom code and pass it to Aerospike so it can execute the UDF under a record lock that is thread-safe.

Real World UseCase

We at MakeMyTrip & GoIbibo work for Discounting Service. We have a functionality where if a customer isn’t satisfied with the booking(various measures) and they reach out to our service desk, based on the scenario the customer care team can give a unique voucher to the user which he can apply in his next booking.

We use aerospike to store all our Unique Voucher Details and the structure goes this way.

We have User ID as a key, Line of Business-LOB(Flights, Hotels, etc) as a bin and for each bin, we have a Map of Key & Value where the Map Key will be the unique Voucher and Map value is some sort of metadata representing the voucher.

When we started, it's like a voucher can be used only on a specific LOB and when a user used it, we remove the key from Map on a specific Bin by using Aerospike MapDeleteByKey functionality.

But then we got a use case where certain users can use the promo code in any LOB once distributed and whenever the promo code is used in one LOB, it cannot be used in any other LOB again.

The initial design was we replicate such promo codes in all Bins and when a usage happens, we delete the map key from all bins in that user record.

When we thought everything was going as planned, that’s when we discovered Aerospike doesn’t provide the functionality to delete a map key across all the available bins for the record.

This means if we create a promo code across 10 LOB’s, we need to make 10 aerospike calls for MapDeleteByKey by sending the bin names in each call.

  • Aerospike doesn’t have support for application-level transactions. Hence while making the 10 network calls and if one of the network calls returned an error, the application should handle the partial failures.
  • Should the application keep retrying until the operation is succeeded? What happens when the retry limit exceeds and still the operation isn’t succeeded. There might be many such cases of partial failures and as the project is revenue-related, we can’t allow such failures.

This is when we thought of having a custom UDF that operates at a record level and deletes the map keys across all Bins and returns back to us a comma-separated string of all bins from which the promo code is deleted. This trim of map key from all bins happens with one Aerospike call and it will be either a successful execution or fail everything.

How to write a UDF?

As mentioned earlier, UDF’s are written in Lua. For our use case below is the UDF which trims the map key from all available bins and returns the bins from which the promo code is deleted.

Let’s understand each step here.

  • Line -1 -> When we define the function, the first argument should be the record. This is passed to the UDF by aerospike based on the Key that the UDF is getting executed on. The function arguments other than record are custom arguments passed by the client that can be used in our UDF. These custom arguments have to be one of the aerospike scalar data types.
  • We then check if the record exists, if not return back error.
  • If exists, we go through all the bins in a for loop and remove the key from the map.
  • In the end, we update the aerospike record with the trimmed bin data.
  • While all these operations are being done, Aerospike holds a record lock at the database layer so that any Get or Set Operation is performed on this record Key, they’ll have to wait for this UDF to complete. **No dirty reads or writes**

Now that we wrote our Lua Script to basically trim the map keys across bins, it's time for us to register the script with Aerospike Node.

This can be done in two ways.

  • The database administrator can manually check in to the node and copy the Lua content in /opt/aerospike/usr/udf/lua which serves as a default directory for Lua Scripts. (This path can be changed from the Aerospike Config File while starting the server).
  • The client application can register this file when the server starts. This depends on the programming language that the client follows. **It’s always better for an administrator to register the modules rather than we do it from code**

Let us first try to register the UDF via Terminal | Command-Line. Copy the content of the above UDF and create a file called test.lua in /opt/aerospike/usr/udf/lua

Register your module using the above command and specify the path where the Lua file is present in the aerospike node.

Check if your module is registered properly or not. In case of any syntax issues in Lua code, while registering aerospike throws an error. If the module is successfully registered, it should appear in show modules

When we register the modules, it compiles the Lua code into byte code and resides at the server node.

Now that we registered our modules to aerospike, let us also check how to execute these Lua functions to delete the data across bins.

This is the sample bin data before executing the script. (For security reasons I’m not able to specify the namespace and set names and map keys)

The Execute command takes the fileName and the functionName with a set of arguments to be passed to the function(PromoCode) in our case and on what namespace, set name and record key that the UDF to be executed on. Above is the command for the same.

Now once the command is executed and we check the results of the record, the map key is deleted from all the bins that are part of the record in one single Aerospike call and is atomic in nature.

What happens when a UDF is deleted and there are some requests currently executing that UDF?

  • Aerospike makes sure that once the delete is executed on Aerospike, it blocks any further calls from executing the UDF and it also finishes all the current requests which are already accepted, and once done, the module is removed.

Let us also check on how to execute a UDF from Golang Client.

Output

Just like Record UDF’s we do have a capability of Stream UDF’s as well in Aerospike which is really useful for aggregation. We will cover that as part of another article.

Hope you enjoyed the article. In case of any queries or suggestions, please use the comments section and I’ll try to reply asap.

I will starve to death if you don’t feed me some code. Quora : https://www.quora.com/profile/Mourya-Venkat-1