It looks like you are using Internet Explorer, which unfortunately is not supported. Please use a modern browser like Chrome, Firefox, Safari or Edge.

Building a fully serverless invoicing data pipeline for Transval

Published in Technology

Written by

Jouni Tenhunen
Junior Software and Support Engineer

Jouni Tenhunen works for Nitor part-time while completing his master's studies at the University of Helsinki. Jouni repaired power plants and large motors in his previous life but now enjoys developing backend services in modern cloud environments.

Article

October 20, 2021 · 3 min read time

Transval, the leading Finnish logistics, and warehouse outsourcing company, needed an automatic way to move invoices from an old invoicing system used by its parent company to a new internal ERP system. Together with the customer, we planned and implemented a fully serverless pipeline built with AWS technologies. This was the first real-world customer software project for two junior developers.

The development process involved making several decisions based on a couple of principles: simple asynchronous structure and keeping costs low. During initial planning, it was decided that as a proof of concept we would build a pipeline to handle the simplest form of invoices from the legacy ERP system. After this, the application could easily be expanded. 

Collaboration between developers was easy because every single functionality was developed as a separate function. We could both work on different components pretty independently. 

Transval also wanted a JSON REST API in order to define their own JSON invoice format and have clients use this API as well. This way they are not tied down to one ERP for invoice handling. We agreed to use only AWS serverless components to keep things simple and costs low. This minimum viable product infrastructure and code was deployed in a matter of a few weeks.

After it was seen that the system performs as expected, we started working on handling the more complex forms of invoices. The whole infrastructure was defined as Cloudformation and Serverless framework templates. The deployment process was handled by Nitor-developed Nameless Deploy Tools (NDT). Automated deployment and testing were done with a combination of BitBucket webhooks + AWS CodeBuild, NDT, and Serverless framework.

The pipeline has now been in use for over a year and handles hundreds of invoices from three different sources with total invoicing revenue in the millions every month. Using only serverless pay-as-you-go services keeps the running costs of the pipeline very low.

How it works

Illustrated chart with colourful boxes

Invoices can come into the pipeline from two sources: an XML format invoice uploaded by the legacy ERP via SFTP to an S3 bucket or a JSON format invoice HTTP POST to API Gateway. All stages of processing are decoupled with SQS message queues with error handling implemented by SQS dead-letter queues. 

Automated monitoring in cases of errors is done by publishing messages to an SNS topic when messages go to dead-letter queues. The interim invoice formats of different stages of processing are stored in S3 buckets. 

We used a DynamoDB table to hold customer and article data for mapping customer and article data between systems. The API Gateway validates all requests with a JSON Schema and passes the invoices through SQS to a Lambda function that parses the internal JSON invoice format to a yet different XML format that is understood by Transval API. 

What we learned

All of the AWS services used are pay-for-use except the DynamoDB tables which have a small monthly cost. As the system is used only for a few days every month but needs to be responsive during those two days, serverless architecture fits the project really well.

SQS queues are a great way to decouple parts of your software and allow/limit concurrency in different parts of the application. Dead-letter queues with SNS topics tied to them give meaningful alerts when something goes wrong.

Limiting the concurrency of Lambdas can be problematic with SQS as an event source, as some messages might go straight to dead-letter queues if there are a lot of messages. This is a known limitation when combining SQS with a low concurrency limit Lambda. Raising the number of retries solves this problem.

The project was a fun way to get to know some completely new to us technologies and create something worthwhile at the same time. The lessons learned and the serverless mindset have been useful in later projects as well.

"The billing pipeline solution Nitor provided, has been really reliable and has not required additional maintenance or development tasks. We have also extended the original scope by adding new source system to the solution, which has been quite effortless. So in general, we are really pleased with the solution", says Jani Laine, IT Solutions Architect at Transval Group.

Kuva: Transval

Written by

Jouni Tenhunen
Junior Software and Support Engineer

Jouni Tenhunen works for Nitor part-time while completing his master's studies at the University of Helsinki. Jouni repaired power plants and large motors in his previous life but now enjoys developing backend services in modern cloud environments.