The Hours of Work Behind a Simple Feature

Since I started this website in April 2021, I knew users would upload PDF files that were password protected. I figured it would be too annoying to do the work to support these PDFs so I put it off. Recently a user asked me to support password protected PDFs, so I did it. It’s a simple feature, and it works pretty well. I spent all of yesterday working on it and pushed it into production at 2:32 AM this morning.

Read more →

PDF Bank Statement JSON API

1. Get the Authorization Token Go to https://bankstatementconverter.com/ in Google Chrome Login Hit F12 to open up the developer tools Click on the “Application” tab On the left panel click on “Storage” -> “Local Storage” -> “https://bankstatementconverter.com/" You should see a key called “bsc-authToken” and a corresponding value. The value is the Authorization Token. Copy the value. 2. Upload PDF API Method: POST URL: https://api2.bankstatementconverter.com/api/v1/BankStatement Headers: { Authorization: AUTH_TOKEN } Body: Multipart Form Data Request Headers Request Body Response Body 3.

Read more →

HSBC Mole Whacking Development

Last week a customer asked me to help him process a few hundred of his documents. He had PDFs for several bank accounts going back to 2018. I had a look a documents arranged a price and then got to work processing his documents. My idea was to run it through Bank Statement Converter’s (BSC) PDF to CSV processor. However a few errors occurred. BSC has a generic algorithm for detecting transaction records in a PDF file.

Read more →

What Year Is It?

Lately I’ve been getting a lot of complaints that the year is wrong in Bank Statement Converter’s resulting CSV. At first when I got these complaints I thought “What the hell you talking about? All we do is find the transaction data and then write it out to a CSV file. How can the year be wrong?”. Let’s walk through an what’s going on with one of my HSBC bank statements.

Read more →

Automatically Get your Bank Transactions

The other day I had a bit of revelation about what problem Bank Statement Converter solves. The obvious answer seems to be “it solves the problem of extracting transaction data from PDF bank statements”. That’s true, but you could also say it solves a more general problem. “It gives user access to their bank transaction data”. However, it’s a bit of a pain to use. To get your 2021 transaction data you need to:

Read more →

Getting to $1000 MRR in ten months

A few weeks ago the BankStatementConverter (BSC) exceed $1000 in Monthly Recurring Revneue (MRR). I figured you lot would be interested to hear the story $0 MRR, March to July 2021 I got the idea to build BSC to help users process PDFs. I spent about a week playing around in Kotlin to see if my idea was feasible, it was. Soon after I meet up with a friend for beers and tell him my idea.

Read more →

Detecting Headers of the Transaction Table

This is the second part in a series of blog posts where I explain how Bank Statement Converter works. In the previous article I talked about how I extract each character and its bounding box from a PDF. In this article I’ll talk about how I use the characters and bounding boxes to deteect the headers of the transaction table. val pageRegion = Rectangle(0f, 0f, page.cropBox.width, page.cropBox.height) val lines = LineExtractor(page).

Read more →

Extracting Text and Bounding Boxes from a PDF

This is part one of a series of blog posts where I explain how BankStatementConverter works. In this post I’m going to explain the code that figures out the bounding boxes and other attributes of characters on a page. Lots of this code was lifted from DrawPrintTextLocations and PDFTextStripper. First thing we do is load the PDF file using PDFBox and then we process the document page by page. The PDFs are processed page by page because we don’t run out of memory, most documents are less than ten pages long, but there are documents out there that are over 10,000 pages long, if we tried to load all the data from a large document into memory we would quickly run out and crash our app.

Read more →

Analysing My HSBC Bank Statements

Originally I wanted to analyse my HSBC bank statements from 2014. 2014 was a great year for me, I started off the year by launching a new app, joined a game development company and rented an apartment with a friend. Unfortunately my 2014 bank statements from HSBC’s internet banking are no longer available, it seems they only go back a few years. So let’s go through my 2015 bank statements instead.

Read more →

Five ways I handled my OutOfMemoryErrors

I use Grafana to create graphs that show me various business and performance metrics for Bank Statement Converter. One of the graphs I created tracks the number of Internal Server Errors the server returns to its clients. I do this by writing a record into the database whenever a 500 is sent to the client. This graph has been really helpful for ironing out bugs I didn’t anticipate. Last Thursday at 12:55 AM HKT my servers started throwing Java’s infamous OutOfMemoryErrors.

Read more →

Join The Mailing List