- Samkit Jain
A lending company needs the answer to just one question when giving loans, “Will the borrower be able to repay the loan?”.
Coming to a binary (yes or no) answer to that question involves a lot of work (both manual and automated). Lending companies go through various financial documents of the borrower and perform a thorough analysis to come to a conclusion. With the advancements in artificial intelligence and machine learning, some companies have been able to reduce the time to approve a loan to a day (but don’t share the percentage of bad loans probably because OCR systems are still not as good as they want it to be). A bank account statement can be tens or hundreds of pages long with thousands of transactions. An individual’s account statement can contain just a few hundred transactions while a corporate’s can be in thousands. To understand the past spending behaviour of a borrower and predict the future loan repaying ability, one of the financial document that every lending company asks for is a bank account statement.
Bank statements shared come in all types imaginable. You’ll find PDFs (both bank generated and scanned copies), CSVs, images, in rare cases even HTMLs. In this post, we’ll talk about the most common type, image. Even in images, you’ll see a lot of variety,
OCR to the rescue!
Optical Character Recognition or OCR is a technology that recognizes text within an image. Humans have the ability to easily understand the text in an image, however complex (after all, we are the masters of the sacred texts!).
Over the past few years OCR solutions have really gotten much better. They are able to recognise handwritten texts with a good amount of accuracy. Giants like Google and Microsoft have also invested in the field and have come up with their own text recognition products.
It’s a known fact that OCR works well when the characters are printed, image quality is high and lighting is ideal. Bank statement images shared by the borrower have one thing going for them, they contain printed text. But this is not enough. Even with ideal conditions, it won’t be enough.
Majority of errors in OCR systems are because of incorrect classification. It usually misclassifies in cases where the features of a letter and number are same. Some of the ambiguous cases are,
Images shared by borrowers are usually not in the ideal condition. They are in bad lighting, blurry, low res, have pen/pencil markings, pages are folded, etc. All these factors act as a catalyst and lead to more and more incorrectly classified characters.
Some examples where OCR didn’t work for us
Current state of OCR systems
When Inkredo was into P2P lending, we too accepted images of bank account statements and manually typed every entry in an Excel sheet. Expectedly, this was a time taking process and we tried multiple OCR solutions — in-house, open-sourced and paid — but, none of them gave us the desired result. Some of them worked really well with bad quality photos but all of them struggled with ambiguous characters.
To conclude, OCR is not reliable for text detection in financial documents where reading a comma as a dot (or vice-versa) can make a significant difference. PDF (containing text and not scanned images) should be the preferred type because it’s not as easy to manipulate as a CSV and is easier to extract text from PDF-encapsulated files rather than images. Plus, you won’t have to buy expensive OCR solutions.
Do you think OCR is reliable when it comes to credit risk assessment? Share your thoughts in the comments.
In case you are interested in this problem, we're hiring machine learning engineers.
India produces lakhs of engineers every year, and most of them know that coding is a sure shot way to get an impressive job. Hence hiring a programmer shouldn’t be difficult. But hiring a good programmer or figuring out how to hire a good programmer is extremely challenging. It becomes even more mind-boggling when you are hiring a full-time programmer while you are bootstrapping in early stages. In addition to coding skills, lots of other things come into play - culture fit, flexibility to change, optimising for speed and quality and work ethics. How do you vet for such traits? Paying a high salary alone is not the answer to hiring a fit for your team! How do you know you are not throwing away cash?
You do the best interviewing you can and then bring the most qualified candidate on board to join the team. If it doesn’t work, you try another candidate until one sticks. The problem is spray and prays method doesn't work in building fundamentally strong teams. We’re sharing our approach to hiring our first two teammates who have stuck with us for almost a year now. And this is the same approach we’re going to use hiring programmers - be it remote or on-site.
4-Step hiring process:
The result of this process tells us following about the candidate
There are minimum 24 hours of gap between any step. The buffer allows us to evaluate the response of the candidate and decide if we want to take it to the next step. We do not spend more than one meeting (usually on Slack) to decide upon it - that’s the discipline. It’s an agile process where each step builds on the success of the last step. If at any stage, the company decides to terminate the process, we inform the candidate via email and then continue with our day.
Overall the goal of the process is to save time, both ours and the developer. A candidate who is actively seeking a job will care to spend time with us. Window shoppers are automatically weeded out. We invest time in meeting the candidate only after someone has impressed the team after taking the programming challenge. Why? Time is valuable. Its the resource we cannot renew and if I can save our team the hassle of meetings, travelling, etc. then I will do so. That's what this accomplishes. The candidate usually appreciates it as well as it saves them time and hassle. Saving time for my company and employees is what I’m going for in these early steps.
Step 1: Statement of purpose
Objective: We are looking for wordsmiths.
Step 2: Programming Challenge
Objective: Identify if the candidate can solve problems with code and gain insights into the quality of code that is produced.
We specify the programming language to the candidate well in advance to prepare for the programming challenge. The length of programming challenge depends on the experience of the candidate. If the candidate is a fresher, we usually expect a response in 8 hours while an experienced programmer (3-5 years) should be able to complete the challenge in less than 6 hours. A candidate is allowed to take the programming challenge at home in the comfort of their surroundings so that the candidate can focus on solving the problem than spending time in getting acquainted with the new surroundings.
What happens if they cheat? They will call their friends and family to complete the challenge or hunt for a solution on the internet or find an open source that does the same job. That’s a high possibility, and that's the world programmers live in. If programmers do the same at their work, who cares. If the candidate is truly a hack, then we’ll know fairly quickly. Our team is going to sniff it, and at that point, we’ll decide whether to retain the candidate.
Programming challenge is a great equaliser not only because it allows you to gain insight into the quality of code that the person writes but also about how effectively the candidate can communicate, his best practices, comprehension of instructions, using git and his follow-throughs to succeed.
“This challenge has turned out to be an auspicious moment for both the candidate as well as us. More than 80% of candidates never finish the challenge - usually, they get lucky in getting a job offer from another company precisely during the programming challenge!
Good enough a reason why you should take this challenge!
Of the remaining 20%, only 10% complete the challenge and we’re left with 1 or 2 good candidates. If there are more, then it’s a good challenge to have.
We hire the top candidate and let the others know that we decided to go with someone else, but we may be in contact with them shortly because we were impressed and we may need to grow our team. This leaves the door open for future conversation with those candidates.
Step 3: Technical Screening over Skype/ hangout
Objective: Check his problem-solving skills and ability to perform the job at a quiz level.
This technical screening is a follow up to the programming challenge where most of the questions revolve around his coding practices during the challenge and finding the reason behind it. We keep the questions open-ended so that the candidate gets to explain himself. Another programmer on the team performs these type of interviews. If the candidate starts failing on too many occasions, then it’s a red flag. The point is to save time during this process. Cut the interview short and then move on.
Step 4: Culture fit
Objective: Can we spend informal time with the candidate
At this point, most of the candidates probably are certain that they’re through the process. However, we don’t let the candidate's excitement be the determining factor – we still need to determine if this is a person our team can work with and enjoy being around. At this point I like to bring the candidate into the office to meet the team and work with us for a day or two. This is an excellent way to let the candidate get a taste of things and people to come. We usually have lunch and snacks together; this allows us to gauge their personality and hopefully identify any traits that are not so desirable in our team. We as a team need to be able to get along with this person and enjoy their company. We’re going to be spending a lot of time together, and if we don’t like each other, well, that probably won’t work out too well. We usually check if we can spend informal time together. We also check with the team at the end of the day - at what point the person will leave the company. If it is something that requires more of a mental shift than a physical effort, then it’s a red flag.
e.g. we had someone who got through first three steps with desired results. We all were excited to get him onboard. But we found out that he did not take up his previous job in Pune because he wanted to live with his parents in Delhi. Not that his parents needed his support, but he was not secure of living without his parents. We decided that he is not someone who would be comfortable with dynamics of an early stage company and hence he would not click with us. In the end, we did not offer the job.
Finding good talent is hard, and it is even harder when you have limited means while bootstrapping in an early stage company. The steps are indeed time taking, but it has helped us save time in finding the developers our team needs. It’s about matchmaking, and we take time. We believe that it’s better to wait for the right candidate than hire a mismatch. Step 2 is the equaliser. This is where the grains are separated from the chaff. The "can-do" vs "cannot-do" emerge at this stage. I’m always surprised to see how many people with strong resumes never make it past this stage. The harder it is to get through, the stronger are the chances that the candidate will emerge as a champion in the team.
By Samkit Jain
Deploying the version one of an app to production for the first time is never a walk in the park. We too faced obstacles when deploying the backend of 91paisa (written in Go) on Amazon Web Services (AWS) Elastic Beanstalk. After multiple trials and support from the AWS Support, we finally deployed the initial version of 91paisa. This article will help you if you too are figuring out how to deploy your Go web app on AWS Elastic Beanstalk.
You can deploy Go web applications on AWS Elastic Beanstalk in a couple of ways:
Uploading the source code
Create a Zip file which contains application.go (entry point of your application) file at the root along with the source code. Also, create three additional files - build.sh, Buildfile and Procfile.
The directory structure will look like:
│ └── db.go
│ └── ...
│ └── api.go
│ └── ...
│ └── index.html
Contents of build.sh
Contents of Contentfile
Contents of Procfile
Now, upload the Zip file to Elastic Beanstalk and wait. If everything works, congratulations! If not, read on.
We followed the same process but the environment health immediately degraded. Digging through the logs we found,
The build command was failing because of project layout
We ran eb ssh to SSH into the instance and could find the files downloaded at /var/app/current/ (GOPATH) meaning that the datastore folder was located at /var/app/current/datastore while the application was looking for it in /var/app/current/src/github.com/91paisa/backend/datastore
The build command was failing because of my project layout. We follow the standard structure (and naming convention) where the source code is located at go/src/github.com/91paisa/backend/ which can be imported by specifying the complete path, i.e. import "github.com/91paisa/backend/datastore". Since beanstalk does not know about the project layout it is not creating one and hence the build was failing.
You can get around this by either (a) using relative paths for your packages in the import statement or (b) dividing the application into packages such that you can install them using go get.
Note: Elastic Beanstalk does not delete the previous files. Example, your source code initially contained just three files — application.go, api.go and cmd.go — you uploaded the Zip and everything works great. Now, you don’t need the cmd.go file and you delete it and re-upload the Zip. On your environment, the cmd.go file will still be present and you’ll have to manually remove it or restart the environment. Whenever the load balancer creates a new instance it contains only the latest files though.
Uploading the binary
Create binary using
GOARCH=amd64 GOOS=linux go build -o bin/application application.go
This will create a bin folder containing the binary. Now, create a Zip file containing the bin folder. You can also add additional files like assets and .ebextensions folder.
The directory structure of the Zip will look like
│ └── application
│ └── index.html
You can also create a shell script to automate the process. A basic shell script that fits my use case:
Don’t forget to chmod +x create_zip.sh. Run it using ./create_zip.
We hope this article helped you in deploying your Go app to AWS Elastic Beanstalk. If you are looking to deploy on EC2 or DigitalOcean, the process becomes much more easier. All you have to do is build the executable binary and run it as a service on the server.
Know a better way to deploy? Please share your knowledge in the comments.
Wanna chat? Drop me a message on on Twitter or connect with me on LinkedIn!
How we achieved accuracy of over 90% after reading 800+ transactions
- Samkit Jain
At Inkredo, we perform flow-based credit assessment to determine the monthly repaying capacity of a customer. Our customers are small & underbanked retailers who are running a bootstrapped and consistently profitable business, yet they remain excluded from formal credit. Formal institutions have shied to lend to lower-middle income group because the cost-benefit analysis of lending and collections do not offset the cost of originating and recovery. There is no cost-effective measure to monitor income/solvency & ensure timely repayment.
The assessment involves calculating useful analysis from the bank statement of our customer. This task requires copy pasting every transaction from the PDF of bank statement (containing tens of pages with hundreds of rows) to an Excel file, cleaning the copied data, and then using Excel wizardry to perform some statistical operations. Imagine using Ctrl+C and Ctrl+V almost a thousand times every other day. As you might have guessed, this involves a lot of human interaction and typically takes us a day to complete a single bank statement.
With our growing user base, a solution was required to reduce the effort and time required. A smart solution to generate insights within seconds with minimal human interaction.
Bank statement in PDF
Before starting with the development, we tackled the problem manually. For each bank statement (from various banks), read all the transactions, highlighted keywords and assigned appropriate labels and categories to each. Then with the generated mapping created a set of keywords for each category with priorities assigned.
A bank statement containing transactions from over six months of a person running a business is usually more than 20 pages long with around 1,000 transactions. Columns are generally of date, particular, balance, deposit, withdrawal, etc. For a specific bank, the result is pretty consistent and easy to play with, but every bank has its format for bank statements. Count of columns, positioning of columns, separators, text format and abbreviations vary.
The columns we require are
These columns are found in every bank statement. Example,
Date | Narration | Chq./Ref.No. | Value Dt | Withdrawal Amt. | Deposit Amt. | Closing Balance
Srl | Txn Date | Value Date | Description | Cheque No | CR/DR | CCY | Trxn Amount | Balance
Tran Date | Value Date | Particulars | Location | Chq.No | Withdrawals | Deposits | Balance (INR)
The naming convention might be different, but the purpose of every column remains the same. Created a dictionary called BANK_DETAILS that contains the position of the required column. Example,
Reading the bank statement
Reading tables from PDF documents is not an easy task. Even copying data from tables doesn’t work properly most of the time. Thankfully, there’s an open-source library available called tabula that can extract tables from a PDF with almost accurate results. We used its Python wrapper tabula-py for the data extraction.
Making the extracted data consistent
Every page with transactions table of the ICICI bank statement consists of this header row. This row is useless for the system as we are only targeting transactions.
Aim: Remove header rows from the list of transactions.
Solution: From reading multiple transactions from numerous bank statements we realised that the closing balance column is always the last. So, a header can be considered as rows (why plural? see next task) starting from the first row till the row where closing balance is not null. Then, go through all the rows and if the row is a part of headers, remove it. In the end, we have rows without any header.
In the image, we can see that particular can be in multiple lines but belong to the same row. Tabula cannot differentiate whether multiple lines in a row belong to the same row. It will treat them as multiple rows, and as a result, we get the following output:
# First line read from HDFC statement
['22/06/17', 'IMPS-7-RAHUL-HDFC-XXXXXXXX', 'XXXX7', '22/06/17', nan, '1,000.00', '14,904.08']
# Second line read from HDFC statement
[nan, '8-XXXX', nan, nan, nan, nan, nan]
Aim: Convert the same particular from multiple rows into one.
Solution: The first line of every entry contains particular, date, balance, transaction amount and cheque number. Only the particulars can be multiline. So, a multiline particular can be between two date entries.
Credit, debit and default
As we saw in the columns of various bank statements, the differentiation between credit, debit and default is based on whether the entry is in deposit column or withdrawal column or in some cases whether mentioned as CR/DR.
Aim: Classify every transaction as credit, debit or default.
Solution: Classify all deposits as credit and withdrawals as debit. An event of default is defined when a withdrawal leads to negative closing balance and then immediately followed by a deposit of the same amount.
To perform analysis on the bank transactions, we need to categorise every bank transaction. Categorizing enables us to perform category specific operations and answer questions such as “how much does he spend on operations?” or “what are the different channels of earning?”. A category can be ATM, Shopping, IMPS, NEFT, etc.
Aim: Categorise every transaction.
Solution: For every transaction, tokenise the particular and based on the occurrence and position of keywords assign a category.
Now that we have read, cleaned and categorised transactions from the bank statement, it’s time to generate some insights. After all, what’s data without information?
Cashflow analysis helps in
This analysis gives an overall view of the total number and amount of credits, debits and defaults in the bank statement. Also contains a categorical breakdown of cash and non-cash transactions.
This analysis is a month-wise breakdown of the overall analysis of the bank statement. Helps in calculating the growth of the business.
This analysis shows the total number and amount of defaults in the bank statement along with the details of every default.
To understand the spending behaviour of the user we need to know the most common transactions. To answer questions like, “Are there multiple NEFT transactions to/from the same person/company?”, “Is he an IRCTC agent?” etc., We used Ratcliff-Obershelp algorithm to club similar transactions with more than 85% similarity. For better results, removed numbers, special characters from the strings.
Note: Code snippets mentioned above are pseudocodes to demonstrate the idea and may not contain all the edge cases.
India is home to largest underbanked population in the world - 25% of the total underbanked people on the planet are in India. It's a statement enough that existing financial tools in the market aren't designed for those in the lower income group despite all the push by the system in last few decades. Existing financial institutions have always focused on serving those who can give the banks enough liquidity in maintaining an average monthly balance, credit score comes after it.
With the increasing penetration of the internet, mobile is almost about to become the first universal device. Thanks to the telecom industry that has shown the world to deliver services at high volume and low cost, overcoming the geographical constraints on the way. Why can't banking services do the same? It's time that we put the power in the hands of individuals instead of the system. Hence, we decided to build a mobile-first financial tool to empower them.
The predominant USSD interface is clumsy, text heavy, hierarchical, and a barrier to uptake. Smartphones open a whole new range of interface options that can leverage touch-screens, images, graphics, and sound. A well- designed interface can affect millions of customers in their day-to-day interactions with finance. Many market signs point to rising smartphone usage in the next 5-10 years. Smartphone interfaces could be a key to unlock value for low-literate consumers overcoming communication barriers imposed by early-stage feature phone-based models.
While working with the informal lenders in the low-income geographies where micro-entrepreneurs operate, we identified key principles of design that drives their engagement when it comes to accessing a financial tool
It is not meant to be comprehensive, but is intended as a starting set of principles that will improve smartphone interfaces for basic mobile financial tools in low-income geographies.