How to leverage OCR and AI for financial data extraction?

Alexandre Abu-Jamra, Founder & CEO ● Jun 28th, 2024

The full transcript

Oleg

Hi everybody! Welcome to the Devico Breakfast Bar! Here we speak with different people involved in the business landscape, share their expertise, delve into the latest tech trends, and explore the ins and outs of IT outsourcing. I'm Oleg Sadikov, and today I'm excited to have Alexandre, a founder at Klooks. Don't forget to subscribe and hit the notification bell so you don't miss new episodes. Hi, Alexandre! Can you share with us your journey into the business world, from your early experience to your current role as a co-founder?

Alexandre

Hello, Oleg! Thanks for inviting me to have this talk with you guys here in Devico. Well, my journey in the business world. I started as an M&A advisor – as an intern in M&A advisory – and I went as an associate. When I was in the M&A advising business, it’s where I looked at this opportunity that we are pursuing nowadays at Klooks. Right there where we work on margins and acquisitions. So, we used to help companies to be sold to other companies. But to find companies to buy other companies, we needed to find suitable buyers. What is that? We needed to know companies that would have the capabilities to buy other companies. And to know that we pretty much needed financial information from private companies. So, information we had from the stock exchange and from public companies wasn't enough. So, we had in M&A advisory, a lot of people working to gather financial information of private companies, which was data that was hidden in some hard-to-find government sources. And that's where I started. And after I quitted the job. I started as a CFO in the business of my family, which is a rope manufacturing company. But I kept that problem in my mind – of the problem I faced when I was working with M&A — to find private financial data in Brazil. While I was working as a CFO, I started talking to people and trying to find people to help me solve the problem that was very present in my mind. And I found two partners that helped me to make our MVP to start a business at that point. And I was a CFO. And while I was CFO, I think there for like five years, I had a part-time job at Klooks, and my full-time job was to be a CFO. And in 2017, I quitted my job and went full-time with Klooks. So well, that I think that's pretty much my journey till I join Klooks full-time.

Oleg

Okay, then the logical question is what about Klooks? What is it? Its mission? And what inspired your creation? Since I understand, If I understand you correctly, that problem was the origin to create Klooks?

Alexandre

Yeah, that's right. So, what are Klooks? I think it's a good idea to give you an overview of what we do. We're specialized in gathering financial statements from private companies on the web. So, we have a lot of robots and crawlers going on the web and finding PDF files that are financial statements. What is a financial statement? It's a file that has information about the company: how much assets the company has, how much profits the company had last year, how much expenses the company had, what the company has in real estate, what the company has in inventories, what the company has in debts. So, that's a financial statement. That's usually an unstandardized document in a PDF format. So, it's kind of a nightmare to find it, and worst off to take the data out of it. Because extracting data from PDF, I think, it’s worse than rocket science. It's really a nightmare. So, we have crawlers specialized in finding financial statements. They are trained to find PDFs that look like financial statements. So, once they find it, just grab it, and send it to our database. And once in our database, we start extracting the data. So, we have a first step, which is an automatic way to track the data. We use OCR and a lot of AI to try to make the job easier. But there is no way nowadays to extract data – financial data – from PDFs 100% automatic with the quality that our clients in the financial market needs. So, we need to have a layer of human labor in the middle for quality assurance. So, that's pretty much what we do. We find financial statements, we extract data, and then we sell the data to a lot of people. We sell the data to Capital IQ, to Moody's Analytics, to the London Stock Exchange. So data – financial data from private companies in Brazil – we turned into a hub for the world, an international financial platform. Bloomberg has our data as well. So, that's pretty much what we do. And our mission, that was the original question, right? Well, when I started the business, this data was not public to anyone. I mean, it was public, but it wasn't available to anyone. So, any international investor that was looking into Brazil, didn't have a house. That was probably for external investment in the country, right? So, our mission and our vision at that point, and our purpose was to make Brazil more attractive financially and in terms of data to international investors, so we could boost our investment in the country. That was a big dream at the start. Nowadays we are starting to work with data from other countries as well. And we got so good at extracting data from PDFs and turning it into data that we are nowadays also offering this as a service to banks and insurance companies. So most of the Brazilian banks nowadays, they hire us to turn financial PDFs into data to feed their credit models.

Oleg

Why don't you create this as a SaaS program?

Alexandre

We do have a SaaS. We have two products, right? One is what we call our DaaS, which is data as a service. We just send data in large volumes to distributors, to value-added resellers like Bloomberg, or Capital IQ, or Morris. We have the service to turn PDFs – financial PDFs – into data. And also, we have a SaaS where you can access our data and they sell industry analysis and access information from Brazilian companies and staff. So, we do have the three business models. Our revenue breakdown would be like 60% is data as a service to large platforms; 30 to 35% – our spreading service to banks; and 5 to 10% would be our SaaS. So, we didn't manage to scale our SaaS. We had to look at lower-hanging fruit and invest in that.

Oleg

What were some of the challenges you faced in bootstrapping Klooks, and how did you overcome them?

Alexandre

I think that the main challenge was at the start. It always is, right? The larger risks are risks at the start when you don't know much about the business and the market, and you make a lot of assumptions that are usually wrong. So, our wrong assumption at the start was, “Well, let's build the Brazilian Capital IQ!” I don't know if your audience is familiar with Capital IQ, but Capital IQ is like a huge platform that is standard and in four zones with a lot of features and capabilities. But essentially, what they have is financial data from private companies. And they didn't have that from the Brazilian companies. So, our view at the start was, ‘Let's get all this data and make it like a big Capital IQ.’ And we did gather the data. We were successful in the first step. But on making the SaaS, and making the UX, and the marketing, and all that stuff attracted our end users, to like consulting companies, M&A advisors, and that stuff – we were unsuccessful in that. We built our bots, we had the data, and we built the UX and the SaaS platform. But that was a disaster. After one year, we had like six clients that would never pay the bill. The challenge there was to start thinking out of the box. And all right, what can we do to pay the bills? Because that's not going to last forever. Our strategy at that point was to try to find a value-added reseller, someone that would buy our data in volume, that would pay our bills, and then that would be able to make our business flourish. But that wasn't on the SaaS market. It was like someone that would buy our raw data, not the UX, not the features, not the analysis, etc. They would buy the raw data. That was our view at that point. All right, SaaS didn't work out, and users are not buying us. Let's find someone to buy us in volume. And well, we didn't have many options. So, the first one I called was Capital IQ, which was obvious. That was the tool that I used when I was an M&A advisor. And Capital Qi was the tool that I used when I was an M&A advisor. So that was like my top-of-mind. So, I called them – and like cold call – and just asked to talk with the Brazilian director, and for my surprise, he attended the call and listened to me. My pitch was, ‘Hey, I've got financial data from private Brazilian companies. Are you interested?’ And his answer was, ‘Yes. I'm very much interested. That would save my life. That I’m having a lot of trouble selling our products because I don't have that data. So, yes, I want it, but that's not in our roadmap right now.’ Headquarters was looking more into Asian data at that point. So, he said, ‘I will call you in six months, and you can wait for my call. I want your data. I'm going to fight internally here to buy that.’ Well, I thought he was never going to call me anymore. And then six months passed, and he called me. ‘Hey! This is Pedro. Do you remember me? From Capital IQ? Well, I think that's a good time to talk about the matter we have spoken for.’ So, then we sold our database to Capital IQ, and that was our first big client. Capital IQ was a billionaire company in the US based, and their product is financial data from the world. And then, after Capital IQ, we kept the strategy to try to sell it like to the 'sharks’ and not to the small fishes. And that's created a business. I mean, it's not a big business, at least not yet. But it's business, at least. And that strategy was the strategy that managed to make us be kept alive. We were talking about challenges right? This challenge on what to do when we were selling was the main challenge.

Oleg

Thanks for this story. Everyone loves stories. Are there any emerging technology or methodology that you believe will have a significant impact on marketing and other platforms in the near future? And how are Klooks prepared for these changes?

Alexandre

Oh, sure. I think that the main one is quite obvious, right? Large language models are going to revolutionize the way we understand data. In our market nowadays, people use the data we provide, and the data that Capital IQ provides, and Bloomberg provides to make decisions, right? And decisions usually are made out of a lot of study and report building. People make reports, they give it to their director, the director reads, and then they make a decision. Well, large language models are going to revolutionize pretty much all this process. First of all, reports don't need to be written by analysts anymore. It can be done by a model. So, that's the first. I haven't seen anyone doing that quite well until now. We are investigating it. Our IT team was already prompting and making like the prompt engineering to create standard reports. Our next step will be to make decisions on a credit point of view, on investment point of view, on a corporate point of view. So, we're going to work on that a lot. Large language models are quite obvious. And second one, I think is the evolution of OCRs, because, as I mentioned, nowadays, we get financial PDFs, we pass an OCR over it. OCR is a technology that reads characters in images and in PDFs to try to extract the text from an image and turn that into text. And they're quite troubleful at this moment. They don't work very well. And I think that as OCR evolves, and as OCR is able to interconnect meanings inside the text, it will get better and better, and that might make a lot of difference in our business and in intelligence platforms.

Oleg

How does Apple extract data from PDF images?

Alexandre

You can extract data…

Oleg

They extract it in a pretty detailed and pretty comprehensive view.

Alexandre

Yeah, the OCRs they do work for less complex data. Like, you can have a text, and you extract it. It works well unless the resolution is too bad. Or if there's a coffee mark on the paper, all right, that will not work. But in general, it works, right? The thing is that when you talk about financial statements, there are tables. And tables are actually – I know it sounds, stupid – but to extract tables correctly it’s a huge challenge. Because of the OCRs, they usually make confusion on lines. And when you are analyzing a financial statement, you can't have the values for the debt in the values for accounts payable. So, that would make a credit analysis completely wrong. It has to be totally perfect. That's the problem one – lines – they make confusions on lines. And problem two – they usually make confusions on numbers. So, you might look at a nine and think it is an eight. And in credit analysis, you can't make that mistake. So, those tiny mistakes are what make the current technology of OCR not enough for financial analysis. It's like it's 99% good. But, you know, like water when it is 99 degrees Celsius, it’s hot water, right? When it gets to 100, it boils, and the vapor makes the rain work. So, it's like that one degree that we are missing on OCR to make it usable for credit analysis and investment analysis. But right now, I'm still not ready for that.

Oleg

Okay, got it. We cannot be half-pregnant. As someone who has navigated the startup landscape, what advice would you offer to inspired interpreters working to make their mark in the industry?

Alexandre

I like it a lot. I think it's quite cliché what I would say, but I like the philosophy of selling something you don't have, but you know you make it before making it. So, It's that famous fake until you make a sentence, right? But fake. It doesn't need to be fake. You have to have confidence that you can deliver it. So, I would give this advice. When you are starting a business, you're very inclined to create prototypes for clients and try things out without charging and building stuff, which takes a lot of time and effort before being hired. So, I think that the advice that I would give is to do the other way around. Before you start what you do and put your hands in the dirt, sell it, get the contract done. And it's not as easy as selling something you already have, all right, It's not. But you always can sell a promise like, ‘All right, I will make these for you. And once I deliver this, you will pay me that. You don't need to pay me before I start doing it. You can pay me after I do. But we need to have agreed that once I do, you will pay me. So, I think that can save people a lot of time.

Oleg

Can you share a funny moment or experience that shaped your outlook on entrepreneurship?

Alexandre

The first moment, the first challenge to pivot our channel, to give up – we’ve never totally given up – but to like to give up on SaaS, and turn that into DaaS, data as a service. I think that that moment was very special. And the moment when I quitted my job when I quitted as a CFO and went Klooks full-time. That has a lot of personal stuff involved because that was my family company. My dad was the CFO and CEO. And well, family businesses are not easy. So, when I decided to give up on my family business, it was a very important moment.

Oleg

Are there any professionals or leaders in your network who inspire you in your professional journey?

Alexandre

Yes. Actually in my network. I got a lot of entrepreneurs and professionals that created interesting businesses. In 2008, me and some high school friends, we founded the association here in our state that was directed to help entrepreneurs. That association has formed more than 400 entrepreneurs for now. And a lot of those are quite successful. I got a lot of inspiration from those guys. There's a huge M&A advisor that was a founder in the association with me. There are some tech entrepreneurs that are quite big right now that found this association as well. But there's a lot of people that I admire there. I don't know if there is any sense of saying names, because you definitely won't know them. They are local entrepreneurs from Brazil. But I can mention a few, if you like.

Oleg

You can name a few names.

Alexandre

All right. Joao — this is John in Portuguese. John Spingular. He built a business on gift cards, and pretty much no one did that in Brazil when he started that. And he sold his business to Income, which is an American company in that field. Gustavo Reyes, he built an e-commerce company. He helps companies to use their own e-commerce. So, it's kind of a Shopify from Brazil. Let's say, it's something like that. He sold his business to a large Brazilian corporation. Gustavo Reis, he was partnered with Deniz Osorio. Angel Umoratori, he built one of the largest financial advisory firms in the states.Those are ones I remember. Colls Refil created a large data business. So, It's just the names in the wind, right? But they inspire a lot.

Oleg

Based in Brazil, do you notice any unique challenges or opportunities in the tech industry in your region compared to the other parts of the world?

Alexandre

Yeah, the challenge and opportunity in Brazil is that data is not accessible. So, the guys that make the data accessible, they have an opportunity. So, that's what we do. Financial data on private companies is not easy to find here. And what we do is we find it, and then we sell it in an easier manner. Other countries, like Portugal, or Spain, or even Colombia, are much more public on financial data of companies, so there are smaller challenges and smaller opportunities as well.

Oleg

Could you comment on the challenges associated with the shortage of qualified specialists in the IT sector, particularly in relation to your business?

Alexandre

Oh, there is an interesting story there. There's a lot of shortage of qualified professionals. Some of our products… We had devs, but they were working on our SaaS in the start and our DaaS, and getting the data out, and sending it to people. And we had banks knocking on our door and saying, ‘Hey, can you spread some financial statements?’ We weren't prepared to do that to banks, like in a customized manner. We had our standard process. We wouldn't like to put something out of the blue inside of the process. So, what we did with that had devs for that at the moment – we were a bootstrap startup needing an MVP. And with clients asking for data – we didn't have people to do that. So, we started prototyping in Google Sheets. And the first clients we sold were the whole process would run in Google Sheets, and we would send the data for them in whatever manner they wanted. It could be a JSON file, by FTP, or by API, wherever. But the extraction process would be done in Google Sheets. And after we created our system, once we started selling that in volume to other banks. But in the start, we found a way to deal with the shortage of devs, using these spreadsheets. So, that's the way we deal with it. And since we deal with data, nowadays, you can prototype a lot of things with sheets, and BIs, like Power BI, or whatever. So, data prototyping is quite easy nowadays. When we started, it wasn't that easy. That's how we deal with shortage.

Oleg

I know that you have an in-house development team. What factors have led you to abstain from considering IT outsources so far? And how do you foresee any circumstances in which the decision might change in the future?

Alexandre

We are a tech company, right? So, that is the quiet core for us to have technology and to have the capabilities to build technology. So, it's kind of a strategic thing to have the devs inside. We would be open to outsourcing some parts of our system that are less core. Oh, I need to make a crawler for Romania. I don't know Romanian financial data. So, we can project that and outsource. We could do that. But nowadays, our system is still not prepared to. We could outsource, but it would be a big pain to integrate in our system because it’s not prepared to get scripts and developments from outside and to ingest data from other providers. We are preparing for that. So probably next year we will be able to outsource more, some parts of our system that are not so core stuff.

Oleg

Maintaining an in-house development team can be challenging, definitely. What strategies have you employed to effectively manage and nurture its internal development talent?

Alexandre

The team is not that big. It's like seven people in our IT team. So, our CTO is pretty close to them, and he tries to understand their needs, how they're feeling, their purpose in life. what makes Klooks an interesting place for them. And he tried to make some interesting initiatives parallelly from the business. I think that in the end, It helps to keep people. I lead an NGO that helps to teach in public schools ChatGPT-oriented to financial education. So, we built a method to basically help the students to learn by themselves, studying by ChatGPT. It's not rocket science. It's just interacting with ChatGPT to learn something. Having that contact with students makes a lot of difference for them because they learn a lot – and that's something that they need to know for their future – and for our employees as well. They love the initiative, and some of them are also professors and teachers in the initiative. So, that's something that happened organically, but in the end, helped our HR.

Oleg

Reflecting on your business operations, are there specific tasks or projects that you believe could potentially benefit from outsourcing? And what criteria would determine their sustainability for external collaboration?

Alexandre

Once we are prepared to outsource technologically, when our system is not so monolithic and is more flexible, I think that there are a lot of opportunities to outsource, as I mentioned, data from other countries. We have a lot of countries that are mapped. Let’s say India, for instance. India has a lot of data. I think I definitely could outsource the bots to get the financial statements there. Or Portugal and Spain, as I mentioned before. Those we don't have yet, and we have clients that want that. And we have the data, and it's not in our short-term roadmap. So, we could grow quicker if we outsourced these less priority tasks.

Oleg

India is a pretty big market. Why is it not a priority for you?

Alexandre

Yeah, yes, it's a huge market. The market is kind of red, right? There are some Indians doing that. And also, all the data companies have established themselves in Bangalore, or Mumbai, or in cities in India because they are very good on data crawling and data extraction.

Oleg

As we conclude our discussion, drawing from your own perspective, what advice would you give to aspiring entrepreneurs or individuals looking to outsource their technical needs?

Alexandre

I am not really a good guy to talk about outsourcing because I've never outsourced much stuff. But I think it can be a really good boost on growth. But you have to be prepared for it because you need to make the architecture of the systems, and you need to ingest the data or the features afterwards, so the integrated features. So, you need to be very much prepared. So, then you get it done. There will be a lot of trouble on your side as well.

Oleg

Alexandre, thanks for your time, joining me today. It was a nice conversation. Thanks for sharing the stories, the valuable insights from the data industry. I do believe that you still have great potential in your business. I wish you to move faster and to acquire more and more new opportunities.

Alexandre

Great, Oleg. Thank you! Thank you for the opportunity and talking to you and to your audience. It was a great pleasure being here.

Oleg

If you enjoyed our discussion and want to stay updated on future episodes, don't forget to subscribe and hit the notification bell. This way, you will not miss on the latest insights and conversations from the Devico Breakfast Bar. See you in a week!

Watch previous episodes

Contact us for a free
IT consultation

Fill out the form below to receive a free consultation and find out how Devico can help your business grow.

Get in touch