Reverse Engineering a Credit Card Processing API

30 Dec, 2023

This isn't some multi-million dollar heist or massive hacking attempt, this is a real story about me working to update an internal API used to charge credit cards. The previous processor was shutting down at a specified date and alerted us that we'd need to find a new processor. Easy enough, there's plenty of processors out there so we compared our options and found one. Now all I have to do is create a drop-in replacement to avoid needing to push updates to our applications utilizing the legacy API.

However, between no real documentation on the existing system and nearly a decade of changes since original implementation, what was budgeted as a two or three week project ballooned to six weeks worth of work consisting of late nights and early mornings of poking and prodding existing systems to see how they worked together. Once again, this experience demonstrated the accuracy of a certain project estimation rule.

I'm only going to focus on the development side of things, but the moral of the story is when you have a hard deadline of an important project, that project needs to be moved to priority #1. If you don't have everything needed to complete that project, you should still do all you can with what you do have, that way once you get the other pieces required you can continue from where you left off.

What I Knew Starting Out

Two applications reference this single API to process credit card transactions
From working on updates in one of the applications over the years, I had mapped out API endpoints and their usage
The new processor has an SDK and documentation
There was a SQL server that stored tokenized credit card numbers and a way to export those cards out for adding to the new system
This would run on Windows Server and had to be written in C#

The Plan

Naively:

Assess each endpoint for expected inputs and outputs
Match current endpoint functions to the new processor SDK
Test endpoints
All good push to production!

The Beginning

I didn't have access to our actual account yet, but a sandbox was available which I took advantage of. Thanks to previous work with C#, I had an environment set up though by no means was I that experienced with the ecosystem. Saved around 36 hours there, however there was still a lot to learn in terms of the setup, project structure, and how to convert examples from the SDK to an ASP.NET Core Web API project. The processor offered a quick "Hello World" equivalent class to test the system which I tried to use to no avail. I couldn't figure out how to make it work with my ASP.NET Core Web API.

For the time being, I needed to try to start something so I created a new Console application without any templating, threw the code from the class into it and tried to run it. Once again, I got compiler errors. This was particularly frustrating because I was using the same code as what was provided to me, yet it wasn't working.

I know I don't confidently know C#, but a "hello world" copy and paste example should work!

I threw the errors I was getting and the name of the processor into a search box and hoped to find something. The first blue link felt promising, it led to an enterprise-y discussion forum where someone had this exact same issue! Better yet, there was a solution, I just needed to...add some other random code for some reason. Of course, why didn't I think of that?

Later on I'd discover that on the actual API documentation, their code examples were up to date, however this one introduction page existed outside of that and had an older version of the SDK referenced. That would be forgivable, if not the very first impression of working with the processor. Spirits were dampened a bit, but at least now I knew the change that examples may need to work.

This was great and all, but it still was a Console application and not an API, which I needed. Thankfully, ChatGPT helped significantly as I asked how I would convert the class to be used in the API. The model responded giving some good starting points. I followed up with a question asking how I'd add that to a project, and ChatGPT provided an example structure. The structure seemed to match other C# projects I've seen and over time I did begin to understand better where pieces should go. Granted I still have opinions over the opinionated structure, but at least I can work with it now.

At this point I could take examples from the processor SDK and nearly copy/paste them into endpoints to test. If only it remained that easy.

Did You Know It Could Do That?

Every time I hit endpoints in the legacy API XML was returned, so I made sure to match that with the new API. The Model-View-Controller concept did work well here, as I defined models I needed then was able to add XML-specific attributes and instructions on how it should be serialized. Since I worked on one of the two applications that uses the legacy API, I was able to add logging to see how the requests and responses were structured. I fired off a request from the application and...what??

It's JSON. Why is it JSON?

I looked all through the code I had, sure enough it seemed to be expecting a JSON response, but then why was it XML when visiting it in the browser? How can the same endpoint yield two different formats for the same request?

Content-Type Can Dictate The Response

It turns out servers can be configured to return different responses based on the Content-Type header. I had always thought this was more for compatibility, not a way that you could change the response from the server. The browser was asking for application/xml, meanwhile the application wasn't requesting anything. The legacy API was configured to return XML or JSON based on the request. So great, now I need to support two formats for everything. I tried several different options then settled on an if statement in every controller that detected content type, then returned serialized XML or JSON. Time to try the request again, and yay it works now! After several days of work I have a functional example that can be called by a request to an API to trigger an action at the credit card processor. All that's left is to match the endpoints, generate some models, and I should be good to go, right?

Wrong

A big part of this project was importing all credit card information from the legacy system into the new system. I needed to create an ongoing process use the tokens from the SQL database to retrieve credit card information, then all in memory to avoid saving anything locally, pass that directly to the new processor. This part was fine and straightforward, but when I got in the SQL database myself I noticed there were a few other tables:

CreditCards
Orders
Transactions
Emails

Why were there these other tables? Maybe they are extremely legacy because I know the one application I've worked on has its own order system. Let's check

SELECT * FROM Orders
ORDER BY Id DESC

Oh no.

These tables all had recent entries! Was the other application using them as a database for all of its order information? Was this all actually legacy, but still populated just because? I thought everything was stored by the processor, mainly because when looking at the new processor, I found I could store pretty much all of these items in there, including custom user-defined fields¹!

There were two major problems:

The legacy API and applications to differing extents utilized a mixture of information directly from the legacy API and the database
Despite the API reference for the new processor saying I could use a customer id or payment profile id to retrieve information, the SDK would not allow me to pass only a payment profile id and I needed to map that relationship locally to match the existing design of the legacy API

I tried several different designs to avoid using a database, like those those user-defined fields. Unfortunately, the fields are not ones that would be stored and associated with an order, they are only to pass information through from one part to another. That was a real downer since it was a main reason for even choosing this processor. Maybe I can get clever² and use some of the other fields I wouldn't need to store this information. Well, other limits quickly came in and I had to make a new database.

Data All Day

It was time to create a new database and tables for information I needed. However, I was able to store significantly more information within the processor than before and still allow the endpoints to return the same effective data. This made my database organization simpler. Also, I knew how data were used in each application³ so I could make a few changes for simplicity without detriment. There were a couple benefits here:

Tokenized credit cards and credit card information was no longer needed to be stored on my side, allowing this database to be limited to extraneous order and transaction information and not requiring the same kind of lock down the legacy system had
New fields captured as part of transactions by the processor negated requirements for about half of information stored in SQL

I made a new database, created the stored procedures, and wired it all up and it was working. Good! From this point on, for real this time I'd only need to match the endpoints, generate some models, and I should be good to go, right?

Still Wrong

The validation was based upon the same assumptions⁴ as development:

There is one way the API is used, as it's a standard API used across multiple applications
Implementation would be similar between applications

After a two weeks of rapid work, importing over a bunch of customer information, and several very late nights and early days (Wake up at 4am, magically your work day can be increased to 12 hours by just 4pm!) I was ready to have our validation engineer test the code to make sure it works. I set up the new API to be used by a validation version of one of our applications and she went to go test it out. It passed! It validated! Yay!

That was on Friday. On Saturday I woke up with a nagging feeling that I really should double-check the other application, now having just received information on where the source code is, to make sure it works the same.

Why are their test functions all sending JSON?

Every request up until now had been encoded form requests, this was JSON! Why?? In the end, this was a bit of a red herring as I found out from creating a middleware function for my C# API⁵ to output all information about requests that the other application didn't actually use JSON, though sometimes it would send information as a form or through the body of a request. Anyway, this led to learning in C# when handling requests you can also set up attributes to indicate what content each function based on request method should consume. It took awhile to get there, but the end implementation was simple enough.

While trying to navigate all of these issues, there was also the problem that every single test I did of the system for some reason emailed our production sales email inbox. I had reached the "sshing into random servers and making edits with vim" stage trying to fix things since I didn't want to spam our sales team. I found where the email was set to send to them, changed it to my email, and it didn't do anything. I reached out to the team who created this particular application for assistance, and they updated the same exact line and it still didn't work. I didn't know what the issue was there, but I knew that I also had to limit tests until I was as confident as I could be when working with the application as to not flood their inbox.

I continued a run through of the other application to make sure the new API worked. Then a new request came showed up:

https://server.com/api/ProfileFromEmail

That's not an endpoint I have made. That's not an endpoint we tested. That's not an endpoint I even knew existed!

Two-Faced Servers

Each application interacted with the same API in a unique way. There was some overlap, but a good chunk was distinct. Using the middleware I decided to go through all the different pages in each application and see what requests were sent over to the API.

The middleware looked like this, which was triggered every time any page was requested, making it an excellent catchall for things trying to talk to the server and to see what they were saying:

public class RequestLoggingMiddleware
{
    private readonly RequestDelegate _next;

    public RequestLoggingMiddleware(RequestDelegate next)
    {
        _next = next;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        Console.WriteLine($"Handling request: {context.Request.Method} {context.Request.Path}");
        foreach (var param in context.Request.Query)
        {
            Console.WriteLine($"Query Parameter: {param.Key} = {param.Value}");
        }

        // Log Request Headers
        foreach (var header in context.Request.Headers)
        {
           Console.WriteLine($"Header: {header.Key} = {header.Value}");
        }

        // Log Request Body
        if (context.Request.Method == HttpMethods.Post)
        {
            context.Request.EnableBuffering();
            var body = await new StreamReader(context.Request.Body).ReadToEndAsync();
            context.Request.Body.Position = 0;

            Console.WriteLine($"Body: {body}");
        }

        await _next(context);
    }
}

Then a quick app.UseMiddleware<RequestLoggingMiddleware>(); in Program.cs to make it all happen. Over time I built a better picture of both how each application was intending to use the API. Now I knew:

The list of endpoints each application used
Requests each application made to those endpoints
When and how endpoints were called

What I didn't have yet was responses the applications wanted. I could have gone through the source code for each application along with what information I had for the legacy API, tried to find where they were parsing information, and then rebuilt new models from that.

However that felt like it would take a long time. Instead, I've still got a validation version of the legacy API and I have captured real requests the server makes. I'll curl some requests at it and see what it returns!

And with that, I finally had all the pieces, so surely now all that's left is I'd only need to match the endpoints, generate some models, and I should be good to go, right?

Wrong Wrong Wrong

Once again, not quite. I got a lot further this time, nearly matching everything and then found out at the last step in finalizing an order, one application used information and the generated response in creating another request. My captured request for the last step missed a significant chunk of incoming data because that step required a correct request and response of the previous step, which it would then reference to make a request for the last step. This required reworking the model, the requests, and final responses so everything would move around gracefully. Without this reworking, the application could charge a credit card, but the system wouldn't acknowledge an order had been made.

I did all the work, fixed it up, and boom, it all seemed to work. For real this time! The new API populates correctly, I can save and retrieve information, and most importantly cards are successfully charged.

What We Have Learned

Despite not having documentation, each application interacting with the legacy API in distinct ways, and the odd duplication of information between SQL and each system, the existing API had worked solidly for nearly 10 years. In my time, only a single and minor update was ever required, and that was so minor I can't remember what it was, but I remember that I had to log into a special server, compile everything into a dll then add that to Windows Server, which felt weird. The existing system itself did work and seemed to work well, but without documentation the entire project was lost to time and became a black box.

For new development, starting earlier would have massively helped. The original timeline had room for improvement:

Early August: Notified we would need to make a change in processor by the end of the year
End of August: I provided my recommendations for a couple candidate processors
Middle of September: I strongly suggested the single processor that we ended up going with
End of October: Discussions started with the processor
Middle of December: Official access to our account, leaving a couple weeks to complete the changeover

Hindsight is 20/20 and shoulda/woulda/coulda is time's endless taunt. A better timeline would have been:

End of October: After initial discussions and opting to move forward, start developing based on API and Sandbox access, even without an account
November-December: Build as much as possible without an official account
Middle of December: With an official account, import information over from existing system, change from dev to production environment, fix any lingering bugs

There would have been a little more risk in developing for a system we weren't 100% sure we would use, however at end of October I believe we were at least 90% sure we'd go with them, and best case this would have given an extra month and a half. Worst case, I would have discovered everything I did about the legacy API and each application back then and been able to anticipate and handle it smoother, even if the actual API I had developed would need to change.

Prioritization and Planning Are Worthwhile

Planning with details is critical to adhering to timelines and managing stress. Prioritization efforts may lead astray if not considered appropriately. I was balancing three projects: one huge one that we're all ready to be done, one that will help in the near term, and one that had an absolute hard deadline. The hard deadline should have been first, followed by near term, then once those more immediate concerns were done I could head back to the big project to finish it up. It's very much worth it to spend the time planning, even if you aren't completely sure how something may work out, in order to keep stress down, morale up, and keep projects running smoothly.

Footnotes 🐾

I was a fool. Had I completely read the documentation, and as we'll find out soon, user definable fields are quick passthrough and nothing more. I suppose it was too much to expect for the processor to inadvertently also be a customizable database.↩
If you are trying to be clever you are 99% doing it incorrectly, and 1% of the time actually doing something worth the logisitic overhead.↩
I didn't really as I would find out later, but I got lucky that I knew enough to make decent guesses here.↩
Yes I know what assuming does and yes it did do just that.↩
This hands down is one of my favorite features of making an API in C#. It made it so easy to see exactly what requests were so then I could ensure my application could handle it. All it required was a helper class (placed in the Helpers folder of course) and in Program.cs adding a line activating middleware.↩

#.net core #c sharp #reverse engineer