Skip to main content

Command Palette

Search for a command to run...

Advanced Security Testing in SIBANGKU: Finding CSV Formula Injection Before It Reached Users

Updated
7 min read

Unit tests are good at checking whether internal logic behaves correctly. But not every risk in a web application is visible from ordinary unit tests.

In the SIBANGKU project, I learned this through the CSV upload feature. The feature worked functionally: admins could upload schedule data, the system parsed it, and the records were stored. But a working feature is not always a safe feature.

The more important question became: what happens if the CSV itself is malicious?

That question pushed me beyond standard functional testing into security testing, automated vulnerability checks, and manual penetration testing.

The gap: strip_tags() was not enough

Before the security-focused testing, the upload flow already performed basic sanitization. User-controlled CSV fields were cleaned using strip_tags() before being stored or rendered.

That helped with HTML/script injection risks, but it did not cover a different class of attack: CSV Formula Injection.

CSV Formula Injection happens when a spreadsheet application interprets a cell value as a formula. If a field begins with characters such as:

=-@

then Excel or Google Sheets may treat the value as executable spreadsheet logic instead of plain text.

This matters because an attacker could upload a payload such as a formula-like value, let it sit in the database, and wait until an admin exports or opens the data in a spreadsheet. At that point, the risk moves from the web application into a downstream component: the spreadsheet application.

This was the key lesson: sanitizing HTML is not the same as neutralizing spreadsheet formulas.

Finding the vulnerability

I used a combination of automated DAST using OWASP ZAP and manual penetration testing on the manager upload flow.

The manual test was especially important because CSV Formula Injection is not always obvious from normal web UI behavior. The payload can look harmless inside the web app, but become dangerous once exported or opened in Excel.

The finding was clear: the existing sanitization did not neutralize fields that started with formula-triggering characters. This meant the application could store dangerous spreadsheet payloads even though the web page itself did not execute JavaScript.

In OWASP terms, this relates to injection risk because untrusted input is passed into a downstream interpreter without proper neutralization.

Remediation: neutralize dangerous prefixes

To mitigate the issue, the parser needed to treat spreadsheet-control characters as unsafe when they appeared at the beginning of a field.

The remediation strategy was to prefix dangerous values with a literal marker, such as a single quote, when the value begins with:

=-@

The goal is not to delete the user's data. The goal is to preserve the content while forcing spreadsheet applications to interpret it as plain text.

This distinction matters. If we simply strip characters, we may silently change legitimate data. If we prefix the value safely, the data remains visible but loses its ability to execute as a spreadsheet formula.

Turning the finding into automated tests

A security finding is not fully useful if it only exists as a one-time manual discovery. It needs to become a regression test.

That is why I built an automated security test suite for the upload flow. The tests covered two major areas:

  • CSV Formula Injection bypass attempts;

  • Broken Access Control on schedule modification flows.

For access control, the expected behavior was explicit: non-admin roles must receive HTTP 403 Forbidden when trying to modify schedule data.

This moved the security work from "I found a problem once" into "the pipeline can keep checking this behavior".

In the broader security section of the IR work, this security testing effort also included a larger set of automated checks in files such as test_security.py and security header regression tests. The important part is that the security assumptions became executable.

Manual penetration testing: clickjacking

The next issue came from manual penetration testing around browser-level protection.

I audited the application settings and found that X-Frame-Options was not configured explicitly. Django may provide a default behavior, but for sensitive application pages, relying on an implicit default is weaker than stating the security policy directly.

The risk here is clickjacking or UI redressing. In a clickjacking attack, a sensitive page is embedded inside an iframe and visually manipulated so the user clicks something different from what they think they are clicking.

To validate the risk, I created a local proof-of-concept HTML scenario using iframe opacity manipulation. The goal was not to attack production, but to demonstrate that the UI could be reasoned about from an attacker's perspective.

The remediation was to set:

X_FRAME_OPTIONS = "DENY"

This explicitly prevents the page from being embedded in frames. To make sure the behavior does not accidentally regress in the future, I also added security header regression tests.

Mapping the work to OWASP

The testing and remediation aligned with several OWASP references.

First, CSV Formula Injection relates to OWASP Top 10 A03:2021 Injection. The core issue is improper neutralization of special elements before data is consumed by a downstream interpreter.

Second, the access-control tests relate to OWASP Top 10 A01:2021 Broken Access Control. The system must not only hide UI controls from unauthorized users; it must reject unauthorized requests at the server level.

Third, the manual penetration testing approach followed the spirit of the OWASP Web Security Testing Guide, especially around information gathering and input validation testing.

The important part is that OWASP was not used only as a citation. It helped structure what to look for:

  • Where does untrusted input enter?

  • What component interprets it later?

  • What roles are allowed to perform state-changing operations?

  • What browser security headers must be explicit?

  • Which findings should become regression tests?

Criticizing my previous testing approach

Before this work, the project already had functional tests and some sanitization. But the testing mindset was still too focused on whether the feature worked under expected usage.

The CSV Formula Injection finding showed that "valid input" and "safe input" are not the same thing.

The previous approach had several weaknesses:

  • it treated CSV mostly as data format, not as a possible attack vector;

  • it focused on web rendering risks, while spreadsheet execution was a downstream risk;

  • it relied on strip_tags() as if HTML sanitization covered all input threats;

  • it did not initially encode security findings as regression tests;

  • it did not make browser-level protections such as frame restrictions explicit enough.

This critique is useful because it changed how I think about QA. Advanced testing is not just adding another tool. It is changing the threat model.

What I would improve next

There are several improvements I would still make.

First, I would integrate the security test suite more tightly into CI/CD so every merge request automatically checks security regressions.

Second, I would combine DAST with SAST or dependency scanning. OWASP ZAP is useful for observing a running application, but it does not replace source-level and dependency-level checks.

Third, I would document security assumptions beside the tests. For example, the reason why =, +, -, and @ are dangerous in CSV fields should be written close to the helper or test cases.

Fourth, I would make the export flow part of the test strategy. CSV Formula Injection becomes most dangerous when data moves into spreadsheet tools, so export behavior deserves its own tests.

Conclusion

The biggest lesson from this work is that advanced testing is not about running a tool and collecting a screenshot. It is about finding the gap between "the feature works" and "the feature is safe".

In SIBANGKU, security testing revealed that basic HTML sanitization was not enough to protect the CSV upload flow from formula injection. By combining OWASP ZAP, manual penetration testing, automated security tests, access-control checks, and security header regression tests, I turned a hidden risk into an explicit engineering contract.

That is the kind of testing that actually improves a project: not just more tests, but better questions.