Why Threat Intelligence Triage Breaks at the Adapter Layer

An indicator lands in your queue. Maybe it's a suspicious IP from a SIEM alert. Maybe it's a file hash pulled out of a phishing attachment. Maybe it's a domain a user tried to reach and the proxy flagged. Whatever the source, the next five minutes look the same in most SOCs.

You open a tab. You paste the indicator into VirusTotal, wait for the results to load, scroll through the detection stats, and try to form a quick read. You open a second tab. GreyNoise. You skim whether the IP is noise, RIOT-tagged, or actually classified as malicious. You open a third tab. AbuseIPDB. You check the confidence score and the report count. Then you mentally reconcile three different scoring schemes into a single sentence: probably fine, probably worth watching, or probably bad. Then you write a note, update the ticket, and move on to the next indicator. By the end of the shift you have done this twenty or fifty or a hundred times.

The work itself is not hard. It is just repetitive in a way that quietly burns hours that should have gone to actual analysis. And if you are an engineer trying to put an agent in front of that workflow, it gets worse. Every provider has a different API shape, a different auth scheme, a different rate limit, a different way of saying "malicious," and a different definition of what a clean verdict looks like. Your agent code ends up being three adapters, a retry loop, and a hand-rolled aggregator before you have written a single line of actual logic.

What existing tools actually solve

The threat intelligence space is not short on options. Commercial aggregator platforms stitch together dozens of feeds, add enrichment pipelines, and charge accordingly. SOAR platforms provide playbook engines that can call provider APIs in sequence. Open source SDKs wrap individual providers in convenient clients. All of these are useful for teams that have already made the platform decision and have a budget to match. But most of them assume the existence of a bigger system around them. They expect a SOAR engine, an enrichment pipeline, or a threat intelligence platform as the home base, and they are not designed for the moment when an analyst is sitting in front of a Claude Code session and just wants to ask, in plain English, is this hash bad, or the engineer who is building a small internal agent and needs one clean tool call instead of three provider integrations. That layer, the layer between "I have three API keys" and "I have a normalized answer I can reason about," is where most agent-adjacent security work wastes time. It is adapter glue, and it is the work nobody wants to maintain and nobody puts in the demo video.

What we built

menagos-ioc-mcp is an open-source Model Context Protocol server that collapses that adapter layer into a single tool. It is deliberately narrow. It does not try to be a threat intelligence platform, a SOAR engine, or a hunting interface. It exposes exactly one tool, lookup_ioc, and it does one thing well:

Accepts an indicator of compromise: an IPv4 or IPv6 address, a domain name, or a file hash (MD5, SHA-1, or SHA-256)
Classifies the indicator type so callers don't have to
Fans out to VirusTotal, GreyNoise, and AbuseIPDB in parallel and within a strict per-provider timeout
Normalizes every provider response into a shared shape with a reputation score in [0, 1]
Aggregates the per-provider reports into one verdict with a classification, a confidence rating, and a plain-English summary
Degrades gracefully when a provider fails, times out, or is rate limited, and tells the caller exactly which sources responded

The server is written in Python, uses FastMCP for the protocol layer, and runs in two shapes from one codebase: stdio (for Claude Desktop and Claude Code) and local HTTP (for browsers and other agents on the same machine). A small React + TypeScript + Tailwind web UI ships in the same repo for humans who want to drive the tool without an agent in front of it. A single lookup_ioc call takes about half a second against all three live providers, returning one structured JSON response and one verdict.

menagos-ioc-mcp web UI showing a file hash lookup returning a MALICIOUS verdict; hash value blurred

Why a single normalized tool matters

The value is not really about saving tabs. It is about what happens when an agent can reason over a stable response shape. When VirusTotal says malicious: 55, suspicious: 2, harmless: 0, undetected: 10, that is a piece of evidence. When GreyNoise says classification: benign, riot: true, name: "Google Public DNS", that is a different piece of evidence. When AbuseIPDB says abuseConfidenceScore: 92, totalReports: 147, that is a third. A human analyst can hold all three in their head and produce a verdict. A naive agent usually cannot, because the three shapes look nothing like each other and the agent's prompt has to account for every combination of fields. Normalizing them into a single SourceReport with a score in [0, 1], a classification drawn from a four-value enum, and a status drawn from a six-value enum is not a cosmetic choice. It is what lets the agent's downstream logic actually work: you can chain this tool's output into another tool, write a rule that says "if verdict is malicious and confidence is high, open a ticket," and reason about coverage honestly because the response tells you how many providers actually answered and which ones were skipped. The aggregation is also honest about what it does not know. If all three providers fail, the verdict is unknown with low confidence, not a silently-made-up benign. If only one out of three responds, the confidence is downgraded even if the single response is clean. Coverage shapes confidence, not the other way around.

Why local-first matters

Indicators are not always harmless strings. An internal IP tied to a specific host, a domain that a user on your network tried to resolve, a file hash from a device under investigation: these can all reveal more than they look like they reveal. Sending them through a third-party enrichment SaaS creates a data handling question that most teams do not want to answer every time they triage an alert. Local-first means the only traffic that leaves your machine is the three direct HTTPS calls to the providers the tool is built to query. No proxy, no aggregation layer, no vendor between you and the threat intel. The API keys sit in a .env file on your disk, never get logged (the structlog processor redacts them by design), and never touch any infrastructure that Menagos controls. The same property matters for agents: a Claude Code session that spawns the server over stdio runs it inside your environment with your keys. A remote deployment is possible if you want one, but the repo does not ship an opinionated hosting layer. If you want to put a reverse proxy and an auth boundary in front of it, that is a decision you make in your own infrastructure, not one we made for you.

What is in the v0.1 release

The initial release supports IPs, domains, and file hashes. URL lookups are deferred to v0.2 because VirusTotal's URL endpoint requires a base64url-encoded URL identifier and the extra input handling was not worth holding up the release. The GreyNoise adapter uses the Community API and will work with either a free-tier key or no key at all. The VirusTotal and AbuseIPDB adapters expect free-tier keys, which take about two minutes each to obtain. The repo ships with 65 pytest tests covering the classifier, the aggregation rules, every provider happy path and failure mode, and the orchestrator under partial-failure conditions. The web UI provides a search input, a verdict card with a color-coded classification and reputation score bar, three per-source cards with expandable raw signals, an error panel for invalid input, and a metadata footer with the query ID and duration for every lookup.

What this is not

This tool is for fast, low-friction IOC triage. It is important to be explicit about what it does not do. It is not a threat intelligence platform: there is no case management, no historical storage, no hunting interface, no correlation engine, and every lookup is a point-in-time question. It is not a block-or-allow decision engine; the verdict is an aggregation of three external sources at a single moment, one input into a decision rather than the decision itself, and if you are going to automatically block traffic based on this output, wrap it in your own policy layer and handle the edge cases. It is not a substitute for deeper analysis: a clean verdict from three providers does not mean an indicator is safe, it means three specific sources had nothing bad to say about it, and a malicious verdict means at least one of those sources thinks you should look closer, not that you should stop looking.

Is this the problem you are solving right now?

Are your analysts still tabbing between threat intel sites for every alert? Are you building a Claude-based agent and stuck wiring up three different provider SDKs before you can write any real logic? Is your existing enrichment pipeline a shell script, a Python notebook, and a Slack bot that mostly works? Do you need a clean, auditable tool call you can drop into a workflow without adopting a whole platform to get it?

If any of that sounds familiar, menagos-ioc-mcp exists for exactly that reason. It is open source, MIT licensed, and runs on a laptop. Clone it, paste in three API keys, and you have a working MCP server in under five minutes. Menagos LLC is a cybersecurity consultancy focused on security data engineering, GRC operations, and agentic tooling for small defense contractors and mid-market teams. We help teams wire up the operational layer between their security stack and the agents they want running on top of it, so the analysts can spend their time on the work that actually requires a human. If you want to talk about what that looks like for your environment, reach out through our contact form or send a note to info@menagos.com. No pitch deck. Just a conversation about where the friction is.