Skip to main content

Thesis context (M.Tech project)

This product is documented for operators and engineers. This page summarizes the academic framing from the degree project report so the same ideas appear in one place.

Report title: AI-Driven Endpoint Detection and Response (EDR) System with RAG-Powered Automation
Author: Ravi Sarode (Roll No. 2023MCS220005)
Institution: Indian Institute of Information Technology Kottayam — M.Tech in Cyber Security (submitted November 2025)
Supervisor: Dr Ragesh G K

The formal degree report is not published on this documentation site; this page is the public summary of its framing and scope.


Abstract (from the report)

The system automates post-detection triage that analysts usually do manually. Compared with EDR tools focused on ML or static rules alone, this work integrates Retrieval-Augmented Generation (RAG) with Amazon Bedrock to produce contextual incident analyses. Telemetry comes from Sysmon; components include a WPF agent, a React dashboard, and AWS serverless pieces (Lambda, API Gateway, DynamoDB) for scalable processing. The goal is to reduce manual work, speed investigations, and standardize explanation quality using AI-assisted automation.


Problem the project addresses

  • Alert volume vs manual triage — Behavioural EDR generates many alerts; understanding lineage, correlation, and severity still falls heavily on analysts.
  • Detection without interpretation — Finding suspicious behaviour is not the same as explaining why it matters or how events connect.
  • Weak or missing correlation — Real attacks are sequences; presenting isolated events forces manual stitching.
  • AI without grounding — LLM summaries are unreliable unless tied to verified telemetry and rule metadata.
  • ScaleServerless backends fit bursty alert traffic better than fixed-size servers for this design.

Objectives (aligned with implementation)

  1. Automated, consistent alert interpretation — Structured packages from Sysmon + JSON rules + correlation feed analysis.
  2. Faster triage — Less manual reconstruction of process lineage and timelines.
  3. Scalable architecture — Agent on the endpoint; serverless cloud for ingest, storage, and AI.
  4. Maintainable rulesJSON rule sets updated without rebuilding the agent binary.
  5. Grounded AIRAG supplies MITRE / similar-event / policy context so Bedrock outputs are evidence-linked, not generic prose.

For how this maps to code paths, see End-to-end, AI & enrichment, and Bedrock & AI.


Three-layer model (thesis ↔ docs)

Thesis layerProduct documentation
Endpoint (agent)Agent overview, Detection pipeline
CloudArchitecture, Backend overview
Analyst dashboardConsole features

Agent internals (thesis Figure 5.2)

The report describes the Windows agent as combining:

  • Monitors — File, process, and network-oriented paths feeding Sysmon-driven telemetry.
  • Detection engine — JSON rules and MITRE ATT&CK-style mappings at match time.
  • Forensic data collector — Builds process tree, checks signatures, hashes, and system metadata for richer alerts.
  • Correlation — Groups related activity (e.g. lineage) before cloud upload.
  • Queue / uploader — Resilient delivery to the API when the network is intermittent.

This matches the implementation story in From agent to console and Agent operations.


Sysmon event IDs captured (thesis Table 5.1)

The thesis lists the following Sysmon event types as part of the telemetry baseline for behavioural detection and AI analysis:

Sysmon Event IDEvent typeRole
1Process creationCommand-line and parent-child visibility
3Network connectionOutbound TCP/UDP from processes
7Image loadDLL / module loads
10Process accessCross-process access (e.g. credential theft patterns)
11File createNew files on disk
12Registry create/deleteRegistry object changes
13Registry value setValue modifications
14Registry renameKey/value renames
22DNS queryDNS lookups by process
23File delete (archived)Deleted files with retained metadata
25Process tamperingImage/memory tampering attempts

Tune your Sysmon config in production to balance visibility and noise; the agent’s rule packs assume these classes of events are available where rules reference them.


AI analysis pipeline (thesis §5.4)

RAG retrieves context (e.g. MITRE, similar events) and passes it with the structured alert to Bedrock so the model’s narrative stays grounded in evidence. Outputs are stored and shown in the console (classification, reasoning, severity, MITRE references, etc.). See AI & enrichment and Bedrock & AI.


Future work (from the report)

  • Device-level correlation across multiple alerts from the same host (broader narratives than single-event focus).
  • Richer rules and correlation logic as threats evolve.
  • More event types where telemetry cost allows.
  • Analyst feedback into the model or rule tuning (human-in-the-loop learning).

Further reading (bibliography highlights)

The thesis cites industry reports (breach costs, EDR ML), Sysmon documentation, AWS / Bedrock, IBM on RAG, MITRE ATT&CK, and academic work on EDR correlation and RAG—use the PDF’s bibliography for full citations.


How this page relates to the rest of the site

  • Operators can skip this page and use Console features and End-to-end.
  • Stakeholders and reviewers can use this page to see why the architecture and AI choices exist without reading the full academic report.