revise RAG readme

2026-04-10 11:23:42 +00:00 · 2025-04-07 11:40:15 -04:00
parent 59442dd546
commit fa99f9d99a
1 changed files with 22 additions and 28 deletions
--- a/lab03_RAG/README.md
+++ b/lab03_RAG/README.md
@@ -1,4 +1,17 @@
-# RAG-based Cyber Forensics Investigation Tool [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+# RAG-based Cyber Forensics Investigation Tool
+
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+
+## Author
+
+**Mohit Ajaykumar Dhabuwala**
+
+- M.S. in Cyber Forensics and Counterterrorism
+- Specialization: Digital Forensics & Incident Response (DFIR)
+- Proficient in:
+  - Memory, Windows, mobile, and network forensics
+  - Forensic tools: Magnet AXIOM, EnCase, Volatility, Wireshark
+  - Programming languages: Python, Bash, PowerShell for forensic data parsing and automation

 ## What is RAG?

@@ -72,7 +85,7 @@ The code comprises:

 RAG offers these advantages:

- **Contextualized responses:** Answers are grounded in the provided cyber forensics document.
+- **Contextualized responses:** Answers are based on the provided cyber forensics document.
 - **Interactive interface:** User-friendly chat interaction.
 - **Efficiency:** FAISS enables fast retrieval.
 - **Cloud-based execution:** Google Colab provides a convenient environment.
@@ -82,7 +95,7 @@ RAG offers these advantages:

 _(Flowchart image included here)_

-![Flowchart](Colab_RAG.png)
+![Flowchart](image_f6fb04.png-8c6bf71b-bc0b-4179-93a2-cc646df542c9)

 ## Setup and Usage

@@ -97,38 +110,19 @@ _(Flowchart image included here)_
 4.  **Install Python dependencies:** Execute these commands in a Colab cell:

    ```bash
-    !pip install -U langchain langchain-core langchain-huggingface langchain_community faiss-cpu huggingface_hub
+    !pip install transformers langchain langchain_community faiss-cpu huggingface_hub pypdf pymupdf -U langchain langchain-huggingface
    !pip install --upgrade langchain
    ```

-5.  **Provide Hugging Face API Token:** Add a code cell to set the `HUGGINGFACEHUB_API_TOKEN` environment variable:
+5.  **Provide Hugging Face API Token:** Add a code cell to set the `HUGGINGFACEHUB_API_TOKEN` environment variable with your token:

    ```python
-    api_token = "ENTER THE API KEY"  # Replace 'ENTER THE API KEY' with your actual token
+    import os
+    os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'hf_your_token'  # Replace 'hf_your_token' with your actual token
    ```

-6.  **Provide Your Knowledge Base:** Add a cell to define `scenario_text` (Any passage of your choice).
-7.  **Run the Code:** Execute the cells in order to interact with the RAG system.
-
-## Background Story Used
-
-This project utilizes a futuristic cyberpunk scenario to simulate a cybercrime investigation. Detective Y investigates a complex ransomware attack targeting robotics engineer Z by "The Serpent," who employs advanced techniques to encrypt and steal research data. This scenario serves as the knowledge base for the RAG system.
-
-## Story based Questions
-
-The RAG system answers questions based on the provided cyber forensics scenario. Examples:
-
-**In-Text Questions:**
-
-1.  What type of cyberattack did Detective Y investigate?
-2.  What was the victim's profession?
-3.  Where was the remote server located that led to the perpetrator's arrest?
-
-**Out-of-Text Questions (Answers not in the text):**
-
-1.  What specific encryption algorithm did The Serpent use?
-2.  What was the name of the university where the security breach occurred?
-3.  Did Detective Y's team collaborate with external experts?
+6.  **Provide your knowledge base:** Add a cell to define `document_text` (the scenario).
+7.  **Run the code:** Execute the cells to interact with the RAG system.

 ## Features