Building a RAG Assistant to explore my technical Portfolio
Nov 10, 2025 (3 months ago)
3 min read
In this project, I developed a RAG (Retrieval-Augmented Generation) Assistant designed to allow an AI to understand, explore, and answer questions about my software and data project portfolio as if it were a human technical reviewer.
The main motivation was simple:
If a recruiter can ask an AI "how did you implement data ingestion?" or "what does this project do?", then my portfolio stops being static and becomes interactive.
In this post, I explain how I designed it, the decisions I made, and what I learned during the process.
What problem did I want to solve?
A traditional technical portfolio has several limitations:
- It requires the reader to manually navigate between repositories, READMEs, and notebooks.
- It doesn't scale well when many projects accumulate.
- It doesn't allow open-ended questions in natural language.
My goal was to create a system that indexes my projects, understands both documentation and code, and allows flexible natural language queries.
General Approach: RAG on my Portfolio
The system follows the classic Retrieval-Augmented Generation pattern, divided into three stages:
- Document Ingestion
- Semantic Retrieval
- Response Generation with an LLM
All knowledge lives in a local vector store, and the model only receives the strictly necessary context to answer each question.
Ingestion: How I turn my projects into knowledge
Ingestion was the most critical part of the project.
My portfolio includes multiple formats: Markdown, Python, PDF, DOCX, plain text, and Jupyter notebooks.
Key decisions:
- Use dedicated loaders per file type.
- Notebooks are loaded at the cell level, including Markdown and code.
- Irrelevant folders like
.gitor__pycache__are ignored. - Metadata is normalized to track the origin of each fragment.
Intelligent Segmentation: Prose ≠ Code
Not all content is treated the same:
- Prose and documentation use larger chunks to preserve coherence.
- Python code is divided into smaller chunks using language-aware segmentation.
This significantly improves the quality of retrieval and responses.
Embeddings and Vector Store
- Embeddings: BAAI/bge-m3 (multilingual and normalized).
- Vector store: ChromaDB, persistent and local.
Each change in the projects is reflected simply by re-running the ingestion process.
Retrieval: Maximal Marginal Relevance
To avoid redundant fragments, the system uses MMR, which allows retrieving diverse and complementary context within the same project.
Generation and Privacy
Generation is performed with a DeepSeek model.
The model only receives:
- The user's question.
- The retrieved fragments.
It does not have direct access to the files or the complete vector store, maintaining data control and privacy.
Interfaces: CLI and Web App
The system can be used from:
- A command-line interface.
- A web application developed with Dash and Bootstrap.
Both share exactly the same RAG pipeline.
Initial Validation
To test the system, I initially indexed a single real repository from my portfolio: evalcards, a Python library for generating model evaluation reports.
This allowed me to validate that the assistant understands metrics, documentation, and real code.
What this project demonstrates
- Design of real RAG pipelines.
- Technical judgment in chunking and retrieval.
- Integration of AI with UX.
- Product-oriented thinking and scalability.
Conclusion
This project transforms my portfolio into an AI-queryable knowledge base, allowing me to explain technical decisions and architecture interactively.
It is not just an AI demo; it is a new way to present technical expertise.