Building a RAG Assistant to explore my technical Portfolio

Github Repo

In this project, I developed a RAG (Retrieval-Augmented Generation) Assistant designed to allow an AI to understand, explore, and answer questions about my software and data project portfolio as if it were a human technical reviewer.

The main motivation was simple:
If a recruiter can ask an AI "how did you implement data ingestion?" or "what does this project do?", then my portfolio stops being static and becomes interactive.

In this post, I explain how I designed it, the decisions I made, and what I learned during the process.

What problem did I want to solve?

A traditional technical portfolio has several limitations:

It requires the reader to manually navigate between repositories, READMEs, and notebooks.
It doesn't scale well when many projects accumulate.
It doesn't allow open-ended questions in natural language.

My goal was to create a system that indexes my projects, understands both documentation and code, and allows flexible natural language queries.

General Approach: RAG on my Portfolio

The system follows the classic Retrieval-Augmented Generation pattern, divided into three stages:

Document Ingestion
Semantic Retrieval
Response Generation with an LLM

All knowledge lives in a local vector store, and the model only receives the strictly necessary context to answer each question.

Ingestion: How I turn my projects into knowledge

Ingestion was the most critical part of the project.

My portfolio includes multiple formats: Markdown, Python, PDF, DOCX, plain text, and Jupyter notebooks.

Key decisions:

Use dedicated loaders per file type.
Notebooks are loaded at the cell level, including Markdown and code.
Irrelevant folders like .git or __pycache__ are ignored.
Metadata is normalized to track the origin of each fragment.

Intelligent Segmentation: Prose ≠ Code

Not all content is treated the same:

Prose and documentation use larger chunks to preserve coherence.
Python code is divided into smaller chunks using language-aware segmentation.

This significantly improves the quality of retrieval and responses.

Embeddings and Vector Store

Embeddings: BAAI/bge-m3 (multilingual and normalized).
Vector store: ChromaDB, persistent and local.

Each change in the projects is reflected simply by re-running the ingestion process.

Retrieval: Maximal Marginal Relevance

To avoid redundant fragments, the system uses MMR, which allows retrieving diverse and complementary context within the same project.

Generation and Privacy

Generation is performed with a DeepSeek model.

The model only receives:

The user's question.
The retrieved fragments.

It does not have direct access to the files or the complete vector store, maintaining data control and privacy.

Interfaces: CLI and Web App

The system can be used from:

A command-line interface.
A web application developed with Dash and Bootstrap.

Both share exactly the same RAG pipeline.

Initial Validation

To test the system, I initially indexed a single real repository from my portfolio: evalcards, a Python library for generating model evaluation reports.

This allowed me to validate that the assistant understands metrics, documentation, and real code.

What this project demonstrates

Design of real RAG pipelines.
Technical judgment in chunking and retrieval.
Integration of AI with UX.
Product-oriented thinking and scalability.

Conclusion

This project transforms my portfolio into an AI-queryable knowledge base, allowing me to explain technical decisions and architecture interactively.

It is not just an AI demo; it is a new way to present technical expertise.