RAG Enhancement Feature Release

For a detailed comparison of the analysis quality before and after this improvement, including specific test cases and PRs, please go over the Detailed Comparative Analysis Report.

Overview

We are excited to announce the release of our enhanced Retrieval-Augmented Generation (RAG) architecture for CodeSherlock PR Reviews. This upgrade significantly improves how contextual information is retrieved and understood across the entire repository during code analysis.

The Problem

One of the key improvements focuses on reducing false positives related to imported files that are not part of the pull request but exist in the overall repository. Previously, this scenario contributed to unnecessary alerts because the analysis pipeline lacked visibility into those external dependencies. It could only analyze the files present directly within the PR.

The Solution: Enhanced Repository-Level Understanding

With our enhanced repository-level contextual understanding, the analysis pipeline can now fetch the necessary context by resolving dependencies from the main repository branch. This gives the AI a complete picture of the codebase structure and validation rules inherited from other parts of the system.

We have achieved approximately an 80% reduction in such false positives. Previously we noticed getting around 30% false positives, now we get a maximum of 5%.

Benefits

With this you get:

More accurate code reviews
Cleaner and more relevant findings
Reduced noise for developers
Faster and more confident merge decisions

Overview​

The Problem​

The Solution: Enhanced Repository-Level Understanding​

Benefits​

Overview

The Problem

The Solution: Enhanced Repository-Level Understanding

Benefits