Back to home

Increase Recommendation Systems’ Precision with LLMs, Using Python

A clear and practical article about artificial intelligence for a professional audience.

Audio reading is not available in this browser
Increase Recommendation Systems’ Precision with LLMs, Using Python

Tags

Quick summary

A clear and practical article about artificial intelligence for a professional audience.

Increase Recommendation Systems’ Precision with LLMs, Using Python

Traditional recommendation engines have long relied on collaborative filtering, matrix factorization, and deep learning architectures to predict user preferences. While these methods excel at exploiting large behavioral datasets, they often struggle with cold-start items, sparse interaction histories, and the semantic nuances buried inside unstructured metadata such as product descriptions, reviews, or content summaries. Large Language Models (LLMs) offer a compelling complement to classical pipelines. By encoding natural language understanding, zero-shot reasoning, and contextual relevance scoring, LLMs can bridge the gap between raw text and user intent.

The convergence of generative AI and recommendation infrastructure is a recurring theme across leading industry channels, including the [Google AI Blog](https://blog.google/technology/ai/), the [Microsoft AI Blog](https://www.microsoft.com/en-us/ai/blog/), [OpenAI News](https://openai.com/news/), and the [Towards Data Science](https://towardsdatascience.com/) community. Rather than replacing proven retrieval layers, practitioners are increasingly adopting hybrid architectures: classical models generate a broad set of candidates, and LLMs refine, rerank, or explain those candidates to boost precision. This article provides a practical guide to building such hybrid workflows in Python, covering environment setup, embedding-based enrichment, and LLM-driven reranking.

Requirements

Before proceeding, ensure your workstation meets the following prerequisites:

  • **Python 3.9 or newer** installed and available in your system path.
  • **pip**, the Python package installer, updated to a recent version.
  • **Virtual environment tooling** (`venv` is sufficient) to isolate project dependencies.
  • An **OpenAI API key** if you intend to run the GPT-based reranking examples. You can obtain one from the OpenAI platform.
  • **Hardware**: A minimum of 4 GB RAM is adequate for small local embedding models. If you plan to experiment with larger open-source LLMs locally, 16 GB RAM and a modern GPU are recommended; however, the examples below use API-based models or lightweight sentence transformers to remain accessible.

Step-by-step installation

Start by creating a dedicated directory to house your project files and code.

mkdir llm-recommender && cd llm-recommender

Create a Python virtual environment named `venv` to keep dependencies isolated from your global Python installation.

python -m venv venv

Activate the virtual environment. On Linux or macOS, run the following command.

source venv/bin/activate

On Windows, use the analogous activation script.

venv\Scripts\activate

Upgrade `pip` to prevent compatibility issues when installing scientific computing wheels.

pip install --upgrade pip

Install the core libraries required for data manipulation, traditional similarity metrics, local embedding inference, and OpenAI API access.

pip install pandas numpy scikit-learn sentence-transformers openai python-dotenv

Create a hidden environment file named `.env` to store your API credentials securely outside of version control.

touch .env

Open `.env` in your text editor and add your OpenAI API key as shown below. Replace the placeholder with your actual key.

OPENAI_API_KEY=sk-your-key-here

Your environment is now ready. The installed stack allows you to run local embedding models without a GPU, interact with OpenAI’s API for heavy generative lifting, and manage tabular data with `pandas`.

Usage examples

The following sections demonstrate three distinct ways to inject LLM capabilities into a recommendation pipeline. Each example is self-contained and can be adapted to your own catalog data.

Example 1: Semantic item embeddings with sentence transformers

Classical collaborative filtering fails when new items enter the catalog because there are no user interactions to learn from. A robust first step is to encode item metadata—such as titles, descriptions, or tags—into dense vectors using a transformer model. By computing similarity between user profile vectors and item vectors, you can surface semantically relevant content even without historical signals.

The code below loads a lightweight model, encodes a toy catalog, and ranks items against a user interest string.

import pandas as pd
import numpy as np
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity

# Load a compact, high-quality embedding model suitable for CPU inference.
model = SentenceTransformer('all-MiniLM-L6-v2')

# Build a toy catalog where each item has unstructured descriptive text.
items = pd.DataFrame({
    'item_id': [1, 2, 3, 4],
    'description': [
        'A cyberpunk thriller set in a dystopian Tokyo underworld',
        'A hands-on guide to Python machine learning and pandas',
        'A romantic comedy about two rival pastry chefs in Paris',
        'An introductory textbook on deep learning and neural networks'
    ]
})

# Encode the catalog descriptions into 384-dimensional dense vectors.
item_embeddings = model.encode(
    items['description'].tolist(),
    convert_to_numpy=True,
    show_progress_bar=False
)

# Simulate a user profile expressed in natural language.
user_query = "I want to learn about artificial intelligence and neural nets"

# Encode the user query into the same semantic space.
user_embedding = model.encode([user_query], convert_to_numpy=True)

# Compute cosine similarity between the user and each item.
similarity_scores = cosine_similarity(user_embedding, item_embeddings).flatten()

# Attach scores and sort to produce the final ranking.
items['similarity'] = similarity_scores
ranked_items = items.sort_values('similarity', ascending=False)

print(ranked_items[['item_id', 'description', 'similarity']])

In this pattern, the LLM-derived embeddings capture semantic relationships that item IDs alone cannot. A matrix factorization model might associate item 2 and item 4 only if users co-clicked them, whereas the embedding model recognizes their topical similarity immediately. In production, you can store these vectors in a vector

Sources

FAQ

What is this article about?

This article covers “Increase Recommendation Systems’ Precision with LLMs, Using Python” in the AI tools category. A clear and practical article about artificial intelligence for a professional audience.

Who is this useful for?

It is useful for readers who want a practical understanding of AI tools, models, and workflows.

What should I do next?

Read the article, review the listed sources, and test the most relevant ideas in your own workflow.