How to process text embeddings classification on n8n

intermediate 12 min read Updated 2026-06-01

Quick Answer

Text embeddings classification in n8n involves creating a workflow that connects to an AI service to generate embeddings, then uses classification nodes to categorize text. You'll need to configure AI service credentials, set up data preprocessing, and create classification logic using comparison nodes.

Full n8n Review

Prerequisites

Basic understanding of n8n workflows
OpenAI API key or similar AI service credentials
Knowledge of text classification concepts
Familiarity with JSON data structures

Step-by-Step Instructions

Set up AI Service Credentials

Navigate to Settings > Credentials in your n8n workspace. Click + Add Credential and select your AI service (OpenAI, Cohere, or Hugging Face). Enter your API key and test the connection. Save the credential with a descriptive name like OpenAI-Embeddings.

Store multiple AI service credentials as backup options for better reliability.

Create Input Data Node

Add a Manual Trigger or Webhook node to start your workflow. Configure the input to accept text data by adding a JSON payload structure:

{
  "text": "Your text to classify",
  "categories": ["category1", "category2"]
}

Add Text Preprocessing Node

Insert a Code node after your trigger. Add JavaScript code to clean and prepare your text:

const cleanText = $json.text
  .toLowerCase()
  .replace(/[^\w\s]/g, '')
  .trim();

return {
  originalText: $json.text,
  cleanedText: cleanText,
  categories: $json.categories
};

Text preprocessing improves embedding quality and classification accuracy.

Configure OpenAI Embeddings Node

Add an OpenAI node and select Get Embeddings operation. Set the Input Text field to {{ $json.cleanedText }}. Choose the embedding model (recommended: text-embedding-ada-002). Select your previously created credentials.

Use consistent embedding models throughout your workflow for better comparison results.

Create Reference Embeddings

Add another OpenAI node to generate embeddings for your classification categories. Use an Item Lists node to split categories, then connect to the OpenAI node. Set the input to {{ $json }} to process each category name into embeddings.

Calculate Similarity Scores

Add a Code node to compute cosine similarity between text and category embeddings:

const textEmbedding = $('OpenAI').first().json.embedding;
const categoryEmbeddings = $('OpenAI1').all();

function cosineSimilarity(a, b) {
  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
  const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
  const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
  return dotProduct / (magnitudeA * magnitudeB);
}

const similarities = categoryEmbeddings.map(cat => ({
  category: cat.json.input,
  similarity: cosineSimilarity(textEmbedding, cat.json.embedding)
}));

return { similarities };

Cosine similarity values range from -1 to 1, with higher values indicating better matches.

Determine Classification Result

Add a final Code node to select the highest similarity score:

const similarities = $json.similarities;
const bestMatch = similarities.reduce((best, current) => 
  current.similarity > best.similarity ? current : best
);

return {
  originalText: $('Code').first().json.originalText,
  predictedCategory: bestMatch.category,
  confidence: bestMatch.similarity,
  allScores: similarities
};

Set confidence thresholds to handle uncertain classifications appropriately.

Add Output and Error Handling

Connect your final node to a Webhook Response or Set node to output results. Add Error Trigger and Stop and Error nodes to handle API failures gracefully. Configure retry logic in the OpenAI nodes with Retry on Fail: 3 times.

Common Issues & Troubleshooting

API rate limit errors

Add Wait nodes between API calls or implement exponential backoff in your Code nodes. Consider upgrading your API plan for higher rate limits.

Low classification accuracy

Improve text preprocessing by handling special characters, stemming, or lemmatization. Use more descriptive category names or add example texts for each category.

High API costs

Cache embeddings using a Redis or Database node to avoid recalculating identical text embeddings. Batch process multiple texts in single API calls.

Workflow timeout issues

Enable Save Intermediate Results in workflow settings. Split large text processing into smaller chunks using SplitInBatches node.

Prices mentioned in this guide are pulled from current plan data and may change. Always verify on the official n8n website before purchasing.

Visit n8n View n8n Pricing