How to process text embeddings classification on n8n
Text embeddings classification in n8n involves creating a workflow that connects to an AI service to generate embeddings, then uses classification nodes to categorize text. You'll need to configure AI service credentials, set up data preprocessing, and create classification logic using comparison nodes.
Prerequisites
- Basic understanding of n8n workflows
- OpenAI API key or similar AI service credentials
- Knowledge of text classification concepts
- Familiarity with JSON data structures
Step-by-Step Instructions
Set up AI Service Credentials
OpenAI-Embeddings.Create Input Data Node
{
"text": "Your text to classify",
"categories": ["category1", "category2"]
}Add Text Preprocessing Node
const cleanText = $json.text
.toLowerCase()
.replace(/[^\w\s]/g, '')
.trim();
return {
originalText: $json.text,
cleanedText: cleanText,
categories: $json.categories
};Configure OpenAI Embeddings Node
{{ $json.cleanedText }}. Choose the embedding model (recommended: text-embedding-ada-002). Select your previously created credentials.Create Reference Embeddings
{{ $json }} to process each category name into embeddings.Calculate Similarity Scores
const textEmbedding = $('OpenAI').first().json.embedding;
const categoryEmbeddings = $('OpenAI1').all();
function cosineSimilarity(a, b) {
const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
return dotProduct / (magnitudeA * magnitudeB);
}
const similarities = categoryEmbeddings.map(cat => ({
category: cat.json.input,
similarity: cosineSimilarity(textEmbedding, cat.json.embedding)
}));
return { similarities };Determine Classification Result
const similarities = $json.similarities;
const bestMatch = similarities.reduce((best, current) =>
current.similarity > best.similarity ? current : best
);
return {
originalText: $('Code').first().json.originalText,
predictedCategory: bestMatch.category,
confidence: bestMatch.similarity,
allScores: similarities
};Add Output and Error Handling
Common Issues & Troubleshooting
API rate limit errors
Add Wait nodes between API calls or implement exponential backoff in your Code nodes. Consider upgrading your API plan for higher rate limits.
Low classification accuracy
Improve text preprocessing by handling special characters, stemming, or lemmatization. Use more descriptive category names or add example texts for each category.
High API costs
Cache embeddings using a Redis or Database node to avoid recalculating identical text embeddings. Batch process multiple texts in single API calls.
Workflow timeout issues
Enable Save Intermediate Results in workflow settings. Split large text processing into smaller chunks using SplitInBatches node.