Glossary · E-commerce ML
Product Embeddings
also: item embeddings · transformer embeddings
Definition
Product embeddings are dense vector representations of items in a learned semantic space, such that geometrically close items are similar in the behavioral or content sense. Transformer-based embeddings trained on session sequences capture nuanced substitute/complement relationships that simple collaborative filtering misses.
Early embeddings (word2vec adapted to item2vec) learned from co-occurrence. Modern transformer architectures (BERT4Rec, SASRec, Pinterest's PinnerSAGE) capture sequential intent — 'what did the user view next' — and produce embeddings where vector arithmetic reflects substitution and complement structure. Embeddings also enable zero/few-shot recommendation for cold-start items via content-derived vectors.
Essays on this concept
- E-commerce ML
Transformer-Based Product Embeddings: Outperforming Collaborative Filtering with Multimodal Representations
Collaborative filtering needs a user to buy before it can recommend. Transformer-based embeddings understand products from their descriptions, images, and the behavioral context of browsing sessions — no purchase history required.
- E-commerce ML
Cold-Start Problem Solved: Few-Shot Learning for New Product Recommendations Using Meta-Learning
New products get no recommendations. No recommendations means no clicks. No clicks means no data. No data means no recommendations. Meta-learning breaks this loop by transferring knowledge from products that came before.
- E-commerce ML
Graph Neural Networks for Cross-Sell: Modeling the Product Co-Purchase Network at Scale
Association rules find that beer and diapers are co-purchased. Graph neural networks understand why — the underlying structure of complementary needs, occasion-based shopping, and brand affinity networks that connect products across categories.
- E-commerce ML
LLM-Powered Catalog Enrichment: Automated Attribute Extraction, Taxonomy Mapping, and SEO Generation
The average e-commerce catalog has 40% missing attributes, inconsistent taxonomy, and product descriptions written by suppliers who don't speak the customer's language. LLMs can fix all three — if you build the right quality assurance pipeline around them.
- Marketing Engineering
Building a Real-Time Personalization Engine: From Contextual Bandits to Deep Reinforcement Learning
A/B tests answer 'which variant is best on average.' Contextual bandits answer 'which variant is best for this user right now.' The difference in cumulative regret — and revenue — compounds daily.
- E-commerce ML
Search Ranking as a Revenue Optimization Problem: Learning-to-Rank with Business Objective Regularization
E-commerce search is not Google search. When a user types 'running shoes,' the goal isn't to find the most relevant document — it's to surface the product most likely to be purchased at the highest margin. This reframes ranking as a constrained revenue optimization problem.
Related concepts