Documentation

Understanding Chunks and Products

This guide explains the advanced concepts behind how your AI agent finds and delivers accurate answers: vectorized chunks for semantic understanding and supplement products for precise filtering.

How Your AI Finds Answers

Your AI uses a hybrid search system that combines two powerful approaches:

  1. Semantic Search - Understands meaning and context from your content chunks
  2. Structured Search - Filters and sorts using your product catalog data

This combination delivers results that neither approach could achieve alone.


Vectorized Chunks

What Are Chunks?

When you upload content (web pages, documents, PDFs), the system breaks it into smaller, manageable pieces called chunks. Each chunk typically contains a paragraph or section of related information.

Example: A 10-page PDF about your return policy might become 25-30 chunks, each covering a specific topic like "refund timeframes," "exchange process," or "exceptions."

What Is Vectorization?

Each chunk is converted into a vector embedding - a numerical representation that captures its semantic meaning. Think of it as translating text into coordinates in a multi-dimensional space where similar meanings are located near each other.

How it works:

"affordable engagement rings" → [0.23, -0.45, 0.67, 0.12, ...]
"budget-friendly diamond bands" → [0.25, -0.43, 0.80, 0.14, ...]

These two phrases have similar vectors because they have similar meanings, even though they use different words.

How Semantic Search Works

When a customer asks a question:

  1. Query vectorization - The question is converted to a vector
  2. Similarity search - System finds chunks with similar vectors
  3. Relevance ranking - Most relevant chunks are selected
  4. Answer generation - AI uses these chunks to craft a response

Example:

  • Customer asks: "What's your policy on returns?"
  • System finds chunks about refunds, exchanges, return windows
  • AI synthesizes an accurate answer from your actual content

Why Chunks Matter

BenefitDescription
Contextual answersResponses based on your actual content, not generic AI knowledge
Accurate informationAI cites your policies, products, and services correctly
Reduced hallucinationGrounded in real data rather than making things up
Up-to-date responsesReflects your latest uploaded content

Chunk Limits by Plan

PlanChunk Limit
Free500
Professional1,000
Premium2,000
EnterpriseCustom

Managing Your Chunks

What counts toward your limit:

  • Uploaded documents (PDFs, DOCs, TXT files)
  • Crawled web pages
  • Manually added content

Tips for optimization:

  • Remove duplicate or outdated content
  • Focus on high-value information customers ask about
  • Consolidate similar pages when possible

Supplement Products

The Limitation of Semantic Search

While semantic search excels at understanding meaning, it struggles with numerical precision.

The problem:

  • "Affordable rings" and "luxury rings" have different meanings ✓
  • "$500 rings" and "$5,000 rings" may appear similar if descriptions match ✗

Vector embeddings don't encode numerical relationships well. A budget ring and an expensive ring might look similar in vector space if they're both described as "beautiful diamond engagement rings."

What Are Supplement Products?

Supplement products are your structured product catalog that adds precise, filterable data on top of semantic search. They "supplement" the vector search with exact values.

Product Data Structure

Each product record contains:

FieldDescriptionExample
TitleProduct name"1.5ct Oval Diamond Ring"
URLProduct page link"https://yoursite.com/rings/oval-15ct"
PriceNumeric price value2499.00
Numeric 1-3Custom numeric fieldsCarat: 1.5, Length: 8mm
CategoriesClassification tags"Engagement, Oval, Natural"
KeywordsSearchable attributes"solitaire, platinum, certified"
Media URLProduct image"https://yoursite.com/images/ring.jpg"

How Structured Search Works

The product catalog enables:

CapabilityExample Query
Price filtering"under $2,000" or "between $500-$1,000"
Numeric comparisons"at least 1 carat" or "under 50,000 miles"
Category filtering"show me oval diamonds" or "exclude SUVs"
Keyword matching"with heated seats" or "certified pre-owned"
Budget awareness"within my $3,000 budget"

The Hybrid Search in Action

Customer query: "Show me oval diamond rings under $3,000"

What happens:

  1. Semantic search analyzes chunks

    • Finds content about oval diamonds, ring styles, quality factors
    • Understands "oval" refers to shape, not just any mention of the word
  2. Structured search filters products

    • Filters shape category = "oval"
    • Filters price < $3,000
    • Returns matching products with exact prices
  3. Combined results

    • AI presents relevant products with accurate pricing
    • Provides context from your content about oval diamonds
    • Respects the exact budget constraint

Result: Customer sees oval rings priced $1,200 to $2,950 - not a $4,500 ring that happens to mention "affordable" in its description.

Industry Examples

Jewelry Store:

  • Price, carat weight, ring size
  • Categories: shape, metal, stone type
  • Keywords: certified, natural, lab-grown

Car Dealership:

  • Price, mileage, year
  • Categories: make, model, body style
  • Keywords: AWD, leather, sunroof

Real Estate:

  • Price, square footage, bedrooms
  • Categories: property type, neighborhood
  • Keywords: pool, garage, renovated

Restaurant:

  • Price, calories, prep time
  • Categories: cuisine, meal type, dietary
  • Keywords: gluten-free, spicy, vegetarian

Product Limits by Plan

PlanProduct Limit
Free500
Professional1,000
Premium2,000
EnterpriseCustom

Chunks vs. Products: When to Use Each

Content TypeUse ChunksUse Products
Policies and FAQs
Company information
Blog posts and articles
Product catalog✓ (descriptions)✓ (structured data)
Service listings
Pricing tables

Best practice: Use both together. Upload product descriptions as documents (creates chunks for semantic understanding) AND import your catalog as products (enables precise filtering).


Practical Tips

Optimizing Chunks

  1. Quality over quantity - Well-written content creates better chunks
  2. Clear structure - Use headings and paragraphs for logical chunking
  3. Avoid duplication - Same content uploaded twice wastes chunk limit
  4. Update regularly - Remove outdated content, add new information

Optimizing Products

  1. Complete data - Fill in all available fields
  2. Consistent categories - Use the same terms across products
  3. Rich keywords - Include features customers search for
  4. Accurate pricing - Keep prices current for budget filtering

Monitoring Usage

Check your usage in Admin > Billing Usage:

  • Current chunk count vs. limit
  • Current product count vs. limit
  • Progress bars show utilization percentage

Technical Deep Dive

Vector Dimensions

Embeddings are high-dimensional vectors (typically 768 or 1536 dimensions) that encode semantic meaning. The AI model generating these embeddings was trained on billions of text examples to understand language nuances.

Similarity Metrics

When searching, the system calculates cosine similarity between the query vector and all chunk vectors. Higher similarity scores indicate more relevant content.

Chunk Size

Chunks are sized to balance:

  • Context - Large enough to contain meaningful information
  • Precision - Small enough to match specific queries
  • Performance - Optimized for fast retrieval

Typical chunk size: 200-500 tokens (roughly 150-400 words).

Retrieval Count

For each query, the system retrieves the top-k most similar chunks (typically 3-10) to provide context for answer generation. This balances accuracy with response speed.


Summary

ConceptPurposeStrength
ChunksSemantic understandingMeaning, context, natural language
ProductsStructured filteringNumbers, categories, exact values
Hybrid SearchCombined approachAccurate, relevant, precise results

Your AI agent uses both systems together to deliver answers that understand what customers mean AND respect exact constraints like budgets and specifications.

Related Pages