singhsidhukuldeep (Kuldeep Singh Sidhu)

Posts 63

Post

739

Although this might sound like another way to make money on LLM API calls...

Good folks at @AnthropicAI just introduced Contextual Retrieval, and it's a significant yet logical step up from simple Retrieval-Augmented Generation (RAG)!

Here are the steps to implement Contextual Retrieval based on Anthropic's approach:

1. Preprocess the knowledge base:
- Break down documents into smaller chunks (typically a few hundred tokens each).
- Generate contextual information for each chunk using Claude 3 Haiku with a specific prompt.
- Prepend the generated context (usually 50-100 tokens) to each chunk.

2. Create embeddings and a BM25 index:
- Use an embedding model (Gemini or Voyage recommended) to convert contextualized chunks into vector embeddings.
- Create a BM25 index using the contextualized chunks.

3. Set up the retrieval process:
- Implement a system to search both the vector embeddings and the BM25 index.
- Use rank fusion techniques to combine and deduplicate results from both searches.

4. Implement reranking (optional but recommended):
- Retrieve the top 150 potentially relevant chunks initially.
- Use a reranking model (e.g., Cohere reranker) to score these chunks based on relevance to the query.
- Select the top 20 chunks after reranking.

5. Integrate with the generative model:
- Add the top 20 chunks (or top K, based on your specific needs) to the prompt sent to the generative model.

6. Optimize for your use case:
- Experiment with chunk sizes, boundary selection, and overlap.
- Consider creating custom contextualizer prompts for your specific domain.
- Test different numbers of retrieved chunks (5, 10, 20) to find the optimal balance.

7. Leverage prompt caching:
- Use Claude's prompt caching feature to reduce costs when generating contextualized chunks.
- Cache the reference document once and reference it for each chunk, rather than passing it repeatedly.

8. Evaluate and iterate

Post

1994

It's not every day you see a research paper named "Alice's Adventures in a Differentiable Wonderland," and when you open it, it's a 281-page book!

I haven't completed it yet, but this amazing work, written by Simone Scardapane, is a fascinating introduction to deep neural networks and differentiable programming.

Some key technical highlights:

• Covers core concepts like automatic differentiation, stochastic optimization, and activation functions in depth

• Explains modern architectures like convolutional networks, transformers, and graph neural networks

• Provides mathematical foundations including linear algebra, gradients, and probability theory

• Discusses implementation details in PyTorch and JAX

• Explores advanced topics like Bayesian neural networks and neural scaling laws

The book takes a unique approach, framing neural networks as compositions of differentiable primitives rather than biological analogs. It provides both theoretical insights and practical coding examples.

I especially enjoyed the sections on:

• Vector-Jacobian products and reverse-mode autodiff
• Stochastic gradient descent and mini-batch optimization
• ReLU, GELU, and other modern activation functions
• Universal approximation capabilities of MLPs

Whether you're new to deep learning or an experienced practitioner, this book offers valuable insights into the fundamentals and latest developments. Highly recommended for anyone working with neural networks!

View all posts

models

None public yet

datasets

None public yet

Kuldeep Singh Sidhu

AI & ML interests

Organizations

Posts 63

models

datasets