Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

davanstrienΒ 
posted an update 1 day ago
view post
Post
1486
ColPali is revolutionizing multimodal retrieval, but could it be even more effective with domain-specific fine-tuning?

Check out my latest blog post, where I guide you through creating a ColPali fine-tuning dataset using Qwen/Qwen2-VL-7B-Instruct to generate queries for a collection of UFO documents sourced from the Internet Archive.

The post covers:
- Introduction to data for ColPali models
- Using Qwen2-VL for retrieval query generation
- Tips for better query generation

Check out the post here:
https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html

The resulting Hugging Face dataset: davanstrien/ufo-ColPali
  • 1 reply
Β·
enzostvsΒ 
posted an update 1 day ago
view post
Post
1733
Looking for a logo idea πŸ‘€ ?
I made a new cool space enzostvs/Logo.Ai to help you design a great logo in seconds!

Here are some examples of what you can do, feel free to share yours too! πŸš€
davidberenstein1957Β 
posted an update 1 day ago
view post
Post
1522
πŸŽ‰ Exciting News: Argilla 2.2.0 is Here! πŸš€

We're thrilled to announce the release of Argilla 2.2.0, packed with powerful new features to enhance your data annotation and LLM workflow:

πŸ—¨οΈ ChatField: Work with text conversations natively in Argilla. Perfect for building datasets for conversational LLMs!
βš™οΈ Adjustable Task Distribution: Modify settings on the fly and automatically recalculate completed and pending records.
πŸ“Š Progress Tracking: Monitor annotation progress directly from the SDK, including user-specific metrics.
🧠 Automatic Settings Inference: Importing datasets from Hugging Face Hub just got easier with automatic settings detection.
πŸ“‹ Task Templates: Jump-start your projects with pre-built templates for common dataset types.
πŸ”§ Background Jobs Support: Improved performance for long-running tasks (requires Redis).

Upgrade now and supercharge your data workflows!

Check out our full changelog for more details: https://github.com/argilla-io/argilla/compare/v2.1.0...v2.2.0
BlinkDLΒ 
posted an update 1 day ago
painΒ 
posted an update 1 day ago
fdaudensΒ 
posted an update 1 day ago
view post
Post
1070
IBM & NASA just released open-source AI model for weather & climate on Hugging Face.

Prithvi WxC offers insights beyond forecasting, tackling challenges from local weather to global climate. Potential apps: targeted forecasts, severe weather detection & more. https://huggingface.co/Prithvi-WxC

This is impressive. Check out this comparison of the Ida hurricane between ground truth and the AI model's prediction.
dylanebertΒ 
posted an update 1 day ago
MonsterMMORPGΒ 
posted an update 2 days ago
view post
Post
2430
Detailed Comparison of JoyCaption Alpha One vs JoyCaption Pre-Alpha β€” 10 Different Style Amazing Images β€” I think JoyCaption Alpha One is the very best image captioning model at the moment for model training β€” Works very fast and requires as low as 8.5 GB VRAM

Where To Download And Install

You can download our APP from here : https://www.patreon.com/posts/110613301

1-Click to install on Windows, RunPod and Massed Compute
Official APP is here where you can try : fancyfeast/joy-caption-alpha-one

Have The Following Features

Auto downloads meta-llama/Meta-Llama-3.1–8B into your Hugging Face cache folder and other necessary models into the installation folder

Use 4-bit quantization β€” Uses 8.5 GB VRAM Total

Overwrite existing caption file

Append new caption to existing caption

Remove newlines from generated captions

Cut off at last complete sentence

Discard repeating sentences

Don’t save processed image

Caption Prefix

Caption Suffix

Custom System Prompt (Optional)

Input Folder for Batch Processing

Output Folder for Batch Processing (Optional)

Fully supported Multi GPU captioning β€” GPU IDs (comma-separated, e.g., 0,1,2)

Batch Size β€” Batch captioning
singhsidhukuldeepΒ 
posted an update 1 day ago
view post
Post
690
Although this might sound like another way to make money on LLM API calls...

Good folks at @AnthropicAI just introduced Contextual Retrieval, and it's a significant yet logical step up from simple Retrieval-Augmented Generation (RAG)!

Here are the steps to implement Contextual Retrieval based on Anthropic's approach:

1. Preprocess the knowledge base:
- Break down documents into smaller chunks (typically a few hundred tokens each).
- Generate contextual information for each chunk using Claude 3 Haiku with a specific prompt.
- Prepend the generated context (usually 50-100 tokens) to each chunk.

2. Create embeddings and a BM25 index:
- Use an embedding model (Gemini or Voyage recommended) to convert contextualized chunks into vector embeddings.
- Create a BM25 index using the contextualized chunks.

3. Set up the retrieval process:
- Implement a system to search both the vector embeddings and the BM25 index.
- Use rank fusion techniques to combine and deduplicate results from both searches.

4. Implement reranking (optional but recommended):
- Retrieve the top 150 potentially relevant chunks initially.
- Use a reranking model (e.g., Cohere reranker) to score these chunks based on relevance to the query.
- Select the top 20 chunks after reranking.

5. Integrate with the generative model:
- Add the top 20 chunks (or top K, based on your specific needs) to the prompt sent to the generative model.

6. Optimize for your use case:
- Experiment with chunk sizes, boundary selection, and overlap.
- Consider creating custom contextualizer prompts for your specific domain.
- Test different numbers of retrieved chunks (5, 10, 20) to find the optimal balance.

7. Leverage prompt caching:
- Use Claude's prompt caching feature to reduce costs when generating contextualized chunks.
- Cache the reference document once and reference it for each chunk, rather than passing it repeatedly.

8. Evaluate and iterate
rwightmanΒ 
posted an update 2 days ago
view post
Post
1961
A 'small' MobileNet-V4 update, I just pushed weights for the smallest model I've trained in the series, a 0.5 width multiplier version of the MobileNet-V4 Conv Small.

Now you may look at this and say hey, why is this impressive? 64.8% top-1 and 2.2M params? MobileNetV3-Small 0.75, and MobileNet-V2 0.5 are both fewer params (at ~2M) and over 65% top-1, what gives? Well this is where MobileNet-V4 differs from the previous versions of the model family, it trades off (gives up) a little parameter efficiency for some computational efficiency.

So, let's look at the speed. On a 4090 w/ torchcompile
* 98K img/sec - timm/mobilenetv4_conv_small_050.e3000_r224_in1k
* 58K img/sec - timm/mobilenetv3_small_075.lamb_in1k
* 37K img/sec - timm/mobilenetv2_050.lamb_in1k

And there you go, if you have a need for speed, MNV4 is the better option.