Xenova (Joshua)

replied to victor's post 20 days ago

We have Transformers.js, the JavaScript/WASM/WebGPU port of the python library, which supports ~100 different architectures.
Docs: https://huggingface.co/docs/transformers.js
Repo: http://github.com/xenova/transformers.js

Is that the kind of thing you're looking for? :)

posted an update about 1 month ago

Post

7208

I can't believe this... Phi-3.5-mini (3.8B) running in-browser at ~90 tokens/second on WebGPU w/ Transformers.js and ONNX Runtime Web! 🤯 Since everything runs 100% locally, no messages are sent to a server — a huge win for privacy!
- 🤗 Demo: webml-community/phi-3.5-webgpu
- 🧑‍💻 Source code: https://github.com/huggingface/transformers.js-examples/tree/main/phi-3.5-webgpu

10 replies

·

posted an update about 1 month ago

Post

9229

I'm excited to announce that Transformers.js V3 is finally available on NPM! 🔥 State-of-the-art Machine Learning for the web, now with WebGPU support! 🤯⚡️

Install it from NPM with:
𝚗𝚙𝚖 𝚒 @𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎/𝚝𝚛𝚊𝚗𝚜𝚏𝚘𝚛𝚖𝚎𝚛𝚜

or via CDN, for example: https://v2.scrimba.com/s0lmm0qh1q

Segment Anything demo: webml-community/segment-anything-webgpu

4 replies

·

posted an update 2 months ago

Post

7724

Introducing Whisper Diarization: Multilingual speech recognition with word-level timestamps and speaker segmentation, running 100% locally in your browser thanks to 🤗 Transformers.js!

Tested on this iconic Letterman interview w/ Grace Hopper from 1983!
- Demo: Xenova/whisper-speaker-diarization
- Source code: Xenova/whisper-speaker-diarization

1 reply

·

posted an update 3 months ago

Post

6567

Introducing Whisper Timestamped: Multilingual speech recognition with word-level timestamps, running 100% locally in your browser thanks to 🤗 Transformers.js! Check it out!
👉 Xenova/whisper-word-level-timestamps 👈

This unlocks a world of possibilities for in-browser video editing! 🤯 What will you build? 😍

Source code: https://github.com/xenova/transformers.js/tree/v3/examples/whisper-word-timestamps

1 reply

·

replied to their post 3 months ago

Note: Since the API is experimental, you will need to install Chrome Dev/Canary version 127 or higher, and enable a few flags to get it working (see blog post for more detailed instructions)

posted an update 3 months ago

Post

5973

Chrome's new window.ai feature is going to change the web forever! 🤯 It allows you to run Gemini Nano, a powerful 3.25B parameter LLM, 100% locally in your browser!

We've also added experimental support to 🤗 Transformers.js!
- Demo: Xenova/experimental-built-in-ai-chat
- Blog post: https://huggingface.co/blog/Xenova/run-gemini-nano-in-your-browser

5 replies

·

posted an update 3 months ago

Post

5833

Florence-2, the new vision foundation model by Microsoft, can now run 100% locally in your browser on WebGPU, thanks to Transformers.js! 🤗🤯

It supports tasks like image captioning, optical character recognition, object detection, and many more! 😍 WOW!
- Demo: Xenova/florence2-webgpu
- Models: https://huggingface.co/models?library=transformers.js&other=florence2
- Source code: https://github.com/xenova/transformers.js/tree/v3/examples/florence2-webgpu

posted an update 4 months ago

Post

10015

Introducing Whisper WebGPU: Blazingly-fast ML-powered speech recognition directly in your browser! 🚀 It supports multilingual transcription and translation across 100 languages! 🤯

The model runs locally, meaning no data leaves your device! 😍

Check it out! 👇
- Demo: Xenova/whisper-webgpu
- Source code: https://github.com/xenova/whisper-web/tree/experimental-webgpu

7 replies

·

replied to their post 5 months ago

The model might be a bit large, but it could be something to try!

posted an update 5 months ago

Post

11348

Introducing Phi-3 WebGPU, a private and powerful AI chatbot that runs 100% locally in your browser, powered by 🤗 Transformers.js and onnxruntime-web!

🔒 On-device inference: no data sent to a server
⚡️ WebGPU-accelerated (> 20 t/s)
📥 Model downloaded once and cached

Try it out: Xenova/experimental-phi3-webgpu

5 replies

·

replied to their post 5 months ago

Indeed! The model is cached on first load and will be reused once you refresh the page.

posted an update 6 months ago

Post

12949

Introducing MusicGen Web: AI-powered music generation directly in your browser, built with 🤗 Transformers.js! 🎵

Everything runs 100% locally, meaning there are no calls to an API! 🤯 Since it's served as a static HF space, it costs $0 to host and run! 🔥

We also added the ability to share your generated music to the discussion tab, so give it a try! 👇
Xenova/musicgen-web

2 replies

·

posted an update 7 months ago

Post

Introducing the 🤗 Transformers.js WebGPU Embedding Benchmark! ⚡️
👉 Xenova/webgpu-embedding-benchmark 👈

On my device, I was able to achieve a 64.04x speedup over WASM! 🤯 How much does WebGPU speed up ML models running locally in your browser? Try it out and share your results! 🚀

3 replies

·

replied to their post 7 months ago

Sure! Here's the PR for it: https://github.com/xenova/transformers.js/pull/607. Vite + vanilla JS.

posted an update 7 months ago

Post

Real-time object detection w/ 🤗 Transformers.js, running YOLOv9 locally in your browser! 🤯

Try it out yourself: Xenova/video-object-detection
(Model used + example code: Xenova/gelan-c_all)

This demo shows why on-device ML is so important:
1. Privacy - local inference means no user data is sent to the cloud
2. No server latency - empowers developers to build real-time applications
3. Lower costs - no need to pay for bandwidth and processing of streamed video

I can't wait to see what you build with it! 🔥

3 replies

·

replied to their post 8 months ago

The source code can be found here.

Note that it’s just vanilla JavaScript, but you can integrate it into your own react application.

replied to their post 8 months ago

Hi there! I suppose you could do this with custom configs and specifying the model_file_name (see here). Feel free to open an issue on GitHub and I'll be happy to try provide example code. Alternatively, you can find the .onnx files here, then use onnxruntime-node on the server and onnxruntime-web on the client to load the models. You can use the SamProcessor (provided by transformers.js) to do the pre- and post-processing (see model card for example usage).

posted an update 8 months ago

Post

Introducing Remove Background Web: In-browser background removal, powered by @briaai 's new RMBG-v1.4 model and 🤗 Transformers.js!

Everything runs 100% locally, meaning none of your images are uploaded to a server! 🤯 At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).

Check it out! 👇
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4

9 replies

·

replied to their post 8 months ago

Coming soon!

replied to their post 8 months ago

This comment has been hidden

replied to their post 8 months ago

I assume this is for another demo? See here for more information.

posted an update 8 months ago

Post

Last week, we released 🤗 Transformers.js v2.14, which added support for SAM (Segment Anything Model).

This means you can now generate high-quality segmentation masks for objects in a scene, directly in your browser! 🤯

Demo (+ source code): Xenova/segment-anything-web
Model: Xenova/slimsam-77-uniform

But how does this differ from Meta's original demo? 🤔 Didn't that also run in-browser?

Well, in their demo, the image embeddings are computed server-side, then sent to the client for decoding. Trying to do this all client-side would be completely impractical: taking minutes per image! 😵‍💫

That's where SlimSAM comes to the rescue! SlimSAM is a novel SAM compression method, able to shrink the model over 100x (637M → 5.5M params), while still achieving remarkable results!

The best part? You can get started in a few lines of JavaScript code, thanks to Transformers.js! 🔥

// npm i @xenova/transformers
import { SamModel, AutoProcessor, RawImage } from '@xenova/transformers';

// Load model and processor
const model = await SamModel.from_pretrained('Xenova/slimsam-77-uniform');
const processor = await AutoProcessor.from_pretrained('Xenova/slimsam-77-uniform');

// Prepare image and input points
const img_url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/corgi.jpg';
const raw_image = await RawImage.read(img_url);
const input_points = [[[340, 250]]];

// Process inputs and perform mask generation
const inputs = await processor(raw_image, input_points);
const outputs = await model(inputs);

// Post-process masks
const masks = await processor.post_process_masks(outputs.pred_masks, inputs.original_sizes, inputs.reshaped_input_sizes);
console.log(masks);

// Visualize the mask
const image = RawImage.fromTensor(masks[0][0].mul(255));
image.save('mask.png');

I can't wait to see what you build with it! 🤗

12 replies

·

Joshua

AI & ML interests

Articles

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

How to run Gemini Nano locally in your browser

Making ML-powered web games with Transformers.js

Organizations

Xenova's activity