Joshua
AI & ML interests
Articles
Organizations
Xenova's activity
We have Transformers.js, the JavaScript/WASM/WebGPU port of the python library, which supports ~100 different architectures.
Docs: https://huggingface.co/docs/transformers.js
Repo: http://github.com/xenova/transformers.js
Is that the kind of thing you're looking for? :)
- π€ Demo: webml-community/phi-3.5-webgpu
- π§βπ» Source code: https://github.com/huggingface/transformers.js-examples/tree/main/phi-3.5-webgpu
Install it from NPM with:
πππ π @πππππππππππ/ππππππππππππ
or via CDN, for example: https://v2.scrimba.com/s0lmm0qh1q
Segment Anything demo: webml-community/segment-anything-webgpu
Tested on this iconic Letterman interview w/ Grace Hopper from 1983!
- Demo: Xenova/whisper-speaker-diarization
- Source code: Xenova/whisper-speaker-diarization
π Xenova/whisper-word-level-timestamps π
This unlocks a world of possibilities for in-browser video editing! π€― What will you build? π
Source code: https://github.com/xenova/transformers.js/tree/v3/examples/whisper-word-timestamps
Note: Since the API is experimental, you will need to install Chrome Dev/Canary version 127 or higher, and enable a few flags to get it working (see blog post for more detailed instructions)
windowβ.ai
feature is going to change the web forever! π€― It allows you to run Gemini Nano, a powerful 3.25B parameter LLM, 100% locally in your browser!We've also added experimental support to π€ Transformers.js!
- Demo: Xenova/experimental-built-in-ai-chat
- Blog post: https://huggingface.co/blog/Xenova/run-gemini-nano-in-your-browser
It supports tasks like image captioning, optical character recognition, object detection, and many more! π WOW!
- Demo: Xenova/florence2-webgpu
- Models: https://huggingface.co/models?library=transformers.js&other=florence2
- Source code: https://github.com/xenova/transformers.js/tree/v3/examples/florence2-webgpu
The model runs locally, meaning no data leaves your device! π
Check it out! π
- Demo: Xenova/whisper-webgpu
- Source code: https://github.com/xenova/whisper-web/tree/experimental-webgpu
The model might be a bit large, but it could be something to try!
π On-device inference: no data sent to a server
β‘οΈ WebGPU-accelerated (> 20 t/s)
π₯ Model downloaded once and cached
Try it out: Xenova/experimental-phi3-webgpu
Indeed! The model is cached on first load and will be reused once you refresh the page.
Everything runs 100% locally, meaning there are no calls to an API! π€― Since it's served as a static HF space, it costs $0 to host and run! π₯
We also added the ability to share your generated music to the discussion tab, so give it a try! π
Xenova/musicgen-web
π Xenova/webgpu-embedding-benchmark π
On my device, I was able to achieve a 64.04x speedup over WASM! π€― How much does WebGPU speed up ML models running locally in your browser? Try it out and share your results! π
Sure! Here's the PR for it: https://github.com/xenova/transformers.js/pull/607. Vite + vanilla JS.
Try it out yourself: Xenova/video-object-detection
(Model used + example code: Xenova/gelan-c_all)
This demo shows why on-device ML is so important:
1. Privacy - local inference means no user data is sent to the cloud
2. No server latency - empowers developers to build real-time applications
3. Lower costs - no need to pay for bandwidth and processing of streamed video
I can't wait to see what you build with it! π₯
The source code can be found here.
Note that itβs just vanilla JavaScript, but you can integrate it into your own react application.
Hi there! I suppose you could do this with custom configs and specifying the model_file_name
(see here). Feel free to open an issue on GitHub and I'll be happy to try provide example code. Alternatively, you can find the .onnx files here, then use onnxruntime-node
on the server and onnxruntime-web
on the client to load the models. You can use the SamProcessor
(provided by transformers.js) to do the pre- and post-processing (see model card for example usage).
Everything runs 100% locally, meaning none of your images are uploaded to a server! π€― At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).
Check it out! π
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4
Coming soon!
This means you can now generate high-quality segmentation masks for objects in a scene, directly in your browser! π€―
Demo (+ source code): Xenova/segment-anything-web
Model: Xenova/slimsam-77-uniform
But how does this differ from Meta's original demo? π€ Didn't that also run in-browser?
Well, in their demo, the image embeddings are computed server-side, then sent to the client for decoding. Trying to do this all client-side would be completely impractical: taking minutes per image! π΅βπ«
That's where SlimSAM comes to the rescue! SlimSAM is a novel SAM compression method, able to shrink the model over 100x (637M β 5.5M params), while still achieving remarkable results!
The best part? You can get started in a few lines of JavaScript code, thanks to Transformers.js! π₯
// npm i @xenova/transformers
import { SamModel, AutoProcessor, RawImage } from '@xenova/transformers';
// Load model and processor
const model = await SamModel.from_pretrained('Xenova/slimsam-77-uniform');
const processor = await AutoProcessor.from_pretrained('Xenova/slimsam-77-uniform');
// Prepare image and input points
const img_url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/corgi.jpg';
const raw_image = await RawImage.read(img_url);
const input_points = [[[340, 250]]];
// Process inputs and perform mask generation
const inputs = await processor(raw_image, input_points);
const outputs = await model(inputs);
// Post-process masks
const masks = await processor.post_process_masks(outputs.pred_masks, inputs.original_sizes, inputs.reshaped_input_sizes);
console.log(masks);
// Visualize the mask
const image = RawImage.fromTensor(masks[0][0].mul(255));
image.save('mask.png');
I can't wait to see what you build with it! π€