LED-Based Summarization Model: Condensing Long and Technical Information

The Longformer Encoder-Decoder (LED) for Narrative-Esque Long Text Summarization is a model I fine-tuned from allenai/led-base-16384 to condense extensive technical, academic, and narrative content in a fairly generalizable way.

Key Features and Use Cases

Ideal for summarizing long narratives, articles, papers, textbooks, and other documents.
- the sparknotes-esque style leads to 'explanations' in the summarized content, offering insightful output.
High capacity: Handles up to 16,384 tokens per batch.
demos: try it out in the notebook linked above or in the demo on Spaces

Note: The API widget has a max length of ~96 tokens due to inference timeout constraints.

Training Details

The model was trained on the BookSum dataset released by SalesForce, which leads to the bsd-3-clause license. The training process involved 16 epochs with parameters tweaked to facilitate very fine-tuning-type training (super low learning rate).

Model checkpoint: pszemraj/led-base-16384-finetuned-booksum.

Other Related Checkpoints

This model is the smallest/fastest booksum-tuned model I have worked on. If you're looking for higher quality summaries, check out:

There are also other variants on other datasets etc on my hf profile, feel free to try them out :)

Basic Usage

I recommend using encoder_no_repeat_ngram_size=3 when calling the pipeline object, as it enhances the summary quality by encouraging the use of new vocabulary and crafting an abstractive summary.

Create the pipeline object:

import torch
from transformers import pipeline

hf_name = "pszemraj/led-base-book-summary"

summarizer = pipeline(
    "summarization",
    hf_name,
    device=0 if torch.cuda.is_available() else -1,
)

Feed the text into the pipeline object:

wall_of_text = "your words here"

result = summarizer(
    wall_of_text,
    min_length=8,
    max_length=256,
    no_repeat_ngram_size=3,
    encoder_no_repeat_ngram_size=3,
    repetition_penalty=3.5,
    num_beams=4,
    do_sample=False,
    early_stopping=True,
)
print(result[0]["generated_text"])

Simplified Usage with TextSum

To streamline the process of using this and other models, I've developed a Python package utility named textsum. This package offers simple interfaces for applying summarization models to text documents of arbitrary length.

Install TextSum:

pip install textsum

Then use it in Python with this model:

from textsum.summarize import Summarizer

model_name = "pszemraj/led-base-book-summary"
summarizer = Summarizer(
    model_name_or_path=model_name,  # you can use any Seq2Seq model on the Hub
    token_batch_length=4096,  # how many tokens to batch summarize at a time
)
long_string = "This is a long string of text that will be summarized."
out_str = summarizer.summarize_string(long_string)
print(f"summary: {out_str}")

Currently implemented interfaces include a Python API, a Command-Line Interface (CLI), and a shareable demo/web UI.

For detailed explanations and documentation, check the README or the wiki

Downloads last month: 296

Safetensors

Model size

162M params

Tensor type

F32

Inference Examples

Summarization

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pszemraj/led-base-book-summary

Finetunes

14 models

Dataset used to train pszemraj/led-base-book-summary

Spaces using pszemraj/led-base-book-summary 21

Collection including pszemraj/led-base-book-summary

BookSum-based Summarizers

Collection

BookSum-tuned text-to-text summarization models • 7 items • Updated Feb 24 • 3

Evaluation results

ROUGE-1 on kmfoda/booksum
test set verified

33.454
ROUGE-2 on kmfoda/booksum
test set verified

5.223
ROUGE-L on kmfoda/booksum
test set verified

16.204
ROUGE-LSUM on kmfoda/booksum
test set verified

29.977
loss on kmfoda/booksum
test set verified

3.199
gen_len on kmfoda/booksum
test set verified

191.978
ROUGE-1 on samsum
test set verified

32.000
ROUGE-2 on samsum
test set verified

10.078
ROUGE-L on samsum
test set verified

23.633
ROUGE-LSUM on samsum
test set verified

28.783

View on Papers With Code