Steelskull (Steel)

Posts 1

Post

2547

Myself ( @Steelskull ) and @elinas have been working on a new rendition of the Aethora-15B model, that's built on the Llama 3 architecture, and we've optimized it especially for creative writing tasks ( Both kinds ;D ) while maintaining strong general intelligence capabilities.

Model: L3-Aethora-15B-V2
ZeusLabs/L3-Aethora-15B-V2

Dataset: Aether-Lite-v1.8.1
TheSkullery/Aether-Lite-v1.8.1

What we've built:
A modified DUS (Depth Up Scale) model (originally created by Elinas) by using passthrough to create a 15b model, with specific adjustments (zeroing) to 'o_proj' and 'down_proj', enhancing its efficiency and reducing perplexity

Trained for 17.5 hours on 4 x A100 GPUs (huge thanks to g4rg for sponsoring the compute!)

Uses our Aether-Lite-V1.8.1 dataset with Large 125k high-quality samples
Focuses on creative writing and storytelling, with robust general intelligence

What makes L3-Aethora-15B v2 unique:
Creative Writing: We've really pushed its capabilities in generating engaging narratives, poetry, and adapting to various writing styles, RP and genres.

Versatile Intelligence: While we focused on creative tasks, it still handles scientific discussions, problem-solving, and educational content creation like a champ.

Long Context Understanding: Trained on the full sequence length of 8192 tokens, it maintains coherent conversations over extended interactions.

Carefully Curated Dataset: Alot of work was put into Aether-Lite-V1.8.1, our training dataset. It combines creative writing, instructional content, and specialized knowledge from various high-quality sources. All brought together by a custom data pipeline. (more information on the process is available on the dataset page)

Open Source: We've made both the model and the full dataset available to the community.

We'd love your ideas and recommendations for further improvements!

Collections 4

models 7

datasets

None public yet

Steel PRO

AI & ML interests

Organizations

Posts 1

Collections 4

Steelskull/L3-Arcania-4x8b

SteelStorage/Umbra-v3-MoE-4x11b-2ex

SteelStorage/Umbra-v2.1-MoE-4x10.7

Steelskull/Lumosia-v2-MoE-4x10.7

models 7

Steelskull/G2-DA-Nyxora-27b-V2

Steelskull/G2-MS-Nyxora-27b

Steelskull/L3-Aethora-15B

Steelskull/L3-Arcania-4x8b

Steelskull/L3-MS-Astoria-70b

Steelskull/Lumosia-v2-MoE-4x10.7

Steelskull/Etheria-55b-v0.1

datasets

Steel PRO

AI & ML interests

Organizations

Posts 1

Collections 4

models 7 Sort: Recently updated

datasets

models 7