Skip to content

Pinned Loading

  1. understand-r1-zero understand-r1-zero Public

    Understanding R1-Zero-Like Training: A Critical Perspective

    Python 1.2k 56

  2. zero-bubble-pipeline-parallelism zero-bubble-pipeline-parallelism Public

    Forked from NVIDIA/Megatron-LM

    Zero Bubble Pipeline Parallelism

    Python 449 31

  3. lorahub lorahub Public

    [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

    Python 667 39

  4. oat oat Public

    🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

    Python 624 59

  5. stde stde Public

    Official implementation of Stochastic Taylor Derivative Estimator (STDE) NeurIPS2024

    Python 128 10

  6. feedback-conditional-policy feedback-conditional-policy Public

    Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"

    Python 58 2

Repositories

Showing 10 of 100 repositories
  • Stable-RL Public

    Rethinking the Trust Region in LLM Reinforcement Learning

    sail-sg/Stable-RL’s past year of commit activity
    Python 8 Apache-2.0 0 0 5 Updated Feb 5, 2026
  • jrystal Public

    A JAX-based Differentiable Density Functional Theory Framework for Materials

    sail-sg/jrystal’s past year of commit activity
    Python 43 Apache-2.0 1 5 0 Updated Feb 4, 2026
  • odc Public

    On demand communication

    sail-sg/odc’s past year of commit activity
    Python 29 2 0 5 Updated Feb 4, 2026
  • oat Public

    🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

    sail-sg/oat’s past year of commit activity
    Python 624 Apache-2.0 59 6 1 Updated Jan 29, 2026
  • sail-sg/LifelongSafetyAlignment’s past year of commit activity
    Python 11 0 1 0 Updated Jan 13, 2026
  • feedback-conditional-policy Public

    Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"

    sail-sg/feedback-conditional-policy’s past year of commit activity
    Python 58 2 0 0 Updated Jan 5, 2026
  • InfNeRF Public

    InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space Complexity

    sail-sg/InfNeRF’s past year of commit activity
    Python 11 Apache-2.0 1 1 0 Updated Jan 3, 2026
  • SkyLadder Public Forked from jzhang38/TinyLlama

    The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling

    sail-sg/SkyLadder’s past year of commit activity
    Python 42 Apache-2.0 603 1 0 Updated Dec 29, 2025
  • d4ft Public

    A JAX library for Density Functional Theory.

    sail-sg/d4ft’s past year of commit activity
    Python 54 Apache-2.0 5 16 0 Updated Nov 25, 2025
  • Precision-RL Public

    Defeating the Training-Inference Mismatch via FP16

    sail-sg/Precision-RL’s past year of commit activity
    Python 180 MIT 15 4 0 Updated Nov 14, 2025