Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing DoRA docs, adding to mem opt tutorial #1918

Merged
merged 9 commits into from
Oct 29, 2024

Conversation

SalmanMohammadi
Copy link
Collaborator

Context

What is the purpose of this PR? Is it to

  • add a new feature
  • fix a bug
  • update tests and/or documentation
  • other (please add here)

Fix DoRA docstring, add to the rest of the docs

                          ooo
                         $ o$
                        o $$
              ""$$$    o" $$ oo "
          " o$"$oo$$$"o$$o$$"$$$$$ o
         $" "o$$$$$$o$$$$$$$$$$$$$$o     o
      o$"    "$$$$$$$$$$$$$$$$$$$$$$o" "oo  o
     " "     o  "$$$o   o$$$$$$$$$$$oo$$
    " $     " "o$$$$$ $$$$$$$$$$$"$$$$$$$o
  o  $       o o$$$$$"$$$$$$$$$$$o$$"""$$$$o " "
 o          o$$$$$"    "$$$$$$$$$$ "" oo $$   o $
 $  $       $$$$$  $$$oo "$$$$$$$$o o $$$o$$oo o o
o        o $$$$$oo$$$$$$o$$$$ ""$$oo$$$$$$$$"  " "o
"   o    $ ""$$$$$$$$$$$$$$  o  "$$$$$$$$$$$$   o "
"   $      "$$$$$$$$$$$$$$   "   $$$"$$$$$$$$o  o
$   o      o$"""""$$$$$$$$    oooo$$ $$$$$$$$"  "
$      o""o $$o    $$$$$$$$$$$$$$$$$ ""  o$$$   $ o
 o     " "o "$$$$  $$$$$""""""""""" $  o$$$$$"" o o
 "  " o  o$o" $$$$o   ""           o  o$$$$$"   o
  $         o$$$$$$$oo            "oo$$$$$$$"    o
  "$   o o$o $o o$$$$$"$$$$oooo$$$$$$$$$$$$$$"o$o
    "o oo  $o$"oo$$$$$o$$$$$$$$$$$$"$$$$$$$$"o$"
     "$ooo $$o$   $$$$$$$$$$$$$$$$ $$$$$$$$o"
        "" $$$$$$$$$$$$$$$$$$$$$$" """"
                         """"""

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 29, 2024
Copy link

pytorch-bot bot commented Oct 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1918

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0d76a65 with merge base 1bbd749 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@@ -21,6 +21,7 @@ To make things easy, we've summarized these components in the following table:
":ref:`glossary_opt_in_bwd`", "Helps reduce memory usage when using stateful optimizers, particularly when full-finetuning large models with high gradient memory usage. This is not compatible with ``gradient_accumulation_steps``, so training may slow down due to reduced model throughput."
":ref:`glossary_lora`", "When you want to significantly reduce the number of trainable parameters, saving gradient and optimizer memory during training, and significantly speeding up training."
":ref:`glossary_qlora`", "When you need even more memory savings than LoRA, at the potential cost of some training speed. Useful for very large models or limited hardware."
":ref:`glossary_dora`", "Like LoRA, DoRA can provide significant memory savings and training speed-ups. DoRA may improve performance over LoRA, particularly when using small rank updates."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know when someone would choose dora over lora?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly not sure, according to the paper it's just straight up better

docs/source/tutorials/memory_optimizations.rst Outdated Show resolved Hide resolved
docs/source/tutorials/memory_optimizations.rst Outdated Show resolved Hide resolved
is a scalar vector that adjusts the scale, while the direction component corresponds to the original LoRA decomposition and
updates the orientation of weights.

DoRA adds a small overhead to LoRA training due to the addition of the magnitude parameter, but it has been shown to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perf or memory overhead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not 100% but there's an added parameter and extra computation so I'd say both

docs/source/tutorials/memory_optimizations.rst Outdated Show resolved Hide resolved
docs/source/tutorials/memory_optimizations.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@RdoubleA RdoubleA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more typos, stamping to unblock

docs/source/tutorials/memory_optimizations.rst Outdated Show resolved Hide resolved
docs/source/tutorials/memory_optimizations.rst Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.63%. Comparing base (1bbd749) to head (0d76a65).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1918      +/-   ##
==========================================
- Coverage   67.77%   67.63%   -0.15%     
==========================================
  Files         304      304              
  Lines       16199    16241      +42     
==========================================
+ Hits        10979    10984       +5     
- Misses       5220     5257      +37     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@SalmanMohammadi SalmanMohammadi merged commit 48a8449 into pytorch:main Oct 29, 2024
16 checks passed
@SalmanMohammadi SalmanMohammadi deleted the dora_docs branch October 29, 2024 23:15
@SalmanMohammadi SalmanMohammadi mentioned this pull request Oct 30, 2024
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants