Release Stability and additional improvements · Lightning-AI/pytorch-lightning

App

Added

Added a possibility to set up basic authentication for Lightning apps (#16105)

Changed

The LoadBalancer now uses internal ip + port instead of URL exposed (#16119)
Added support for logging in different trainer stages with DeviceStatsMonitor
(#16002)
Changed lightning_app.components.serve.gradio to lightning_app.components.serve.gradio_server (#16201)
Made cluster creation/deletion async by default (#16185)

Fixed

Fixed not being able to run multiple lightning apps locally due to port collision (#15819)
Avoid relpath bug on Windows (#16164)
Avoid using the deprecated LooseVersion (#16162)
Porting fixes to autoscaler component (#16249)
Fixed a bug where lightning login with env variables would not correctly save the credentials (#16339)

Fabric

Added

Added Fabric.launch() to programmatically launch processes (e.g. in Jupyter notebook) (#14992)
Added the option to launch Fabric scripts from the CLI, without the need to wrap the code into the run method (#14992)
Added Fabric.setup_module() and Fabric.setup_optimizers() to support strategies that need to set up the model before an optimizer can be created (#15185)
Added support for Fully Sharded Data Parallel (FSDP) training in Lightning Lite (#14967)
Added lightning_fabric.accelerators.find_usable_cuda_devices utility function (#16147)
Added basic support for LightningModules (#16048)
Added support for managing callbacks via Fabric(callbacks=...) and emitting events through Fabric.call() (#16074)
Added Logger support (#16121)
- Added Fabric(loggers=...) to support different Logger frameworks in Fabric
- Added Fabric.log for logging scalars using multiple loggers
- Added Fabric.log_dict for logging a dictionary of multiple metrics at once
- Added Fabric.loggers and Fabric.logger attributes to access the individual logger instances
- Added support for calling self.log and self.log_dict in a LightningModule when using Fabric
- Added access to self.logger and self.loggers in a LightningModule when using Fabric
Added lightning_fabric.loggers.TensorBoardLogger (#16121)
Added lightning_fabric.loggers.CSVLogger (#16346)
Added support for a consistent .zero_grad(set_to_none=...) on the wrapped optimizer regardless of which strategy is used (#16275)

Changed

Renamed the class LightningLite to Fabric (#15932, #15938)
The Fabric.run() method is no longer abstract (#14992)
The XLAStrategy now inherits from ParallelStrategy instead of DDPSpawnStrategy (#15838)
Merged the implementation of DDPSpawnStrategy into DDPStrategy and removed DDPSpawnStrategy (#14952)
The dataloader wrapper returned from .setup_dataloaders() now calls .set_epoch() on the distributed sampler if one is used (#16101)
Renamed Strategy.reduce to Strategy.all_reduce in all strategies (#16370)
When using multiple devices, the strategy now defaults to "ddp" instead of "ddp_spawn" when none is set (#16388)

Removed

Removed support for FairScale's sharded training (strategy='ddp_sharded'|'ddp_sharded_spawn'). Use Fully-Sharded Data Parallel instead (strategy='fsdp') (#16329)

Fixed

Restored sampling parity between PyTorch and Fabric dataloaders when using the DistributedSampler (#16101)
Fixes an issue where the error message wouldn't tell the user the real value that was passed through the CLI (#16334)

PyTorch

Added

Added support for native logging of MetricCollection with enabled compute groups (#15580)
Added support for custom artifact names in pl.loggers.WandbLogger (#16173)
Added support for DDP with LRFinder (#15304)
Added utilities to migrate checkpoints from one Lightning version to another (#15237)
Added support to upgrade all checkpoints in a folder using the pl.utilities.upgrade_checkpoint script (#15333)
Add an axes argument ax to the .lr_find().plot() to enable writing to a user-defined axes in a matplotlib figure (#15652)
Added log_model parameter to MLFlowLogger (#9187)
Added a check to validate that wrapped FSDP models are used while initializing optimizers (#15301)
Added a warning when self.log(..., logger=True) is called without a configured logger (#15814)
Added support for colossalai 0.1.11 (#15888)
Added LightningCLI support for optimizer and learning schedulers via callable type dependency injection (#15869)
Added support for activation checkpointing for the DDPFullyShardedNativeStrategy strategy (#15826)
Added the option to set DDPFullyShardedNativeStrategy(cpu_offload=True|False) via bool instead of needing to pass a configuration object (#15832)
Added info message for Ampere CUDA GPU users to enable tf32 matmul precision (#16037)
Added support for returning optimizer-like classes in LightningModule.configure_optimizers (#16189)

Changed

Switch from tensorboard to tensorboardx in TensorBoardLogger (#15728)
From now on, Lightning Trainer and LightningModule.load_from_checkpoint automatically upgrade the loaded checkpoint if it was produced in an old version of Lightning (#15237)
Trainer.{validate,test,predict}(ckpt_path=...) no longer restores the Trainer.global_step and trainer.current_epoch value from the checkpoints - From now on, only Trainer.fit will restore this value (#15532)
The ModelCheckpoint.save_on_train_epoch_end attribute is now computed dynamically every epoch, accounting for changes to the validation dataloaders (#15300)
The Trainer now raises an error if it is given multiple stateful callbacks of the same time with colliding state keys (#15634)
MLFlowLogger now logs hyperparameters and metrics in batched API calls (#15915)
Overriding the on_train_batch_{start,end} hooks in conjunction with taking a dataloader_iter in the training_step no longer errors out and instead shows a warning (#16062)
Move tensorboardX to extra dependencies. Use the CSVLogger by default (#16349)
Drop PyTorch 1.9 support (#15347)

Deprecated

Deprecated description, env_prefix and env_parse parameters in LightningCLI.__init__ in favour of giving them through parser_kwargs (#15651)
Deprecated pytorch_lightning.profiler in favor of pytorch_lightning.profilers (#16059)
Deprecated Trainer(auto_select_gpus=...) in favor of pytorch_lightning.accelerators.find_usable_cuda_devices (#16147)
Deprecated pytorch_lightning.tuner.auto_gpu_select.{pick_single_gpu,pick_multiple_gpus} in favor of pytorch_lightning.accelerators.find_usable_cuda_devices (#16147)
nvidia/apex deprecation (#16039)
- Deprecated pytorch_lightning.plugins.NativeMixedPrecisionPlugin in favor of pytorch_lightning.plugins.MixedPrecisionPlugin
- Deprecated the LightningModule.optimizer_step(using_native_amp=...) argument
- Deprecated the Trainer(amp_backend=...) argument
- Deprecated the Trainer.amp_backend property
- Deprecated the Trainer(amp_level=...) argument
- Deprecated the pytorch_lightning.plugins.ApexMixedPrecisionPlugin class
- Deprecates the pytorch_lightning.utilities.enums.AMPType enum
- Deprecates the DeepSpeedPrecisionPlugin(amp_type=..., amp_level=...) arguments
horovod deprecation (#16141)
- Deprecated Trainer(strategy="horovod")
- Deprecated the HorovodStrategy class
Deprecated pytorch_lightning.lite.LightningLite in favor of lightning.fabric.Fabric (#16314)
FairScale deprecation (in favor of PyTorch's FSDP implementation) (#16353)
- Deprecated the pytorch_lightning.overrides.fairscale.LightningShardedDataParallel class
- Deprecated the pytorch_lightning.plugins.precision.fully_sharded_native_amp.FullyShardedNativeMixedPrecisionPlugin class
- Deprecated the pytorch_lightning.plugins.precision.sharded_native_amp.ShardedNativeMixedPrecisionPlugin class
- Deprecated the pytorch_lightning.strategies.fully_sharded.DDPFullyShardedStrategy class
- Deprecated the pytorch_lightning.strategies.sharded.DDPShardedStrategy class
- Deprecated the pytorch_lightning.strategies.sharded_spawn.DDPSpawnShardedStrategy class

Removed

Removed deprecated pytorch_lightning.utilities.memory.get_gpu_memory_map in favor of pytorch_lightning.accelerators.cuda.get_nvidia_gpu_stats (#15617)
Temporarily removed support for Hydra multi-run (#15737)
Removed deprecated pytorch_lightning.profiler.base.AbstractProfiler in favor of pytorch_lightning.profilers.profiler.Profiler (#15637)
Removed deprecated pytorch_lightning.profiler.base.BaseProfiler in favor of pytorch_lightning.profilers.profiler.Profiler (#15637)
Removed deprecated code in pytorch_lightning.utilities.meta (#16038)
Removed the deprecated LightningDeepSpeedModule (#16041)
Removed the deprecated pytorch_lightning.accelerators.GPUAccelerator in favor of pytorch_lightning.accelerators.CUDAAccelerator (#16050)
Removed the deprecated pytorch_lightning.profiler.* classes in favor of pytorch_lightning.profilers (#16059)
Removed the deprecated pytorch_lightning.utilities.cli module in favor of pytorch_lightning.cli (#16116)
Removed the deprecated pytorch_lightning.loggers.base module in favor of pytorch_lightning.loggers.logger (#16120)
Removed the deprecated pytorch_lightning.loops.base module in favor of pytorch_lightning.loops.loop (#16142)
Removed the deprecated pytorch_lightning.core.lightning module in favor of pytorch_lightning.core.module (#16318)
Removed the deprecated pytorch_lightning.callbacks.base module in favor of pytorch_lightning.callbacks.callback (#16319)
Removed the deprecated Trainer.reset_train_val_dataloaders() in favor of Trainer.reset_{train,val}_dataloader (#16131)
Removed support for LightningCLI(seed_everything_default=None) (#16131)
Removed support in LightningLite for FairScale's sharded training (strategy='ddp_sharded'|'ddp_sharded_spawn'). Use Fully-Sharded Data Parallel instead (strategy='fsdp') (#16329)

Fixed

Enhanced reduce_boolean_decision to accommodate any-analogous semantics expected by the EarlyStopping callback (#15253)
Fixed the incorrect optimizer step synchronization when running across multiple TPU devices (#16020)
Fixed a type error when dividing the chunk size in the ColossalAI strategy (#16212)
Fixed bug where the interval key of the scheduler would be ignored during manual optimization, making the LearningRateMonitor callback fail to log the learning rate (#16308)
Fixed an issue with MLFlowLogger not finalizing correctly when status code 'finished' was passed (#16340)

Contributors

@1SAA, @akihironitta, @AlessioQuercia, @awaelchli, @bipinKrishnan, @Borda, @carmocca, @dmitsf, @erhoo82, @ethanwharris, @Forbu, @hhsecond, @justusschock, @lantiga, @lightningforever, @Liyang90, @manangoel99, @mauvilsa, @nicolai86, @nohalon, @rohitgr7, @schmidt-jake, @speediedan, @yMayanand

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stability and additional improvements

App

Added

Changed

Fixed

Fabric

Added

Changed

Removed

Fixed

PyTorch

Added

Changed

Deprecated

Removed

Fixed

Contributors

Contributors