-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Concept Entry] Sklearn: Linear Discriminant Analysis #5824
Changes from 1 commit
dafac89
355a171
3b48904
81d2140
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
--- | ||
Title: 'Linear Discriminant Analysis' | ||
Description: 'Linear Discriminant Analysis aims to project data onto a lower-dimensional space while preserving the information that discriminates between different classes.' | ||
Subjects: | ||
- 'Data Science' | ||
- 'Machine Learning' | ||
Tags: | ||
- 'Machine Learning' | ||
- 'Scikit-learn' | ||
- 'Supervised Learning' | ||
- 'Unsupervised Learning' | ||
CatalogContent: | ||
- 'learn-python-3' | ||
- 'paths/computer-science' | ||
--- | ||
|
||
In Sklearn, **Linear Discriminant Analysis (LDA)** is a supervised algorithm that aims to project data onto a lower-dimensional space while preserving the information that discriminates between different classes. LDA finds a set of directions in the original feature space that maximize the separation between the classes. These directions are called discriminant directions. By projecting the data onto these directions, LDA reduces the dimensionality of the data while retaining the information that is most relevant for classification. | ||
|
||
## Syntax | ||
|
||
```pseudo | ||
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis | ||
|
||
# Create an LDA model | ||
model = LinearDiscriminantAnalysis() | ||
|
||
# Fit the model to the training data | ||
model.fit(X_train, y_train) | ||
|
||
# Make predictions on the new data | ||
y_pred = model.predict(X_test) | ||
``` | ||
|
||
## Example | ||
|
||
The following example demonstrates the implementation of LDA: | ||
|
||
```py | ||
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis | ||
from sklearn.datasets import load_iris | ||
from sklearn.model_selection import train_test_split | ||
from sklearn.metrics import accuracy_score | ||
|
||
# Load the Iris dataset | ||
iris = load_iris() | ||
X = iris.data | ||
y = iris.target | ||
|
||
# Create training and testing sets by splitting the dataset | ||
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) | ||
|
||
# Create an LDA model | ||
model = LinearDiscriminantAnalysis() | ||
|
||
# Fit the model to the training data | ||
model.fit(X_train, y_train) | ||
|
||
# Make predictions on the new data | ||
y_pred = model.predict(X_test) | ||
|
||
# Evaluate the model | ||
print("Accuracy:", accuracy_score(y_test, y_pred)) | ||
``` | ||
|
||
The above code produces the following output: | ||
|
||
```shell | ||
Accuracy: 1.0 | ||
``` | ||
|
||
## Codebyte Example | ||
|
||
The following codebyte example demonstrates the implementation of LDA: | ||
|
||
```codebyte/python | ||
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis | ||
from sklearn.datasets import load_diabetes | ||
from sklearn.model_selection import train_test_split | ||
from sklearn.metrics import accuracy_score | ||
|
||
# Load the Diabetes dataset | ||
diabetes = load_diabetes() | ||
X = diabetes.data | ||
y = diabetes.target | ||
|
||
# Create training and testing sets by splitting the dataset | ||
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=44) | ||
|
||
# Create an LDA model | ||
model = LinearDiscriminantAnalysis() | ||
|
||
# Fit the model to the training data | ||
model.fit(X_train, y_train) | ||
|
||
# Make predictions on the test set | ||
y_pred = model.predict(X_test) | ||
|
||
# Evaluate the model | ||
print("Accuracy:", accuracy_score(y_test, y_pred)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess we should use Iris data set here, because the Diabetes dataset from |
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add parameters for the
LinearDiscriminantAnalysis
to increase its readability?Reference - https://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, we can certainly do that.