Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Concept Entry] Sklearn multilabel-classification #5817

Merged
merged 16 commits into from
Dec 20, 2024

Conversation

SaviDahegaonkar
Copy link
Collaborator

@SaviDahegaonkar SaviDahegaonkar commented Dec 13, 2024

Description

Created a new entry on the Multilabel Classification concept under Sklearn.

Issue Solved

Closes #4967

Type of Change

  • Adding a new entry

Checklist

  • All writings are my own.
  • My entry follows the Codecademy Docs style guide.
  • My changes generate no new warnings.
  • I have performed a self-review of my own writing and code.
  • I have checked my entry and corrected any misspellings.
  • I have made corresponding changes to the documentation if needed.
  • I have confirmed my changes are not being pushed from my forked main branch.
  • I have confirmed that I'm pushing from a new branch named after the changes I'm making.
  • I have linked any issues that are relevant to this PR in the Issues Solved section.

@mamtawardhani mamtawardhani self-assigned this Dec 14, 2024
@mamtawardhani mamtawardhani added new entry New entry or entries status: under review Issue or PR is currently being reviewed sklearn Sklearn labels Dec 14, 2024
Copy link
Collaborator

@mamtawardhani mamtawardhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing to Codecademy Docs @SaviDahegaonkar 😄

The entry looks good to be merged! 🚀

@@ -0,0 +1,120 @@
---
Title: 'Multilabel Classification'
Description: 'Multilabel classification assigns multiple labels to a single instance, using scikit-learn tools and models.'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Description: 'Multilabel classification assigns multiple labels to a single instance, using scikit-learn tools and models.'
Description: 'Multilabel classification is a machine learning task where each instance can be assigned multiple labels or categories simultaneously.'

Comment on lines 19 to 21
In Sklearn, **Multilabel classification** assigns multiple labels to a single instance using scikit-learn tools and models. This method differs from traditional classification, where each instance belongs to one class, and this technique predicts multiple outputs simultaneously.

Scikit-learn offers tools like `OneVsRestClassifier`, `ClassifierChain`, and `MultiOutputClassifier` to facilitate multilabel classification and enable efficient model training and evaluation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In Sklearn, **Multilabel classification** assigns multiple labels to a single instance using scikit-learn tools and models. This method differs from traditional classification, where each instance belongs to one class, and this technique predicts multiple outputs simultaneously.
Scikit-learn offers tools like `OneVsRestClassifier`, `ClassifierChain`, and `MultiOutputClassifier` to facilitate multilabel classification and enable efficient model training and evaluation.
In sklearn, **Multilabel classification** assigns multiple labels to a single instance, allowing models to predict multiple outputs simultaneously. This method differs from traditional classification, where each instance belongs to only one class.
Scikit-learn offers tools like `OneVsRestClassifier`, `ClassifierChain`, and `MultiOutputClassifier` to handle multilabel classification and enable efficient model training and evaluation.

Comment on lines 28 to 44
from sklearn.multioutput import MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier

# Step 1: Initialize the base classifier
base_model = RandomForestClassifier(random_state=42)

# Step 2: Create a MultiOutputClassifier wrapper for multilabel classification
multi_label_model = MultiOutputClassifier(base_model)

# Step 3: Train the model using the training dataset
multi_label_model.fit(X_train, y_train)

# Step 4: Make predictions on the test dataset
predicted_labels = multi_label_model.predict(X_test)

# Step 5: Evaluate predictions or use the results
print(predicted_labels)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from sklearn.multioutput import MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier
# Step 1: Initialize the base classifier
base_model = RandomForestClassifier(random_state=42)
# Step 2: Create a MultiOutputClassifier wrapper for multilabel classification
multi_label_model = MultiOutputClassifier(base_model)
# Step 3: Train the model using the training dataset
multi_label_model.fit(X_train, y_train)
# Step 4: Make predictions on the test dataset
predicted_labels = multi_label_model.predict(X_test)
# Step 5: Evaluate predictions or use the results
print(predicted_labels)
from sklearn.multioutput import MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
# Step 1: Initialize the base classifier
base_model = RandomForestClassifier(random_state=42)
# Step 2: Create a MultiOutputClassifier wrapper for multilabel classification
multi_label_model = MultiOutputClassifier(base_model)
# Step 3: Train the model using the training dataset
multi_label_model.fit(X_train, y_train)
# Step 4: Make predictions on the test dataset
predicted_labels = multi_label_model.predict(X_test)
# Step 5: Evaluate predictions or use the results
print(predicted_labels)

Comment on lines 47 to 50
- `RandomForestClassifier`: Used as the base estimator, and can be replaced with any scikit-learn classifier.
- `MultiOutputClassifier`: Wraps the base model to handle multiple output labels.
- `Fit`: Method used to train the model on training data.
- `predict()`: Method makes predicitions on the test data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `RandomForestClassifier`: Used as the base estimator, and can be replaced with any scikit-learn classifier.
- `MultiOutputClassifier`: Wraps the base model to handle multiple output labels.
- `Fit`: Method used to train the model on training data.
- `predict()`: Method makes predicitions on the test data.
- `RandomForestClassifier`: The base classifier for multilabel classification.
- `MultiOutputClassifier`: A wrapper to extend the base classifier for multilabel tasks.
- `Training and testing`: The model is trained with `fit()` and predictions are made using `predict()`.

Comment on lines 57 to 78
from sklearn.datasets import make_multilabel_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.multioutput import MultiOutputClassifier

# Generate synthetic multilabel data
X, y = make_multilabel_classification(n_samples=100, n_features=10, n_classes=3, n_labels=2, random_state=42)

# Initialize a base classifier
base_classifier = RandomForestClassifier()

# Wrap the base classifier for multilabel classification
model = MultiOutputClassifier(base_classifier)

# Train the model
model.fit(X, y)

# Predict labels for new data
predictions = model.predict(X[:5])

# Display predictions
print("Predicted Labels for First 5 Samples:")
print(predictions)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from sklearn.datasets import make_multilabel_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.multioutput import MultiOutputClassifier
# Generate synthetic multilabel data
X, y = make_multilabel_classification(n_samples=100, n_features=10, n_classes=3, n_labels=2, random_state=42)
# Initialize a base classifier
base_classifier = RandomForestClassifier()
# Wrap the base classifier for multilabel classification
model = MultiOutputClassifier(base_classifier)
# Train the model
model.fit(X, y)
# Predict labels for new data
predictions = model.predict(X[:5])
# Display predictions
print("Predicted Labels for First 5 Samples:")
print(predictions)
from sklearn.datasets import make_multilabel_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.multioutput import MultiOutputClassifier
from sklearn.metrics import classification_report
# Generate synthetic multilabel data
X, y = make_multilabel_classification(n_samples=100, n_features=10, n_classes=3, n_labels=2, random_state=42)
# Initialize a base classifier
base_classifier = RandomForestClassifier()
# Wrap the base classifier for multilabel classification
model = MultiOutputClassifier(base_classifier)
# Train the model
model.fit(X, y)
# Predict labels for new data
predictions = model.predict(X[:5])
# Display predictions
print("Predicted Labels for First 5 Samples:")
print(predictions)

Comment on lines 84 to 89
Predicted Labels for First 5 Samples:
[[1 1 1]
[1 1 0]
[1 1 1]
[1 1 0]
[0 1 0]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Predicted Labels for First 5 Samples:
[[1 1 1]
[1 1 0]
[1 1 1]
[1 1 0]
[0 1 0]]
Predicted Labels for First 5 Samples:
[[1 1 0]
[1 1 0]
[0 0 1]
[1 1 1]
[0 1 0]]

@mamtawardhani mamtawardhani added status: review 1️⃣ completed and removed status: under review Issue or PR is currently being reviewed labels Dec 20, 2024
@mamtawardhani mamtawardhani merged commit a198871 into Codecademy:main Dec 20, 2024
6 checks passed
Copy link

👋 @SaviDahegaonkar
You have contributed to Codecademy Docs, and we would like to know more about you and your experience.
Please take a minute to fill out this four question survey to help us better understand Docs contributions and how we can improve the experience for you and our learners.
Thank you for your help!

🎉 Your contribution(s) can be seen here:

https://www.codecademy.com/resources/docs/sklearn/multilabel-classification

Please note it may take a little while for changes to become visible.
If you're appearing as anonymous and want to be credited, see here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Concept Entry] Sklearn multilabel-classification
3 participants