comments | difficulty | edit_url | tags | |
---|---|---|---|---|
true |
Easy |
|
Table: Tweets
+----------------+---------+ | Column Name | Type | +----------------+---------+ | tweet_id | int | | content | varchar | +----------------+---------+ tweet_id is the primary key (column with unique values) for this table. This table contains all the tweets in a social media app.
Write a solution to find invalid tweets. A tweet is considered invalid if it meets any of the following criteria:
- It exceeds
140
characters in length. - It has more than
3
mentions. - It includes more than
3
hashtags.
Return the result table ordered by tweet_id
in ascending order.
The result format is in the following example.
Example:
Input:
Tweets table:
+----------+-----------------------------------------------------------------------------------+ | tweet_id | content | +----------+-----------------------------------------------------------------------------------+ | 1 | Traveling, exploring, and living my best life @JaneSmith @SaraJohnson @LisaTaylor | | | @MikeBrown #Foodie #Fitness #Learning | | 2 | Just had the best dinner with friends! #Foodie #Friends #Fun | | 4 | Working hard on my new project #Work #Goals #Productivity #Fun | +----------+-----------------------------------------------------------------------------------+
Output:
+----------+ | tweet_id | +----------+ | 1 | | 4 | +----------+
Explanation:
- tweet_id 1 contains 4 mentions.
- tweet_id 4 contains 4 hashtags.
We can use the LENGTH()
function to calculate the length of the string, calculate the length after excluding @
or #
, then use the OR
operator to connect these three conditions, filter out the corresponding tweet_id, and sort by tweet_id in ascending order.
# Write your MySQL query statement below
SELECT tweet_id
FROM Tweets
WHERE
LENGTH(content) > 140
OR (LENGTH(content) - LENGTH(REPLACE(content, '@', ''))) > 3
OR (LENGTH(content) - LENGTH(REPLACE(content, '#', ''))) > 3
ORDER BY 1;
import pandas as pd
def find_invalid_tweets(tweets: pd.DataFrame) -> pd.DataFrame:
invalid_tweets = tweets[
(tweets["content"].str.len() > 140)
| (tweets["content"].str.count("@") > 3)
| (tweets["content"].str.count("#") > 3)
].sort_values(by="tweet_id")
return invalid_tweets[["tweet_id"]]