Skip to content

Latest commit

 

History

History
126 lines (92 loc) · 3.63 KB

File metadata and controls

126 lines (92 loc) · 3.63 KB
comments difficulty edit_url tags
true
Easy
Database

中文文档

Description

Table: Tweets

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| tweet_id       | int     |
| content        | varchar |
+----------------+---------+
tweet_id is the primary key (column with unique values) for this table.
This table contains all the tweets in a social media app.

Write a solution to find invalid tweets. A tweet is considered invalid if it meets any of the following criteria:

  • It exceeds 140 characters in length.
  • It has more than 3 mentions.
  • It includes more than 3 hashtags.

Return the result table ordered by tweet_id in ascending order.

The result format is in the following example.

 

Example:

Input:

Tweets table:

  +----------+-----------------------------------------------------------------------------------+
  | tweet_id | content                                                                           |
  +----------+-----------------------------------------------------------------------------------+
  | 1        | Traveling, exploring, and living my best life @JaneSmith @SaraJohnson @LisaTaylor |
  |          | @MikeBrown #Foodie #Fitness #Learning                                             | 
  | 2        | Just had the best dinner with friends! #Foodie #Friends #Fun                      |
  | 4        | Working hard on my new project #Work #Goals #Productivity #Fun                    |
  +----------+-----------------------------------------------------------------------------------+
  

Output:

  +----------+
  | tweet_id |
  +----------+
  | 1        |
  | 4        |
  +----------+
  

Explanation:

  • tweet_id 1 contains 4 mentions.
  • tweet_id 4 contains 4 hashtags.
Output table is ordered by tweet_id in ascending order.

Solutions

Solution 1: LENGTH() Function + REPLACE() Function

We can use the LENGTH() function to calculate the length of the string, calculate the length after excluding @ or #, then use the OR operator to connect these three conditions, filter out the corresponding tweet_id, and sort by tweet_id in ascending order.

MySQL

# Write your MySQL query statement below
SELECT tweet_id
FROM Tweets
WHERE
    LENGTH(content) > 140
    OR (LENGTH(content) - LENGTH(REPLACE(content, '@', ''))) > 3
    OR (LENGTH(content) - LENGTH(REPLACE(content, '#', ''))) > 3
ORDER BY 1;

Python3

import pandas as pd


def find_invalid_tweets(tweets: pd.DataFrame) -> pd.DataFrame:
    invalid_tweets = tweets[
        (tweets["content"].str.len() > 140)
        | (tweets["content"].str.count("@") > 3)
        | (tweets["content"].str.count("#") > 3)
    ].sort_values(by="tweet_id")
    return invalid_tweets[["tweet_id"]]