-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is TiDB suitable for handling massive IoT time series data #7779
Comments
@jinhuang415 TiDB isn't optimized for time series data now(and also not recommend to use heavy write timestamp index https://zhuanlan.zhihu.com/p/25574778 ),to “store ts data” maybe not the best solution for now(compared to classical tsdb). But TiDB works well in scalability,reliability and sql & complex query support and maybe we will improve in future just like timescale. for now,if you need complex sql support and write request can spread to multiple region so tidb can scale(e.g. by device),and doesn't take much care about data compression,maybe can try and if meet any question,we are glad to help😁 |
@lysu Given the new TiFlash addition, I'm wondering if TiDB still is not a good fit for time-series data? |
TiFlash may be competitive with some time series database for some queries. Time Series optimized databases will perform a lot better in terms of lower storage overhead and better optimizations for certain time series specific queries. It is possible you might prefer some of the scale-out characteristics of TiDB since they are missing from many time series database. Or you might prefer to have all your data in one system that can also do SQL. In the end the judgement is up to you, but you can see there are tradeoffs. If you benchmark a time series workload with TiDB/TiFlash let us know about your results. |
Just a recommendation, I'm using TiDB as main transactional database and minor OLAP tasks (most analytics workloads are done using Elasticsearch), for time series VictoriaMetrics is my choice. The use case is: multi-tenant and multi-source metrics and resource states from multiple cloud providers with global availability. My team is very happy with both ❤️ |
There have been some improvements in TiDB 3.1+ to scenarios where the "front most" region is very hot. In addition to auto_increment, TiDB now supports auto_random. This helps distribute the inserts amongst multiple regions to lessen the hot spot. As others have indicated; TiDB will require more disk space than specialized time series databases. Time series is really a special workload, since inserts are always at the front-most partition and updates are typically not required. With the original question answered, I am going to close this issue for now. Please feel free to create a new issue if you have additional questions. Thanks! |
I found one previous question which might be related but it is asking for TTL (#6218), not sure if it is the same thing as my question so I raised a new one. We plan to use TiDB as a scalable RDBMS to replace mySQL, but we also have massive IoT sensor time series data, may we know if we can also use TiDB to handle/analyze such time series data (considering the factors in respective of write/query speed, database size, etc) or dedicated time series DB is suggested (like openTSDB, influxDB, etc)? Thanks.
The text was updated successfully, but these errors were encountered: