What is the most lucrative industry on earth — oil, gas or arms? The answer is none of the above. These industries are not the most profitable industries. The correct answer is tobacco. Today, at least 1/4 of the world’s people smoke, and cigarettes have become the ‘New currency’ of the world.
The cost of cigarettes is extremely low, but the price is very high. In fact, the tobacco industry has also brought incredible fiscal revenue to many countries. From China’s point of view, the annual tobacco tax is approximately equal to Chinese annual national defense and military expenditure income, so there is also a joke. If nobody smokes, who will pay for the national security.
For tobacco companies, since most of them are state-owned monopoly, many cigarette factories have adopted industrialized production methods and have a high level of informatization. Generally speaking, a cigarette factory is divided into two parts: a Tabacco leaf unit and a wrapping unit. The operators in the Tabacco leaf unit is mainly engaged in the adding, flavoring and shredding of tobacco leaves, while the operators in the wrapping unit operate equipment such as cigarette rolling machines and packaging machines.
In the production process of the tobacco leaf, it mainly involves various data such as temperature, humidity, etc. These data are generated frequently and in real time, so each monitoring point will generate a large amount of data per second .
The tobacco leaf unit has many production equipment and complex processes, so generally speaking, there are about 100,000 monitoring points, and the number of monitoring points is very large. These data consume a lot of disk storage space. How to save them reasonably has become a challenge.
We can also find the characteristics of these data: First, these data rely on the collection time, and each piece of data has a unique time corresponding to it. Secondly, because most of the data is generated by the device and is also collected and recorded in real time according to the time stamp, the time series database will have more insert operations instead of frequent update and delete operations.
Data storage and access Tabacco leaf unit
In terms of data storage, the current Tabaco leaf unit mainly uses a relational database for storage, but this storage scheme also brings many problems
Firstly, the table model of the relational database is a row-column structure, including a primary key used to identify a unique row or multiple indexes, each row identifies a record, so there will be a large amount of redundant attribute data, which will consume a lot of disk space.
Secondly, by reducing the sampling frequency (even if the sampling frequency is reduced, about 1TB of data will be generated per month), the accuracy and reliability of real-time data are also reduced, which is not conducive to data monitoring and analysis.
Finally, when the data storage reaches a certain level, in order to improve the system performance, the practice of deleting historical data is often adopted, which causes a great waste of collected data and also brings great difficulties to the maintenance of the system and data .
In terms of data access, the central control management system of the Tabacco leaf unit should have the characteristics of real-time, automation, and intelligence. The central control management system needs to realize real-time monitoring of the production process, perfect quality analysis, fault monitoring and reporting, comprehensive management and other functions, and the realization of these functions needs to be based on the efficient processing of massive real-time information. In order to enable production to proceed in an orderly manner, the system needs to monitor the operating parameters of the equipment in real time, perform production statistical analysis reports, equipment load analysis and prediction, and other functions. In order to realize these functions, the database needs to run fast and under high load for a long time. However, in this operating environment, relational databases will have serious performance bottlenecks if they want to realize these functions. When the data is in an unresponsive state for a long time, it is easy to cause server downtime, which seriously affects production tasks.
TSDB on Tabacco Leaf Unit
For real-time, high-frequency, massive writing and storage requirements, time-series databases are an ideal choice. Compared with relational databases, it greatly reduces storage space and data storage costs. Time series functions not only have superior writing performance, but also can achieve faster query performance and store historical data for a longer period of time, which greatly improves the use value of data. At the same time, the time series database also supports large-scale data monitoring points, and can support millions of monitoring points on ordinary servers. It has a data storage capability of 100,000–600,000 events/second and a data access capability of 1 million-8 million events/second. The superior performance of the time series database can well meet the requirements of the central control management system of the Tobacco leaf unit.
At present, using the time series database, the Tobacoo leaf unit mainly has the following applications:
- Trend curve: in the application of the Tobacoo leaf unit. When using a relational database for data collection, due to the long cycle of data collection, the time span in a single data collection cycle is long and the value changes greatly. The data collected in this way cannot accurately reflect the instantaneous value of the equipment in production. There is a certain difference between the trend curve drawn in this way and the actual state trend, and it is obviously not reliable when displaying the precise trend curve. The time-series database can solve the problem of long data acquisition cycle very well, because its sampling frequency can reach thousands of hertz, the time-series database can complete one data or even multiple data sampling cycles within one millisecond. The trend curve more realistically restores the actual state trend and reproduces the real-time changes of the data more completely, which greatly improves the accuracy of the data and can provide professional analysts with high-precision, high-density data sources, which is conducive to product quality and Analysis of equipment quality.
- Alarm prompt information: The Tabacco leaf unit have higher requirements on the production environment. Each product has different requirements for temperature, humidity etc, in each process section. When the production environment does not meet the requirements, the system needs to report to the relevant personnel in time to deal with it. Real-time monitoring of these environmental factors is required to ensure that the production environment meets the requirements. Process requirements. In addition to environmental factors, there are many parameters in the production process that need to be monitored in real time, such as flavoring/feeding accuracy, instantaneous value of each flow scale, voltage and current of equipment operation, motor frequency, etc. When the parameter is abnormal, the management system can judge the parameter value and send out and record the alarm information in time. Because the time-series database can record real-time data with high density and high precision, relevant personnel can analyze the cause of the abnormality by analyzing the data or trend curve before and after the abnormality occurs, so as to formulate accurate and effective countermeasures.
- Multi-platform and multi-system data sharing: Most time-series databases have realized data integration, collaboration, and service sharing, and provided rich application program interfaces and service sharing calls. Compatible with various operating systems such as Windows, Linux, etc. Support multiple programming languages: such as C language, C#, PHP, Java, etc. This superior compatibility can facilitate the data sharing of multiple systems. Each application system can quickly obtain the data it needs from the data center directly through the API. Realize the integration and data interconnection between multiple systems. For example, the warehousing and logistics system can be combined to realize the sharing of material transportation and material storage information, and the integrated MES system can realize the data sharing of production and manufacturing execution. Integrating multiple systems makes workshop production and management more diversified and intelligent, and makes data analysis more accurate and comprehensive.