Outlook 2015: Big analogue data is the biggest of big data

4 min read

In test and measurement applications, engineers and scientists can collect vast amounts of data in short periods. For every second that an experiment runs at the Large Hadron Collider at CERN, 40Tbyte of data are generated.

For every 30 minutes that a Boeing jet engine runs, the system creates 10Tbyte of information; one journey across the Atlantic by a four engined jumbo jet can create 640Tbyte of data. Multiply that by more than 25,000 flights flown each day and you get an understanding of the enormous amount of data being generated. Meanwhile, large gas turbine manufacturers report that data from instrumented electricity generating turbines during manufacturing test generate more than 10Tbyte of data per day. These are all examples of the 'big data' trend. According to a study on digital data by technology research firm IDC, data production is doubling every two years, mimicking electronics' most famous and enduring law: Moore's Law. In 1965, Gordon Moore stated the number of transistors on an IC doubled approximately every two years and he expected the trend to continue 'for at least ten years'. Almost 50 years later, Moore's Law still influences many aspects of IT and electronics. As a consequence, technology is more affordable and the latest innovations help engineers and scientists capture, analyse and store data at rates faster than ever before. If the production of digital data continues to mimic Moore's Law, an organisation's success will hinge on the speed at which it can turn acquired data into useful knowledge. But volume is not the only trait of big data. In general, big data is characterised by a combination of three or four Vs – volume, variety, velocity and value. An additional V – visibility – is emerging as a key defining characteristic. That is, a growing need among global corporations is geographically dispersed access to business, engineering, and scientific data. For example, data acquired from instrumented agricultural equipment in a rural US midwestern field may undergo analysis by data scientists in Europe. Or product test engineers in manufacturing lines in South America and China may need access to each other's data to conduct comparative analysis. This results in demand for interconnected IT systems, such as the cloud, to be connected intimately to data acquisition (DAQ) systems. Characterising big analogue data information Big Analogue Data information differs from other big data, such as that derived in IT systems or social data. It includes analogue data on voltage, pressure, acceleration, vibration, temperature, sound and so on from the physical world. Big Analogue Data sources are generated from the environment, nature, people and electrical and mechanical machines. In addition, it's the fastest of all big data, since analogue signals are generally continuous waveforms that require digitising at rates as fast as tens of gigahertz, often at large bit widths. And it's the biggest type, because this kind of information is generated constantly from natural and man made sources. According to IBM, a large portion of the big data today is from the environment, 'including images, light, sound, and even the radio signals – and it's all analogue'. And the analogue data collected from deep space by the Square Kilometre Array is expected to be 10 times that of the global Internet traffic. The three tier big analogue data solution Drawing accurate and meaningful conclusions from such high speed and high volume analogue data is a growing problem. This adds new challenges to data analysis, search, data integration, reporting and system maintenance that must be met to keep pace with the exponential growth of data. Solutions for capturing, analysing and sharing Big Analogue Data work to address the combination of conventional big data issues and the difficulties of managing analogue data. To cope with these challenges – and to harness the value in analogue data sources – engineers are seeking end to end solutions. Specifically, engineers are looking for three tier architectures to create a single, integrated solution that adds insight from the real time capture at the sensors to the analytics at the back end IT infrastructures. The data flow starts in tier 1 at the sensor and is captured in tier 2 system nodes. These nodes perform the initial real time, in motion and early life data analysis. Information deemed important flows across 'The Edge' to traditional IT equipment. In the IT infrastructure, or tier 3, servers, storage and networking equipment manage, organise and further analyse the early life or at rest data. Finally, data is archived for later use. Through the stages of data flow, the growing field of big data analytics is generating never before seen insights. For example, real-time analytics are needed to determine the immediate response of a precision motion control system. At the other end, at-rest data can be retrieved for analysis against newer in-motion data; for example, to gain insight into the seasonal behaviour of a power generating turbine. Throughout tiers 2 and 3, data visualisation products and technologies help realise the benefits of the acquired information. Considering that Big Analogue Data solutions typically involve many data acquisition channels connected to many system nodes, the capabilities of reliability, availability, serviceability and manageability (RASM) are becoming more important. In general, RASM expresses the robustness of a system, related to how well it performs its intended function. Therefore, the RASM characteristics of a system are crucial to the quality of the mission for which the system is deployed. This has a great impact on both technical and business outcomes. For example, RASM functions can aid in establishing when preventive maintenance or replacement should take place. This, in turn, can effectively convert a surprise outage into a manageable, planned event and thus maintain smoother service delivery and increase business continuity. The serviceability and management are similar to that needed for PCs and servers. They include discovery, deployment, health status, updates, security, diagnostics, calibration and event logging. RASM capabilities are critical for reducing integration risks and lowering the total cost of ownership because these system nodes integrate with tier 3 IT infrastructures. The oldest, fastest and biggest big data – Big Analogue Data – harbours great scientific, engineering and business insight. To tap this vast resource, developers are turning to solutions powered by tools and platforms that integrate well with each other and with a wide range of other partners. This three tier Big Analogue Data solution is growing in demand as it solves problems in key application areas such as scientific research, product test, and machine condition and asset monitoring. National Instruments For nearly 40 years, NI has worked with engineers and scientists to provide answers to the most challenging questions. Through these pursuits, NI customers have brought hundreds of thousands of products to market, overcome innumerable technological roadblocks and engineered a better life for us all. NI provides powerful, flexible technology solutions for measurement and control that accelerate productivity and drive rapid innovation. From daily tasks to grand challenges, NI's integrated hardware and software platform helps engineers and scientists in nearly every industry—from healthcare and automotive to consumer electronics and particle physics—to improve the world we live in. If you can turn it on, connect it, drive it or launch it, chances are NI technology helped make it happen. Francis Griffiths is senior vice president, regional sales and marketing, for National Instruments.