Huawei's HPC storage solution has helped the oil exploration industry
the energy industry enter the era of big data
with the continuous growth and rapid expansion of the total amount of data, the era of big data has arrived, and the development and application of big data has been kicked off in energy sub industries such as oil and electricity. How to efficiently obtain information from massive data, effectively deep process and finally obtain useful data is the purpose of energy enterprises involved in big data, otherwise it may cause fire or circuit breakage
for the oil industry, many enterprises are applying more new technologies to strategic decision-making, scientific and technological research and development, production and operation, safety and environmental protection and other fields in order to tap more wealth and value from big data resources. The application of big data is the inevitable trend of the deepening of information technology and the deep integration of it and business in the petroleum industry. The application prospect in China's petroleum and petrochemical industry will be more and more broad. With the gradual reduction of oil reserves, the difficulty of exploration and development in the industrial chain of petroleum and petrochemical industry is increasing. The maturity of informatization has become the primary factor affecting the growth rate of the industry. Accurate and fast geological survey has become one of the core competitiveness of the world's energy giants, in which the application of high-performance computing technology and big data technology is the key factor
massive data processing of oil and gas exploration requires high-performance computing
geophysical methods are most commonly used in oil exploration at present. Geophysical methods are geological exploration methods that use modern physical methods to convert elongation into tension value, including electrical method, magnetic method, gravity method, radioactive method, seismic wave method, etc., of which seismic wave method is the most important. In order to understand and simulate the geological structure of thousands of meters underground, massive data are collected through seismic wave reflection. Generally, the two-dimensional data can reach 1~2tb, and the three-dimensional data can reach hundreds of TB or even Pb. Then a large number of intensive calculations and simulations are carried out, and the calculation results are converted into intuitive visual images to facilitate experts to interpret the data, such as: σ 0.001、 σ 0.01、 σ 0.1 and σ 0.3, etc., providing reference for oil and gas drilling positioning. Therefore, the processing of these massive data can achieve the best exploration benefits only with the help of high-performance computing, which is also the main reason for the demand for high-performance computing in the field of oil exploration
due to the particularity and complexity of the petroleum exploration industry, petroleum exploration puts forward very stringent requirements for high-performance computing. In the past decade, mainframe or high-performance computer has been widely used in petroleum exploration calculation and processing, but at present, high-performance computer systems have faced many problems in computing performance, system construction and operation cost. The problems that make oil exploration enterprises feel headache mainly focus on three major dilemmas: first, the gap between the demand for computing power and the performance of CPU processors is getting larger and larger. At present, the technical route of improving computing performance by continuously improving the working frequency of CPU processors has gradually moved towards its extreme limit; Second, the rapid growth of oil exploration data and storage expansion are increasingly mismatched; Third, the restriction of energy consumption is becoming more and more serious. The weaknesses of high-performance computers, such as large volume and high power consumption, as well as the demand for huge computer room space, air conditioning demand and power consumption, have become a major challenge for oil exploration data processing
big data characteristics of seismic data
bgp (China Petroleum Group Oriental geophysical company) is a geophysical professional technical service company solely invested by China National Petroleum Corporation, which is mainly engaged in land and shallow sea seismic exploration acquisition, processing, interpretation and geophysical exploration equipment and software development. Its business is distributed in 34 countries around the world, and its land seismic exploration market share ranks first in the world. Now it has 26000 employees, more than 3000 IT personnel and more than 300 software developers. It has 23 processing centers around the world, about 90000 CPU cores and 800000 GPU cores, with computing capacity of about 2pflops and storage capacity of more than 25pb
Mr. lainenghe, chief engineer of the Research Institute of CNPC Oriental geophysics company, made a keynote speech on the application of big data technology in the field of oil exploration at Huawei Cloud Computing Conference 2013, and explained in detail the acquisition and processing process of massive seismic data
the big data generated in the process of oil exploration has its own unique 4V characteristics:
1. Massive data: Taking BGP as an example, more than 7tb of production data will be generated every day, and a large number of intermediate process data will be generated in the process of seismic data processing
2. Single data source: seismic data is excited by artificial simulated seismic waves, received and collected by fixed-point acquisition instruments, and the data source and data format are relatively single
3. Large amount of calculation: Taking BGP as an example, 54tb of raw data is processed through a 4000 CPU computing cluster, which takes more than 50 days
4. Complex processing flow: seismic data processing involves frequent IO and database, and the operation is complex
the rapid growth of seismic data not only puts forward great demand for storage, but also poses new challenges to the traditional HPC software and hardware architecture
Huawei petroleum exploration HPC solution realizes high computing and large storage capacity
according to the characteristics and needs of the petroleum industry, Huawei proposed petroleum exploration HPC solution, which includes the following parts:
1, computing cluster system
computing nodes and fat nodes adopt Huawei blade servers to provide strong computing power, especially floating-point computing power, to complete huge computing tasks in seismic data processing
2. Storage system
Huawei oceanstor 9000 big data storage system (hereinafter referred to as oceanstor 9000) is used for the storage of fully active pressure testing machine
different from the traditional NFS and lustre solutions, oceanstor 9000 adopts a fully symmetrical distributed architecture. Each node can provide IO and storage units, and provide business access, data processing and storage capabilities. Therefore, it can easily complete node expansion and realize the linear expansion of system performance and capacity
its full symmetric scale out architecture integrates and manages system resources through cluster mode, automatic load balancing, global cache and other technologies, effectively improves the performance of the storage system and shortens the operation cycle of seismic data processing. Oceanstor 9000 also has high reliability and hardware fault tolerance to ensure normal operation. It can also provide flexible group mode. Both the front and rear end networks support Infiniband or 10GE Ethernet high-speed interconnection, which can effectively meet the high bandwidth and low delay requirements of oil exploration HPC scenarios
3. Network interconnection
adopts the separation of computing network, storage network and management network. The computing network adopts 10 Gigabit Ethernet to undertake the data communication during parallel computing. The management network adopts Gigabit Ethernet for the management and monitoring of HPC cluster system. The storage network adopts 10GE Ethernet or 40ge Infiniband network to provide high-speed network interconnection for the host to access data files
LINK
Copyright © 2011 JIN SHI