A REVIEW ON HIGH-PERFORMANCE COMPUTING

*-** Department of Computer Science, Govt. E. Raghvendra Rao P.G. Science College, Bilaspur, Chhattisgarh, India.

Abstract

High-Performance Computing (HPC) has evolved into a tool that is essential to every researcher's work. The vast majority of issues that arise in contemporary research may be simulated, explained, or put to the test through the use of computer simulations. Researchers often struggle with computational issues while concentrating on the issues that arise from the study. Because the majority of researchers have a minimal or nonexistent understanding of low-level computer science, it tends to view computer programs as extensions of the thoughts and bodies rather than as fully independent systems. As a result of the fact that computers do not function in the same manner as people do, the typical outcome is lowperformance computing in situations where high-performance computing would be expected.

Keywords :

High-Performance Computing (HPC),
Computer programs,
Fog Computing,
Cloud Computing,
Machine Learning,
Artificial Intelligence.

Introduction

The delivery of computing services is shifting to mimic that of other utilities like water, electricity, gas, and telephones, creating a model in which computing is just another commodity. In this paradigm, customers gain access to services according to their needs rather than the location of the hosting infrastructure. This utility computing vision has been a goal of several computing paradigms, such as grid computing. The next paradigm shift, cloud computing, holds the greatest potential to finally make the concept of computing utilities a reality. One of the founding scientists of the Advanced Research Projects Agency Network (ARPANET), the forerunner of the Internet, Leonard Kleinrock, predicted in 1969 that computer utilities, similar to today's electric and telephone utilities, would spread across the country as computer networks matured.

In the twenty-first century, the entire computing industry is expected to undergo a radical transformation, and this vision of computing utilities based on a serviceprovisioning model predicted that these services would be readily available on demand, much like water, electricity, telephone, and gas are today. In a similar way, customers only need to pay service providers when they really use the computer resources. Furthermore, customers no longer have to make costly investments or struggle with constructing and maintaining sophisticated Information Technology (IT) infrastructure. Service consumers under such a paradigm would get access to the services they need, regardless of the physical location of the service providers. Historically, this approach has been known as utility computing, but more recently (after 2007), it has been referred to as cloud computing. The latter phrase often refers to the underlying infrastructure in the form of the cloud, through which consumers and companies may access apps on demand from any location on the globe. Thus, cloud computing may be defined as a new paradigm for the on-demand delivery of computer services hosted in highly sophisticated data centers that make use of virtualization technologies to maximize efficiency and resource usage. In the cloud, resources such as servers and software can be rented on a pay-as-you-go basis. Large companies' Chief Information Officers (CIOs) perceive an opportunity to scale their infrastructure on demand and tailor it to their operations' specific requirements.

Users that take advantage of cloud computing services may get to their files at any time, from any place, and via any Internet-connected device. There are many different perspectives on cloud computing, which may be summed up as follows,

Checks whether the data is kept or the programs run is completely irrelevant to them.
It needs to be permanently accessible from any Internet-connected device.
Moreover, it is to continue paying for this assistance for as long as it is required.

For all intents and purposes, a computer program with High-Performance Computing (HPC) can complete a task more quickly than a person can. However, this is erroneous since computers and humans' approach problem-solving in fundamentally different ways (Buyya, 1999). Studying, developing, and creating a multi-task nature instrument may be difficult due to people's tendency to be "mono-task" focused, especially when considering the approach paradigm. This paper draws heavily from the work done at the Indian Standard Time (IST) Gravity Group's Baltasar Sete S'ois Computer Cluster and the related experiences and use cases done there. Professor Vitor Cardoso has been given a "DyBHo" Employee Retention Credit (ERC) Starting Grant to investigate and better comprehend dynamical black holes, and Baltasar is a member of this effort. Baltasar is an implementation of Cactus that was made to assist in the resolution of a particular issue. The targeted infrastructure included 4GB of Random-Access Memory (RAM) per core, 200 cores or more, and 500MB/s of network storage speed. It was intended for a particular use case, but it performed so well that the team is now considering it a general-purpose cluster. This examination begins with a discussion of the fundamental computer concepts necessary for grasping the parallel processing paradigms. It also outlines a few of the key resources needed for an effective HPC process. Finally, it makes an effort to simplify using a shared computer cluster by providing illustrative instances.

1. Objective

To study High-Performance Computing (HPC)
To study cloud computing, Fog Computing, Big Data, Data Mining, Machine Learning, and Artificial Intelligence.

2. The Vision of Cloud Computing

Cloud computing is the practice of virtually providing hardware, runtime environments, and resources to users in exchange for monetary compensation. There is no necessity for an up-front commitment when it comes to using these things for as long as the user desires. In order to deploy devices without incurring maintenance costs, the entire collection of computer hardware is converted into a set of utilities that can be delivered and put together in a matter of hours rather than days all at once. The long-term goal of a cloud computing system is to have information technology services exchanged on an open market like utilities without the involvement of technological or other regulatory restrictions.

A worldwide digital market that provides services for cloud computing will make it feasible to automate the processes of discovery and integration with the software systems that are already in existence. A digital cloud trading platform is already accessible, which will allow service providers to not only increase their income but also their profits. A cloud service could also be the customer service that a rival offers in order to fulfill its consumer promises.

Both company and personal data are readily available in a variety of organized formats across the board, which enables access and interaction on an ever-expanding scale. The security and reliability of cloud computing will continue to advance, making it increasingly safer via the implementation of a broad variety of new strategies. The cloud is the most significant technology since its primary focus should be on the kind of services and applications it supports. When wearable technology, the "bring your own device" (BYOD) movement, cloud computing, and the Internet of Things (IoT) all come together, cloud computing will no longer be seen as an enabler in both personal and professional life (Buyya et al., 2009). Figure 1 shows the emergence of cloud computing as a result of the convergence of several innovations.

Figure 1. Emergence of Cloud Computing

2.1 Defining a Cloud

"Cloud computing" is a relatively new catchphrase in the information technology industry. Its inception followed several decades of technological advancements in areas such as virtualization, utility computing, distributed computing, networking, and software services. The cloud is an information technology system that was developed to remotely deliver measurable and scalable resources. It has developed into a contemporary paradigm for the sharing of information and the provision of internet services. Customers benefit from services that are more secure, versatile, and scalable as a result of this. It is utilized as a service-oriented architecture that lessens the amount of information that is burdensome for end users.

2.2 Challenges Ahead

Data storage necessitates cloud computing in most businesses. Businesses are responsible for the generation and storage of a massive amount of data. As a result, it faces many security concerns. Establishments would be included as companies in order to simplify, improve, and optimize the process, as well as to enhance cloud computing administration. The risk and difficulties associated with cloud computing are,

Security & Privacy
Interoperability & Portability
Reliable and flexible
Downtime
Lack of resources
Dealing with Multi-Cloud Environments

2.2.1 Security and Privacy of Cloud

It is essential that the cloud data repository maintains both privacy and safety. Customers are extremely reliant on the cloud service provider. In other words, the cloud provider is obligated to implement essential safety precautions to protect the data of its customers. Customers are also liable for the security of the company's assets since it is required to have a strong password, refrain from disclosing the password to anyone else, and change it on a regular basis. If the data are located outside of the firewall, then a number of potential issues may arise, all of which may be resolved by the cloud provider. Because it might harm a large number of clients, hacking and malware count as one of the most significant challenges. It is possible for there to be a loss of data, as well as disruption to the encrypted file system and a number of other concerns.

2.2.2 Interoperability and Portability

The customer will receive cloud migration services both into and out of the cloud. Because it might be difficult for the clients, there should be no bond time allowed. The cloud will have the ability to provide facilities for onpremises use. One of the challenges presented by the cloud is remote access, which eliminates the possibility of the cloud provider accessing the cloud from any location.

2.2.3 Reliable and Flexible

Customers of cloud services face a challenge in achieving reliability and flexibility, which can prevent data sent to the cloud service from being leaked and provide the consumer with trustworthiness. Monitoring services provided by third parties and keeping an eye on the performance, resiliency, and dependability of businesses are necessary steps in overcoming this obstacle.

2.2.4 Downtime

The most common problem with cloud computing is known as downtime, which occurs because no cloud provider can guarantee a platform which is always available. Connection to the internet is another factor that plays a significant role since it may be problematic for a business to have an unreliable internet connection, which can result in downtime for the business.

2.2.5 Lack of Resources

The cloud sector aims to hire experienced staff to address a shortage of resources and skills. This employee will not only assist in finding solutions to the problems that the firm is facing, but it will also provide training to other employees so that the organization can profit. At the moment, a large number of IT professionals are working to improve cloud computing capabilities; however, this presents a challenge for the Chief Executive Officer (CEO) because the employees have limited qualifications. It suggests that organizations gain more from employees with experience in the latest technology and innovations.

2.2.6 Dealing with Multi-Cloud

According to the findings of a survey compiled by RightScale, over 84 percent of businesses utilize a multicloud strategy, and 58 percent of those businesses utilize hybrid cloud strategies that include both public and private cloud components. In addition, companies take advantage of five distinct public and private cloud computing environments. When it comes in making longterm projections regarding the development of cloud computing technologies, the IT infrastructure teams have a harder time than usual. Experts have also proposed various techniques for addressing this issue, such as redesigning procedures, educating staff, utilizing the appropriate technologies, actively managing vendor relationships, and doing research.

3. Fog Computing

Fog computing is an expansion of cloud computing, but it is more like a technology that works with the Internet of Things data. Fog computing acts as a mediator between the cloud and end devices (Yi et al., 2015). This is achieved by moving storage and networking resources, as well as computation, closer to the end devices. As a result of the localization of the fog nodes, it offers reduced latency and improved location awareness. Cloud computing offers a dynamically virtualized interface for the provision of services like storage, computing, and networking between client computers and the usual data centers. The list of qualities is as follows,

Low latency results from the proximity of fog nodes to the devices with which it is coupled, which reduces reaction time significantly and enables quicker data processing. It helps raise awareness of where fog may be present. It is possible to put nodes in a wide variety of locations.
Mobility: Straight-forward communication may take place between fog and devices, and it offers improved mobility.
Interaction in Real Time: Due to the proximity of devices and fog nodes, interactions are able to take place in real-time. This is in contrast to the batch processing that is performed by cloud computing.
Geographical Dispersion: Because computing operates in a dispersed environment, it is able to provide high-quality streaming services in a timely manner due to its geographic spread. Fog computing is used because there is insufficient assistance with quality services at the end of the network.
Protection and Privacy: Due to the fact that the data contains nuanced information, the Internet of Things plays an important role in the military sector. To protect against online attackers, this level of specificity is necessary. It features its own implementation of security measures. Caching, often known as "content caching," is a method that can enhance response time while simultaneously lowering the amount of network traffic that is experienced.

3.1 Architecture of Fog

Computing acts as a link between the devices on the edge and the cloud. It is a network that combines devices linked at the edge with those in the cloud. The design of three layers is one of the most prevalent architectures. Layers are broken down into different levels, Layer 1 is the most fundamental level, and it comprises all Internet of Things devices that are able to store and transfer raw data to higher layers of the network.

Layer 2, known as the "middle layer," is made up of a variety of network devices such as routers and switches that are able to process and temporarily store information. These devices are connected to the network in the cloud, and thus constantly upload data to the cloud at predetermined intervals.

Layer 3 is the topmost layer, and it consists of a large number of servers and data centers that are able to store a large quantity of data and also have the capability to handle it.

3.2 Fog Deployment Models

The ownership of the fog infrastructure and the underlying features may be used to differentiate between different types of fog models. It has four distinct sorts of fog models.

Private Fog: A private fog is one that an organization, a third organization, or some combination of the two has created, acquired, and is responsible for maintaining and controlling. It is also possible to install it off-site if desired. Private fog services are offered for purchase by a single corporation for its exclusive use.
Public Fog: A public fog is one that is owned, generated, and controlled by a corporate, academic institution, government body, or a mix of these three types of entities. It is implemented on the properties that are used by the fog providers. The common or general public is allowed unrestricted access to the public fog services that are supplied.
Community Fog: Community fog is created, managed, and regulated by several community groups. These organizations, together with a possible third party or a consolidation of them, are responsible for its development. It is possible to install it either on or off the premises, and services are offered for the exclusive use of consumers, who are often members of a certain set of organizations that have shared interests.
Hybrid Fog: Hybrid fog is a sort of cloud computing that combines public, private, and community fog with public and private cloud computing. This type of cloud computing is also known as "hybrid cloud" (i.e., hybrid fog). Because of the fog's detrimental effects on the availability of physical resources, it could be useful. As a direct consequence of this, the platform has been implemented within the hybrid cloud in order to scale performance. The hybrid cloud has adaptability and modularity, and services may be accessed whenever necessary.

4. Big Data

The term "Big Data" has emerged as a popular phrase in recent times. It is utilized by almost everyone, including academics and professionals in various fields of endeavor. The idea of big data can be traced all the way back to 2001, when Laney presented 3Vs (volume, variety, and velocity) model as a solution to the problems caused by the growing volume of data.

The term "big data" was coined by Apache Hadoop in 2010, and it refers to "datasets that could not be acquired, handled, and analyzed by normal computers within an acceptable scope". In 2011, the McKinsey Global Institute defined "big data" as "datasets whose size is beyond the capabilities of standard database software tools to acquire, store, manage, and analyze". According to the definition provided by the International Data Corporation (IDC), "big data technologies" are a "new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis".

4.1 Dimensions of Big Data

Big data was classified according to the following dimensions, which were frequently referred to as the 3V model.

Volume: The term "volume" refers to the amount of data that is being produced and gathered at any one time. The progression from terabytes to petabytes (1024 terabytes) is taking place at a more rapid. It cannot be collected and stored at this time and will become feasible in the future as a result of increases in storage capabilities. The categorization of big data according to its volume is relative to the kind of data created and the amount of time it was collected over. It is possible that the same volume of two distinct forms of data, like text and video, would require two separate data management methods.
Velocity: The term "velocity" refers to the rate at which new data is produced. The conventional method of data analysis is predicated on regular updates, such as daily, weekly, or monthly checks. In order to make judgments that are based on accurate information, it is necessary to process and examine large amounts of data in real-time or as close to real-time as possible. The importance of time cannot be overstated in this context. There are just a few industries that create high-frequency data, including retail, telecommunications, and finance. It is possible to provide clients with individualized services in realtime by making use of the data that is created by mobile applications. This data may include, for example, demographic information, geographic location, and transaction history. Both the consumer base and the quality of the service might improve as a result of this measure.
Variety: The term "variety" refers to the many distinct kinds of data that are being produced and stored. It is beyond structured data and is classified as semistructured data and unstructured data, respectively. The term "structured data" refers to data that can be organized using a predefined data model. Structured data is represented as tabular data in relational databases and Excel. However, structured data make up just 5% of all the data that is currently available. It is impossible to organize unstructured data using these pre-defined models; some instances of unstructured data include text, video, and audio. Data that are neither completely organized nor completely unstructured are known as semi-structured data. This class includes the Extensible Markup Language (XML), among other related technologies.
A low-Value Density: This means that the data in its original form cannot be used. The data are evaluated to find items of exceptionally high value. For instance, it cannot derive any commercial value from the raw data that the website logs generate in their original form. In order to forecast how customers would act, it has to be evaluated.

4.2 Sources of Big Data

The digitization of material produced by many businesses has become the primary source of data. The development of new data at a rapid rate is another effect of advances in technology.

5. Data Mining

Data mining can turn a large collection of data into knowledge that can help to solve a problem that affects the entire world.

Although data mining is only a crucial phase in the process of knowledge discovery, the majority of people consider data mining to be synonymous with another phrase that is often used, which is known as Knowledge Discovery in Databases (KDD).

5.1 Data Mining as the Evolution of Information Technology

The development of information technology inevitably led to the creation of new opportunities, such as data mining. The database and data management sector evolved via the development of various essential features, including data gathering and the building of databases, data administration (including data storage and retrieval as well as database transaction processing), and enhanced data analysis (involving data warehousing and data mining). The early development of methods for data gathering and the building of databases served as predecessors for the subsequent development of efficient mechanisms for the storage and retrieval of data, as well as the processing of queries and transactions. A large number of database management systems provide query and transaction processing as standard operating procedures.

6. Machine Learning

A significant portion of the art of machine learning consists of simplifying a wide variety of distinct issues into a set of prototypes that are quite restricted. The majority of the research that goes into the field of machine learning is then focused on finding answers to these difficulties and providing solid assurances for the results.

6.1 Models of Machine Learning

Machine learning is currently coming together from a variety of different directions. Many traditions contribute to its unique strategies and vocabularies, which are currently being integrated into a more coherent discipline in order to make it more accessible (Jordan & Mitchell, 2015). The following fields have contributed to machine learning,

Statistical Models: A challenge that has been around for a long time in the field of statistics is how to make the most of using samples derived from probability distributions. The decision and estimate procedures depend on a corpus of samples obtained from the issue environment, and statistical approaches that are used to deal with these challenges might be regarded as examples of machine learning.
Brain Models: In order to simplify the modeling of biological neurons, non-linear components with weighted inputs have been proposed. Studies of the brain are curious about the degree to which these networks come close to simulating the learning processes that occur in live brains. It will discover that numerous essential aspects of machine learning are founded on networks of nonlinear components, which are more commonly referred to as "neural networks."
Adaptive Control Theory: The difficulty of regulating a process with unknown parameters that must be approximated while it is running is what control theorists focus their research on. During operation, the parameters will frequently be subject to change; thus, the control mechanism must monitor these alterations. This kind of issue manifests itself in a number of different ways during the process of operating a robot based on sensory inputs.
Psychological Models: The performance of people in a variety of learning activities has been investigated by psychologists, and used the data to develop psychological models. One of the earliest examples is the Effective Programming for America (EPAM) network, which was developed by Feigenbaum in 1961 for the purpose of storing and retrieving one member of a pair of words when given the other. Work in this area contributed to the development of a number of early approaches, such as decision trees and semantic networks.
Evolutionary Models: Techniques that mirror specific features of biological evolution have been proposed as learning approaches to enhance the overall performance of computer programs. This is due to the fact that the difference between evolving and learning may become blurry in computer systems. The most widely used computational methods for evolution are genetic algorithms and genetic programming respectively.

A wide range of various computational architectures includes,

Functions
Logic programs and rule sets
Finite-state machines
Grammars
Problem solving systems

7. Artificial Intelligence

The term "Artificial Intelligence" (AI) refers to a category of computing technologies that have become increasingly advanced in recent years.

In order to have meaningful and productive arguments regarding Artificial Intelligence (AI), greater accuracy is essential because AI may relate to such a wide variety of methods and circumstances. For instance, arguments regarding simple "expert systems" used in advisory roles need to be differentiated from those regarding complex data-driven algorithms that automatically implement decisions about individuals. This distinction is necessary because complex algorithms automatically implement decisions about individuals. In a similar vein, it is essential to differentiate between arguments concerning hypothetical developments in the distant future that may never come to pass and those concerning existing AI that already has an impact on society in the present day.

Symbolic AI works best in small, controlled situations that do not change much over time, where the rules are clear and unambiguous and the variables can be measured (Haefner et al., 2021). The present resurrection of Artificial Intelligence may be attributed in large part to the more recent development of "data-driven" techniques, which are part of the second wave of AI and have seen remarkable growth over the past two decades. This eliminates the need for human specialists and is necessary for the initial wave of AI by automating the learning process of algorithms. The capabilities of the brain have served as a model for the development of Artificial Neural Networks (ANNs). The inputs are first converted into signals, which are then sent through a network of artificial neurons in order to create outputs. These outputs are then interpreted as reactions to the inputs. ANNs are able to solve increasingly complicated issues when they have been given additional neurons and layers. Deep learning refers to the use of ANNs that have many layers. The term "Machine Learning" (ML) refers to the process of modifying a network in such a way that its outputs are seen as helpful or intelligent responses to the information provided by the inputs. This learning process may be automated by Machine Learning algorithms, either by making modest changes to individual ANNs or by applying evolutionary concepts to large populations of ANNs in order to provide progressive gains.

Conclusion

The delivery of computing services is shifting to mimic that of other utilities like water, electricity, gas, and telephones, creating a model in which computing is just another commodity. In this paradigm, customers gain access to services according to their needs rather than the location of the hosting infrastructure. This utility computing vision has been a goal of several computing paradigms, such as grid computing. The next paradigm shift, cloud computing, holds the greatest potential to finally make the concept of computing utilities a reality. Instead of focusing on the problems that need to be solved, researchers often get stuck on computational problems where High-Performance Computing (HPC) might be expected, Low-Performance Computing (LPC) is typically the consequence since computers do not function like people.

References

[1]. Buyya, R. (1999). High-Performance Cluster Computing: Architectures and Systems. Prentice Hall.

[2]. Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., & Brandic, I. (2009). Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5₅ utility. Future Generation Computer Systems, 25(6), 599-616. https://doi.org/10.1016/j.future. 2008.12.001

[3]. Chen, M. S., Han, J., & Yu, P. S. (1996). Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6), 866-883. https://doi.org/10.1109/69.553155

[4]. Haefner, N., Wincent, J., Parida, V., & Gassmann, O. (2021). Artificial intelligence and innovation management: A review, framework, and research agenda. Technological Forecasting and Social Change, 162, 120392. https://doi.org/10.1016/j.techfore.2020.120392

[5]. Haenlein, M., & Kaplan, A. (2019). A brief history of artificial intelligence: On the past, present, and future of artificial intelligence. California Management Review, 61(4), 5-14. https://doi.org/10.1177/0008125619864925

[6]. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260. https://doi.org/10.1126/science. aaa8415

[7]. Sagiroglu, S., & Sinanc, D. (2013, May). Big data: A review. In 2013 International Conference on Collaboration Technologies and Systems (CTS) (pp. 42- 47). IEEE.

[8]. Yi, S., Li, C., & Li, Q. (2015, June). A survey of fog computing: concepts, applications and issues. In Proceedings of the 2015 Workshop on Mobile Big Data (pp. 37-42). https://doi.org/10.1145/2757384.2757397