Cloud Computing - Overview
-I-
“Alexandru Ioan Cuza” University of Iasi Faculty of Computer Science
Prof. Lenuța Alboaie [email protected]
Content
• Why Cloud Computing?
• History & Evolution
• Grid/Cluster computing – general aspects
• Cloud Computing – definitions
• Grid versus Cloud
• Cloud Computing - aspects
Cloud Computing
• Do you use Cloud Computing?
Cloud Computing
• Cloud computing “in your pocket”?
Why Cloud Computing?
• Understanding the basic principles
– How something scalable can be built?
– Various development environments
• What is behind a Cloud Platform?
– How does it work? Advantages? Disadvantages?
– Technologies: Web Services, SOA, Ajax, XML, NoSQL, MapReduce,….
• Would you like to build the ‘next’ Facebook?
– Scalability, efficiency, fault tolerance, security,…
• Knowing the impact on society – Vulnerabilities, security issues,
• Anticipating a possible future How do we reach Cloud Computing?
(Now)
History & Evolution
• 1945-1985: “computers were large and expensive”
• … improvements:
Processors Memory Networking
Storage
Protocols
History & Evolution
• Microprocessor industry (8-biti, 16,32,64,…) has evolved rapidly
• Computers have become – Smaller
– Cheaper – Faster
• “…from machine that cost 10 million dollars and executed 1 instruction per second (IPS) we have come to machines that cost 1000 dollars and are able to execute 1 billion instructions per second, a price/performance gain of 1013”
• “In 2019, Google announced that its Sycamore quantum computer had completed a task in 200 seconds that would take a conventional computer 10,000 years.”
– IBM's 127-qubit Eagle processor (Nov 2021)
Year Cost ($/MB) Capacity (average)
1977 $32,000 16K
1987 $250 640K-2MB
1997 $2 64MB-256MB
2007 $0.06 512MB-2GB+
2014 $0.0091 8GB->…
2022 $0.000… 16Gb->
[http://www.cs.rutgers.edu/~pxk/]
History & Evolution
• 1977: 310KB floppy drive ~ $1480
• 1987: 40 MB drive ~ $679
• 2008: 750 GB drive ~ $99
• 2022: 3-4TB drive ~ $100
• “Areal density is a measure of the quantity of information bits that can be stored on a given length of track, area of surface, or in a given volume of a computer storage
medium - TPI (tracks per inch) or bits per inch. ”
• “Recording density increased over 60,000,000 times over
History & Evolution
1961-1972: first communication's attempts using packet-switching
• 1961: Kleinrock – proposed a theoretical model
• 1964: Baran – implemented the communication among US military computers
• 1967: ARPAnet was projected by Advanced Research Projects Agency
• 1969: first operational node ARPAnet, a network formed by 4 computers
• 1972:
• public demonstration of ARPAnet technologies
• NCP (Network Control Protocol) – the first host-host protocol
• First program for electronic mail (e-mail)
• The sign @ is introduced
• ARPAnet contains 15 nodes
History & Evolution
• 1972-1980: The Internetworking concept appeared. Also, proprietary networks appeared.
• 1973: DARPA (Defense Advanced Research Projects Agency) – interconnected networks; Robert Metcalf (Hardvard) developed Ethernet technology that allowed data transfer using coaxial
cable
• 1974: Cerf and Kahn – proposed a communication protocol entitled TCP(Transmission Control Protocol)
• 1978: TCP/IP protocols stack was standardized via RFC (Request For Comments) documents
• In the late 70s: proprietary networks stacks appeared: DECnet, SNA, XNA
• 1979: ARPAnet contained 200 nodes
History & Evolution
• 1980-1990: new protocols, the network number was increasing, Internet
• 1983: TCP/IP was used
• 1982: SMTP (Simple Mail Transfer Protocol) was defined
• 1983: DNS (translation of host name into IP address and vice versa) appeared
• 1985: FTP(File Transfer Protocol) protocol appeared
• 1986: Internet backbone appeared
• 1988: some congestion control mechanisms for TCP were introduced
History & Evolution
LAN – speed:
– Original Ethernet: 2.94 Mbps
– 1985: thick Ethernet: 10 Mbps; 1 Mbps with twisted pair networking
– 1991: 10BaseT - twisted pair: 10 Mbps – 1995: 100 Mbps Ethernet
– 1998: 1 Gbps (Gigabit) Ethernet
– 1999: 802.11b (wireless Ethernet) standardized – 2001: 10 Gbps introduced
– 2005: 100 Gbps (over optical link) – 2022: … Gbps
History & Evolution
History & Evolution
[https://worldpopulationreview.com/country-rankings/internet-speeds-by-country]
“According to internet speed specialists Ookla (https://www.speedtest.net/) the global average download speed on fixed broadband as of September 2021 was 113.25 Mbps on fixed broadband and 63.15 Mbps on mobile. These are both notable improvements over the scores of 85.73 Mbps broadband and 35.96 Mbps mobile just one year earlier in September 2020”
History & Evolution
Figure. Hosts number form January 1994 till January 2019 Source: Feb 2020| https://www.isc.org/network/survey/
Trends
* From supercomputers to workstations that can be connected together
What means computing?
• Computing
• The way one thinks
In computer science?
• “we can define computing to mean any goal-
oriented activity requiring, benefiting from, or
creating computers.”
… Computing?
“… computing may someday be organized as a public utility just as the telephone system is a public utility... The computer utility could become the basis of a new and important industry.”−John McCarthy (a professor of MIT) 1961.
“As of now, computer networks are still in their infancy, but as they grow up and become sophisticated, we will probably see the
spread of computer utilities which, like present electric and
telephone utilities, will service individual homes and offices across the country.−L. Kleinrock(one of the chief scientists of the original ARPANET project) 1969” −John McCarthy (a professor of MIT)
1961.
… Computing?
“it was transformed in a model consisting of consumer services
(commodity computing) and can be provided in a manner similar to traditional utilities “
Fifth utility -> Utility Computing or “Computing as a Utility”
Computing Power ?
Required:
• solving problems involving modeling, simulation and analyzes
• Using unoccupied resources:
– in the 90s almost 90% of a processor power was not used
– the possibility to solve a wide variety of problems at affordable prices – cost/performance report in relation with a super-computer (HPC - high
performance computer) => ….
Grid Computing
• The Grid concept appeared in the 90s
In analogy with electric power grids ~ 1910
Grid Computing
• Foster and Kesselman (1998): “A computational grid is a hardware and software infrastructure that provides dependable, consistent,
pervasive, and inexpensive access to high-end computational capabilities.”
• “The Grid is an emerging infrastructure that will fundamentally change the way we think – and use – computing. The word Grid is used by
analogy with the electric power grid, which provides pervasive access to electricity and, like the computer and a small number of other
advances has had a dramatic impact on human capabilities and society.
Many believe that by allowing all components of our information
technology infrastructure – computational capabilities, databases,
sensors, and people – to be shared flexibly as true collaborative tools,
the Grid will have a similar transforming effect, allowing new classes of
application to emerge.” (Foster and Kesselman 2004)
Grid Computing
Distributed computing architecture originally designed scientific projects and then the industrial ones
Offers the existance of a software and hardware infrastructure which allows:
permanent and affordable access in a consistent manner to computing resources
Offers various mechanism to process data in a distributed manner
Allows the execution of tasks on multiple machines that can be viewed as a single computer
Offers support for searching and retrieving information, regardless of their physical location
Offers the context to create VO - virtual organizations – which shares application, data in an open and heterogeneous environment in order to solve various complex problems
It is shared:
Computing/processing power, Data storage/networked file systems,Grid Computing
Terminology:
Grid middleware – software level providing the required
functionalities needed for heterogeneous resources sharing and creating a virtual organization
Grid infrastructure – refers to the combination of hardware and Grid middleware which transforms disparate and
heterogeneous computing resources in a virtual
infrastructure that offers the view of a single machine to the end user
Utility computing – Grid Computing and applications are provided as services (e.g. hosting solutions for VO, et. al.)
- Utility computing is based on business pay-per-use model
Grid Computing| Architecture
Grid Architectures use simultaneously a large number of resources (hardware, software, logical)
Resource – a sharing entity that can be present in a Grid infrastructure:
Computation: PDA, PC, workstation, server, cluster
Storage: hard disk, RAID, SAN, …
I/O type: sensors, networks, printersetc.
Logical: timers, …
• Obs. Systems as: scientific instruments or HPC can be part of a Grid
A Grid architecture focuses on interoperability issues , communication protocols between suppliers and the resource used in order to
establish sharing relationships
Grid Computing| Architecture
Generic Grid architecture
Grid Computing|Classification
Classifications:
In relation to the type of managed resources
Compute Grid – used to share computing resources (e.g. CPU) - Examples: intensive graphic processing
Data Grid – focused on storage, management and sharing of distributed and heterogeneous resources
Application Grid – focused on application management and
transparently providing remote access to software and libraries;
Example: grids in the bioinformatics field or earth science
Service Grid – resulting from Grid and SOA convergence, offers support to share services in an efficient manner
In relation to the resource sharing domain:
Cluster Grid
Enterprise Grid
Utility Grid Services
Grid Computing| Evolution
Generation 1 – Globus project (Goble & Foster)
Applications requiring high computing power
Includes protocols (LDAP, FTP) and heterogeneous development tools
Support for access and files transfer
Use Internet technologies, but ignore the Web
Employed mainly in academic environment
Sharing resources is achieved via GridFTP
Implementations: …Legion, Condor, Unicore, ….
Grid Computing| Evolution
Generation 2 – OGSA (Open Grid Services Architecture)
There is convergence of Service-oriented computing (SOC) and Grid Computing
• We notice the interoperability and sharing vision of SOC at application lever versus Grid computing vision mainly at hardware level
Generation 1: Grid Computing architecture consists of protocols and services used to describe and share available physical resources
By using Web Services Standard ( such as: WSDL, SOAP, BPL4WS,…) Grid
protocols and Services can be described in a standardized manner
Grid Computing| Evolution
Generation 2 – OGSA (Open Grid Services Architecture)
OGSA:
Using the same standards
=> it was possible the
convergence between Grid Computing and SOC =>
besides hardware and system resources, the
applications have become
shareable
Implementation
Generation 2 – OGSA (Open Grid Services Architecture)
Grid services must be:
– Dynamic and volatile – set of composed services that can be invoked or removed “on the fly”
– Ad-hoc – there is no central location or central control
– Widespread– orchestrating a large number of services (> 100) should be performed anytime
– Available – potentially long-term (e.g. a simulation can take weeks) – OGSI (Open Grid Service Infrastructure)
OGSA Infrastructure - “accommodates” interactions between Grid resources and Web Services
Model implemented by Globus Toolkit 3.0
» OGSI was replaced by WSRF (Web Service Resource Framework):
WS- Security, WS- Management and other standards for Web
Grid Computing| Evolution
Generation 3 – present and future
Convergence of Grid Computing and SaaS (Software-as-a-Service) paradigm
SaaS
Designates software that is owned, delivered and managed by a provider
It is used in the pay-per-use principle via a Web browser or APIs
Versus traditional software
The user pays for the time of use
The user does not have the software, he does not invest in the infrastructure or licenses
History: Application Service Provisioning (ASP) – appeared in 1988
It was a step for IT outsourcing and it came with the idea of Web
applications that could be provided by a central supplier (one-to-many delivery model)
The main problem: the inability to provide personalized services
Issues regarding scalability, robustness,….
Grid Computing| Evolution
Generation 3 – present and future
ASP problems can be solved by using Grid Computing + Web Services
Web Services allows services personalization
Grid Environment offer flexibility and scalability
=> many-to-many delivery model
[Grid and Cloud Computing - A Business Perspective on
Technology and
Applications, 2010]
Present and future
Overview
Two directions of evolution:
Grid Computing
Mature technology
It provides computational power in pay-per-use manner => new business models for utility computing
There were many initiatives at hardware level: Sun, IBM, etc.
There were many initiatives at software level -> SaaS
Microsoft, SAP et. al.
? Next step…
A scalable , robust and reliable physical infrastructure,
Services that provide developers access to infrastructure by manipulating abstracted interfaces
SaaS running on a flexible and scalable infrastructure
Cloud Computing
What is?
Larry Ellison, founder of
“We’ve redefined Cloud
Computing to include everything that we already do. . . . I don’t understand what we would do differently in the light of Cloud
Computing other than change the
wording of some of our ads.”
Cloud Computing
What is?
Richard Stallman
•“cloud computing is evil”
•“I think that marketers like cloud computing because it is devoid of substantive meaning. The term’s meaning is not substance, it’s an
attitude: ‘Let any Tom, Dick and Harry
hold your data, let any Tom, Dick and
Harry do your computing for you (and
control it).’ Perhaps the term ‘careless
Cloud Computing
Definition from the end user perspective:
• “the idea of delivering personal (e.g., email, word
processing, presentations.) and business productivity applications (e.g., sales force automation, customer service, accounting) from centralized servers” (Merrill Lynch)
Definition that contains architectural aspects:
• “a service model that combines a general organizing principle for IT delivery, infrastructure components, an
architectural approach and an economic model – basically,
a confluence of grid computing, virtualization, utility
Cloud Computing
Definition that contains both architectural and final use aspects:
• “Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems
software in the datacenters that provide those services. The
services themselves have long been referred to as Software as a Service (SaaS). The datacenter hardware and software is what we will call a Cloud. When a Cloud is made available in a pay-as- you-go manner to the general public, we call it a Public Cloud;
the service being sold is Utility Computing. We use the term Private Cloud to refer to internal datacenters of a business or other organization, not made available to the general public.
Thus, Cloud Computing is the sum of SaaS and Utility
Computing, but does not include Private Clouds. People can be users or providers of SaaS, or users or providers of Utility
Computing.” (Berkeley Lab, 2009)
Cloud Computing
Definitions
“a large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet.” (Foster et al. (2008))
• http://jameskaskade.com/?p=594
• “a style of computing in which massively scalable IT-related capabilities are provided “as a service” using Internet
technologies to multiple external
customers” (Gartner)
Cloud Computing
The relation with Grid Computing:
• “We argue that Cloud Computing not only overlaps with Grid Computing, it is indeed evolved out of
Grid Computing and relies on Grid Computing as its backbone and infrastructure support. The
evolution has been a result of a shift in focus from an infrastructure that delivers storage and
compute resources (such is the case in Grids) to
one that is economy based aiming to deliver more
abstract resources and services (such is the case in
Clouds).” (Foster et al., 2008)
Cloud Computing
Versus Grid Computing
Grid Computing Cloud Computing
Business Model
(Traditional: pay only once for the unlimited use of the software)
Grid: project oriented, negotiation, allocate resources depending on the level of the provided services
Cloud: the pay is allocated depending on the consumption (computing, storage, ..)
Arhitectura Fabric Level – consists in resources, similar to
Grid
Unified Resource Level – resources that have been encapsulated (e.g. virtualization) – cluster or virtual system, file system logic, etc.
Platform Level- environment to host web, develop the services, etc.
Cloud Computing
Versus Grid Computing
Grid Computing Cloud Computing
Computing Model Batch-scheduled (queued systems)
Assigning multiple resources/
servers for a single task
Simultaneouslyuser shared resources, as opposed to dedicated
Challenge: QoS Exploitation pattern Running programs for a limited
amount of time
Frequently used for “long- running services”
Various relationships between resources providers
Main purpose – creating => user policies and agreements
(multiple domains)
Overrides this necessity (single domain)
Different purpose Provides infrastructure as service
Provides IaaS, PaaS, SaaS Final user perspective The Grid interfaces are based on
protocols and API’s oriented to expert users
Provides interfaces available in browser or API.
Cloud Computing
Versus Grid Computing
Grid Computing Cloud Computing
Data location – in order to achieve better scalability, the data are distributed on several computers
Based on distributed files systems (NFS, GPFS,PVFS, Lustre)
Based on map-reduce mechanisms
Monitoring Monitoring tools: Ganglia
(http://meta.rocksclusters.org/g anglia/) -Grid Report for Sun, 19 Feb 2012
A low granularity control is difficult to achieve because virtualization (issues for users and administrators)
Vision : self-maintained autonomous clouds Programming models Employs flux control tools to
manage large quantities of data and tasks (MPICH-G2, GridRPC,
…)
Employs map-reduce models.
Implementation examples:
Hadoop using Pig as programming
Cloud Computing
• “How big is the
Cloud?”
Cloud Computing
• “How big is the
Cloud?”
[https://zephoria.com/top-15-valuable-facebook-statistics/]
Cloud Computing
• “How big is the
Cloud?”
Cloud Computing
• “How big is the Cloud?” 2019
[http://www.internetlivestats.com/one-second/#google-band]
Cloud Computing
• “How big is the
Cloud?” 2022
Cloud Computing
• “How big is the
Cloud?
” Flickr
[http://www.live-counter.com/how-big-is-the-internet/]
2015
2016
2017
2019
2020
2021
2022
Cloud Computing
• The trend: data-centric computing – Big data
• Current currency on the Internet?
– Users “pay” Facebook, Google, Instagram use … all actions, links, shares are recorded
– Also, data have another dimension (besides economic)
• Better answers to various questions, validate hypotheses on various social interactions,….
• Example: Online Social Network research
Cloud Computing
Cloud Computing
• At this point, near search engines, there are other BigData
“players”
– banks, academia, financial environment, government,….
Everything is possible thank to the new generation of
“hardware hosting services” cloud and new programming models
• Cloud Services are deeply embedded in modern society – Communication: Twitter, Facebook, Skype, IM,…
– Media: iTunes, Netflix,….
– Market: Amazon, eBay, stock exchanges, advertising,…
– ….
• True understanding understanding the interactions between
technology, systems, networks and people purpose of this
Bibliography
• Katarina Stanoevska Slabeva, Thomas Wozniak, Grid and Cloud Computing - A Business
Perspective on Technology and Applications, 2010, Editors Santi Ristol, Springer-Verlag Berlin Heidelberg
• Massimo Cafaro, Givani Aloisio, Grids, Clouds and Virtualization, 2011
• Foster I, Kesselman, C, Tuecke S (2001) The Anatomy of the Grid: Enabling Scalable Virtual Organization. International Journal of High Performance Computing Applications 15(3):200- 222
• Massimo Cafaro, Givani Aloisio, Grids, Clouds and Virtualization, 2011
• Katarina Stanoevska Slabeva, Thomas Wozniak, Grid and Cloud Computing - A Business
Perspective on Technology and Applications, 2010, Editors Santi Ristol, Springer-Verlag Berlin Heidelberg
• DMTF - http://dmtf.org/standards/cloud
• LIBVRT - http://libvirt.org/apps.html
• 2016 - http://expandedramblings.com/index.php/flickr-stats/
• 2016 -http://expandedramblings.com/index.php/by-the-numbers-a-gigantic-list-of-google- stats-and-facts/2/
• https://www.computerworld.com/article/3030642/flash-memorys-density-surpasses-hard-