personal computer clients communicating via network with a vane server serving inactive content alone .
The inside and front of a Dell PowerEdge waiter, a calculator designed to be mounted in a rack mount environment. It is frequently used as a vane server.
Reading: Web server – Flickroom
Multiple world wide web servers may be used for a high traffic web site .
Web server farm with thousands of web servers used for super-high traffic websites. A web server is computer software and underlying hardware that accepts requests via HTTP ( the net protocol created to distribute web capacity ) or its plug random variable HTTPS. A drug user agent, normally a web browser or network sycophant, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the message of that resource or an error message. A world wide web server can besides accept and store resources sent from the drug user agent if configured to do so. [ 1 ] [ 2 ] The hardware used to run a web waiter can vary according to the bulk of requests that it needs to handle. At the depleted end of the range are embedded systems, such as a router that runs a little web server as its shape interface. A high-traffic Internet web site might handle requests with hundreds of servers that run on racks of high-speed computers. A resource send from a world wide web server can be a preexist file ( electrostatic content ) available to the network waiter, or it can be generated at the time of the request ( active capacity ) by another broadcast that communicates with the server software. The erstwhile normally can be served faster and can be more easily cached for reprise requests, while the latter supports a broader range of applications. Technologies such as REST and SOAP, which use HTTP as a footing for general computer-to-computer communication, a well as accompaniment for WebDAV extensions, have extended the application of web servers well beyond their original purpose of serving human-readable pages .
Contents
history
“vague but exciting…” beginning network proposal ( 1989 ) evaluated as
The world ‘s first network server, a adjacent Computer workstation with Ethernet, 1990. The case label reads : “ This machine is a server. DO NOT POWER IT DOWN ! ! ”
This is a very brief history of the web server programs and then some data necessarily interlaps with the histories of the web browsers, the World Wide Web and the Internet, therefore, for the sake of the clearness and comprehensibility, some key diachronic information below reported may be exchangeable to that line up besides in one or more of the above mentioned history articles .
initial WWW project ( 1989-1991 )
In March 1989, Sir Tim Berners-Lee proposed a modern project to his employer CERN, with the goal of easing the exchange of data between scientists by using a hypertext arrangement. The marriage proposal, titled “HyperText and CERN”, asked for comments and it was read by respective people. In October 1990 the marriage proposal was reformulated and enriched ( having as co-author Robert Cailliau ), and last it was approved. [ 3 ] [ 4 ] [ 5 ] Between late 1990 and early 1991 the project resulted in Berners-Lee and his developers writing and testing respective software libraries along with three programs, which initially ran on NeXTSTEP OS installed on adjacent workstations : [ 6 ] [ 7 ] [ 5 ]
Those early on browsers retrieved web pages from network server ( s ) using a new basic communication protocol that was named HTTP 0.9. In August 1991 Tim Berner-Lee announced the birth of WWW technology and encourage scientists to adopt and develop it. [ 8 ] Soon after, those programs, along with their reference code, were made available to people concern in their custom. [ 6 ] In practice CERN informally allowed other people, including developers, and so forth, to play with and possibly far develop what it has been made cashbox that here and now. This was the official parturition of CERN httpd. Since then Berner-Lee started promoting the adoption and the custom of those programs along with their port to other OSs. [ 5 ]
Fast and violent development ( 1991-1995 )
Active web sites (1991-1996) | |
Number of active web sites (1991-1996) [9] [10] |
In December 1991 the first web server outside Europe was installed at SLAC ( U.S.A. ). [ 7 ] This was a very important event because it started trans-continental web communications between web browsers and vane servers. In 1991-1993 CERN web server platform continued to be actively developed by world wide web group, meanwhile, thanks to the handiness of its source code and the public specifications of the HTTP protocol, many other implementations of web servers started to be developed. In April 1993 CERN issued a public official statement stating that the three components of Web software ( the basic line-mode customer, the web server and the library of coarse code ), along with their source code, were put in the populace knowledge domain. [ 11 ] This instruction freed world wide web server developers from any possible legal return about the development of derivative work based on that source code ( a threat that in exercise never existed ). At the beginning of 1994, the most noteworthy among new web servers was NCSA httpd which ran on a variety show of Unix -based OSs and could serve dynamically generated content by implementing the POST
HTTP method and the CGI to communicate with external programs. These capabilities, along with the multimedia features of NCSA ‘s Mosaic browser ( besides able to manage HTML FORMs in arrange to send data to web server ) highlighted the potential of web technology for publish and distributed computing applications. In the irregular half of 1994, the development of NCSA httpd stalled to the point that a group of external software developers, webmasters and other professional figures interest in that waiter, started to write and collect patches thanks to the NCSA httpd source code being available to public knowledge domain. At the beginning of 1995 those patches were all applied to the concluding unblock of NCSA generator code and, after several tests, the Apache HTTP server undertaking was started. [ 12 ] [ 13 ] At the end of 1994 a raw commercial web server, named Netsite, was released with specific features. It was the first one of many early exchangeable products that were developed beginning by Netscape, then besides by Sun Microsystems and ultimately by Oracle Corporation. In mid-1995 the foremost adaptation of IIS was released, for Windows NT OS, by Microsoft. This was a luminary event because marked the entrance, in the playing field of World Wide Web technologies, of a identical authoritative commercial developer and seller that has played and inactive is playing a key character on both sides ( client and waiter ) of the world wide web. In the moment one-half of 1995 CERN and NCSA web servers started to decline ( in ball-shaped percentage use ) because of the wide-spread adoption of new web servers which had a much faster development cycle along with more features, more fixes applied and more performances than the previous ones .
explosive growth and competition ( 1996-2014 )
Active web sites (1996-2002) | |
Number of active web sites (1996-2002) [10] [14] |
At the end of 1996 there were already over fifty known ( different ) web waiter software programs that were available to everybody wanted to own an Internet knowledge domain list and/or to host websites. [ 15 ] Many of them lived entirely curtly and were replaced by other world wide web servers. The publication of RFCs about protocol versions HTTP/1.0 ( 1996 ) and HTTP/1.1 ( 1997, 1999 ), forced most web servers to comply ( not always completely ) with those standards. The use of TCP/IP persistent connections ( HTTP/1.1 ) required network servers both to increase a distribute the maximum number of coincident connections allowed and to improve their level of scalability. between 1996 and 1999 Netscape Enterprise Server and Microsoft ‘s IIS emerged among the leading commercial options whereas among the freely available and open-source programs Apache HTTP Server held the lead as the preferable waiter ( because of its dependability and its many features ). In those years there was besides another commercial, highly innovative and frankincense celebrated web server called Zeus ( now discontinued ) that was known as one of the fastest and most scalable web servers available on market, at least public treasury first gear ten of 2000s, despite its gloomy percentage of use. Apache resulted in the most practice network server from mid-1996 to the end of 2015 when, after a few years of decline, it was surpassed initially by IIS and then by Nginx. Afterwards IIS dropped to much lower percentages of use than Apache ( see besides market share ). Since 2005-2006 Apache started to improve its speed and its scalability floor by introducing raw performance features ( e.g. event MPM and new content cache ). [ 16 ] [ 17 ] As those fresh performance improvements initially were marked as experimental, they were not enabled by its users for a long time and thus Apache suffered evening more the competition of commercial servers and, above all, of early open-source servers which interim had already achieved army for the liberation of rwanda superior performances ( largely when serving static contented ) since the begin of their development and at the time of the Apache decline were able to offer besides a farseeing enough list of well tested advance features. In fact, a few years after 2000 started to slowly emerge not merely other commercial and highly competitive web servers, e.g. LiteSpeed, but besides many early open-source programs, frequently of excellent quality and very high performances, among which should be noted Hiawatha, Cherokee HTTP waiter, Lighttpd, Nginx and early derived / related products besides available with commercial corroborate. Around 2007-2008 most popular web browsers increased their previous default limit of 2 persistent connections per host-domain ( a specify recommended by RFC-2616 ) [ 18 ] to 4, 6 or 8 persistent connections per host-domain, in order to speed up the retrieval of heavy web pages with lots of images, and to mitigate the problem of the dearth of dogged connections dedicated to dynamic objects used for bi-directional notifications of events in web pages. [ 19 ] Within a year, these changes, on median, about tripled the maximum number of persistent connections that web servers had to manage. This vogue ( of increasing the number of dogged connections ) decidedly gave a firm impulse to the adoption of turn back proxies in front of slower world wide web servers and it gave also one more chance to the emerging new web servers that could show all their speed and their capability to handle very high numbers of concurrent connections without requiring besides many HW resources ( expensive computers with lots of CPUs, RAM and fast disks ). [ 20 ]
New challenges ( 2015 and later years )
In 2015, RFCs about new protocol version HTTP/2 were published and as the implementation of fresh specifications was not trivial at all, a dilemma arose among developers of less popular web servers ( e.g. with a percentage of usage lower than 1 % .. 2 % ), approximately adding or not adding support for that newfangled protocol translation. [ 21 ] [ 22 ] In fact supporting HTTP/2 much required group changes to their internal execution due to many factors ( much constantly required code connections, capability to distinguish between HTTP/1.x and HTTP/2 connections on the like TCP port, binary star representation of HTTP messages, message precedence, compression of HTTP headers, use of streams besides known as TCP/IP sub-connections and relate flow-control, etc. ) and indeed a few developers of those web servers opted for not supporting new HTTP/2 version ( at least in the cheeseparing future ) besides because of these main reasons : [ 21 ] [ 22 ]
- protocols HTTP/1.x would have been supported anyway by browsers for a very long time (maybe forever) so that there would be no incompatibility between clients and servers in next future;
- implementing HTTP/2 was considered a task of overwhelming complexity that could open the door to a whole new class of bugs that till 2015 did not exist and so it would have required notable investments in developing and testing the implementation of the new protocol;
- adding HTTP/2 support could always be done in future in case the efforts would be justified.
alternatively, developers of most popular web servers, rushed to offer the availability of new protocol, not entirely because they had the work force and the time to do thus, but besides because normally their former implementation of SPDY protocol could be reused as a begin point and because most use web browsers implemented it very cursorily for the lapp cause. Another cause that prompted those developers to act quickly was that webmasters felt the pressure of the ever increasing web traffic and they in truth wanted to install and to try – vitamin a soon as potential – something that could drastically lower the number of TCP/IP connections and acceleration accesses to hosted websites. [ 23 ] In 2020-2021 the HTTP/2 dynamics about its implementation ( by top web servers and popular web browsers ) were partially replicated after the publication of gain drafts of future RFC about HTTP/3 protocol .
technical foul overview
personal computer clients connected to a web server via Internet. The pursue technical overview should be considered alone as an undertake to give a few identical limited examples about some features that may be implemented in a web waiter and some of the tasks that it may perform in regulate to have a sufficiently wide scenario about the subject. A web server program plays the function of a server in a client-server model by implementing one or more versions of HTTP protocol, much including the HTTPS impregnable form and other features and extensions that are considered utilitarian for its plan use. The complexity and the efficiency of a web server program may vary a batch depending on ( e.g. ) : [ 1 ]
- common features implemented;
- common tasks performed;
- performances and scalability level aimed as a goal;
- software model and techniques adopted to achieve wished performance and scalability level;
- target HW and category of usage, e.g. embedded system, low-medium traffic web server, high traffic Internet web server.
coarse features
Although web server programs differ in how they are implemented, most of them offer the pursuit coarse features. These are basic features that most network servers normally have .
- Static content serving: to be able to serve static content (web files) to clients via HTTP protocol.
- HTTP: support for one or more versions of HTTP protocol in order to send versions of HTTP responses compatible with versions of client HTTP requests, e.g. HTTP/1.0, HTTP/1.1 (eventually also with encrypted connections HTTPS), plus, if available, HTTP/2, HTTP/3.
- Logging: usually web servers have also the capability of logging some information, about client requests and server responses, to log files for security and statistical purposes.
A few other more advanced and democratic features ( only a very short selection ) are the take after ones .
park tasks
A web server program, when it is running, normally performs respective general tasks, ( e.g. ) : [ 1 ]
Read request message
Web server programs are able : [ 24 ] [ 25 ] [ 26 ]
- to read an HTTP request message;
- to interpret it;
- to verify its syntax;
- to identify known HTTP headers and to extract their values from them.
once that a request message has been decoded and verified, its values can be used to determine whether that request can be satisfied or not and indeed many other steps are performed to do so, including security checks .
URL standardization
Web server programs normally perform some type of URL standardization ( URL found in most HTTP request messages ) in rate :
- to make resource path always a clean uniform path from root directory of website;
- to lower security risks (e.g. by intercepting more easily attempts to access static resources outside the root directory of the website or to access to portions of path below website root directory that are forbidden or which require authorization);
- to make path of web resources more recognizable by human beings and web log analysis programs (also known as log analyzers / statistical applications).
The term URL normalization refers to the process of modify and standardizing a URL in a consistent manner. There are several types of standardization that may be performed including conversion of URLs knowledge domain identify to lowercase, the most authoritative are removal of “. ” and “ .. ” way segments and adding trailing slashes to the non-empty path component .
URL mapping
“ URL map is the process by which a URL is analyzed to figure out what resource it is referring to, so that that resource can be returned to the requesting customer. This summons is performed with every request that is made to a world wide web server, with some of the requests being served with a charge, such as an HTML text file, or a gif image, others with the results of running a CGI program, and others by some early serve, such as a built-in module coach, a PHP document, or a Java servlet. ” [ 27 ] In commit, web server programs that implement boost features, beyond the childlike static content serving ( e.g. URL rewrite engine, active content serving ), normally have to figure out how that URL has to be handled, e.g. :
- as a URL redirection, a redirection to another URL;
- as a static request of file content;
- as a dynamic request of:
- directory listing of files or other sub-directories contained in that directory;
- other types of dynamic request in order to identify the program / module processor able to handle that kind of URL path and to pass to it other URL parts, i.e. usually path-info and query string variables.
One or more shape files of web server may specify the map of parts of URL path ( e.g. initial parts of file path, filename extension and early path components ) to a specific URL coach ( file, directory, external broadcast or inner module ). [ 28 ] When a web server implements one or more of the above-mentioned advanced features then the path part of a valid URL may not constantly match an existing file arrangement path under web site directory corner ( a charge or a directory in file system ) because it can refer to a virtual appoint of an internal or external module processor for dynamic requests .
URL path translation to file arrangement
Web server programs are able to translate an URL way ( all or separate of it ), that refers to a physical file arrangement path, to an absolute way under the aim web site ‘s root directory. [ 28 ] Website ‘s root directory may be specified by a shape file or by some home rule of the network waiter by using the list of the web site which is the host separate of the URL found in HTTP client request. [ 28 ] Path translation to file organization is done for the follow types of web resources :
- a local, usually non-executable, file (static request for file content);
- a local directory (dynamic request: directory listing generated on the fly);
- a program name (dynamic requests that is executed using CGI or SCGI interface and whose output is read by web server and resent to client who made the HTTP request).
The web server appends the path found in requested URL ( HTTP request message ) and appends it to the path of the ( Host ) website root directory. On an Apache server, this is normally /home/www/website
( on Unix machines, normally it is : /var/www/website
). See the stick to examples of how it may result. URL path translation for a static file request case of a static request of an existing file specified by the succeed url :
http://www.example.com/path/file.html
The node ‘s exploiter agent connects to www.example.com
and then sends the following HTTP /1.1 request :
GET /path/file.html HTTP/1.1 Host: www.example.com Connection: keep-alive
The result is the local file system reSource :
/home/www/www.example.com/path/file.html
The network server then reads the file, if it exists, and sends a response to the customer ‘s world wide web browser. The reply will describe the content of the file and contain the file itself or an error message will return saying that the file does not exist or its access is prevent. URL path translation for a directory request (without a static index file) case of an implicit dynamic request of an existing directory specified by the follow url :
http://www.example.com/directory1/directory2/
The customer ‘s user agentive role connects to www.example.com
and then sends the follow HTTP /1.1 request :
GET /directory1/directory2 HTTP/1.1 Host: www.example.com Connection: keep-alive
The resultant role is the local directory path :
/home/www/www.example.com/directory1/directory2/
The web server then verifies the being of the directory and if it exists and it can be accessed then tries to find out an index file ( which in this sheath does not exist ) and so it passes the request to an inner faculty or a broadcast dedicated to directory listings and finally reads data end product and sends a answer to the client ‘s web browser. The reception will describe the contentedness of the directory ( number of incorporate subdirectories and files ) or an erroneousness message will return saying that the directory does not exist or its access is forbid. URL path translation for a dynamic program request For a dynamic request the URL path specified by the node should refer to an existing external program ( normally an feasible charge with a CGI ) used by web server to generate moral force content. [ 29 ] exemplar of a dynamic request using a plan file to generate end product :
http://www.example.com/cgi-bin/forum.php?action=view&orderby=thread&date=2021-10-15
The node ‘s user agent connects to www.example.com
and then sends the surveil HTTP /1.1 request :
GET /cgi-bin/forum.php?action=view&ordeby=thread&date=2021-10-15 HTTP/1.1 Host: www.example.com Connection: keep-alive
The consequence is the local file way of the platform ( in this model a PHP program ) :
/home/www/www.example.com/cgi-bin/forum.php
Web server executes that plan sink to it the path-info and the question string action=view&orderby=thread&date=2021-10-15
so that the course of study knows what to do ( in this lawsuit to return, as an HTML text file, a horizon of forum entries ordered by weave since October, 15th 2021 ). Besides this web server reads data sent by that external program and resends that datum to the node which made the request .
oversee request message
once a request has been read, interpreted and verified, it has to be managed depending on its method acting, its URL and its parameters which may include values of HTTP headers. In practice web server has to handle the request by using one of these reaction paths : [ 28 ]
- if something in request was not acceptable (in status line or message headers), web server already sent an error response;
- if request has a method (e.g.
OPTIONS
) that can be satisfied by general code of web server then a successful response is sent; - if URL requires authorization then an authorization error message is sent;
- if URL maps to a redirection then a redirect message is sent;
- if URL maps to a dynamic resource (a virtual path or a directory listing) then its handler (an internal module or an external program) is called and request parameters (query string and path info) are passed to it in order to allow it to reply to that request;
- if URL maps to a static resource (usually a file on file system) then the internal static handler is called to send that file;
- if request method is not known or if there is some other unacceptable condition (e.g. resource not found, internal server error, etc.) then an error response is sent.
Serve static capacity
personal computer clients communicating via network with a network server serving inactive content alone. If a web server program is adequate to of serving static content and it has been configured to do therefore, then it is able to send file content whenever a request message has a valid URL path equal ( after URL map, URL translation and URL redirection ) that of an existing file under the solution directory of a web site and file has attributes which match those required by internal rules of web server program. [ 28 ] That kind of content is called static because normally it is not changed by web server when it is sent to clients and because it remains the same until it is modified ( file modification ) by some program. notice : when serving static content only, a vane server program normally does not change file contents of served websites ( as they are only read and never written ) and so it suffices to support alone these HTTP methods :
OPTIONS
HEAD
GET
answer of static file content can be sped up by a file cache .
Directory index files
If a web waiter program receives a customer request message with an URL whose path matches one of an existing directory and that directory is accessible and serving directory index file ( s ) is enabled then a web server program may try to serve the inaugural of known ( or configured ) static index file names ( a regular file ) found in that directory ; if no index file is found or other conditions are not met then an mistake message is returned. Most practice names for static index files are : index.html
, index.htm
and Default.htm
.
regular files
If a web waiter plan receives a client request message with an URL whose path matches the file identify of an existing file and that file is accessible by web server program and its attributes match home rules of vane server program, then web server program can send that file to client. normally, for security reasons, most web server programs are pre-configured to serve only regular files or to avoid to use special file types like device files, along with symbolic links or intemperate links to them. The target is to avoid undesirable slope effects when serving electrostatic web resources. [ 30 ]
Serve active subject
personal computer clients communicating via network with a network server serving static and dynamic content. If a web server course of study is capable of serving dynamic content and it has been configured to do thus, then it is able to communicate with the proper internal module or external program ( associated with the requested URL path ) in order to pass to it parameters of client request ; after that, web server program reads from it its data reaction ( that it has generated, frequently on the fly ) and then it resends it to the node platform who made the request. [ citation needed ] note : when serving static and dynamic content, a web server program normally has to support besides the following HTTP method acting in order to be able to safely receive data from node ( mho ) and so to be able to host besides websites with synergistic class ( s ) that may send big data sets ( e.g. lots of data entrance or file uploads ) to web server / external programs / modules :
POST
In holy order to be able to communicate with its internal modules and/or external programs, a web waiter program must have implemented one or more of the many available gateway interface(s) ( see besides Web Server Gateway Interfaces used for moral force content ).
The three standard and historic gateway interfaces are the follow ones .
- CGI
- An external CGI program is run by web server program for each dynamic request, then web server program reads from it the generated data response and then resends it to client.
- SCGI
- An external SCGI program (it usually is a process) is started once by web server program or by some other program / process and then it waits for network connections; everytime there is a new request for it, web server program makes a new network connection to it in order to send request parameters and to read its data response, then network connection is closed.
- FastCGI
- An external FastCGI program (it usually is a process) is started once by web server program or by some other program / process and then it waits for a network connection which is established permanently by web server; through that connection are sent the request parameters and read data responses.
directory listings
directory listing dyamically generated by a web server. A web server broadcast may be adequate to to manage the active generation ( on the fly ) of a directory index list of files and sub-directories. [ 31 ] If a web server course of study is configured to do so and a requested URL path matches an existing directory and its access is allowed and no static index charge is found under that directory then a vane page ( normally in HTML format ), containing the list of files and/or subdirectories of above mentioned directory, is dynamically generated ( on the fly ). If it can not be generated an error is returned. Some web server programs allow the customization of directory listings by allowing the custom of a web page template ( an HTML document containing placeholders, e.g.
$(FILE_NAME), $(FILE_SIZE)
, and so forth, that are replaced with the field values of each file entry found in directory by web server ), e.g. index.tpl
or the usage of HTML and embedded source code that is interpreted and executed on the fly, e.g. index.asp
, and / or by supporting the custom of dynamic index programs such as CGIs, SCGIs, FGCIs, e.g. index.cgi
, index.php
, index.fcgi
. custom of dynamically generated directory listings is normally avoided or limited to a few selected directories of a web site because that generation takes much more OS resources than sending a static index page. The main use of directory listings is to allow the download of files ( normally when their names, sizes, modification date-times or file attributes may change randomly / frequently ) as they are, without requiring to provide further information to requesting user. [ 32 ]
program or faculty march
An external program or an inner faculty ( processing unit ) can execute some sort of application function that may be used to get data from or to store data to one or more data repositories, e.g. : [ citation needed ]
- files (file system);
- databases (DBs);
- other sources located in local computer or in other computers.
A processing unit can return any kind of web content, besides by using data retrieved from a datum repository, e.g. : [ citation needed ]
- a document (e.g. HTML, XML, etc.);
- an image;
- a video;
- structured data, e.g. that may be used to update one or more values displayed by a dynamic page (DHTML) of a web interface and that maybe was requested by an XMLHttpRequest API (see also: dynamic page).
In rehearse whenever there is contentedness that may vary, depending on one or more parameters contained in client request or in shape settings, then, normally, it is generated dynamically .
Send response message
Web server programs are able to send response messages as replies to customer request messages. [ 24 ] An error reaction message may be sent because a request message could not be successfully read or decoded or analyzed or executed. [ 25 ] note : the following sections are reported only as examples to help to understand what a web server, more or less, does ; these sections are by any means neither exhaustive nor dispatch .
mistake message
A vane server program may reply to a node request message with many kinds of error messages, anyhow these errors are divided chiefly in two categories :
- HTTP client errors, due to the type of request message or to the availability of requested web resource;[33]
- HTTP server errors, due to internal server errors.[34]
When an erroneousness reply / message is received by a node browser, then if it is related to the chief user request ( e.g. an url of a network resource such as a web page ) then normally that error message is shown in some browser window / message .
A vane server program may be able to verify whether the request URL path : [ 35 ]
- can be freely accessed by everybody;
- requires a user authentication (request of user credentials, e.g. such as user name and password);
- access is forbidden to some or all kind of users.
If the mandate / access rights have has been implemented and enabled and access to web resource is not granted, then, depending on the compulsory entree rights, a vane server platform :
- can deny access by sending a specific error message (e.g. access forbidden);
- may deny access by sending a specific error message (e.g. access unauthorized) that usually forces the client browser to ask human user to provide required user credentials; if authentication credentials are provided then web server program verifies and accepts or rejects them.
URL redirection
A web server course of study may have the capability of doing URL redirections to newly URLs ( new locations ) which consists in replying to a client request message with a response message containing a modern URL suited to access a valid or an existing web resource ( node should redo the request with the new URL ). [ 36 ] URL redirection of localization is used : [ 36 ]
- to fix a directory name by adding a final slash ‘/’;[31]
- to give a new URL for a no more existing URL path to a new path where that kind of web resource can be found.
- to give a new URL to another domain when current domain has too much load.
exercise 1 : a URL path points to a directory name but it does not have a final cut ‘/ ‘ therefore web server sends a redirect to client in order to instruct it to redo the request with the pay back path mention. [ 31 ] From :
/directory1/directory2
To :
/directory1/directory2/
exemplar 2 : a hale set of documents has been moved inside website in order to reorganize their file organization paths. From :
/directory1/directory2/2021-10-08/
To :
/directory1/directory2/2021/10/08/
example 3 : a unharmed set of documents has been moved to a new website and now it is mandate to use plug HTTPS connections to access them. From :
http://www.example.com/directory1/directory2/2021-10-08/
To :
https://docs.example.com/directory1/2021-10-08/
Above examples are only a few of the possible kind of redirections .
successful message
A web server broadcast is able to reply to a valid client request message with a successful message, optionally containing requested web resource data. [ 37 ] If web resource data is sent back to node, then it can be static content or dynamic content depending on how it has been retrieved ( from a file or from the output signal of some program / module ) .
Content cache
In order to speed up web server responses by lowering average HTTP reception times and HW resources used, many popular web servers implement one or more message caches, each one specialized in a contentedness class. [ 38 ] [ 39 ] content is normally cached by its beginning, e.g. :
- static content:
- file cache;
- dynamic content:
- dynamic cache (module / program output).
File cache
historically, static contents found in files which had to be accessed frequently, randomly and fastly, have been stored largely on electro-mechanical disks since mid-late 1960s / 1970s ; unfortunately reads from and writes to those kind of devices have always been considered very behind operations when compared to RAM rush and so, since early OSs, first disk caches and then besides OS file hoard sub-systems were developed to speed up I/O operations of frequently accessed data / files. even with the aid of an OS file cache, the relative / casual slowness of I/O operations involving directories and files stored on disks became soon a bottleneck in the increase of performances expected from top level world wide web servers, particularly since mid-late 1990s, when network Internet dealings started to grow exponentially along with the ceaseless increase of amphetamine of Internet / network lines. The trouble about how to further efficiently speed-up the serve of static files, thus increasing the maximum number of requests/responses per second ( RPS ), started to be studied / researched since mid 1990s, with the draw a bead on to propose utilitarian cache models that could be implemented in web server programs. [ 40 ] [ 41 ] In drill, nowadays, many popular / high performance world wide web server programs include their own userland file cache, tailored for a vane waiter use and using their particular implementation and parameters. [ 42 ] [ 43 ] [ 44 ] The across-the-board ranch adoption of RAID and/or fast solid-state drives ( storage HW with very eminent I/O accelerate ) has slightly reduced but of course not eliminated the advantage of having a file hoard incorporated in a world wide web server .
Dynamic cache
Dynamic contentedness, output by an internal module or an external plan, may not always change very frequently ( given a unique URL with keys / parameters ) and thus, possibly for a while ( e.g. from 1 moment to respective hours or more ), the resulting end product can be cached in RAM or even on a fast phonograph record. [ 45 ] The typical use of a active cache is when a web site has moral force network pages about news, upwind, images, maps, etc. that do not change frequently ( e.g. every n minutes ) and that are accessed by a huge number of clients per moment / hour ; in those cases it is useful to return hoard content besides ( without calling the inner faculty or the external program ) because clients much do not have an updated transcript of the requested content in their browser caches. [ 46 ] anyhow, in most cases those kind of caches are implemented by external servers ( e.g. reverse proxy ) or by storing moral force data output in separate computers, managed by specific applications ( e.g. memcached ), in order to not compete for HW resources ( CPU, RAM, disks ) with web server ( mho ). [ 47 ] [ 48 ]
Kernel-mode and user-mode world wide web servers
A web server software can be either incorporated into the OS and executed in kernel space, or it can be executed in exploiter space ( like early even applications ). Web servers that run in kernel mood ( normally called kernel distance web servers ) can have target access to kernel resources and so they can be, in hypothesis, faster than those running in exploiter mood ; anyhow there are disadvantages in running a world wide web server in kernel mode, e.g. : difficulties in developing ( debugging ) software whereas run-time critical errors may lead to dangerous problems in OS kernel. Web servers that run in user-mode have to ask the system for permission to use more memory or more central processing unit resources. not only do these requests to the kernel take time, but they might not constantly be satisfied because the system reserves resources for its own use and has the province to share hardware resources with all the early run applications. Executing in drug user manner can besides mean using more buffer/data copies ( between user-space and kernel-space ) which can lead to a decrease in the performance of a user-mode network server. Nowadays about all web server software is executed in drug user mode ( because many of the aforesaid small disadvantages have been overcome by faster hardware, new OS versions, much faster OS system calls and newly optimized web server software ). See besides comparison of web server software to discover which of them run in kernel mode or in drug user mood ( besides referred as kernel space or drug user distance ) .
Performances
To improve the user experience ( on node / browser side ), a world wide web server should reply quickly ( angstrom soon as possible ) to customer requests ; unless content reply is throttled ( by configuration ) for some type of files ( e.g. big or huge files ), besides returned data contented should be sent deoxyadenosine monophosphate firm as potential ( gamey transfer speed ). In other words, a web server should always be very responsive, even under high load of web traffic, in order to keep total user’s wait ( sum of browser time + network time + web server response time ) for a response as low as possible .
performance metrics
For web server software, main key performance metrics ( measured under change operate on conditions ) normally are at least the follow ones ( i.e. ) : [ 49 ] [ 50 ]
- number of
requests per second
(RPS, similar to QPS, depending on HTTP version and configuration, type of HTTP requests and other operating conditions); - number of connections per second (CPS), is the number of connections per second accepted by web server (useful when using HTTP/1.0 or HTTP/1.1 with a very low limit of requests / responses per connection, i.e. 1 .. 20);
- network latency + response time for each new client request; usually benchmark tool shows how many requests have been satisfied within a scale of time laps (e.g. within 1ms, 3ms, 5ms, 10ms, 20ms, 30ms, 40ms) and / or the shortest, the average and the longest response time;
- throughput of responses, in bytes per second.
Among the engage conditions, the number ( 1 .. n ) of concurrent client connections used during a test is an important parameter because it allows to correlate the concurrency level supported by web server with results of the tested operation metrics .
Software efficiency
The specific web server software design and model adopted ( e.g. ) :
- single process or multi-process;
- single thread (no thread) or multi-thread for each process;
- usage of coroutines or not;
… and other programming techniques, such as ( e.g. ) :
- zero copy;
- minimization of possible CPU cache misses;
- minimization of possible CPU branch mispredictions in critical paths for speed;
- minimization of the number of system calls used to perform a certain function / task;
- other tricks;
… used to implement a web server broadcast, can bias a lot the performances and in especial the scalability level that can be achieved under heavy load or when using senior high school end hardware ( many CPUs, disks and lots of RAM ). In practice some web server software models may require more oxygen resources ( specially more CPUs and more RAM ) than others to be able to work well and so to achieve aim performances .
function conditions
There are many operating conditions that can affect the performances of a vane waiter ; performance values may vary depending on ( i.e. ) :
- the settings of web server (including the fact that log file is or is not enabled, etc.);
- the HTTP version used by client requests;
- the average HTTP request type (method, length of HTTP headers and optional body);
- whether the requested content is static or dynamic;
- whether the content is cached or not cached (by server and/or by client);
- whether the content is compressed on the fly (when transferred), pre-compressed (i.e. when a file resource is stored on disk already compressed so that web server can send that file directly to the network with the only indication that its content is compressed) or not compressed at all;
- whether the connections are or are not encrypted;
- the average network speed between web server and its clients;
- the number of active TCP connections;
- the number of active processes managed by web server (including external CGI, SCGI, FCGI programs);
- the hardware and software limitations or settings of the OS of the computer(s) on which the web server runs;
- other minor conditions.
Benchmarking
Performances of a world wide web server are typically benchmarked by using one or more of the available automated cargo test tools .
Load limits
A web server ( program installation ) normally has pre-defined load limits for each combination of operate conditions, besides because it is limited by OS resources and because it can handle only a circumscribed issue of coincident customer connections ( normally between 2 and several tens of thousands for each active web server process, experience besides the C10k problem and the C10M trouble ). When a world wide web server is near to or over its load limits, it gets overloaded and so it may become unresponsive .
Causes of overload
At any meter world wide web servers can be overloaded due to one or more of the take after causes ( e.g. ) .
- Excess legitimate web traffic. Thousands or even millions of clients connecting to the website in a short amount of time, e.g., Slashdot effect.
- Distributed Denial of Service attacks. A denial-of-service attack (DoS attack) or distributed denial-of-service attack (DDoS attack) is an attempt to make a computer or network resource unavailable to its intended users.
- Computer worms that sometimes cause abnormal traffic because of millions of infected computers (not coordinated among them).
- XSS worms can cause high traffic because of millions of infected browsers or web servers.
- Internet bots Traffic not filtered/limited on large websites with very few network resources (e.g. bandwidth) and/or HW resources (OS CPUs, RAM, disks).
- Internet (network) slowdowns (e.g. due to packet losses) so that client requests are served more slowly and the number of connections increases so much that server limits are reached.
- Web servers, serving dynamic content, waiting for slow responses coming from back-end computer(s) (e.g. databases), maybe because of too many queries mixed with too many inserts or updates of DB data; in these cases web servers have to wait for back-end data responses before replying to HTTP clients but during these waits too many new client connections / requests arrive and so they become overloaded.
- Web servers (computers) partial unavailability. This can happen because of required or urgent maintenance or upgrade, hardware or software failures such as back-end (e.g. database) failures; in these cases the remaining web servers may get too much traffic and become overloaded.
Symptoms of overload
The symptoms of an clog web server are normally the follow ones ( e.g. ) .
- Requests are served with (possibly long) delays (from 1 second to a few hundred seconds).
- The web server returns an HTTP error code, such as 500, 502,[51] 503,[53] 504,[54] 408, or even an intermittent 404.
- The web server refuses or resets (interrupts) TCP connections before it returns any content.
- In very rare cases, the web server returns only a part of the requested content. This behavior can be considered a bug, even if it usually arises as a symptom of overload.
Anti-overload techniques
To partially overcome above average load limits and to prevent overload, most popular websites use common techniques like the follow ones ( e.g. ) .
- Tuning OS parameters for hardware capabilities and usage.
- Tuning web server(s) parameters to improve their security and performances.
- Deploying web cache techniques (not only for static contents but, whenever possible, for dynamic contents too).
- Managing network traffic, by using:
- Firewalls to block unwanted traffic coming from bad IP sources or having bad patterns;
- HTTP traffic managers to drop, redirect or rewrite requests having bad HTTP patterns;
- Bandwidth management and traffic shaping, in order to smooth down peaks in network usage.
- Using different domain names, IP addresses and computers to serve different kinds (static and dynamic) of content; the aim is to separate big or huge files (
download.*
) (that domain might be replaced also by a CDN) from small and medium-sized files (static.*
) and from main dynamic site (maybe where some contents are stored in a backend database) (www.*
); the idea is to be able to efficiently serve big or huge (over 10 – 1000 MB) files (maybe throttling downloads) and to fully cache small and medium-sized files, without affecting performances of dynamic site under heavy load, by using different settings for each (group) of web server computers, e.g.:https://download.example.com
https://static.example.com
https://www.example.com
- Using many web servers (computers) that are grouped together behind a load balancer so that they act or are seen as one big web server.
- Adding more hardware resources (i.e. RAM, fast disks) to each computer.
- Using more efficient computer programs for web servers (see also: software efficiency).
- Using the most efficient Web Server Gateway Interface to process dynamic requests (spawning one or more external programs everytime a dynamic page is retrieved, kills performances).
- Using other programming techniques and workarounds, especially if dynamic content is involved, to speed up the HTTP responses (i.e. by avoiding dynamic calls to retrieve objects, such as style sheets, images and scripts), that never change or change very rarely, by copying that content to static files once and then keeping them synchronized with dynamic content).
- Using latest efficient versions of HTTP (e.g. beyond using common HTTP/1.1 also by enabling HTTP/2 and maybe HTTP/3 too, whenever available web server software has reliable support for the latter two protocols) in order to reduce a lot the number of TCP/IP connections started by each client and the size of data exchanged (because of more compact HTTP headers representation and maybe data compression).
Caveats about using HTTP/2 and HTTP/3 protocols even if newer HTTP ( 2 and 3 ) protocols normally generate less network traffic for each request / reply data, they may require more os resources ( i.e. RAM and CPU ) used by web waiter software ( because of code data, lots of stream buffers and other implementation details ) ; besides this, HTTP/2 and possibly HTTP/3 excessively, depending besides on settings of vane server and customer plan, may not be the best options for data upload of boastful or huge files at very high accelerate because their data streams are optimized for concurrence of requests and so, in many cases, using HTTP/1.1 TCP/IP connections may lead to better results / higher upload speeds ( your mileage may vary ). [ 55 ] [ 56 ]
far information on HTTP server programs : category : Web server software
Market share of all sites for most popular web servers 2005–2021 chart : for most popular world wide web servers 2005–2021
Market share of all sites for most popular web servers 1995–2005 chart : for most democratic web servers 1995–2005
October 2021
Below are the latest statistics of the market share of all sites of the top web servers on the Internet by Netcraft October 2021 Web Server Survey .
All other web servers are used by less than 22 % of all websites. note : ( * ) share rounded to integer act, because its decimal values are not publicly reported by informant page ( only its attack respect is reported in graph ) .
February 2021
Below are the latest statistics of the market share of all sites of the top world wide web servers on the Internet by Netcraft February 2021 Web Server Survey .
All early world wide web servers are used by less than 18 % of all websites .
February 2020
Below are the latest statistics of the market share of all sites of the top vane servers on the Internet by Netcraft February 2020 Web Server Survey .
All other web servers are used by less than 15 % of all websites .
February 2019
Below are the latest statistics of the market share of all sites of the top vane servers on the Internet by Netcraft February 2019 Web Server Survey .
All other world wide web servers are used by less than 19 % of all websites .
February 2018
Below are the latest statistics of the market share of all sites of the top web servers on the Internet by Netcraft February 2018 Web Server Survey .
All other web servers are used by less than 13 % of all websites .
February 2017
Below are the latest statistics of the market share of all sites of the top web servers on the Internet by Netcraft February 2017 Web Server Survey .
All other web servers are used by less than 15 % of all websites .
February 2016
Below are the latest statistics of the market share of all sites of the circus tent network servers on the Internet by Netcraft February 2016 Web Server Survey .
All other vane servers are used by less than 19 % of all websites. Apache, IIS and Nginx are the most use web servers on the World Wide Web. [ 57 ] [ 58 ]
See besides
Standard Web Server Gateway Interfaces used for dynamic contents:
- CGI Common Gateway Interface
- SCGI Simple Common Gateway Interface
- FastCGI Fast Common Gateway Interface
A few other Web Server Interfaces (server or programming language specific) used for dynamic contents:
- SSI Server Side Includes, rarely used, static HTML documents containing SSI directives are interpreted by server software to include small dynamic data on the fly when pages are served, e.g. date and time, other static file contents, etc.
- SAPI Server Application Programming Interface:
- ISAPI Internet Server Application Programming Interface
- NSAPI Netscape Server Application Programming Interface
- PSGI Perl Web Server Gateway Interface
- WSGI Python Web Server Gateway Interface
- Rack Rack Web Server Gateway Interface
- JSGI JavaScript Web Server Gateway Interface
- Java Servlet, JavaServer Pages
- Active Server Pages, ASP.NET