4.1 Introduction to the Web & HTTP Protocol


Introduction

Web applications are software programs that run on web servers and are accessible over the internet through web browsers. They are designed to provide interactive and dynamic functionality to users allowing them to perform various tasks, access information and interact with data online.

Web applications follow the client-server model where the application's logic and data are hosted on a web server and users access it using web browsers on their devices. The user interface (UI) is usually presented through a combination of HTML, CSS and JS to create dynamic and interactive interfaces.

The primary objective of web application security is to ensure the confidentiality, integrity and availability of data processed by web applications while mitigating the risk of unauthorised access, data breaches and service disruptions. Web applications are attractive targets for attackers due to their public accessibility and the potential for gaining access to sensitive data, such as personal information, financial data or intellectual property.

Some web application best security practices:

  • Authentication and authorisation

  • Input validation

  • Secure communication

  • Secure coding practices

  • Regular security updates

  • Least privilege principle

  • Web application firewalls (WAF)

  • Session management

Web App Security Testing

This is the process of evaluating and assessing the security aspects of web applications to identify vulnerabilities, weaknesses and potential security risks. It involves conducting various tests and assessments to ensure that web applications are resistant to security threats. The primary goal is to uncover security flaws before they are exploited by attackers.

Security testing typically involves a combination of automated scanning tools and manual testing techniques. There are some common types of security testing which include:

  • Vulnerability scanning

  • Penetration testing

  • Code review and static analysis

  • Authentication and authorisation testing

  • Input validation and output encoding testing

  • Session management testing

  • API security testing

Web App Penetration Testing

This is a subset of security testing that specifically involves attempting to exploit identified vulnerabilities. It is a simulated attack which is a systematic and controlled approach to assess the application's security.


Common Threats & Risks

A threat refers to any potential source of harm or adverse event that may exploit a vulnerability in a system or an organization's security measure. This can be human-made or natural threats.

A risk is the potential for loss or harm resulting from a threat exploiting a vulnerability in a system or organization. It is a combination of the likelihood or probability of a threat occurrence and the impact and/or severity of the resulting adverse event. Risk is often measured in term s of the likelihood of an incident happening and the potential magnitude of its impact.

In summary, a threat can exist but they may / may not pose a significance risk depending on the vulnerabilities and the security measures in place to mitigate them.

Threat / Risk
Description

Cross-Site Scripting (XSS)

Attackers inject malicious scripts into web pages viewed by other users, leading to unauthorised access to user data, session hijacking and browser manipulation.

SQL Injection (SQLi)

Attackers manipulate user input to inject malicious SQL code into the application's database leading to unauthorised data access or database compromise.

Cross-Site Request Forgery (CSRF)

Attackers trick authenticated users into unknowingly performing actions on a web application by exploiting their active sessions.

Security Misconfigurations

Improperly configured servers, databases or application frameworks can expose sensitive data.

Sensitive Data Exposure

Failure to adequately protect sensitive data which can lead to breaches and identity theft.

Brute-force & Credential Stuffing Attacks

Attackers use automated tools to guess username and passwords attempting to gain unauthorised access.

File Upload Vulnerabilities

Insecure file upload mechanisms can enable attackers to upload malicious files leading to remote code execution or unauthorised access to the server.

DoS and DDoS Attacks

Overwhelm web application servers, causing service disruptions and denying legitimate user's access.

Server-Side Request Forgery (SSRF)

Attackers use SSRF to make requests from the server to internal resources or external networks potentially leading to data theft and/or unauthorised access.

Using Components with Known Vulnerabilities

Integrating third-party apps or components with known security flaws can introduce weaknesses into the web application.


Web Application Architecture

Component
Function

User Interface (UI)

Visual presentation of the web application seen and interacted with by users. It includes elements such as web pages, forms, menus, buttons, etc.

Client-Side Technologies

HTML, CSS and JS are used to create the user interface and handle interactions directly within the user's web browser.

Server-Side Technologies

Programming languages and frameworks to implement the application's business logic, process requests from clients, access databases and generate dynamic content to be sent back to the client.

Databases

Used to store and manage the web application's data.

Application Logic

Represents the rules and procedures that govern how the web application functions (e.g. data validation and security checks).

Web Servers

Handle the initial request and serve the client-side components

Application Servers

Execute the server-side code and handle the dynamic content processing of client-side requests.

Client-Server Model

Typically, web applications are built on the client-server model.

The client represents the user interface and user interface and user interaction with the web application. It is the front-end of the application that user's access through their web browsers. The client is responsible for displaying the web pages, handling user input and sending requests to the server for data or actions.

The server represents the back end of the web application. It processes client requests, executes the application's business logic, communications with databases and other services and generates responses to be sent back to the client.

Communication & Data Flow

Web applications communication using HTTP/S. When a user interacts with the web application by clicking on clicks or submitting forms, the client sends HTTP requests to the server.

The server will then process these requests and interact with the database if necessary and generates the HTTP response. The response is then sent back to the client which renders the content and presents it to the user.

We will discuss more in depth later on.


Web Application Technologies

Client-Side Technologies

HTML is the markup language used to structure and define the content of web pages. It provides the foundation for creating the layout and structure of the UI.

CSS is used to define the presentation and styling of web pages. It allows developers to control the colours, fonts, layout and other visual aspects of the UI.

JavaScript is a script language that enables interactivity in web applications. It is used to create dynamic and responsive UI elements, handle user interactions and perform client-side validations.

Cookies and local storage are client-side mechanisms to store small amounts of data on the user's browsers. They are often used for sessions management and remembering user preferences.

Server-Side Technologies

The web server is responsible for receiving and responding to HTTP requests from clients. It hosts the web application's files, processes requests and sends responses back to clients.

The application server runs the business logic of the web application. It processes user requests, accesses databases and performs computation to generate dynamic content that the web server can serve to clients.

The database server stores and manages the web application's data. it stores user information, content, configurations and tother relevant data required for the application's operation.

Server-side scripting languages are used to handle ser-side processing. They interact with databases, perform validations and generate dynamic content before sending it to the client.

Data Interchange

This refers tot eh process of exchanging data between different computer systems or application, allowing them to communicate and share information. It involves the conversion of data from one format to another, making it compatible with the receiving system. This ensures that data can be interpreted and utilized correctly by the recipient, regardless of the differences in their data structures, programming languages or operating systems.

API's (Application Programming Interfaces) allow different software systems to interact and exchange data. Web applications use APIs to integrate with external services, share data and provide functionalities to other applications.

JSON (JavaScript Object Notation) is a lightweight and widely used data interchange format that is easy for both humans and machines to read and write. It is based on JavaScript syntax and is primarily used for transmitting data between a server and a web application as an alternative to XML.

XML (eXtensible Markup Language) is a versatile data interchange format that uses tags to define the structure of the data. It allows users to create their custom tags and define complex hierarchical data structures. It is commonly used for configuration files, web services and data exchange between different systems.

REST (Representational State Transfer) is a software architectural style that uses standard HTTP methods for data interchange. It is widely used for creating web APIs that allow applications to interact and exchange data over the internet.

SOAP (Simple Object Access Protocol) is a protocol for exchanging structured information in the implementation of web services. It uses XML as the data interchange format and provides a standardized method for communication between different systems.

Security Technologies

Authentication verifies the identify of users, while authorisation controls access to different parts of the web application based on user roles and permissions.

Encryption (SSL or TLS) is used to encrypt data transmitted between the client and a server, ensuring secure communication and data protection.

External Technologies

CDNs (Content Delivery Networks) are used to distribute static content to multiple servers located worldwide, improving the web application's performance and reliability.

Third party libraries and frameworks are often used by web applications to speed up development and access advanced features.


HTTP/S Protocol

HTTP is a stateless application layer protocol used for the transmission of resources like web application data and runs on top of TCP. Resources are uniquely identified with a URL/URI.

HTTPS is HTTP Secure and is secure by using SSL/TLS encryption.

HTTP has two version:

  1. HTTP 1.1 - most widely used and can re-use the same connection and can request for multiple URI's or resources

  2. HTTP 1.0 - have to re-establish a connection as soon as you make another request

You can view the notes for the OSI Model here.

HTTP Requests

The request line is the first line of an HTTP request and contains the following three components:

  1. HTTP method (e.g. GET, POST, PUT or DELETE)

  2. URL (Uniform Resource Locator) - the address of the resource that the client wants to access

  3. HTTP version - so either HTTP 1.0 or 1.1

Request headers provide additional information about the request to the web server:

  1. User-Agent - information about the client making the request

  2. Host - the hostname of the server or domain

  3. Accept - the media types the client can handle in the response (e.g. HTML or JSON)

  4. Authorisation - credentials for authentication if required

  5. Cookie - information stored on the client-side and sent back to the server with each request

The request body is optional and is used by some HTTP methods like POST or PUT where data is sent to the server, typically in JSON or form data format.

Method
Explanation

GET

Used to retrieve data from the server. It requests the resource specified in the URL and does not modify the server's state.

POST

Used to submit data to be processed by the server. Typically includes data in the request body and the server may perform actions based on that data.

PUT

Used to updated or create a resource on the server at the specified URL. It replaces the entire resource with the new representation provided in the request body.

DELETE

Used to remove the resource specified by the URL from the server.

PATCH

Used to apply partial modifications to a resource but doesn't fully replace a resource.

HEAD

Similar to GET but only retrieves the response headers and not the response body. Often used to check for things like resource existence or modification dates.

OPTIONS

Used to retrieve information about the communication options available for the target resource. It allows clients to determine the supported methods and headers.

HTTP Responses

Response headers and similar to request headers and provide additional information about the response. Some common headers include:

  1. Content-Type - the media type of the response content (e.g. text or html)

  2. Content-Length - the size of the response body in bytes

  3. Set-Cookie - used to set cookies on the client-side for subsequent requests

  4. Cache-Control - directives for caching behaviour

The response body is optional but is normally there as it is used to contain the actual content of the response. For example, in the case of a HTML page, the response body will contain the HTML-Markup.

Status Code
Explanations

200 OK

Successful request and the server has returned the requested data

301 Moved Permanently

Requested resource has been moved permanently to a new URL and the client should use the new URL for all future requests

302 Found

Requested resource is temporarily located at a different URL

400 Bad Request

Server cannot process the request due to a client error

401 Unauthorised

Authentication is required and you must provide credentials

403 Forbidden

Server understood the request, but you don't have the permissions to access that resource

404 Not Found

Requested resource could not be found

500 Internal Server Error

Server encountered an error while processing the request (sometime due to DDoS attacks or the server being down)

Cache-Control Directives

Directive
Explanation

Public

The response can be cached by any intermediary caches and shared across different users.

Private

The response is intended for a specific user and should not be cached by intermediate caches.

no-cache

Client has to re-validate the response with the server before using the cached version. Doesn't prevent caching but requires re-validation

no-store

Directs the client and intermediate caches to not store any version of the response

max-age=<seconds>

Specifies the maximum amount of time in seconds that the response can be cached by the client. After this period, the client has to re-validate with the server to re-cache

Last updated