Education

Thursday, January 30, 2025

Software Engineering

Software Engineering is the process of designing, developing, testing, and maintaining software. It is a systematic and disciplined approach to software development that aims to create high-quality, reliable, and maintainable software.

The term software engineering is the product of two words, software, and engineering. The software is a collection of integrated programs. Software subsists of carefully-organized instructions and code written by developers on any of various particular computer languages. Computer programs and related documentation such as requirements, design models and user manuals.

Engineering is the application of scientific and practical knowledge to invent, design, build, maintain, and improve frameworks, processes, etc.

Software engineering includes a variety of techniques, tools, and methodologies, including requirements analysis, design, testing, and maintenance. It is a rapidly evolving field, and new tools and technologies are constantly being developed to improve the software development process. By following the principles of software engineering and using the appropriate tools and methodologies, software developers can create high-quality, reliable, and maintainable software that meets the needs of its users. Software Engineering is mainly used for large projects based on software systems rather than single programs or applications.

The main goal of Software Engineering is to develop software applications for improving quality, budget, and time efficiency. Software Engineering ensures that the software that has to be built should be consistent, correct, also on budget, on time, and within the required requirements. Software is the program that is required to work on the input, processing, output, storage and control.

Why is Software Engineering required?

To manage Large software
For more Scalability
Cost Management
To manage the dynamic nature of software
For better quality Management

Need of Software Engineering

The necessity of software engineering appears because of a higher rate of progress in user requirements and the environment on which the program is working.

Huge Programming: It is simpler to manufacture a wall than to a house or building, similarly, as the measure of programming become extensive engineering has to step to give it a scientific process.

Adaptability: If the software procedure were not based on scientific and engineering ideas, it would be simpler to re-create new software than to scale an existing one.

Cost: As the hardware industry has demonstrated its skills and huge manufacturing has let down the cost of computer and electronic hardware. But the cost of programming remains high if the proper process is not adapted.

Dynamic Nature: The continually growing and adapting nature of programming hugely depends upon the environment in which the client works. If the quality of the software is continually changing, new upgrades need to be done in the existing one.

Quality Management: Better procedure of software development provides a better and quality software product.

Types of Software

It can be categorized into different types:

Based on Application
Based on Copyright

1. Based on Application

The software can be classified on the basis of the application. These are to be done on this basis.

1. System Software:

System Software is necessary to manage computer resources and support the execution of application programs. Software like operating systems, compilers, editors and drivers, etc., come under this category. A computer cannot function without the presence of these. Operating systems are needed to link the machine-dependent needs of a program with the capabilities of the machine on which it runs. Compilers translate programs from high-level language to machine language.

2. Application Software:

Application software is designed to fulfill the user’s requirement by interacting with the user directly. It could be classified into two major categories:- generic or customized. Generic Software is software that is open to all and behaves the same for all of its users. Its function is limited and not customized as per the user’s changing requirements. However, on the other hand, customized software is the software products designed per the client’s requirement, and are not available for all.

3. Networking and Web Applications Software:

Networking Software provides the required support necessary for computers to interact with each other and with data storage facilities. Networking software is also used when software is running on a network of computers (such as the World Wide Web). It includes all network management software, server software, security and encryption software, and software to develop web-based applications like HTML, PHP, XML, etc.

4. Embedded Software:

This type of software is embedded into the hardware normally in the Read-Only Memory (ROM) as a part of a large system and is used to support certain functionality under the control conditions. Examples are software used in instrumentation and control applications like washing machines, satellites, microwaves, etc.

5. Reservation Software:

A Reservation system is primarily used to store and retrieve information and perform transactions related to air travel, car rental, hotels, or other activities. They also provide access to bus and railway reservations, although these are not always integrated with the main system. These are also used to relay computerized information for users in the hotel industry, making a reservation and ensuring that the hotel is not overbooked.

6. Business Software:

This category of software is used to support business applications and is the most widely used category of software. Examples are software for inventory management, accounts, banking, hospitals, schools, stock markets, etc.

7. Entertainment Software:

Education and Entertainment software provides a powerful tool for educational agencies, especially those that deal with educating young children. There is a wide range of entertainment software such as computer games, educational games, translation software, mapping software, etc.

8. Artificial Intelligence Software:

Software like expert systems, decision support systems, pattern recognition software, artificial neural networks, etc. come under this category. They involve complex problems which are not affected by complex computations using non-numerical algorithms.

2. Based on Copyright

Classification of Software can be done based on copyright. These are stated as follows:

1. Commercial Software:

It represents the majority of software that we purchase from software companies, commercial computer stores, etc. In this case, when a user buys software, they acquire a license key to use it. Users are not allowed to make copies of the software. The company owns the copyright of the program.

2. Shareware Software:

Shareware software is also covered under copyright, but the purchasers are allowed to make and distribute copies with the condition that after testing the software, if the purchaser adopts it for use, then they must pay for it. In both of the above types of software, changes to the software are not allowed.

3. Freeware Software:

In general, according to freeware software licenses, copies of the software can be made both for archival and distribution purposes, but here, distribution cannot be for making a profit. Derivative works and modifications to the software are allowed and encouraged. Decompiling of the program code is also allowed without the explicit permission of the copyright holder.

4. Public Domain Software:

In the case of public domain software, the original copyright holder explicitly relinquishes all rights to the software. Hence, software copies can be made both for archival and distribution purposes with no restrictions on distribution. Modifications to the software and reverse engineering are also allowed.

Software Characteristics

Functionality:

It refers to the degree of performance of the software against its intended purpose.

Functionality refers to the set of features and capabilities that a software program or system provides to its users. It is one of the most important characteristics of software, as it determines the usefulness of the software for the intended purpose. Examples of functionality in software include:

Data storage and retrieval
Data processing and manipulation
User interface and navigation
Communication and networking
Security and access control
Reporting and visualization
Automation and scripting

Reliability:

A set of attributes that bears on the capability of software to maintain its level of performance under the given condition for a stated period of time.

Reliability is a characteristic of software that refers to its ability to perform its intended functions correctly and consistently over time. Reliability is an important aspect of software quality, as it helps ensure that the software will work correctly and not fail unexpectedly.

Examples of factors that can affect the reliability of software include:

Bugs and errors in the code
Lack of testing and validation
Poorly designed algorithms and data structures
Inadequate error handling and recovery
Incompatibilities with other software or hardware

Efficiency:

It refers to the ability of the software to use system resources in the most effective and efficient manner. The software should make effective use of storage space and executive command as per desired timing requirements.

Efficiency is a characteristic of software that refers to its ability to use resources such as memory, processing power, and network bandwidth in an optimal way. High efficiency means that a software program can perform its intended functions quickly and with minimal use of resources, while low efficiency means that a software program may be slow or consume excessive resources.

Examples of factors that can affect the efficiency of the software include:

Poorly designed algorithms and data structures
Inefficient use of memory and processing power
High network latency or bandwidth usage
Unnecessary processing or computation
Unoptimized code

Usability:

It refers to the extent to which the software can be used with ease. the amount of effort or time required to learn how to use the software.

Maintainability:

It refers to the ease with which modifications can be made in a software system to extend its functionality, improve its performance, or correct errors.

Portability:

A set of attributes that bears on the ability of software to be transferred from one environment to another, without minimum changes.

Software Development Life Cycle

Software Development Life Cycle (SDLC) is a well-defined, structured sequence of stages in software engineering to develop the intended software product.

Communication

This is the first step where the user initiates the request for a desired software product. The user contacts the service provider and tries to negotiate the terms, submits the request to the service providing organization in writing.

Requirement Gathering

This step onwards the software development team works to carry on the project. The team holds discussions with various stakeholders from problem domain and tries to bring out as much information as possible on their requirements. The requirements are contemplated and segregated into user requirements, system requirements and functional requirements. The requirements are collected using a number of practices as given –

• studying the existing or obsolete system and software,

• conducting interviews of users and developers,

• referring to the database or

• collecting answers from the questionnaires.

Feasibility Study

After requirement gathering, the team comes up with a rough plan of software process. At this step the team analyzes if a software can be designed to fulfill all requirements of the user, and if there is any possibility of software being no more useful. It is also analyzed if the project is financially, practically, and technologically feasible for the organization to take up. There are many algorithms available, which help the developers to conclude the feasibility of a software project.

System Analysis

At this step the developers decide a roadmap of their plan and try to bring up the best software model suitable for the project. System analysis includes understanding of software product limitations, learning system related problems or changes to be done in existing systems beforehand, identifying and addressing the impact of project on organization and personnel etc. The project team analyzes the scope of the project and plans the schedule and resources accordingly.

Software Design

Next step is to bring down whole knowledge of requirements and analysis on the desk and design the software product.

The inputs from users and information gathered in requirement gathering phase are the inputs of this step. The output of this step comes in the form of two designs; logical design, and physical design. Engineers produce meta-data and data dictionaries, logical diagrams, data-flow diagrams, and in some cases pseudo codes.

Coding

This step is also known as programming phase. The implementation of software design starts in terms of writing program code in the suitable programming language and developing error-free executable programs efficiently.

Testing

An estimate says that 50% of whole software development process should be tested. Errors may ruin the software from critical level to its own removal. Software testing is done while coding by the developers and thorough testing is conducted by testing experts at various levels of code such as module testing, program testing, product testing, in-house testing, and testing the product at user’s end. Early discovery of errors and their remedy is the key to reliable software.

Integration

Software may need to be integrated with the libraries, databases, and other program(s). This stage of SDLC is involved in the integration of software with outer world entities.

Implementation

This means installing the software on user machines. At times, software needs post-installation configurations at user end.

Software is tested for portability and adaptability and integration related issues are solved during implementation.

Operation and Maintenance

This phase confirms the software operation in terms of more efficiency and less errors. If required, the users are trained, or aided with the documentation on how to operate the software and how to keep the software operational. The software is maintained timely by updating the code according to the changes taking place in user end environment or technology. This phase may face challenges from hidden bugs and real-world unidentified problems.

Waterfall Model

Waterfall model is the simplest model of software development paradigm. All the phases of SDLC will function one after another in linear manner. That is, when the first phase is finished then only the second phase will start and so on.

This model assumes that everything is carried out and taken place perfectly as planned in the previous stage and there is no need to think about the past issues that may arise in the next phase. This model does not work smoothly if there are some issues left at the previous step. The sequential nature of model does not allow us to go back and undo or redo our actions.

Advantage:

This model is best suited when developers already have designed and developed similar software in the past and are aware of all its domains.

Drawback:

The sequential nature of model does not allow to go back and undo or redo the actions.

Iterative Model

This model leads the software development process in iterations. It projects the process of development cyclic manner repeating every step after every cycle of SDLC process.

The software is first developed on very small scale and all the steps are followed which are taken into consideration. Then, on every next iteration, more features and modules are designed, coded, tested, and added to the software. Every cycle produces a software, which is complete in itself and has more features and capabilities than that of the previous one.

After each iteration, the management team can do work on risk management and prepare for the next iteration. Because a cycle includes small portion of whole software process, it is easier to manage the development process but it consumes more resources.

Advantage:

Because a cycle includes small portion of whole software process, it is easier to manage the development process.

Drawback:

Since more features are added to the software on every iteration, it consumes more resources.

Spiral Model

Spiral model is a combination of both, iterative model and one of the SDLC model. It can be seen as if you choose one SDLC model and combined it with cyclic process (iterative model).

This model considers risk, which often goes un-noticed by most other models. The model starts with determining objectives and constraints of the software at the start of one iteration. Next phase is of prototyping the software. This includes risk analysis. Then one standard SDLC model is used to build the software. In the fourth phase of the plan of next iteration is prepared.

Advantage:

1. Additional functionality or changes can be done at the later stage

2. Cost Estimation becomes easy Drawback:

1. Not advisable for smaller projects, as it might cost more

2. Demands Risk assessment expertise

V – model

The V-model is a type of SDLC model where process executes in a sequential manner in V-shape.

The major drawback of waterfall model is we move to the next stage only when the previous one is finished and there was no chance to go back if something is found wrong in later stages. V-Model provides means of testing of software at each stage in reverse manner.

At every stage, test plans and test cases are created to verify and validate the product according to the requirement of that stage. For example, in requirement gathering stage the test team prepares all the test cases in correspondence to the requirements. Later, when the product is developed and is ready for testing, test cases of this stage verify the software against its validity towards requirements at this stage.

This makes both verification and validation go in parallel. This model is also known as verification and validation model.

Advantage:

1. Each phase has specific deliverables.

2. Works well for small projects where requirements are easily understood.

3. Utility of the resources is high.

Drawback:

1. Very rigid, like the waterfall model.

2 Little flexibility and adjusting scope is difficult and expensive.

Thursday, January 23, 2025

Blockchain Technology

Blockchain is a decentralized distributed database (ledger) of immutable records accessed by various business applications over the network. Client applications of related businesses can read or append transaction records to the blockchain. Transaction records submitted to any node are validated and committed to the ledger database on all the nodes of blockchain network. Committed transactions are immutable because each block is linked with its previous block by means of hash and signature values. Protocols such as Gossip and Consensus ensure that the submitted transactions are transferred to all nodes and committed on all blockchain nodes consistently.

Blockchain ecosystem consists of blockchain client, blockchain node, blockchain network, transaction processor and consensus process.

Blockchain client is an application that creates transaction message in a prescribed format and submits it to blockchain node through web API. It may be any existing application, which posts transaction message to blockchain node. Clients are restricted using Public Key Infrastructure (PKI) technology at blockchain node level.

Blockchain node is a server node that runs blockchain services responsible for receiving the transaction and transmits the transaction to other blockchain nodes. With respect to the design, the node participates in consensus process to commit the block of transaction data to ledger database.

Blockchain network is a network of linked nodes used for read, write transactions into ledger database. The topology is based on the nodes participating in consensus process. Traditional systems are centralized where all data and decision-making is concentrated on a single node or cluster of nodes. In decentralized systems, the data and decision-making are spread out among a large number of nodes. These nodes maintain copies of the shared database and decide among themselves which data is to be committed to the database using consensus mechanism. Decentralized networks can be an interconnection of centralized or hub-and-spoke type networks. A distributed network is a special case of decentralized system where every single node in the network maintains the shared database and participates in consensus to determine which data is to be committed to the database.

There are at types of blockchain networks

public blockchains,
private blockchains,
consortium blockchains and
hybrid blockchains.

Public blockchains

A public blockchain has absolutely no access restrictions. Anyone with an Internet connection can send transactions to it as well as become a validator (i.e., participate in the execution of a consensus protocol). Usually, such networks offer economic incentives for those who secure them and utilize some type of a proof-of-stake or proof-of-work algorithm.

Some of the largest, most known public blockchains are the bitcoin blockchain and the Ethereum blockchain.

Private blockchains

A private blockchain is permissioned. One cannot join it unless invited by the network administrators. Participant and validator access is restricted. To distinguish between open blockchains and other peer-to-peer decentralized database applications that are not open ad-hoc compute clusters, the terminology Distributed Ledger (DLT) is normally used for private blockchains.

Hybrid blockchains

A hybrid blockchain has a combination of centralized and decentralized features. The exact workings of the chain can vary based on which portions of centralization and decentralization are used.

Sidechains

A sidechain is a designation for a blockchain ledger that runs in parallel to a primary blockchain. Entries from the primary blockchain (where said entries typically represent digital assets) can be linked to and from the sidechain; this allows the sidechain to otherwise operate independently of the primary blockchain (e.g., by using an alternate means of record keeping, alternate consensus algorithm, etc.).

Consortium blockchain

A consortium blockchain is a type of blockchain that combines elements of both public and private blockchains. In a consortium blockchain, a group of organizations come together to create and operate the blockchain, rather than a single entity. The consortium members jointly manage the blockchain network and are responsible for validating transactions. Consortium blockchains are permissioned, meaning that only certain individuals or organizations are allowed to participate in the network. This allows for greater control over who can access the blockchain and helps to ensure that sensitive information is kept confidential.

Bitcoin and blockchain are not the same. Blockchain provides the means to record and store bitcoin transactions, but blockchain has many uses beyond bitcoin. Bitcoin is only the first use case for blockchain.

Proof of work

A proof of work is a piece of data which is difficult (costly, time consuming) to produce but easy for others to verify and which satisfies certain requirements.
In order for a block to be accepted by network participants, miner must complete a proof of work which covers all of the data in the block.
The difficulty of this work is adjusted so as to limit the rate at which new blocks can be generated by the network to one every 10 minutes.
Due to the very low probability of successful generation, this makes it unpredictable which worker computer in the network will be able to generate the next block.

Ethereum

Functions as a platform through which people can use tokens to create and run applications and create smart contracts
Ethereum allows people to connect directly through powerful decentralized super computer
Language- Solidity
Currency- Ether
Uses- POS

Smart Contracts

• A smart contract is an agreement or set of rules that govern a business transaction;

• It’s stored on the blockchain and is executed automatically as part of a transaction

• Their purpose is to provide security superior to traditional contract law while reducing the costs and delays associated with traditional contracts.

Hyperledger

Hyperledger is an open source collaborative effort created to advance cross-industry blockchain technologies.
It is a global collaboration, hosted by The Linux Foundation, including leaders in finance, banking, Internet of Things, supply chains, manufacturing and Technology.

Consensus

Consensus is a procedure to select a leader node, which decides whether the block of transactions is to be committed or rejected. Earlier versions of blockchain system used Proof of Work (PoW) for consensus process. Every node or participatory node is given a mining task, and a node elected as leader completes the mining task first. Mining task is to find or calculate a certain pattern value of hash value by adding nonce to current hash. Node that participates in mining process requires heavy computing resources. Latest consensus protocol uses PoET, “Proof of Elapsed Time”. Every node in the consensus process selects random time and keeps decreasing. The node that reaches zero first is selected as leader.

Transaction

Transaction is a unit of business data within Block. Block is a set of transactions bundled with signatures and hash value of previous block. Genesis block is the first block of chain created during installation and configuration.

Merkle Tree

Merkle Tree is a tree data structure in which leaf node holds hashes of every transaction and intermediate node holds hash calculated from immediate child nodes. In blockchain, a block consists of one or more transactions and its respective tree of hashes. In a distributed system, this tree is used to maintain data consistency among all participating nodes.

Ledger

Ledger/ Chain Database is a key-value database for a chain of serialized blocks. One block may contain one or more transactions.

State Database is a key-value database for storing transaction state and links of its related transactions.

Thursday, December 26, 2024

Cloud computing

Cloud computing is the delivery of computing resources, such as storage, databases, and software, over the internet. It allows users to access these resources on-demand, without the need to buy and maintain physical infrastructure.

Cloud Computing means storing and accessing the data and programs on remote servers that are hosted on the internet instead of the computer’s hard drive or local server. Cloud computing is also referred to as Internet-based computing, it is a technology where the resource is provided as a service through the Internet to the user. The data that is stored can be files, images, documents, or any other storable document.

Benefits:

Cost-effectiveness: Users only pay for what they use, which can help lower operating costs.

Scalability: Users can scale up or down their resources as their needs change.

Flexibility: Users can access resources from anywhere, on any device.

Innovation: Users can access resources faster, which can lead to faster innovation.

There are different types of cloud computing, including: private clouds, public clouds, hybrid clouds.

Public Cloud

A Public Cloud is Cloud Computing in which the infrastructure and services are owned and operated by a third-party provider and made available to the public over the internet. The public can access and use shared resources, such as servers, storage, and applications and the main thing is you pay for what you used. . Examples of public cloud providers – are Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP)

Advantages

Cost Efficient: In the public cloud, we have to pay for what we used. So it is more cost-efficient than maintaining the physical servers or their own infrastructure.

Automatic Software Updates: In the public cloud, there are automatic software updates. we don’t have to update the software manually.

Accessibility: Public clouds allow users to access their resources and applications from anywhere in the world. We just need an internet connection to access it.

Disadvantages

Security and Privacy Concerns: Public clouds can be vulnerable to data breaches, cyber attacks, and other security risks. Since data is stored on servers owned by a third-party provider, there is always a risk that confidential or sensitive data may be exposed or compromised.

Limited Control: With public cloud services, users have limited control over the infrastructure and resources used to run their applications. This can make it difficult to customize the environment to meet specific requirements.

Reliance on Internet Connectivity: Public cloud services require a reliable and stable internet connection to access the resources and applications hosted in the cloud. If the internet connection is slow or unstable, it can affect the performance and availability of the services.

Service Downtime: Public cloud providers may experience service downtime due to hardware failures, software issues, or maintenance activities. This can result in temporary loss of access to applications and data.

Compliance and Regulatory Issues: Public cloud services may not meet certain compliance or regulatory requirements, such as those related to data privacy or security. This can create legal or contractual issues for businesses that are subject to these requirements.

Cost Overruns: Public cloud services are typically billed on a pay-per-use basis, which can result in unexpected cost overruns if usage exceeds anticipated levels. Additionally, the cost of using public cloud services may increase over time, as providers adjust their pricing models or add new features and services.

Private Cloud

A Private Cloud is a cloud computing environment in which the infrastructure and services are owned and operated by a single organization, for example, a company or government, and it is accessed by only authorized users within that organization. Private Cloud organizations have their own data center. private cloud provides a higher level of security. Examples – HPE, Dell, VMware, etc.

Advantages

Security Status: Private clouds provide a higher level of security. as the organization has full control over the cloud service. They can customize the servers to manage their security.

Customization of Service: Private clouds allow organizations to customize the infrastructure and services to meet their specific requirements. and also can customize the security.

Privacy: Private clouds provide increased privacy as the organization(company or government ) has more control over who has access to their data and resources.

Disadvantages

Higher Cost: Private clouds require dedicated hardware, software, and networking infrastructure, which can be expensive to acquire and maintain. This can make it challenging for smaller businesses or organizations with limited budgets to implement a private cloud.

Limited Scalability: Private clouds are designed to serve a specific organization, which means that they may not be as scalable as public cloud services. This can make it difficult to quickly add or remove resources in response to changes in demand.

Technical Complexity: Setting up and managing a private cloud infrastructure requires technical expertise and specialized skills. This can be a challenge for organizations that lack in-house IT resources or expertise.

Security Risks: Private clouds are typically considered more secure than public clouds since they are operated within an organization’s own infrastructure. However, they can still be vulnerable to security risks such as data breaches or cyber attacks.

Lack of Standardization: Private clouds are often built using proprietary hardware and software, which can make it challenging to integrate with other cloud services or migrate to a different cloud provider in the future.

Maintenance and Upgrades: Maintaining and upgrading a private cloud infrastructure can be time-consuming and resource-intensive. This can be a challenge for organizations that need to focus on other core business activities.

Hybrid Cloud

A hybrid cloud is a combination of both public and private cloud environments that allows organizations to take advantage of the benefits of both types of clouds. It manages traffic levels during peak usage periods It can provide greater flexibility, scalability, and cost-effectiveness than using a single cloud environment. Examples – IBM, DataCore Software, Rackspace, Threat Stack, Infinidat, etc.

Advantages

Flexibility: Hybrid cloud stores its data (also sensitive) in a private cloud server. While public server provides Flexibility and Scalability.

Scalability: Hybrid cloud Enables organizations to move workloads back and forth between their private and public clouds depending on their needs.

Security: Hybrid cloud controls over highly sensitive data. and it provides high-level security. Also, it takes advantage of the public cloud’s cost savings.

Disadvantages

Complexity: Hybrid clouds are complex to set up and manage since they require integration between different cloud environments. This can require specialized technical expertise and resources.

Cost: Hybrid clouds can be more expensive to implement and manage than either public or private clouds alone, due to the need for additional hardware, software, and networking infrastructure.

Security Risks: Hybrid clouds are vulnerable to security risks such as data breaches or cyber attacks, particularly when there is a lack of standardization and consistency between the different cloud environments.

Data Governance: Managing data across different cloud environments can be challenging, particularly when it comes to ensuring compliance with regulations such as GDPR or HIPAA.

Network Latency: Hybrid clouds rely on communication between different cloud environments, which can result in network latency and performance issues.

Integration Challenges: Integrating different cloud environments can be challenging, particularly when it comes to ensuring compatibility between different applications and services.

Vendor Lock-In: Hybrid clouds may require organizations to work with multiple cloud providers, which can result in vendor lock-in and limit the ability to switch providers in the future.

Some examples of cloud computing include:

Google Docs and Microsoft 365
Email, calendar, Skype, and WhatsApp
Zoom
Microsoft Teams
AWS Lambda
Salesforce

Characteristics

On-demand self-service: "A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider."

Broad network access: "Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations)."

Resource pooling: " The provider's computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand."

Rapid elasticity: "Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear unlimited and can be appropriated in any quantity at any time."

Measured service: "Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.

The following are some of the Operations that can be performed with Cloud Computing

Storage, backup, and recovery of data
Delivery of software on demand
Development of new applications and services
Streaming videos and audio

What is Virtualization In Cloud Computing?

Virtualization is the software technology that helps in providing the logical isolation of physical resources. Creating logical isolation of physical resources such as RAM, CPU, and Storage.. over the cloud is known as Virtualization in Cloud Computing. In simple we can say creating types of Virtual Instances of computing resources over the cloud. It provides better management and utilization of hardware resources with logical isolation making the applications independent of others. It facilitates streamlining the resource allocation and enhancing scalability for multiple virtual computers within a single physical source offering cost-effectiveness and better optimization of resources.

Architecture Of Cloud Computing

Cloud computing architecture refers to the components and sub-components required for cloud computing.

Front end ( Fat client, Thin client)
Back-end platforms ( Servers, Storage )
Cloud-based delivery and a network ( Internet, Intranet, Intercloud )

1. Front End ( User Interaction Enhancement )

The User Interface of Cloud Computing consists of 2 sections of clients. The Thin clients are the ones that use web browsers facilitating portable and lightweight accessibilities and others are known as Fat Clients that use many functionalities for offering a strong user experience.

2. Back-end Platforms ( Cloud Computing Engine )

The core of cloud computing is made at back-end platforms with several servers for storage and processing computing. Management of Applications logic is managed through servers and effective data handling is provided by storage. The combination of these platforms at the backend offers the processing power, and capacity to manage and store data behind the cloud.

3. Cloud-Based Delivery and Network

On-demand access to the computer and resources is provided over the Internet, Intranet, and Intercloud. The Internet comes with global accessibility, the Intranet helps in internal communications of the services within the organization and the Intercloud enables interoperability across various cloud services. This dynamic network connectivity ensures an essential component of cloud computing architecture on guaranteeing easy access and data transfer.

Types of Cloud Computing Services

The following are the types of Cloud Computing:

Infrastructure as a Service (IaaS)
Platform as a Service (PaaS)
Software as a Service (SaaS)

1. Infrastructure as a Service ( IaaS )

Flexibility and Control: IaaS comes up with providing virtualized computing resources such as VMs, Storage, and networks facilitating users with control over the Operating system and applications.

Reducing Expenses of Hardware: IaaS provides business cost savings with the elimination of physical infrastructure investments making it cost-effective.

Scalability of Resources: The cloud provides in scaling of hardware resources up or down as per demand facilitating optimal performance with cost efficiency.

2. Platform as a Service ( PaaS )

Simplifying the Development: Platform as a Service offers application development by keeping the underlying Infrastructure as an Abstraction. It helps the developers to completely focus on application logic ( Code ) and background operations are completely managed by the AWS platform.

Enhancing Efficiency and Productivity: PaaS lowers the Management of Infrastructure complexity, speeding up the Execution time and bringing the updates quickly to market by streamlining the development process.

Automation of Scaling: Management of resource scaling, guaranteeing the program’s workload efficiency is ensured by PaaS.

3. SaaS (software as a service)

Collaboration And Accessibility: Software as a Service (SaaS) helps users to easily access applications without having the requirement of local installations. It is fully managed by the AWS Software working as a service over the internet encouraging effortless cooperation and ease of access.

Automation of Updates: SaaS providers manage the handling of software maintenance with automatic latest updates ensuring users gain experience with the latest features and security patches.

Cost Efficiency: SaaS acts as a cost-effective solution by reducing the overhead of IT support by eliminating the need for individual software licenses.

What Are Cloud Deployment Models?

The following are the Cloud Deployment Models:

1. Private Deployment Model

It provides an enhancement in protection and customization by cloud resource utilization as per particular specified requirements. It is perfect for companies which looking for security and compliance needs.

2. Public Deployment Model

It comes with offering a pay-as-you-go principle for scalability and accessibility of cloud resources for numerous users. it ensures cost-effectiveness by providing enterprise-needed services.

3. Hybrid Deployment Model

It comes up with a combination of elements of both private and public clouds providing seamless data and application processing in between environments. It offers flexibility in optimizing resources such as sensitive data in private clouds and important scalable applications in the public cloud.

Advantages of Cloud Computing

The following are main advantages of Cloud Computing:

Cost Efficiency: Cloud Computing provides flexible pricing to the users with the principal pay-as-you-go model. It helps in lessening capital expenditures of Infrastructure, particularly for small and medium-sized businesses companies.
Flexibility and Scalability: Cloud services facilitate the scaling of resources based on demand. It ensures the efficiency of businesses in handling various workloads without the need for large amounts of investments in hardware during the periods of low demand.
Collaboration and Accessibility: Cloud computing provides easy access to data and applications from anywhere over the internet. This encourages collaborative team participation from different locations through shared documents and projects in real-time resulting in quality and productive outputs.
Automatic Maintenance and Updates: AWS Cloud takes care of the infrastructure management and keeping with the latest software automatically making updates they is new versions. Through this, AWS guarantee the companies always having access to the newest technologies to focus completely on business operations and innvoations.

Disadvantages Of Cloud Computing

The following are the main disadvantages of Cloud Computing:

Security Concerns: Storing of sensitive data on external servers raised more security concerns which is one of the main drawbacks of cloud computing.
Downtime and Reliability: Even though cloud services are usually dependable, they may also have unexpected interruptions and downtimes. These might be raised because of server problems, Network issues or maintenance disruptions in Cloud providers which negative effect on business operations, creating issues for users accessing their apps.
Dependency on Internet Connectivity: Cloud computing services heavily rely on Internet connectivity. For accessing the cloud resources the users should have a stable and high-speed internet connection for accessing and using cloud resources. In regions with limited internet connectivity, users may face challenges in accessing their data and applications.
Cost Management Complexity: The main benefit of cloud services is their pricing model that coming with Pay as you go but it also leads to cost management complexities. On without proper careful monitoring and utilization of resources optimization, Organizations may end up with unexpected costs as per their use scale. Understanding and Controlled usage of cloud services requires ongoing attention.

Monday, December 23, 2024

Operating System (OS)

Operating System a type of system software. It basically manages all the resources of the computer. An operating system acts as an interface between the software and different parts of the computer or the computer hardware. The operating system is designed in such a way that it can manage the overall resources and operations of the computer.

Operating System is a fully integrated set of specialized programs that handle all the operations of the computer. It controls and monitors the execution of all other programs that reside in the computer, which also includes application programs and other system software of the computer. Examples of Operating Systems are Windows, Linux, Mac OS, etc.

An Operating System (OS) is a collection of software that manages computer hardware resources and provides common services for computer programs. In this article we will see basic of operating system in detail.

What is an Operating System Used for?

The operating system helps in improving the computer software as well as hardware. Without OS, it became very difficult for any application to be user-friendly. The Operating System provides a user with an interface that makes any application attractive and user-friendly. The operating System comes with a large number of device drivers that make OS services reachable to the hardware environment. Each and every application present in the system requires the Operating System. The operating system works as a communication channel between system hardware and system software. The operating system helps an application with the hardware part without knowing about the actual hardware configuration. It is one of the most important parts of the system and hence it is present in every device, whether large or small device.

An Operating System can be defined as an interface between user and hardware. It is responsible for the execution of all the processes, Resource Allocation, CPU management, File Management and many other tasks.

The purpose of an operating system is to provide an environment in which a user can execute programs in convenient and efficient manner.

Functions of the Operating System

Resource Management: The operating system manages and allocates memory, CPU time, and other hardware resources among the various programs and processes running on the computer.

Process Management: The operating system is responsible for starting, stopping, and managing processes and programs. It also controls the scheduling of processes and allocates resources to them.

Memory Management: The operating system manages the computer’s primary memory and provides mechanisms for optimizing memory usage.

Security: The operating system provides a secure environment for the user, applications, and data by implementing security policies and mechanisms such as access controls and encryption.

Job Accounting: It keeps track of time and resources used by various jobs or users.

File Management: The operating system is responsible for organizing and managing the file system, including the creation, deletion, and manipulation of files and directories.

Device Management: The operating system manages input/output devices such as printers, keyboards, mice, and displays. It provides the necessary drivers and interfaces to enable communication between the devices and the computer.

Networking: The operating system provides networking capabilities such as establishing and managing network connections, handling network protocols, and sharing resources such as printers and files over a network.

User Interface: The operating system provides a user interface that enables users to interact with the computer system. This can be a Graphical User Interface (GUI), a Command-Line Interface (CLI), or a combination of both.

Backup and Recovery: The operating system provides mechanisms for backing up data and recovering it in case of system failures, errors, or disasters.

Virtualization: The operating system provides virtualization capabilities that allow multiple operating systems or applications to run on a single physical machine. This can enable efficient use of resources and flexibility in managing workloads.

Performance Monitoring: The operating system provides tools for monitoring and optimizing system performance, including identifying bottlenecks, optimizing resource usage, and analyzing system logs and metrics.

Time-Sharing: The operating system enables multiple users to share a computer system and its resources simultaneously by providing time-sharing mechanisms that allocate resources fairly and efficiently.

System Calls: The operating system provides a set of system calls that enable applications to interact with the operating system and access its resources. System calls provide a standardized interface between applications and the operating system, enabling portability and compatibility across different hardware and software platforms.

Error-detecting Aids: These contain methods that include the error messages, and other debugging and error-detecting methods.

Types of Operating Systems

Batch Operating System: A Batch Operating System is a type of operating system that does not interact with the computer directly. There is an operator who takes similar jobs having the same requirements and groups them into batches.

Time-sharing Operating System: Time-sharing Operating System is a type of operating system that allows many users to share computer resources (maximum utilization of the resources).

Distributed Operating System: Distributed Operating System is a type of operating system that manages a group of different computers and makes appear to be a single computer. These operating systems are designed to operate on a network of computers. They allow multiple users to access shared resources and communicate with each other over the network. Examples include Microsoft Windows Server and various distributions of Linux designed for servers.

Network Operating System: Network Operating System is a type of operating system that runs on a server and provides the capability to manage data, users, groups, security, applications, and other networking functions.

Real-time Operating System: Real-time Operating System is a type of operating system that serves a real-time system and the time interval required to process and respond to inputs is very small. These operating systems are designed to respond to events in real time. They are used in applications that require quick and deterministic responses, such as embedded systems, industrial control systems, and robotics.

i. Hard Real-Time Systems

In hard real-time systems, it is absolutely impossible to fail to meet a deadline. The consequence of missing a deadline may be disastrous, even failing the system or loss of life.

Examples: Flight navigation and control aircraft, Medical devices, like pacemakers.

It is required that strict guarantees be established so that the jobs will definitely be executed on time. Failure can be very disastrous.

ii. Soft Real-Time Systems

In this system type, occasional missed deadlines may not result in catastrophic failure but can degrade the performance or usability of the system.

Examples: Video streaming (occasional delays or buffering are acceptable) and online transaction systems (like bank ATMs, where slight delays are tolerable).

The system may still be functionally operational and workable even if some deadline misses do occur, though the performance would be adversely affected. This has focused on efficiency and the removal of delays rather than absolute precision.

iii. Firm Real-Time Systems

In between hard and soft real-time systems, there are firm real-time systems. In that case, there is no crash, but the result of the task is worthless when it doesn't meet the deadline. There's a minimum kind of penalty, but the value of the completion of the task goes significantly down if it is not made in time.

Multiprocessing Operating System: Multiprocessor Operating Systems are used in operating systems to boost the performance of multiple CPUs within a single computer system. Multiple CPUs are linked together so that a job can be divided and executed more quickly.

Single-User Operating Systems: Single-User Operating Systems are designed to support a single user at a time. Examples include Microsoft Windows for personal computers and Apple macOS.

Multi-User Operating Systems: Multi-User Operating Systems are designed to support multiple users simultaneously. Examples include Linux and Unix.

Embedded Operating Systems: Embedded Operating Systems are designed to run on devices with limited resources, such as smartphones, wearable devices, and household appliances. Examples include Google’s Android and Apple’s iOS.

Cluster Operating Systems: Cluster Operating Systems are designed to run on a group of computers, or a cluster, to work together as a single system. They are used for high-performance computing and for applications that require high availability and reliability. Examples include Rocks Cluster Distribution and OpenMPI.

Advantages of Batch OS

It has increased efficiency, like a resident monitor that eliminates CPU idle time between jobs to enable it to smoothen the switching of tasks.

The automatic handling of the processing of jobs means that users should not take control to intervene in whatever is going on while running or executing tasks.

The resources, such as the CPU and the memory, are utilized and not left idle.

It goes well with the payroll as well because it keeps processing jobs in batches without any time wastage and hard work.

It can also perform even very huge complex jobs without any intermission because it simply streams without any intermission, even when the task is excessively intensive.

Group jobs as 'batches' remove most of the manual setup that had to be done in between individual tasks, thereby saving time.

It logs and deals with errors at the time when the batch is over. This allows the system to run even without intermissions.

Disadvantages of Batch OS

1. Starvation

Batch processing suffers from starvation.

2. Not Interactive

Batch Processing is not suitable for jobs that are dependent on the user's input. If a job requires the input of two numbers from the console, then it will never get it in the batch processing scenario since the user is not present at the time of execution.

3. Delayed Output

Since the jobs are submitted in batches, the output is not produced in time. Such a condition can be rather inconvenient for time-critical jobs.

4. Difficult to Debug

An error is found only after the entire batch has been processed, which makes it even harder to locate and fix an issue in real-time.

5. It Requires Knowledge of Job Scheduling

The users or the system administrator should know well about the behavior of the system as well as dependencies among tasks.

6. Large Jobs Cause Delays

If a batch contains a large job, then problems may occur because the processing of all the subsequent jobs is delayed. This, therefore, slows down the overall system performance.

Advantages of Multiprogramming OS

Throughout the system, it increased as the CPU always had one program to execute.
Response time can also be reduced.
Multiprogramming maximizes the utilization of resources like memory, I/O devices, and processing power since more than one program can be kept alive at any time.
Since several jobs are being processed in parallel, significantly more tasks could be completed within a certain amount of time, thus enhancing the overall throughput of the system.
During times when a program is waiting for I/O operations, the processor does not go idle since it jumps on to another task to continue processing.
The system can support both short and long tasks to be executed in parallel, which makes for a more dynamic and productive processing environment.

Disadvantages of Multiprogramming OS

Multiprogramming systems provide an environment in which various systems resources are used efficiently, but they do not provide any user interaction with the computer system.
Multiple programs increase system complexity, as the operating system needs to manage multiple processes, memory management, and scheduling.
It requires more memory compared to less sophisticated operating systems because multiple programs run in memory simultaneously.
The operating system has to switch between the running processes continuously, and this leads to scheduling overhead and reduces performance.
As several concurrent operations access shared resources simultaneously, the system is likely to experience deadlocks: two or more processes waiting for each other for further actions.
Resource contention due to competition for scarce resources could degrade performance.

Advantages of Multiprocessing Operating System:

Increased Reliability: Because multiple processors are present, if one fails, others may take over it, thus stabilizing the system.
Increased Throughput: More jobs can be processed at the same time by these multiple processors than by a single processor, thereby increasing the speed of execution.
Efficient Resource Utilization: It acts more productively upon the utilization of resources like CPU, memory, and I/O devices.
Parallelism: Many processes can run parallel, which increases the speed of execution to a large extent.
Scalability: As the workload increases, more processors can be added to enhance performance.

Disadvantages of Multiprocessing Operating System:

Complexity: There is a rise in the complexity of the system while dealing with a number of processors and proper distribution of tasks.
Increased Cost: Hardware equipment added in multiprocessing systems increases their cost
Communication Overhead: Generally, communication among processors results in overhead and provides a slight reduction in efficiency.
Software Compatibility Problems: Generally, most software is not designed to operate properly with multiprocessing systems.

Advantages of the Multitasking Operating System:

This system can handle multiple users or tasks at once; therefore, it is best utilized in multi-user environments.
The memory is allocated dynamically and efficiently to various tasks so that there is optimal usage of system resources.
More applications can be run at the same time, which increases productivity because tasks are executed concurrently.
It provides for rapid shifts between tasks, therefore, shortening the response time for the user.
Because the system keeps running by switching between tasks, CPU time and other system resources are consumed more effectively.

Disadvantages of Multitasking Operating System:

The multitasking environment makes a number of processors busier at the same time; that is, the CPU generates more heat.
The managing of different tasks all together requires a more sophisticated algorithm, which becomes complex to administer.
Running too many applications at the same time can stress the system to a point where performance degrades.
In this scenario, multiple tasks will be competing for the same resources, thereby delaying them.
Multitasking systems usually require more powerful hardware, especially in terms of memory and processing power, to run without a hitch.

Advantages of Network Operating Systems

Since network applications are split between the clients and the servers, this minimizes the total number of communications on the network, thus increasing performance.
Configuring as well as maintaining an NOS is less expensive than other elaborate systems since shared resources reduce duplication needed.
A NOS grants centralized control over data, security, and resource management. That means that administrations may easily manage a large network.
Scaling of the systems can be made easily in terms of organization growth. Easly, new clients or servers can be added without reconfiguring the system.
It shares resources like printers, files, and applications, which reduces hardware and software redundancy.

Disadvantage of Network Operating System

If one node or server fails, it affects the whole system, and network functions will be interrupted. Hence, reliability is important.
Security needs to be robust so that unauthorized access is restricted. Complex security measures demand constant monitoring and updating.
The skilled network administrators involved will handle system performance, security configurations, and troubleshooting.
When the network size is too large, and traffic is heavy, it degrades with time if it's not monitored and maintained, which is constantly demanding attention.
This is because an attacker, once able to gain access to one server, comes close to achieving his goal for multiple resources contained in the entire network.

Advantages of Real-Time Operating System:

Real-time applications are quite easy to design, develop, and implement under a real-time operating system.
Maximum utilization of devices and systems in a Real-Time Operating System.
Fast response to events.
High reliability in the performance of time-critical operations.
Strict scheduling ensures the predictable execution of tasks.

Disadvantages of Real-Time Operating System:

Real-time operating systems are very expensive to design.
Real-time operating systems are very resource-intensive and consume critical CPU cycles.
Less multitasking support.
Lacks adaptability to new functions.
At times, it demands specific hardware.

Advantages of Time-Sharing Operating System

The time-sharing operating system facilitates effective utilization and sharing of resources.
This system helps decrease CPU idle and response time.
It allows various users to access and interact with their programs at the same time, and it leads to greater productivity.
Time-sharing assures better memory management because it swaps programs into and out of main memory efficiently.
An interactive computing environment provides users with real-time access to their programs and files.

Disadvantages of Time-Sharing Operating System

Data transmission rates compared to others are very high
The integrity and security of the user programs loaded in memory and data have to be ensured since many users access the system concurrently.
Implementation as well as management of time-sharing systems is more complicated than others since there is a tendency for task scheduling and memory management.
Since more and more users get hooked to the system, it degrades due to resource contention.
However, with time, context switches between tasks will incur overheads, thereby impacting the overall efficiency of the system.

Advantages of Distributed Operating System

The distributed operating system offers resource sharing.
This is a fault-tolerant system.
Scalability is achieved and easily new nodes can be added to the system.
Distributed task execution improves its performance.
Parallel processing helps in increasing the speed of job execution and enhances efficiency in getting results.

Disadvantages of Distributed Operating System

Protocol overhead might control computation costs.
The managing of the system is complex as it works in distributed.
Security may prove difficult on multiple nodes.
The system highly depends on network stability for it to run without hassles.

Examples of Operating Systems

Windows (GUI-based, PC)
GNU/Linux (Personal, Workstations, ISP, File, and print server, Three-tier client/Server)
macOS (Macintosh), used for Apple’s personal computers and workstations (MacBook, iMac).
Android (Google’s Operating System for smartphones/tablets/smartwatches)
iOS (Apple’s OS for iPhone, iPad, and iPod Touch)

Tuesday, December 17, 2024

Database Management System

DBMS stands for Database Management System, which is a software system that helps users create, manage, and manipulate databases. It acts as an interface between users and databases, allowing users to access, organize, and manipulate data. DBMSs are essential tools for organizations of all sizes, as they help store, organize, and retrieve large amounts of data quickly and efficiently.

Here are some features of DBMSs:

Data integrity: DBMSs ensure that data is consistently organized and remains easily accessible.
Security: DBMSs manage security for databases.
Concurrency: DBMSs manage concurrency for databases.
Backup and recovery: DBMSs provide backup and recovery functions.
Data descriptions: DBMSs provide data descriptions.

Some types of DBMSs include:

Centralized systems: All data is stored in a single location.
Distributed systems: Data is stored in various nodes.
Federated systems: Data can be provisioned without duplicating source data.
Blockchain database systems: Manage transactions, financial and otherwise.

Some examples of DBMSs include:

Cloud-based database management systems,
In-memory database management systems (IMDBMS),
Columnar database management systems (CDBMS), and
NoSQL in DBMS.

A database management system (DBMS) is a software system for creating and managing databases. A DBMS enables end users to create, protect, read, update and delete data in a database. It also manages security, data integrity and concurrency for databases.

What does a DBMS do?

A DBMS manages the data. The database engine enables data to be accessed, locked and modified and the database schema defines the database's logical structure. These three foundational data elements help provide concurrency, security, data integrity and uniform data administration procedures.

The following are common functions that a DBMS performs:

Administration tasks. A DBMS supports many typical database administration tasks, including change management, performance monitoring and tuning, security, and backup and recovery. Most database management systems are also responsible for automated rollbacks and restarts as well as logging and auditing of activity in databases and the applications that access them.

Storage. A DBMS provides efficient data storage and retrieval by ensuring that data is stored in tables, rows and columns.

Concurrency control. In environments where multiple users access and modify the database simultaneously, a DBMS guarantees controlled transaction execution to prevent data corruption or inconsistency.

Centralized view. A DBMS provides a centralized view of data that multiple users can access from multiple locations in a controlled manner. A DBMS can limit what data end users see and how they view the data, providing many views of a single database schema. End users and software programs are free from having to understand where the data is physically located or on what type of storage medium it resides because the DBMS handles all requests.

Data manipulation. A DBMS ensures data integrity and consistency by letting users insert, update, delete and modify data inside a database.

Data independence. A DBMS offers both logical and physical data independence to protect users and applications from having to know where data is stored or from being concerned about changes to the physical structure of data. As long as programs use the application programming interface (API) for the database that the DBMS provides, developers won't have to modify programs just because changes have been made to the database.

Backup and recovery. A DBMS facilitates backup and recovery options by creating backup copies so that data can be restored to a consistent state. This protects against data loss due to hardware failures, software errors or other unforeseen events. In a relational database management system (RDBMS) -- the most widely used type of DBMS -- the API is structured query language (SQL), a standard programming language for defining, protecting and accessing data.

What are the components of a DBMS?

A DBMS is a sophisticated piece of system software consisting of multiple integrated components that deliver a consistent, managed environment for creating, accessing and modifying data in databases. These components include the following:

DBMS Architecture

The DBMS design depends upon its architecture. The basic client/server architecture is used to deal with a large number of PCs, web servers, database servers and other components that are connected with networks.

The client/server architecture consists of many PCs and a workstation which are connected via the network.

DBMS architecture depends upon how users are connected to the database to get their request done.

Types of DBMS Architecture

DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of two types like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture

In this architecture, the database is directly available to the user. It means the user can directly sit on the DBMS and uses it.

Any changes done here will directly be done on the database itself. It doesn't provide a handy tool for end users.

The 1-Tier architecture is used for development of the local application, where programmers can directly communicate with the database for the quick response.

2-Tier Architecture

The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on the client end can directly communicate with the database at the server side. For this interaction, API's like: ODBC, JDBC are used.

The user interfaces and application programs are run on the client-side.

The server side is responsible to provide the functionalities like: query processing and transaction management.

To communicate with the DBMS, client-side application establishes a connection with the server side.

3-Tier Architecture

The 3-Tier architecture contains another layer between the client and server. In this architecture, client can't directly communicate with the server.

The application on the client-end interacts with an application server which further communicates with the database system.

End user has no idea about the existence of the database beyond the application server. The database also has no idea about any other user beyond the application.

The 3-Tier architecture is used in case of large web application.

Data Models

Data Model is the modeling of the data description, data semantics, and consistency constraints of the data. It provides the conceptual tools for describing the design of a database at each level of data abstraction. Therefore, there are following four data models used for understanding the structure of the database:

Data Models

1) Relational Data Model: This type of model designs the data in the form of rows and columns within a table. Thus, a relational model uses tables for representing data and in-between relationships. Tables are also called relations. This model was initially described by Edgar F. Codd, in 1969. The relational data model is the widely used model which is primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as objects and relationships among them. These objects are known as entities, and relationship is an association among these entities. This model was designed by Peter Chen and published in 1976 papers. It was widely used in database designing. A set of attributes describe the entities. For example, student_name, student_id describes the 'student' entity. A set of the same type of entities is known as an 'Entity set', and the set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of functions, encapsulation, and object identity, as well. This model supports a rich type system that includes structured and collection types. Thus, in 1980s, various database systems following the object-oriented approach were developed. Here, the objects are nothing but the data carrying its properties.

4) Semistructured Data Model: This type of data model is different from the other three data models (explained above). The semistructured data model allows the data specifications at places where the individual data items of the same type may have different attributes sets. The Extensible Markup Language, also known as XML, is widely used for representing the semistructured data. Although XML was initially designed for including the markup information to the text document, it gains importance because of its application in the exchange of data.

Database Languages in DBMS

A DBMS has appropriate languages and interfaces to express database queries and updates.

Database languages can be used to read, store and update the data in the database.

Types of Database Languages

DBMS Language

1. Data Definition Language (DDL)

DDL stands for Data Definition Language. It is used to define database structure or pattern.

It is used to create schema, tables, indexes, constraints, etc. in the database.

Using the DDL statements, you can create the skeleton of the database.

Data definition language is used to store the information of metadata like the number of tables and schemas, their names, indexes, columns in each table, constraints, etc.

Here are some tasks that come under DDL:

Create: It is used to create objects in the database.
Alter: It is used to alter the structure of the database.
Drop: It is used to delete objects from the database.
Truncate: It is used to remove all records from a table.
Rename: It is used to rename an object.
Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come under Data definition language.

2. Data Manipulation Language (DML)

DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a database. It handles user requests.

Here are some tasks that come under DML:

Select: It is used to retrieve data from a database.
Insert: It is used to insert data into a table.
Update: It is used to update existing data within a table.
Delete: It is used to delete all records from a table.
Merge: It performs UPSERT operation, i.e., insert or update operations.
Call: It is used to call a structured query language or a Java subprogram.
Explain Plan: It has the parameter of explaining data.
Lock Table: It controls concurrency.

3. Data Control Language (DCL)

DCL stands for Data Control Language. It is used to retrieve the stored or saved data.

The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have the feature of rolling back.)

Here are some tasks that come under DCL:

Grant: It is used to give user access privileges to a database.
Revoke: It is used to take back permissions from the user.

4. Transaction Control Language (TCL)

TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical transaction.

Here are some tasks that come under TCL:

Commit: It is used to save the transaction on the database.
Rollback: It is used to restore the database to original since the last Commit.

ACID Properties in DBMS

DBMS is the management of data that should remain integrated when any changes are done in it. It is because if the integrity of the data is affected, whole data will get disturbed and corrupted. Therefore, to maintain the integrity of the data, there are four properties described in the database management system, which are known as the ACID properties. The ACID properties are meant for the transaction that goes through a different group of tasks, and there we come to see the role of the ACID properties.

ACID Properties

The expansion of the term ACID defines for:

ACID Properties in DBMS

1) Atomicity

The term atomicity defines that the data remains atomic. It means if any operation is performed on the data, either it should be performed or executed completely or should not be executed at all. It further means that the operation should not break in between or execute partially. In the case of executing operations on the transaction, the operation should be completely executed and not partially.

2) Consistency

The word consistency means that the value should remain preserved always. In DBMS, the integrity of the data should be maintained, which means if a change in the database is made, it should remain preserved always. In the case of transactions, the integrity of the data is very essential so that the database remains consistent before and after the transaction. The data should always be correct.

3) Isolation

The term 'isolation' means separation. In DBMS, Isolation is the property of a database where no data should affect the other one and may occur concurrently. In short, the operation on one database should begin when the operation on the first database gets complete. It means if two operations are being performed on two different databases, they may not affect the value of one another. In the case of transactions, when two or more transactions occur simultaneously, the consistency should remain maintained. Any changes that occur in any particular transaction will not be seen by other transactions until the change is not committed in the memory.

4) Durability

Durability ensures the permanency of something. In DBMS, the term durability ensures that the data after the successful execution of the operation becomes permanent in the database. The durability of the data should be so perfect that even if the system fails or leads to a crash, the database still survives. However, if gets lost, it becomes the responsibility of the recovery manager for ensuring the durability of the database. For committing the values, the COMMIT command must be used every time we make changes.

Therefore, the ACID property of DBMS plays a vital role in maintaining the consistency and availability of data in the database.

Thursday, December 12, 2024

Data Analytics

Data analytics is the process of collecting, organizing, and transforming data to draw conclusions, make predictions, and inform decision-making. It's a multidisciplinary field that uses a variety of techniques, including: math, statistics, and computer science.

Data analytics can include:

Data analysis: Analyzing data

Data science: A part of data analytics

Data engineering: A part of data analytics

Predictive analytics: Using historical data to predict future actions

Prescriptive analytics: Using insights from predictive analytics to recommend actions and make data-driven decisions

Descriptive analysis: Using historical data to review and understand what has occurred in the past

Data aggregation: Gathering data and presenting it in a summarized format

Data mining: Discovering patterns in data

Data analytics, also known as data analysis, is a crucial component of modern business operations. It involves examining datasets to uncover useful information that can be used to make informed decisions. This process is used across industries to optimize performance, improve decision-making, and gain a competitive edge.

Data Analytics is a systematic approach that transforms raw data into valuable insights. This process encompasses a suite of technologies and tools that facilitate data collection, cleaning, transformation, and modelling, ultimately yielding actionable information. This information serves as a robust support system for decision-making. Data analysis plays a pivotal role in business growth and performance optimization. It aids in enhancing decision-making processes, bolstering risk management strategies, and enriching customer experiences. By presenting statistical summaries, data analytics provides a concise overview of quantitative data.

While data analytics finds extensive application in the finance industry, its utility is not confined to this sector alone. It is also leveraged in diverse fields such as agriculture, banking, retail, and government, among others, underscoring its universal relevance and impact.

Process of Data Analytics

Data analysts, data scientists, and data engineers together create data pipelines which helps to set up the model and do further analysis. Data Analytics can be done in the following steps which are mentioned below:

Data Collection : It is the first step where raw data needs to be collected for analysis purposes. It consists of two steps in which data collection can be done. If the data are from different source systems then using data integration routines the data analysts have to combine the different data whereas sometimes the data are the subset of the data set. In this case, the data analyst would perform some steps to extract the useful subset and transfer it to the other compartment in the system.
Data Cleansing : After collecting the data the next step is to clean the quality of the data as the collected data consists of a lot of quality problems such as errors, duplicate entries and white spaces which need to be corrected before moving to the next step. By running data profiling and data cleansing tasks these errors can be corrected. These data are organised according to the needs of the analytical model by the analysts.
Data Analysis and Data Interpretation : Analytical models are created using software and other tools which interpret the data and understand it. The tools include Python, Excel, R, Scala and SQL. Lastly this model is tested again and again until the model works as it needs to be then in production mode the data set is run against the model.
Data Visualisation : Data visualisation is the process of creating visual representation of data using the plots, charts and graphs which helps to analyse the patterns, trends and get the valuable insights of the data. By comparing the datasets and analysing it data analysts find the useful data from the raw data.

Types of Data Analytics

There are different types of data analysis in which raw data is converted into valuable insights. Some of the types of data analysis are mentioned below:

Descriptive Data Analytics : Descriptive data Analytics is a type of data analysis which summarises the data set and it is used to compare the past results, differentiate between the weakness and strength, and identify the anomalies. Descriptive data analysis is used by the companies to identify the problems in the data set as it helps in identifying the patterns.
Real-time Data Analytics : Real time data Analytics doesn’t use data from past events. It is a type of data analysis which involves using the data when the data is immediately entered in the database. This type of analysis is used by the companies to identify the trends and track the competitors’ operations.
Diagnostic Data Analytics : Diagnostic Data Analytics uses past data sets to analyse the cause of an anomaly. Some of the techniques used in diagnostic analysis are correlation analysis, regression analysis and analysis of variance.The results which are provided by diagnostic analysis help the companies to give accurate solutions to the problems.
Predictive Data Analytics : This type of Analytics is done in the current data to predict future outcomes. To build the predictive models it uses machine learning algorithms, statistical model techniques to identify the trends and patterns. Predictive data analysis is also used in sales forecasting, to estimate the risk and to predict customer behaviour.
Prescriptive Data Analytics : Prescriptive data Analytics is an analysis of selecting best solutions to problems. This type of data analysis is used in loan approval, pricing models, machine repair scheduling, analysing the decisions and so on. To automate decision making companies use prescriptive data analysis.

Methods of Data Analytics

There are two types of methods in data analysis which are mentioned below:

1. Qualitative Data Analytics

Qualitative data analysis doesn’t use statistics and derives data from the words, pictures and symbols. Some common qualitative methods are:

Narrative Analytics is used for working with data acquired from diaries, interviews and so on.
Content Analytics is used for Analytics of verbal data and behaviour.
Grounded theory is used to explain some given event by studying.

2. Quantitative Data Analysis

Quantitative data Analytics is used to collect data and then process it into the numerical data. Some of the quantitative methods are mentioned below:

Hypothesis testing assesses the given hypothesis of the data set.
Sample size determination is the method of taking a small sample from a large group of people and then analysing it.
Average or mean of a subject is dividing the sum total numbers in the list by the number of items present in that list.

Skills Required for Data Analytics

There are multiple skills which are required to be a Data analyst. Some of the main skills are mentioned below:

Some of the common programming languages which are used are R and Python.
For databases Structured Query Language (SQL) is a programming language used.
Machine Learning is used in data analysis.
In order to better analyse and interpret probability and statistics are used.
For collecting and organising data, Data Management is used in data analysis.
To use charts and graphs Data visualisation is used.

Importance and Usage of Data Analytics

Data analytics consists of many uses in the finance industry. It is also used in agriculture, banking, retail, government and so on. Some of the main importance of data analysis are mentioned below:

Data Analytics targets the main audience of the business by identifying the trends and patterns from the data sets. Thus, it can improve the businesses to grow and optimise its performance.

By doing data analysis it shows the areas where business needs more resources, products and money and where the right amount of interaction with the customer is not happening in the business. Thus by identifying the problems then working on those problems to grow in the business.

Data analysis also helps in the marketing and advertising of the business to make it popular and thus more customers will know about the business.

The valuable information which is taken out from the raw data can bring advantage to the organisation by examining present situations and predicting future outcomes.

From data Analytics the business can get better by targeting the right audience, disposable outcomes and audience spending habits which helps the business to set prices according to the interest and budget of customers.