+41 768307656

Archive: 13/09/2023

Data Centres Water Requirement. From Cooling To Energy Consumption, Are They Sustainable?

A data center is a dedicated space in a building that houses computer systems and related components like storage and telecommunication systems. It comprises backup components and robust infrastructure for information exchange, power supply, security devices, and environmental control systems like fire suppression and air conditioning systems.

How Does It Work?

A data centre consists of virtual or physical servers (or robust computer systems) connected externally and internally through communication and networking equipment to store digital information and transfer it. It contains several components to serve different purposes:

Networking: It refers to the interconnections between a data center’s components and the outside world. It includes routers, app delivery controllers, firewalls, switches, etc.

Storage: An organization’s data is stored in data centres. The components for storage are tape drives, hard disk drives, solid-state drives (SSDs) with backups, etc.

Compute: It refers to the processing power and memory required to run applications. It is supplied through powerful computers to run applications.

Types of Data Centres

You can come across different types of data centres based on how they are owned, technologies used, and energy efficiency. Some of the main types of data centres that organizations use is:

Managed Data Centres

In a managed data centre, a third-party service provider offers computing, data storage, and other related services to organizations directly to help them run and manage their IT operations. The service provider deploys, monitors, and manages this data centre model, offering the features via a managed platform.

You can source the managed data centre services from a colocation facility, cloud data centres, or a fixed hosting site. A managed data centre can either be partially or fully managed. If it’s partially managed, the organization will have administration control over the data centre service and implementation. However, if it’s fully managed, all the back-end data and technical details are administered and controlled by the service provider.

Suitable for: The ideal users of managed data centres are medium to large businesses.

Benefits: You do not have to deal with regular maintenance, security, and other aspects. The data centre provider is responsible for maintaining network services and components, upgrading system-level programs and operating systems, and restoring service if anything goes wrong.

Enterprise Data Centres

An enterprise data centre refers to a private facility that supports the IT operations of a single organization. It can be situated at a site off-premises or on-premises based on their convenience. This type of data centre may consist of multiple data centres located at different global locations to support an organization’s key functions.

For example, if a business has customers from different global regions, they can set up data centres closer to their customers to enable faster service.

Enterprise data centres can have sub-data centres, such as:

Intranet controls data and applications within the main enterprise data centre. Enterprise uses the data for their research & development, marketing, manufacturing, and other functions.

Extranet performs business-to-business transactions inside the data centre network. The company accesses the services through VPNs or private WANs. The internet data centre is used to support servers and devices needed to run web applications.

Suitable for: As the name suggests, enterprise data centres are ideal for enterprises with global expansion and distinguished network requirements. It’s because they have enough revenue to support their data centres at multiple locations.

Benefits: It’s beneficial for businesses as it allows them to track critical parameters like power and bandwidth utilization and helps update their applications and systems. It also helps the companies understand their needs more and scale their capacities accordingly.

However, building enterprise data centre facilities needs heavy investments, maintenance needs, time, and effort.

Colocation Data Centres

A colocation data centre or “colo” is a facility that a business can rent from a data centre owner to enable IT operations to support applications, servers, and devices. It is becoming increasingly popular these days, especially for organizations that don’t have enough resources to build and manage a data centre of their own but still need it anyway. In a colo, you may use features and infrastructure such as building, security, bandwidth, equipment, and cooling systems. It helps connect network devices to different network and telecommunication service providers. The popularity of colocation facilities grew around the 2000s when organizations wanted to outsource some operations but with certain controls. Even if you rent some space from a data centre provider, your employees can still work within that space and even connect with other company servers.

Suitable for: Colocation data centres are suitable for medium to large businesses.

Benefits: There are several benefits that you can avail yourself from a colocation server, such as:

Scalability to support your business growth; you can add or remove servers and devices easily without hassles.

You will have the option to host the data centre at different global locations closest to your customers to offer the best experience.

Colocation data centres offer high reliability with powerful servers, computing power, and redundancy.

It also saves you money as you don’t have to build a large data centre from scratch at multiple locations. You can just rent it out based on your budget and present needs.

You don’t need to handle the data centre maintenance such as device installation, updates, power management, and other processes.

Cloud Data Centres

One of the most popular types of data centre these days is the cloud data centre. In this type, a cloud service provider runs and manages the data centre to support business applications and systems. It’s like a virtual data centre with even more benefits than colocation data centres.

The popular cloud service providers are Amazon AWS, Google, Microsoft Azure, Salesforce, etc. When data uploads in the cloud servers, the cloud service providers duplicate and fragment this data across multiple locations to ensure it’s never lost. They also back up your data, so you don’t lose it even if something goes wrong.

Now, cloud data centres can be of two types – public and private.

Public cloud providers like AWS and Azure offer resources through the internet to the public. Private cloud service providers offer customized cloud services. They give you singular access to private clouds (their cloud environment). Example: Salesforce CRM.

Suitable for: Cloud data centres are ideal for almost any organization of any type or scale.

Benefits: There are many benefits of using cloud data centres compared to physical or or-premise data centres, including:

It’s cost-effective as you don’t have to invest heavily in building a data centre from scratch. You just have to pay for the service you utilize and as long as you need it. You are free from maintenance requirements. They will take care of everything, from installing systems, upgrading software, and maintaining security to backups and cooling. It offers a flexible pricing plan. You can go for a monthly subscription and be aware of your expenditure in an easier way.

Edge Data Centres

The most recent of all, edge data centres are still in the development stage. They are smaller data centre facilities situated closer to the customers an organization serves. It utilizes the concept of edge computing by bringing the computation closer to systems that generate data to enable faster operations. Edge data centres are characterized by connectivity and size, allowing companies to deliver services and content to their local users at a greater speed and with minimal latency. They are connected to a central, large data centre or other data centres. In the future, edge data centres can support autonomous vehicles and IoT to offer higher processing power and improve the consumer experience.

Suitable for: Small to medium-sized businesses

Benefits: The benefits of using an edge data centre are:

An edge data centre can distribute high traffic loads efficiently. It can cache requested content and minimize the response time for a user request. It can also help increase network reliability by distributing traffic loads efficiently. The data centre offers a superb performance by placing computation closer to the source.

Hyperscale Data Centres

Hyperscale data centres are massive and house thousands of servers. They are designed to be highly scalable by adding more devices and equipment or increasing system power. The demand for hyper scale data centres is increasing with increasing data generation. Businesses now deal with an enormous amount of data, which begins to rise. Hence, to store and manage this sort of data, they need a giant data centre, and hyper scale seems to be the right choice for it.

Suitable for: Hyperscale data centres are best for large enterprises with massive amounts of data to store and manage.

Benefits: Initially, the data centre providers designed hyper scale data centres for large public cloud service providers. Although they can build it themselves, renting a hyper scale data centre comes with a lot of benefits:

It offers more flexibility; companies can scale up or down based on their current needs without any difficulties.

Increased speed to market so they can delight their customers with the best services. Freedom from maintenance needs, so they don’t waste time in repetitive work and dedicate that time to innovation. Other than these five main types of data centres, you may come across others as well. Let’s have a quick look at them.

Carrier hotels are the main internet exchange points for the entire data traffic belonging to a specific area. Carrier hotels focus on more fibre and telecom providers compared to a common colo. They are usually located downtown with a mature fibre infrastructure. However, creating a dense fibre system like this takes a great deal of effort and time, which is why they are rare. For example, One Wilshire in Los Angeles has 200+ carriers in the building to supply connectivity to the entire traffic coming from the US West Coast.

Microdata centre: It’s a condensed version of the edge data centre. It can be smaller, like an office room, to handle the data processing in a specific location.

Traditional data centres: They consisted of multiple servers in racks, performing different tasks. If you need more redundancy to manage your critical apps, you can add more servers to this rack. Starting around the 1990s, in this infrastructure, the service provider acquires, deploys, and maintains a server.

Over time, they add more servers to facilitate more capabilities. It needs monitoring the operating systems using monitoring tools, which requires a certain level of expertise. In addition, it requires patching and updating, and verifying them for security. All these require heavy investments, not to mention the powering and cooling cost is added extra.

Modular data centres: It’s a portable data centre, meaning you can deploy it at a place where you need data capacity. It contains modules and components offering scalability in addition to power and cooling capabilities. You can add modules, combine them with other modules or integrate them into a data centre.

Modular data centres can be of two types:

Containerized or portable: data centres arrange equipment into a shipping container that gets transported to a particular location. It has its own cooling systems.

Another type of modular data centre arranges equipment or devices into a capacity with prefabricated components. These components are quick to build on a location and added for more capacity.

What Are the Data Centre Tiers?

Another way of classifying data centres based on uptime and reliability is by data centre tiers. The Uptime Institute developed it during the 1990s, and there are 4 data centre tiers. Let us understand them.

Tier 1: A tier one data centre has “basic capacity” and includes a UPS. It has fewer components for redundancy and backup and a single path for cooling and power. It also involves higher downtime and may lack energy efficiency systems. It offers a minimum of 99.671% uptime, which means 28.8 hours of downtimes yearly.

Tier 2: A tier two data centre has “redundant capacity” and offers more components for redundancy and backup than tier 1. It also has a singular path for cooling and power. They are generally private data centres, and they also lack energy efficiency. Tier 2 data centres can offer a minimum of 99.741% uptime, which means 22 hours downtimes yearly.

Tier 3: A tier three data centre is “concurrently maintainable,” ensuring any component is safe to remove without impacting the process. It has different paths for cooling and power to help maintain and update the systems.

Tier 3 data centres have redundant systems to limit operational errors and equipment failure. They utilize UPS systems that supply power continuously to servers and backup generators. Therefore, they offer a minimum of 99.982% uptime, which means 1.6 hours of downtimes yearly and N+1 redundancy, higher than tiers 1 and 2.

Tier 4: A tier four data centre is “fault-tolerant” and allows a production capacity to be protected from any failure type. It requires twice the number of components, equipment, and resources to maintain a continuous flow of service even during disruptions.

Critical business operations from organizations that cannot afford downtimes use tier 4 data centres to offer the highest level of redundancy, uptime, and reliability. A tier 4 data centre provides a minimum of 99.995% uptime, which means 0.4 hours of annual downtime and 2N redundancy, which is superb.

Data centre water use

Total water consumption in the USA in 2015 was 1218 billion litres per day, of which thermoelectric power used 503 billion litres, irrigation used 446 billion litres and 147 billion litres per day went to supply 87% of the US population with potable water. Data centres consume water across two main categories: indirectly through electricity generation (traditionally thermoelectric power) and directly through cooling. In 2014, a total of 626 billion litres of water use was attributable to US data centres. This is a small proportion in the context of such high national figures; however, data centres compete with other users for access to local resources. A medium-sized data centre (15 megawatts (MW)) uses water as three average-sized hospitals, or more than two 18-hole golf courses. Progress has been made with using recycled and non-potable water, but from the limited figures available some data centre operators are drawing more than half of their water from potable sources. This has been the source of considerable controversy in areas of water stress and highlights the importance of understanding how data centres use water.

Water use in data centre cooling.

ICT equipment generates heat and so most devices must have a mechanism to manage their temperature. Drawing cool air over hot metal transfers heat energy to that air, which is then pushed out into the environment. This works because the computer temperature is usually higher than the surrounding air. The same process occurs in data centres, just at a larger scale. ICT equipment is located within a room or hall, heat is ejected from the equipment via an exhaust and that air is then extracted, cooled and recirculated. Data centre rooms are designed to operate within temperature ranges of 20–22 °C, with a lower bound of 12 °C. As temperatures increase, equipment failure rates also increase, although not necessarily linearly.

There are several different mechanisms for data centre cooling, but the general approach involves chillers reducing air temperature by cooling water—typically to 7–10 °C—which is then used as a heat transfer mechanism. Some data centres use cooling towers where external air travels across a wet media so the water evaporates. Fans expel the hot, wet air and the cooled water is recirculated. Other data centres use adiabatic economisers where water sprayed directly into the air flow, or onto a heat exchange surface, cools the air entering the data centre. With both techniques, the evaporation results in water loss. A small 1 MW data centre using one of these types of traditional cooling can use around 25.5 million litres of water per year.

Cooling the water is the main source of energy consumption. Raising the chiller water temperature from the usual 7–10 °C to 18–20 °C can reduce expenses by 40% due to the reduced temperature difference between the water and the air. Costs depend on the seasonal ambient temperature of the data centre location. In cooler regions, less cooling is required, and instead free air cooling can draw in cold air from the external environment. This also means smaller chillers can be used, reducing capital expenditure by up to 30%. Both Google and Microsoft have built data centres without chillers, but this is difficult in hot regions.

Alternative water sources

Where data centres own and operate the entire facility, there is more flexibility for exploring alternative sources of water, and different techniques for keeping ICT equipment cool.

Google’s Hamina data centre in Finland has used sea water for cooling since it opened in 2011. Using existing pipes from when the facility was a paper mill, the cold sea water is pumped into heat exchangers within the data centre. The sea water is kept separate from the freshwater, which circulates within the heat exchangers. When expelled, the hot water is mixed with cold sea water before being returned to the sea.

Despite Amazon’s poor environmental efforts in comparison to Google and Microsoft, they are expanding their use of non-potable water. Data centre operators have a history of using drinking water for cooling, and most source their water from reservoirs because access to rainfall, grey water and surface water is seen as unreliable. Digital Realty, a large global data centre operator, is one of the few companies publishing a water source breakdown. Reducing this proportion is important because the processing and filtering requirements of drinking water increase the lifecycle energy footprint. The embodied energy in the manufacturing of any chemicals required for filtering must also be considered. This increases the overall carbon footprint of a data centre.

Amazon claims to be the first data centre operator approved for using recycled water for direct evaporative cooling. Deployed in their data centres in Northern Virginia and Oregon, they also have plans to retrofit facilities in Northern California. However, Digital Realty faced delays when working with a local utility in Los Angeles because they needed a new pipeline to pump recycled water to its data centres.

Microsoft’s Project Natick is a different attempt to tackle this challenge by submerging a sealed data centre under water. Tests concluded off the Orkney Islands in 2020 showed that 864 servers could run reliably for 2 years with cooling provided by the ambient sea temperature, and electricity from local renewable sources. The potential to make use of natural cooling is encouraging, however, the small scale of these systems could mean higher costs, making them appropriate only for certain high-value use cases.

ICT equipment is deployed in racks, aligned in rows, within a data centre room. Traditional cooling manages the temperature of the room as a whole, however, this is not as efficient as more targeted cooling. Moving from cooling the entire room to focused cooling of a row of servers, or even a specific rack, can achieve energy savings of up to 29%, and is the subject of a Google patent granted in 2012.

This is becoming necessary because of the increase in rack density. Microsoft is deploying new hardware such as the Nvidia DGX-2 Graphics Processing Unit that consumes 10 kW for machine learning workloads, and existing cooling techniques are proving insufficient. Using low-boiling-point liquids is more efficient than using ambient air cooling and past experiments have shown that a super-computing system can transfer 96% of excess heat to water, with 45% less heat transferred to the ambient air. Microsoft is now testing these techniques in its cloud data centres.

These projects show promise for the future, but there are still gains to be had from existing infrastructure. Google has used its AI expertise to reduce energy use from cooling by up to 40% through hourly adjustments to environmental controls based on predicted weather, internal temperatures and pressure within its existing data centres. Another idea is to co-locate data centres and desalination facilities so they can share energy intensive operations68. That most of the innovation is now led by the big three cloud providers demonstrates their scale advantage. By owning, managing and controlling the entire value chain from server design through to the location of the building, cloud vendors have been able to push data centre efficiency to levels impossible for more traditional operators to achieve.

However, only the largest providers build their own data centres, and often work with other data centre operators in smaller regions. For example, as of the end of 2020, Google lists 21 data centres, publishes PUE for 17, but has over 100 points of presence (PoPs) around the world. These PoPs are used to provide services closer to its users, for example, to provide faster load times when streaming YouTube videos. Whilst Google owns the equipment deployed in the PoP, it does not have the same level of control as it does when it designs and builds its own data centres. Even so, Google has explored efficiency improvements such as optimising air venting, increasing temperature from 22 to 27 °C, deployed plastic curtains to establish cool aisles for more heat sensitive equipment and improved the design of air conditioning return air flow. In a case study for one its PoPs, this work was shown to reduce PUE from 2.4 to 1.7 and saved US$67,000 per year in energy for a cost of US$25,000.


Data Centre Types Explained in 5 Minutes or Less (geekflare.com)

Data centre water consumption | npj Clean Water (nature.com)

Drought-stricken communities push back against data centres (nbcnews.com)

Our commitment to climate-conscious data centre cooling (blog.google)

Water Usage Effectiveness For Data Centre Sustainability – AKCP

Vending Machine – Data Analysis – From study to action and how to improve performance

Data analysis of A vending machine can be very helpful because the information given after trasforming & visusalizing data can enhance logistics, avoid losses and improve performance.

A vending machine is one of those machines installed in shopping mall, offices and stores. They can sell anything which is inside. Any item is stored in a coil and can be bought at a fixed price.

The new models allows to collects usefull data in csv format and then can be manipulated in a way that can give a lot of information like customer profile, spending, preferences and also to discover some correlation between two or more products are sold together.

This study collects data from a single vending machine and try to analyse and search for some correlation between items sold.

Data consist in a single file with 6445 rows and 16 columns. Rows corresponds to a a single operation, from January to August. Most important columns for this study correspond to:

  • Name
  • DateofSale: day,month, day number, year
  • Type of Food: Carbonated, Non-Carbonated, Food, Water
  • Type of Payment: credit card, cash
  • RCoil: coil number of the product
  • RPrice: price of the product in the coil
  • QtySold: quantity sold
  • TransTotal: total amount of the transaction. Normally 1 sold, 1 paid, but can happens that more than item can be sold


Data is loaded as follow, removing unnecessary fields from raw data:
After cleaning and transforming data, the following table shows the entire dataset consisting of 6445 rows and 10 columns.

The first thing to do, is a preliminary calculation to see which categories are present in the dataset.

We can see that the 2 most important categories are food and carbonated drinks, which correspond to 78% of total transactions in 8 months of sampling. In the following sections we will go deep into data analytics


The following table corresponds Carbonated products and quantity sold, sorted from hightest to lowest:

The first 5 positions, corresponding to 37% of the types of carbonated drink, sold 1431


The following table corresponds Food products and quantity sold, sorted from hightest to lowest

In case of food, the first 5 position covers only 23% of the total quantity sold, in addition to this the number of categories/brands is 7 times bigger than carbonated drinks. This creates a spread in the sales because the user/client has more types to choose. The above short section shown the data extracted from the main dataset that is usefull to provide an indication of trending products. The information given is without any statistical inference, but merely data extracted, loaded and transformed (ELT).

Monthly sales

If you want to see the overall study and discover if there is a correlation between a carbonated drink is sold with food, you can find it below

Population and houses growth in Switzerland

Switzerland is known for its high standard of living and picturesque landscapes, making it a popular destination for expats, students, and travelers. However, it is also known for its high cost of living, including housing prices. Renting a flat in Switzerland can be expensive, especially in larger cities such as Zurich, Geneva, and Basel.

The scope of this article is a study to correlate the prices of the house in francs/m2 and correlating them with population. The data used is provided by opendata.swiss and the information of this paper is free of charge.

Data Mining & Preprocessing

All data used in this study was retrieved from opendata.swiss which is the Swiss public administration’s central portal for open government data.

Several files with CSV and XLS extensions were used and adapted to provide a full dataset of information regarding population growth, buildings construction and price variation through the years.

Population data set cover 1950-2020, classified by sex, provenience & canton

Building construction dataset on the other side starts in 2003 to 2020 classified by flat or building & canton

Last set is about price per m2 in swiss francs. This set starts in 2012 until 2020 classified by canton & year of construction, from older than 1919 up to 2021. For our purpose, average through canton value was used in order to homogenize data accross years and building age.

Population data was truncated to start in 2003 to match building construction data set.


For the analysis few statistical indicatore were used:

  • Arithmetic mean, also known as the average, is a measure of central tendency that represents the typical value of a set of numbers. It is calculated by adding up all the values in a set and then dividing the sum by the number of values in the set. The arithmetic mean is commonly used in statistics to summarize the data and to compare different sets of data. It is a useful measure of central tendency when the data is evenly distributed and does not have any extreme outliers. However, it can be influenced by outliers, and in such cases, other measures of central tendency such as the median or mode may be more appropriate. Defined as:
  • Standard deviation, The standard deviation is a measure of the amount of variation or dispersion in a set of data. It is calculated as the square root of the variance, which is the average of the squared differences of each value from the mean. The standard deviation is commonly used in statistics to describe the spread of a distribution, with a higher standard deviation indicating a wider spread of values and a lower standard deviation indicating a narrower spread of values. It is also used in inferential statistics to calculate confidence intervals and to test hypotheses about the population from which the sample was drawn. Defined as:

After calculations, graphs were constructed to visualize data and get information.

Population Data

Data recall population from year 1950 until 2020. After importing data, it is usefull to display visual information of total values both for sex and citizenship. The final graph after filtering data is as follows:

Adding a linear trend, gives that in 2030 the population will be around 9 millions.

To have further detail on population, it is possible to use population change by canton using standard deviation, to see data variation through the years.

Higher values means high variation in positive(growing) direction

Houses Data

Data about constructions in switzerland is imported. This data covers from 2003 to 2020.

It is clearly visible that the number of new construction reaches its peak in 2015 and then change its direction to the lower values.

Rent average price m2/chf

Data is categorized by canton and year, from 2012 to 2020 and the value is expressed as average through 26 cantons. Due to lowering number of new construction, one can say that prices will growth. For this reason, this dataset can be usefull to study if there are some variation in the prices. Note that this values includes existing buildings and new constructed. Original dataset considers building older that 1919 up to 2021. For practical purposes, data was filtered.

To have a better understanding, difference between 2012 and 2020 prices is summarized and plotted as follow:


The highest price deviation are the AI, Appenzeller Inner, second places is for BS, Basel City and third place is GL, Glarus cantons. . On the other hand, Basel Stadt has the higher variation in the prices, passing from 16.90 chf/m2 to 18.2 chf/m2.

Zurich city which has the highest population increase during the last 20 years, don’t show a proporcional increase in the price, passing 18.5 to 19.3 chf/m2.

A note from last graph is about zug that the price does not changes over 8 years, while Grisons and Schwyz the prices are lower the befor

It is worth to recall that prices are on average basis for all houses present in the canton and the price is referred only to rent, other expenses are not included like common heating, waste, cleaning, parking and other amenities.


Average rent in Swiss francs according to the number of rooms and the canton | opendata.swiss

Demographic evolution, 1950-2021 | opendata.swiss

Average rent per m2 in Swiss francs according to the age of construction and the canton | opendata.swiss

Hydrogen Generation Due to High Voltage in Cathodic Protection


Cathodic protection is a widely used technique to prevent corrosion of metal structures in various industrial applications. The process involves making the metal structure cathodic with respect to a more easily corroded metal or an inert anode. This results in a flow of current, which causes the metal to be protected from corrosion. However, cathodic protection can also lead to the generation of hydrogen gas, which can cause hydrogen embrittlement.

Technical Background

When cathodic protection is applied, a voltage is applied to the metal structure, which is more negative than the equilibrium potential of the metal in the electrolyte. This negative potential causes a flow of electrons from the anode to the cathode. At the cathode, hydrogen ions are reduced to form hydrogen gas. This is a normal process in cathodic protection, but at high potentials, the amount of hydrogen generated can be excessive and lead to hydrogen embrittlement.

Hydrogen embrittlement occurs when hydrogen diffuses into the metal and interacts with the metal lattice. This can reduce the ductility and fracture toughness of the metal, making it more susceptible to cracking and failure. The severity of hydrogen embrittlement depends on factors such as the material, the level of hydrogen exposure, and the applied stress.


Hydrogen embrittlement was first observed in the mid-19th century in steel rails used in railway tracks. The rails were observed to fracture suddenly, even though they had not been subjected to excessive loads. It was later discovered that the rails had been exposed to hydrogen gas, which had caused them to become brittle and prone to fracture. Since then, hydrogen embrittlement has been observed in various other metals and alloys.

Mitigation Strategies

To mitigate the risk of hydrogen embrittlement in cathodic protection, several strategies can be employed. One approach is to limit the amount of hydrogen generated at the cathode by using lower cathodic potentials or adding inhibitors to the electrolyte. Another approach is to use materials that are less susceptible to hydrogen embrittlement, such as high-strength alloys or titanium.

Post-processing techniques can also be used to remove or reduce the amount of hydrogen in the material. For example, annealing or heat treatment can be used to diffuse the hydrogen out of the metal. Additionally, hydrogen diffusion barriers can be applied to prevent hydrogen from entering the metal in the first place.


In conclusion, cathodic protection is an effective method to prevent corrosion of metal structures, but it can also lead to the generation of hydrogen gas and subsequent hydrogen embrittlement. To mitigate the risk of hydrogen embrittlement, it is important to limit the amount of hydrogen generated at the cathode and use materials that are less susceptible to hydrogen embrittlement. Regular inspections and proactive corrosion management are crucial to detecting any signs of hydrogen embrittlement or other types of corrosion damage.


  • H. Wang, J. Zheng, Q. Zhang, & Y. Wei. (2019). Mitigation of hydrogen embrittlement of a 7B04 aluminum alloy by controlling the microstructure. Materials & Design, 170, 107675.
  • M. W. Kendig & R. G. Buchheit. (2003). Hydrogen embrittlement. Corrosion: Understanding the Basics, 305-324.
  • J. R. Scully & H. Zhu. (2010). Hydrogen embrittlement and hydrogen-induced cracking. ASM Handbook, 13B, 1085-1101.

Sentiment Analysis, how to sell more and better

Sentiment analysis is a powerful tool that can be used to analyze and understand the emotions, opinions, and attitudes expressed in a text or speech. This technology has gained significant importance in recent years as it helps businesses to understand customer feedback and sentiment, which can ultimately help them to make better decisions.

In this article, we will explore the techniques used in sentiment analysis and how it can help the retail industry.

What is Sentiment Analysis?

Sentiment analysis, also known as opinion mining, is a process that uses natural language processing, machine learning, and other computational techniques to identify and extract subjective information from text or speech data. It involves classifying the sentiment of a piece of text into positive, negative, or neutral categories.

The techniques used in sentiment analysis can vary from simple rule-based methods to more advanced machine learning algorithms. The most common approach is to use a combination of both methods.

Techniques used in Sentiment Analysis

Rule-based Methods
Rule-based methods rely on a set of predefined rules to classify sentiment in text. These rules can be based on specific words, phrases, or patterns that are associated with a particular sentiment. For example, if a sentence contains words like ‘good,’ ‘great,’ or ‘excellent,’ it is likely to be classified as positive.

While rule-based methods are simple and easy to implement, they can be less accurate than more advanced machine learning algorithms. They also require constant updating as language and expressions change over time.

Machine Learning
Machine learning algorithms use statistical models to learn from data and make predictions. These algorithms require a large dataset of labeled examples to train the model. The labeled data consists of text or speech samples, along with their corresponding sentiment labels.

There are several types of machine learning algorithms used in sentiment analysis, including:

Naive Bayes: This algorithm uses probabilistic models to classify text based on the frequency of words in the document.
Support Vector Machines (SVM): SVM is a supervised learning algorithm that can classify text into two or more categories based on the features extracted from the text.
Deep Learning: Deep learning techniques such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) can be used to learn complex patterns and relationships in text data.
How Sentiment Analysis Can Help the Retail Industry

The retail industry is one of the most competitive industries in the world, and understanding customer feedback and sentiment is crucial for success. Sentiment analysis can help retailers in several ways:

Customer Feedback Analysis

Sentiment analysis can be used to analyze customer feedback and reviews from various sources such as social media, review sites, and customer surveys. This information can help retailers to identify areas of improvement and make changes accordingly. For example, if customers are complaining about the long checkout lines, retailers can take steps to improve the checkout process and reduce wait times.

Product Development

Sentiment analysis can be used to analyze customer feedback on existing products and services. Retailers can use this information to identify areas for improvement or to develop new products that better meet the needs of their customers.

Brand Management

Sentiment analysis can be used to monitor brand reputation and identify potential issues before they become major problems. By analyzing social media conversations and other online content, retailers can track customer sentiment and respond quickly to any negative feedback.

Customer Service

Sentiment analysis can be used to monitor customer service interactions and identify areas where improvements can be made. By analyzing customer feedback and sentiment, retailers can improve the quality of customer service and enhance the overall customer experience.


Sentiment analysis is a powerful tool that can help the retail industry to understand customer feedback and sentiment. By analyzing customer feedback and sentiment, retailers can identify areas for improvement, develop new products, and enhance the overall customer experience. With the increasing availability of data and the advancements in machine learning algorithms, sentiment analysis has become more accurate and efficient. Retailers can now use sentiment analysis tools to process vast amounts of customer feedback and sentiment data in real-time, enabling them to make better business decisions quickly.

In addition to the benefits discussed above, sentiment analysis can also be used to analyze competitor data, track industry trends, and identify emerging market opportunities. By leveraging sentiment analysis, retailers can gain valuable insights into their customers’ needs and preferences, allowing them to stay ahead of the competition and adapt to changing market conditions.

However, it is important to note that sentiment analysis is not a perfect tool and can be affected by biases and inaccuracies in the data. For example, sarcasm and irony can be challenging to detect, and sentiment analysis tools may struggle to identify subtle nuances in language and context.

To mitigate these challenges, retailers should use a combination of sentiment analysis tools and human analysis to ensure the accuracy and relevance of their insights. By combining the power of technology with the expertise of human analysts, retailers can gain a deeper understanding of customer sentiment and make better decisions that drive business growth.

In conclusion, sentiment analysis is a powerful technology that can help the retail industry to understand customer sentiment and feedback. By leveraging sentiment analysis, retailers can gain valuable insights into their customers’ needs and preferences, allowing them to make better decisions and improve customer satisfaction.

How Machine Learning can save money in tomorrow’s industry

The oil and gas industry is constantly looking for ways to improve the efficiency of its operations and reduce downtime. One way to achieve this is through predictive maintenance, which uses machine learning algorithms to identify potential equipment failures before they occur. This article will explore the advantages and disadvantages of using machine learning for predictive maintenance in the oil and gas industry.

Advantages of Machine Learning for Predictive Maintenance in the Oil and Gas Industry

  1. Improved Equipment Performance

Machine learning algorithms can analyze real-time data to predict equipment failures before they occur. This allows companies to schedule maintenance proactively, resulting in improved equipment performance and reliability. By predicting when maintenance is needed, companies can avoid costly and time-consuming downtime.

  1. Reduced Downtime

Downtime can be costly for any industry, but it is especially significant in the oil and gas industry, where every minute of downtime can result in lost revenue. Predictive maintenance can help reduce downtime by identifying potential equipment failures before they occur. By preventing equipment failures, companies can avoid the costly downtime associated with repairs.

  1. Increased Safety

Predictive maintenance can help improve safety by identifying potential equipment failures before they occur. This can help prevent accidents and improve overall safety in the workplace. By identifying potential safety hazards, companies can take proactive steps to prevent accidents and keep their employees safe.

  1. Reduced Maintenance Costs

Predictive maintenance can help reduce maintenance costs by allowing companies to schedule maintenance only when it is necessary. This can reduce the need for unnecessary maintenance, which can be costly. By predicting when maintenance is needed, companies can avoid costly repairs and reduce their overall maintenance costs.

Disadvantages of Machine Learning for Predictive Maintenance in the Oil and Gas Industry

  1. Cost

Implementing machine learning algorithms can be expensive, especially for small to medium-sized companies. It may require significant investment in new hardware and software, as well as training for employees. However, the cost of implementing machine learning for predictive maintenance must be weighed against the potential cost savings from reduced downtime and maintenance costs.

  1. Data Quality

Predictive maintenance relies on high-quality data to produce accurate results. If the data is incomplete or inaccurate, the algorithms may produce inaccurate predictions, leading to unnecessary maintenance or equipment failure. To overcome this challenge, companies must ensure that they have high-quality data and that their algorithms are properly calibrated.

  1. Complexity

Machine learning algorithms can be complex, and it may be difficult for non-experts to understand how they work. This can make it challenging to implement and maintain the algorithms. To overcome this challenge, companies must ensure that they have skilled data analysts who can develop and maintain the algorithms.

  1. Need for Skilled Data Analysts

Machine learning algorithms require skilled data analysts to develop and maintain them. These experts can be difficult to find and expensive to hire, which can be a significant barrier to implementing predictive maintenance. Companies must ensure that they have the necessary resources to develop and maintain their algorithms.


The advantages of using machine learning for predictive maintenance in the oil and gas industry are clear. Improved equipment performance, reduced downtime, increased safety, and reduced maintenance costs can all be achieved through the use of these algorithms. However, the disadvantages, such as cost, data quality, complexity, and the need for skilled data analysts, must also be considered. Overall, the benefits of using machine learning for predictive maintenance in the oil and gas industry outweigh the drawbacks, making it an attractive option for companies looking to improve their operations. To successfully implement predictive maintenance using machine learning, companies must ensure that they have the necessary resources and expertise to develop and maintain their algorithms.

Data Scientist: Current State and Future Trend, a new role for the future

The field of data science has exploded in recent years, with demand for skilled professionals at an all-time high. Universities around the world now offer a variety of courses and degree programs in data science, including both undergraduate and graduate options. Online learning platforms such as Coursera, Udacity, and edX offer massive open online courses (MOOCs) in data science, allowing individuals to gain valuable knowledge and skills without enrolling in a full-time program.

While many universities and online courses provide a solid foundation in data science, heuristic knowledge gained through practical experience is equally important. Data scientists must have strong programming skills, as well as expertise in statistical analysis, machine learning, and data visualization. Effective communication skills are also essential, as data scientists must be able to explain their findings to both technical and non-technical stakeholders.

Some minimum requirements for a career in data science include a bachelor’s degree in a related field such as computer science, statistics, or mathematics, as well as experience with programming languages such as Python or R. However, many employers now require advanced degrees and significant work experience in the field.

According to the Bureau of Labor Statistics, the demand for data scientists is projected to grow by 16% between 2020 and 2030. This growth is expected to be driven by increasing demand for data-driven decision-making across industries. The field of data science is continually evolving, and professionals must keep up with the latest developments and technologies to stay competitive.

In addition to traditional data science roles, there are also emerging areas of specialization within the field, such as data engineering, data visualization, and data journalism. These specializations offer opportunities for individuals to focus on specific aspects of data science and develop expertise in a particular area.

In conclusion, data science is a rapidly growing field with strong demand for skilled professionals. While universities and online courses provide a foundation, practical experience and heuristic knowledge are equally important. Effective communication and programming skills are essential, and advanced degrees and work experience are increasingly required. With the continued growth of data-driven decision-making, the demand for data science professionals is expected to remain high.

Key Distinctions between Scientists and Engineer, to empower Data Analytics

Data analytics is a growing field, where data scientists and engineers are crucial for its success. Both roles involve working with data, but have distinct responsibilities. Science is more like research, while data engineering is more like development. The first analyze data to extract insights and make predictions, while data engineers design and maintain systems to enable data scientists to work with data.

Data scientists ask the right questions and find meaningful insights from data, while data engineers build and maintain the infrastructure. Engineering involves building the infrastructure to support data science, while data science involves using that infrastructure to extract insights to make data usable, while data science makes sense of it.

Both data scientists and data engineers have strong employment prospects. The demand for data scientists is projected to grow by 16% between 2020 and 2030, and for computer and information technology occupations, which include data engineers, by 11%. The increasing importance of data-driven decision making across industries means that the demand for both roles will continue to rise.

If you want to become a data engineer or data scientist, there are various educational paths to take. Many universities offer undergraduate and graduate programs in data science, computer science, or related fields. Additionally, various online courses and bootcamps offer training in data analytics, machine learning, and other relevant skills.

Data science and data engineering have vast and varied applications. In healthcare, data analytics improves patient outcomes and streamlines processes. In finance, data analytics detects fraud and predicts market trends. In retail, data analytics personalizes marketing campaigns and optimizes supply chain operations. Data science and data engineering drive innovation and create value across industries.


In conclusion, data scientists and data engineers are critical for data analytics success, with essential, distinct responsibilities. The demand for both roles will continue to increase, as data-driven decision making becomes more important. Pursuing a career in data analytics offers various educational paths and fields of application to explore.

Further resources

  1. “Python Data Science Handbook” by Jake VanderPlas: https://jakevdp.github.io/PythonDataScienceHandbook/
  2. “Data Science Essentials” by Microsoft: https://docs.microsoft.com/en-us/learn/paths/data-science-essentials/
  3. “Data Engineering Cookbook” by O’Reilly Media: https://www.oreilly.com/library/view/data-engineering-cookbook/9781492071424/
  4. “Data Science for Business” by Foster Provost and Tom Fawcett: https://www.amazon.com/Data-Science-Business-data-analytic-thinking/dp/1449361323
  5. “Data Engineering on Google Cloud Platform” by Google Cloud: https://cloud.google.com/solutions/data-engineering/
  6. “Applied Data Science with Python” by Coursera: https://www.coursera.org/specializations/data-science-python

Supervised, Unsupervised & Reinforced Learning, a quick intro!

In the field of predictive maintenance for rotating equipment, machine learning algorithms can be classified into three categories: supervised learning, unsupervised learning, and reinforced learning. Each of these approaches has its strengths and weaknesses, and choosing the right approach depends on the nature of the problem at hand. In this essay, we will explore the differences between these approaches and their applications in the context of predictive maintenance for rotating equipment.

Supervised Learning

Supervised learning involves training a model on labeled data, where both the input data and the desired output are provided. The goal is to learn a function that can predict the output for new, unseen input data. In the context of predictive maintenance for rotating equipment, supervised learning can be used to predict the remaining useful life of a machine or to detect anomalies that may indicate the onset of a fault.

One common application of supervised learning in predictive maintenance is to analyze vibration data from rotating machinery. By training a model on labeled data that indicates when a fault occurred and the corresponding vibration patterns, the algorithm can learn to identify these patterns in real-time data and predict potential faults before they occur.

Unsupervised Learning

Unsupervised learning involves training a model on unlabeled data, where the input data is provided without any corresponding output. The goal is to find patterns or structures in the data that can be used to make predictions or identify anomalies. In the context of predictive maintenance for rotating equipment, unsupervised learning can be used to identify patterns or clusters in sensor data that may indicate the presence of a fault.

One common application of unsupervised learning in predictive maintenance is to use clustering algorithms to group similar data points together. By analyzing the clusters, it may be possible to identify patterns that are indicative of a specific type of fault or to detect anomalies that may indicate the onset of a fault.

Reinforced Learning

Reinforcement learning involves training a model to make decisions based on feedback from the environment. The goal is to learn a policy that maximizes a reward signal over time. In the context of predictive maintenance for rotating equipment, reinforced learning can be used to develop maintenance schedules that minimize downtime and reduce costs.

One common application of reinforced learning in predictive maintenance is to use a model to determine when maintenance should be performed based on the condition of the machine and the cost of downtime. By learning a policy that balances the cost of maintenance with the cost of downtime, it may be possible to develop a more efficient maintenance schedule that reduces costs and increases efficiency.

Choosing the Right Approach

The choice of machine learning approach depends on the nature of the problem at hand. Supervised learning is best suited for problems where labeled data is available, and the goal is to predict an output for new, unseen data. Unsupervised learning is best suited for problems where the data is not labeled, and the goal is to identify patterns or anomalies in the data. Reinforced learning is best suited for problems where the goal is to develop a policy that maximizes a reward signal over time.

In the context of predictive maintenance for rotating equipment, a combination of these approaches may be used to develop a comprehensive predictive maintenance strategy. For example, supervised learning can be used to predict the remaining useful life of a machine, unsupervised learning can be used to identify patterns or clusters in sensor data, and reinforced learning can be used to develop a maintenance schedule that balances the cost of maintenance with the cost of downtime.


In conclusion, machine learning algorithms can be classified into three categories: supervised learning, unsupervised learning, and reinforced learning. Each of these approaches has its strengths and weaknesses, and choosing the right approach depends on the nature of the problem at hand. In the context of predictive maintenance for rotating equipment, a combination of these approaches may be used to develop a comprehensive predictive maintenance strategy that minimizes downtime, reduces costs

Hydrogen from Ammonia, a fuel for the future

Green ammonia is an emerging technology that has the potential to revolutionize the production of hydrogen and significantly reduce carbon emissions. In this article, we will discuss the production of hydrogen from green ammonia, key production and money figures, companies involved, and future trends.

Production of Hydrogen from Green Ammonia

Green ammonia is produced by using renewable energy sources such as wind or solar power to power the Haber-Bosch process, which produces ammonia. Green ammonia can then be used as a feedstock for the production of hydrogen through the process of ammonia cracking. The reaction is endothermic, requiring a reactor heated to a high temperature of around 700-900°C to break down ammonia into its constituent elements, nitrogen and hydrogen.

Key Production and Money Figures

The production of hydrogen from green ammonia has several advantages over traditional methods, including zero carbon emissions and lower energy requirements. According to the International Energy Agency (IEA), the production of green ammonia is expected to reach 25 million tonnes by 2030 and 500 million tonnes by 2050. The IEA also estimates that the production of green ammonia could reduce the cost of producing hydrogen by up to 50% compared to traditional methods.

Companies Involved

Several companies are involved in the production of green ammonia, including Yara, the world’s largest producer of ammonia, and Siemens Energy, which has developed an electrolysis-based process for producing green ammonia. Other companies involved in the production of green ammonia include Ørsted, a leading renewable energy company, and Air Liquide, a global leader in industrial gases.

Future Trends

The future of green ammonia production looks bright, with the potential for significant growth and contribution to reducing carbon emissions in the energy and agricultural sectors. The IEA has identified green ammonia as a key technology that could help to reduce carbon emissions. Green ammonia has the added benefit of being used as a fertilizer, further reducing the carbon footprint of agriculture. In addition, the use of green ammonia in the shipping industry as a fuel is being explored as a potential replacement for fossil fuels.


Green ammonia is a promising technology that has the potential to revolutionize the production of hydrogen and significantly reduce carbon emissions. Key production and money figures suggest that the production of green ammonia could increase significantly over the next few decades, with the potential to reduce the cost of producing hydrogen by up to 50%. Several companies are involved in the production of green ammonia, and the future looks bright with the potential for significant growth and contribution to reducing carbon emissions in the energy and agricultural sectors.