European commission on Oil Refining

Oil.

For richer, for poorer, for better, for worse, in sickness and in health, until death do us part.

Oil is the Marmite of the technological age. Governments love it, Environmentalists hate it; it’s big business that at face value appears to make the rich richer and the poor poorer; one that caters for today at the expense of tomorrow; one that invests in people at the expense of the environment.

Whatever your views on oil, there is another side to it which I wish to explore as part of this article.

Over the years oil has been discovered in some of the most inhospitable places on the planet, from desert sands to ocean floors to Arctic ice. Drilling for oil is both an art-form and as technologically challenging as you can get without resorting to rocket-science; and it is the path of oil, from the well to the refinery upon which I am basing this analogy.

Research and Development

Research and development forms the backbone of industry. It is this that tells you where to drill, what products to develop, what your customers require and which direction the market is leaning. It is a foolish prospector who drills without first identifying that the supply is there.

There is a saying that is often bandied around within software companies. “Build it and they will come”. This is usually dripping with sarcasm at the point of utterance as it implies that the person whose ears it is intended for hasn’t done their research, or identified that there is indeed a demand for the product they are providing.

In oil, research is about identifying from the geological surround that there is the prospect of discovering oil. It doesn’t categorically state that oil is to be found in that location but it does raise the potential of discovering it. These are Leads which may only turn into prospects once four key geological factors have been identified.

  • A source rock – When organic-rich rock such as oil shale or coal is subjected to high pressure and temperature over an extended period of time, hydrocarbons form.
  • Migration – The hydrocarbons are expelled from source rock by three density-related mechanisms: the newly matured hydrocarbons are less dense than their precursors, which causes overpressure; the hydrocarbons are lighter, and so migrate upwards due to buoyancy, and the fluids expand as further burial causes increased heating. Most hydrocarbons migrate to the surface as oil seeps, but some will get trapped.
  • Reservoir – The hydrocarbons are contained in a reservoir rock. This is commonly a porous sandstone or limestone. The oil collects in the pores within the rock although open fractures within non-porous rocks (e.g. fractured granite) may also store hydrocarbons. The reservoir must also be permeable so that the hydrocarbons will flow to surface during production.
  • Trap – The hydrocarbons are buoyant and have to be trapped within a structural (e.g. Anticline, fault block) or stratigraphic trap. The hydrocarbon trap has to be covered by an impermeable rock known as a seal or cap-rock in order to prevent hydrocarbons escaping to the surface

Source: Elements of a petroleum prospect

A similar path is followed within business. Here it is the job of the marketing department to understand if the geological environment offers the potential to sell the product we wish to develop. The geological surround may look different but fundamentally it is the same.

Software also requires research. When an application is being designed, the research looks at what applications already exist which solve a similar purpose. Libraries and components need to be sourced for the functionality our product will supply. Designs need to be produced showing how our product is to be assembled, where the integration points are; and research is carried out into which team contains the best match on skills in order to build the product we wish to provide.

Quality first.

Despite the analogy, there is a slight disconnect in that oil as it comes out of the ground is raw and unrefined. In fact crude oil is fairly useless before refinement, so why am I basing part of the analogy here?

The answer lies in the product. What is the product being developed? Is the product the software? Or is it the data which flows through our product?

In any business dealing with the information age, the product is the delivery of data between two points, from customer to data-warehouse and from data warehouse to customer. It is the job of software to facilitate this flow, to form the pipeline between our customers and our storage, to act as a refinement mechanism that can take the raw data and turn it into something tangible in the same way that an oil refinery takes crude oil and turns it into petroleum; plastic and petrochemicals for use within other industries.

The truth is that our oil is both data and software. When building an application, our crude oil is the code, raw from the developer’s head. The refinement process is the mechanism by which we build and test our code, turning it into libraries and applications all designed to service the needs of our customers.

Quality has to be at the heart of everything we produce. Within our analogy we are dealing with crude oil which has the potential to cause an environmental disaster at enormous scale if even the smallest component of our pipeline fails. Everything is tested from the smallest spigot to full size rigs.

Deepwater horizon drilling rig before the explosion
Deepwater horizon drilling rig

In April 2010, defective cement in the lining of an oil well in the Gulf of Mexico caused the explosion and sinking of the Deepwater Horizon drilling rig at the cost of 11 lives and the discharge of 210 million gallons of oil into the natural environment.

Similarly, in December 2015 an independent security researcher uncovered a database containing the personal records of 191 million voters in the United States caused by a mis-configured database.

Both instances are considered to be the largest leaks in the history of their respective industries and both leaks could have been avoided if appropriate tests had been properly utilised.

Testing comes in many forms and covers not only the products we are providing but the people providing them. The interview is a test to see if a person has the skills we require to build our products, In software we test a component or library to see if it provides the service we require in the manner we require it. We look at the quality of code contained within that component and ask questions about it. “Is this component built to a standard which is suitable for our product”. The assembly of an oil rig goes through similar tests with each part utilised being built to a specific standard. “This valve is built to ISO 14313 and is suitable for use within the petrochemical industry, not for use within sub-sea pipelines”.

If a person doesn’t pass the interview, they are rejected. If a valve doesn’t adhere to the standard it is also rejected. Likewise, it should be the norm that a software component which does not adhere to the standard laid out for the language it is written in, it should also be rejected.

The reality is in software development this rarely happens. Code is produced in volume and generally under pressure from above. Corners are cut, tests are skipped and all too often we design our products to fail. Instead of designing for strength and stability, as I previously discussed in my article on Lighthouse Design, we instead choose to build our product top down, from poorly chosen materials on shifting sand, putting our faith in prayer that it won’t fall over or collapse under the weight of data we feed through it.

Extraction

Once the potential for supply has been established, drilling rigs are sent to that location to prove that supply is there. In software these are proof of concept models. “This is how your data will be presented” and “This is the information you can get out of it” become the key statements we demonstrate to customers. If we have correctly identified our supply and the raw data to be provided is of sufficient quality, then at this point the lead may turn into a prospect.

A proof of concept (PoC) model is the rapid assembly of components in a given configuration designed to prove the feasibility of extraction of a specific set of data. I often hear arguments at this point that PoCs are throwaway code and do not require testing, and in most instances they are disposable.

Does that mean they do not require testing? I disagree. If a drilling rig sent out to sea to test for oil in the Gulf of Mexico can leak 210 million gallons of oil, likewise your untested proof of concept executing in a live environment on customer data, can also cause an explosion resulting in the loss of thousands, if not millions of records.

To avoid the accidental loss of data, Proof of Concepts should always be executed in a lab environment. Whilst we may divert the flow of data from our production servers to run it through our model and prove to ourselves and to our customers that it is safe to drill in this location, checks must be in place to ensure the model does not leak. “Is that 3rd party component we dragged down off dodgy-site.com sending our customers data back to random-hacker.org?”

Testing provides the assurance that the products we build are safe, that the components we use within our models are not going to cause our own version of Deepwater Horizon. Standards give us the assurance that what we build will protect our customers from harm and not lead to our application breaking as the volume of data is scaled up.

The argument of whether a proof of concept should be tested, in my mind is a moot point. A PoC is a test in itself, we aren’t entirely sure that what we are drilling for really exists and we don’t really know how it will behave once it’s extracted. Drilling for oil comes with risks.

Oil may sit under a layer of methane or natural gas, volatile and explosive with the potential to kill. Likewise within software, the information we retrieve may cause the sudden and unexpected loss of service; and whether this takes out our lab or our production environments, it will come at a cost.

It is with great care that we should design and build our products, even at the PoC level for 20 developers sitting around twiddling their thumbs whilst a lab is rebuilt is an unforgivable expense. The alternative to this is to build in isolation, but isolation leads to mistakes and mistakes lead to leaks.

Completion

Oil has been discovered and there are billions of gallons down there. Our customers love what we have demonstrated to them and are hungry for our product.

Great. Now what?

Completion is the process by which the well is enabled for production. A flow path is created to direct oil into the well, acid and fracturing fluids are pumped in to stimulate the rock to produce oil into the wellbore and this wellbore is cased off and connected to the surface by smaller tubes.

Now the real work begins. Pipelines have to be built to take the code from the developer, build it, test it and finally deploy it. This pipeline generates the pipelines to take data to and from the customer. Servers need to be built to house the data and provide a means for the customer to access it. Checks and gauges need to be provided to ensure that not only is the code of sufficient quality, but that our systems are capable of sustaining the flow of data without leakage.

In my previous post on the Divergence principle, I proposed an algorithm that is designed to prove that conflicts in code are inevitable. The truth is that this algorithm is actually more generic as the output is dependant on the input. In the article, the inputs were Git branches yet a recent deployment failure highlighted that the algorithm I proposed isn’t just a branch conflict predictor but a predictor of when a pipeline will fail.

When developing a product, always work on the assumption that your code is going to fail. You are going to lose data and that loss is going to cause your customers incredible harm. Yes this is a pessimistic approach, but isn’t pessimism the very mechanism which keeps our products safe? Will our products not benefit from the extra vigilance pessimism encourages?

Depending on your perspective, completion is not about completing the product but is about building the development environment to support our products during production, and putting the pipelines into place to move our product through each stage. Our product is the refinery by which we extract petroleum to service our customers, but that refinery needs to be built which requires its’ own refinery, pipelines and storage facilities.

Production

In software development, production is the stage whereby a product is developed against the requirements, tested and deployed. In business, production is the stage whereby customers provide data into the system, it is refined and the results returned to them.

On all sides, production is the most important phase of a well’s life. Crude Oil is produced from the well-head and delivered either into a tank or silo for storage pending delivery, or direct to the delivery pipeline, via a christmas tree which regulates pressure, controls flow and allows access into the wellbore.

Sub-surface christmas tree
Sub-surface christmas tree

When delivering code, we use similar structures. Version Control Systems such as GIT, SVN and Mercurial are our wellbore. Continuous integration platforms such as Jenkins, Bamboo and Travis CI are our Christmas tree, artefact storage such as Artifactory and Nexus are our silos, development and test environments are our refinery and our production environments are the pumps from which we fuel our engines.

Refinement

When I started this article, the intent was that it would follow the path of code from developer to production. Instead it is more complex than that.

Code as a product becomes the refinery to service customers, a product within a product, but this is the nature of software development. Oil pipelines themselves require refined oil and petroleum to help them operate. Both industries are self serving in that they rely 100% on their own product to ensure continuous delivery from well to refinery.

In software development, the refinement process is carried out at every phase of the lifecycle. Developers produce tests to ensure the code they write behaves in the manner they desire. Continuous Integration environments build the product and execute these tests, producing metrics about the quality of the product being delivered. The product is then packaged and delivered either direct into the refinery or into a silo whereby it can be pulled into the refinery at a later time.

well-to-refinery-software-process

Development and test environments are the mechanism by which quality is ensured and the product is refined to ensure it is capable of servicing our customers. If we want our product to succeed, it is in this environment that we need to concentrate our efforts in order to avoid our own version of Deepwater Horizon.

Delivery

The final stage of the process is the delivery of petrol to the pump, oil to garages and shops, plastic and petrochemicals to industry, and our product to the servers under which it will execute. Here it sits, waiting for customers’ data to flow in so it can begin its’ refinement process.

Only time will tell if the product is now suitable for our customers but the choice is ours over whether we build a well oiled machine, designed and built to standards, with rigorous testing and QA, or whether the product we build will overheat and burn out through lack of lubrication.

Leave a Comment