Pipelines Configuration with YAML: How and why to configure pipelines as code?

YAML, which stands for "YAML Ain't Markup Language" or sometimes "Yet Another Markup Language," is a data serialization language. It is a superset of JSON, but what makes it suitable as a format for configuration files is the fact that it is more readable and easier to understand.

YAML was designed to be easily readable by both humans and machines, and it emphasizes simplicity and clarity in its syntax. YAML files are simple text files with a .yaml or .yml extension. The YAML syntax itself does not dictate the file format – it's the plain text representation that follows the YAML syntax rules.

‍

YAML Syntax

YAML uses indentation and a minimal set of punctuation marks to define data structures. Indentation is used to represent the hierarchy of data, and it can be done with spaces only. The number of spaces used for indentation is not important, but it must be consistent.

In YAML syntax, there are three main types of data structures:

1. Mappings

2. Lists or Arrays

3. Scalars

Mappings are collections of key-value pairs. Each key is associated with a value, and the structure is defined by indentation.

blog:
  name: Connfiguring Pipelines as code with YAML

In lists or arrays, each item in the sequence is represented with a hyphen (-).

list:
  - first
  - second
  - third

Scalars represent single atomic values, such as Strings, Numbers, and Booleans. Scalars do not have child nodes.

company: Outecho

In YAML syntax, there are also aliases and anchors. Aliases can be used to reference the same node in the other part of the YAML document. They are created using the asterisk character. Anchors allow a YAML node to be marked and referenced using an alias. Anchors are created using the ampersand character.

Comments in YAML start with the # symbol. They are ignored by the YAML parser and are used for human readability.

# Define an anchor "&" for a sequence
companyInfo: &companyDetails
	name: Outecho
  employees: 10
  
# Use an alias "*" to reference the anchor
company:
  details: *companyDetails
  department: IT

Advantages of using YAML pipelines

In Azure DevOps, YAML is commonly used to define CI/CD (Continuous Integration/Continuous Deployment) pipelines. It allows users to express their build and release configurations in the format of code. This approach provides many advantages, and the key ones are:

Version Control Integration:

YAML files can be stored in version control repositories such as Git, together with the application’s source code.
This way, changes to the pipeline can be tracked, reviewed, tested, and deployed like any other code change, which provides transparency and auditability. Versioning of the configuration changes and history tracking makes it easier to manage changes to the pipeline over time.
Good organization of YAML configuration files contributes to improved maintainability and collaboration among team members.

Cross-Platform Compatibility:

YAML is platform-independent and can be used across various operating systems and environments.
The same YAML configuration file can be used to define pipelines for different stages (e.g., DEV, QA, Staging, and Production environments)
YAML provides a standardized way to define pipelines, promoting consistency across projects.

Modularity and Reusability:

Modular and reusable components can be created using YAML configuration files
YAML pipelines support the use of templates, allowing users to define reusable components and share them across multiple pipelines. This promotes modularity and reduces duplication of configuration code.

Integration with Source Control Events:

YAML pipelines can be triggered by source control events, such as code pushes or pull requests. This enables automated builds and tests in response to changes, promoting a continuous integration workflow.

Configure API Build with YAML Configuration File

Let’s dive into a concrete example and problem to resolve when creating a CI/CD pipeline using YAML configuration.

To create a YAML file to use for code management and deployment of a simple API, we need to define triggers, paths, pools, and steps to be executed. Below is a screenshot of a simple YAML file for building API, with the explanations for each part of it.

trigger:
  branches:
    include:
    - develop
    - releases/*
    - hotfix/*
  paths:
    exclude:
    - README.md
 
pool:
   name: 'Azure Pipelines'
   vmImage: 'ubuntu-latest'
variables:
   buildConfiguration: 'Release'

Trigger:

Trigger specifies conditions in the version control system that should initiate the CI process
In our example, the pipeline is triggered for changes in the develop branch and any branch that starts with releases/ or hotfix/

Paths:

Specifies the paths within the repository that should be considered for triggering the pipeline.
In our example, changes in the repository are considered except for changes in the README.md file
This part of the YAML file can be used to exclude projects that we do not want to have included in the build, for example, if we have different CI/CD pipelines for API and Web projects and they are part of the same solution and git repository, this is the place where we can define which YAML file includes which build

Pool:

This section defines the execution environment or resources that should be used to run the jobs within the pipeline
vmImage specifies the type of virtual machine image to be used for running the jobs
In our example, it is specified to use the latest version of a Windows virtual machine image.

steps:
- task: NuGetToolInstaller@1
  name: 'NuGetInstaller'
  displayName: 'Install NuGet Packages'
 
- task: NuGetCommand@2
  name: 'NuGetRestore'
  displayName: 'Restore NuGet Packages'
  inputs:
    restoreSolution: '$(solution)'
 
- task: DotNetCoreCLI@2
  name: 'BuildBlogAPI'
  displayName: 'Build Blog API'
  inputs:
    command: 'build'
    projects: '**/Blog.API/Blog.API.csproj'
    arguments: '--configuration $(buildConfiguration)'
 
- task: DotNetCoreCLI@2
  displayName: 'Blog API: Publish'
  inputs:
    command: publish
    publishWebProjects: True
    arguments: '--configuration $(buildConfiguration) --output $(build.artifactstagingdirectory)'
    zipAfterPublish: True
 
- task: PublishBuildArtifacts@1
  displayName: 'Blog API: Publish Build Artifact'
  inputs:
    PathtoPublish: '$(build.artifactstagingdirectory)'
    ArtifactName: Blog.API

Steps:

The steps section is the section where the sequence of tasks or actions that need to be performed as part of the pipeline is defined. Each step represents a unit of work, such as building the code, running tests, or deploying the application.
Steps are specified as a list, and each step is prefixed with a hyphen (-)
First part of the configuration file is related to the building of API .
- In our example, building of API is done using the following three predefined tasks:
  - NuGetToolInstaller
  - NuGetCommand to run command for NuGet restore
  - DotNetCoreCLI to run command for building a project

Second part of the configuration file is related to the publishing of artifacts
- In our example, publishing of API build artifacts is done using the following two predefined tasks:
  - DotNetCoreCLI
  - PublishBuildArtifacts

Reuse YML files to avoid code duplication

In the example above, the created .yml file can be used for the CI/CD pipeline that will deploy the latest code to the app service, but what if we want to create a PR build that will be triggered as part of the branch policy? We can easily create a new pipeline for PR build and reference a new .yml file for it. The difference between the CI build and the PR build would only be – we don’t do the publish artifacts steps in the PR build. PR build’s responsibility is to confirm that the new code changes do not break the build of the API projects. But we don’t want to duplicate code in the YAML files as well, right?

The good news is that in .yml files we can reference other .yml files. The avoid code duplication, we can create a “build-api.yml” file that can contain steps related to the build. In our example, those steps would be: install nugets, restore nugets, and build API project. The way of referencing yml file in other yml file is shown in the screenshot below.

steps:
- template: build-api.yml
  parameters:
    buildConfiguration: $(buildConfiguration)

CONCLUSION

The simplicity and human-readable nature of YAML syntax makes it easier to maintain and understand the pipeline configuration, for both developers and stakeholders. It offers a clear representation of the build and deployment steps, allowing the development team to adapt and scale their CI/CD processes as their projects evolve. The advantages of YAML pipelines, explained in this blog, make YAML a valuable tool in modern software development practices that lead to successful and reliable deliveries of software. Ultimately, the adoption of YAML pipelines represents a strategic move toward more agile, efficient, and collaborative development practices.

Pipelines Configuration with YAML: How and why to configure pipelines as code?

Explore other articles