The personal website of Scott W Harden
May 1st, 2022

Using DataFrames in C#

The DataFrame is a data structure designed for manipulation, analysis, and visualization of tabular data, and it is the cornerstone of many data science applications. One of the most famous implementations of the data frame is provided by the Pandas package for Python. An equivalent data structure is available for C# using Microsoft's data analysis package. Although data frames are commonly used in Jupyter notebooks, they can be used in standard .NET applications as well. This article surveys Microsoft's Data Analysis package and introduces how to interact with with data frames using C# and the .NET platform.

DataFrame Quickstart

  • A DataFrame is a 2D matrix that stores data values in named columns.
  • Each column has a distinct data type.
  • Rows represent observations.

Add the Microsoft.Data.Analysis package to your project, then you can create a DataFrame like this:

using Microsoft.Data.Analysis;

string[] names = { "Oliver", "Charlotte", "Henry", "Amelia", "Owen" };
int[] ages = { 23, 19, 42, 64, 35 };
double[] heights = { 1.91, 1.62, 1.72, 1.57, 1.85 };

DataFrameColumn[] columns = {
    new StringDataFrameColumn("Name", names),
    new PrimitiveDataFrameColumn<int>("Age", ages),
    new PrimitiveDataFrameColumn<double>("Height", heights),
};

DataFrame df = new(columns);

Contents of a DataFrame can be previewed using Console.WriteLine(df) but the formatting isn't pretty.

Name  Age   Height
Oliver23    1.91
Charlotte19    1.62
Henry 42    1.72
Amelia64    1.57
Owen  35    1.85

Pretty DataFrame Formatting

A custom PrettyPrint() extension method can improve DataFrame readability. Implementing this as an extension method allows me to call df.PrettyPrint() anywhere in my code.

💡 View the full PrettyPrinters.cs source code
using Microsoft.Data.Analysis;
using System.Text;

internal static class PrettyPrinters
{
    public static void PrettyPrint(this DataFrame df) => Console.WriteLine(PrettyText(df));
    public static string PrettyText(this DataFrame df) => ToStringArray2D(df).ToFormattedText();

    public static string ToMarkdown(this DataFrame df) => ToStringArray2D(df).ToMarkdown();

    public static void PrettyPrint(this DataFrameRow row) => Console.WriteLine(Pretty(row));
    public static string Pretty(this DataFrameRow row) => row.Select(x => x?.ToString() ?? string.Empty).StringJoin();
    private static string StringJoin(this IEnumerable<string> strings) => string.Join(" ", strings.Select(x => x.ToString()));

    private static string[,] ToStringArray2D(DataFrame df)
    {
        string[,] strings = new string[df.Rows.Count + 1, df.Columns.Count];

        for (int i = 0; i < df.Columns.Count; i++)
            strings[0, i] = df.Columns[i].Name;

        for (int i = 0; i < df.Rows.Count; i++)
            for (int j = 0; j < df.Columns.Count; j++)
                strings[i + 1, j] = df[i, j]?.ToString() ?? string.Empty;

        return strings;
    }

    private static int[] GetMaxLengthsByColumn(this string[,] strings)
    {
        int[] maxLengthsByColumn = new int[strings.GetLength(1)];

        for (int y = 0; y < strings.GetLength(0); y++)
            for (int x = 0; x < strings.GetLength(1); x++)
                maxLengthsByColumn[x] = Math.Max(maxLengthsByColumn[x], strings[y, x].Length);

        return maxLengthsByColumn;
    }

    private static string ToFormattedText(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x] + 2));
            }
            sb.AppendLine();
        }

        return sb.ToString();
    }


    private static string ToMarkdown(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x]));
                if (x < strings.GetLength(1) - 1)
                    sb.Append(" | ");
            }
            sb.AppendLine();

            if (y == 0)
            {
                for (int i = 0; i < strings.GetLength(1); i++)
                {
                    int bars = maxLengthsByColumn[i] + 2;
                    if (i == 0)
                        bars -= 1;
                    sb.Append(new String('-', bars));

                    if (i < strings.GetLength(1) - 1)
                        sb.Append("|");
                }
                sb.AppendLine();
            }
        }

        return sb.ToString();
    }
}
Name       Age  Height
Oliver     23   1.91
Charlotte  19   1.62
Henry      42   1.72
Amelia     64   1.57
Owen       35   1.85

I can create similar methods to format a DataFrame as Markdown or HTML.

Name      | Age | Height
----------|-----|--------
Oliver    | 23  | 1.91
Charlotte | 19  | 1.62
Henry     | 42  | 1.72
Amelia    | 64  | 1.57
Owen      | 35  | 1.85
Name Age Height
Oliver 23 1.91
Charlotte 19 1.62
Henry 42 1.72
Amelia 64 1.57
Owen 35 1.85

Using DataFrames in Interactive Notebooks

To get started using .NET workbooks, install the .NET Interactive Notebooks extension for VS Code, create a new demo.ipynb file, then add your code.

Previously users had to create custom HTML formatters to properly display DataFrames in .NET Interactive Notebooks, but these days it works right out of the box.

💡 See demo.html for a full length demonstration notebook

// visualize the DataFrame
df

Append a Row

Build a new row using key/value pair then append it to the DataFrame

List<KeyValuePair<string, object>> newRowData = new()
{
    new KeyValuePair<string, object>("Name", "Scott"),
    new KeyValuePair<string, object>("Age", 36),
    new KeyValuePair<string, object>("Height", 1.65),
};

df.Append(newRowData, inPlace: true);

Add a Column

Build a new column, populate it with data, and add it to the DataFrame

int[] weights = { 123, 321, 111, 121, 131 };
PrimitiveDataFrameColumn<int> weightCol = new("Weight", weights);
df.Columns.Add(weightCol);

Sort and Filter

The DataFrame class has numerous operations available to sort, filter, and analyze data in many different ways. A popular pattern when working with DataFrames is to use method chaining to combine numerous operations together into a single statement. See the DataFrame Class API for a full list of available operations.

df.OrderBy("Name")
    .Filter(df["Age"].ElementwiseGreaterThan(30))
    .PrettyPrint();
Name    Age  Height
Henry   42   1.72
Oliver  23   1.91
Owen    35   1.85

Mathematical Operations

It's easy to perform math on columns or across multiple DataFrames. In this example we will perform math using two columns and create a new column to hold the output.

DataFrameColumn iqCol = df["Age"] * df["Height"] * 1.5;

double[] iqs = Enumerable.Range(0, (int)iqCol.Length)
    .Select(x => (double)iqCol[x])
    .ToArray();

df.Columns.Add(new PrimitiveDataFrameColumn<double>("IQ", iqs));
df.PrettyPrint();
Name       Age  Height  IQ
Oliver     23   1.91    65.9
Charlotte  19   1.62    46.17
Henry      42   1.72    108.36
Amelia     64   1.57    150.72
Owen       35   1.85    97.12

Statistical Operations

You can iterate across every row of a column to calculate population statistics

foreach (DataFrameColumn col in df.Columns.Skip(1))
{
    // warning: additional care must be taken for datasets which contain null
    double[] values = Enumerable.Range(0, (int)col.Length).Select(x => Convert.ToDouble(col[x])).ToArray();
    (double mean, double std) = MeanAndStd(values);
    Console.WriteLine($"{col.Name} = {mean} +/- {std:N3} (n={values.Length})");
}
Age = 36.6 +/- 15.982 (n=5)
Height = 1.734 +/- 0.130 (n=5)
💡 View the full MeanAndStd() source code
private static (double mean, double std) MeanAndStd(double[] values)
{
    if (values is null)
        throw new ArgumentNullException(nameof(values));

    if (values.Length == 0)
        throw new ArgumentException($"{nameof(values)} must not be empty");

    double sum = 0;
    for (int i = 0; i < values.Length; i++)
        sum += values[i];

    double mean = sum / values.Length;

    double sumVariancesSquared = 0;
    for (int i = 0; i < values.Length; i++)
    {
        double pointVariance = Math.Abs(mean - values[i]);
        double pointVarianceSquared = Math.Pow(pointVariance, 2);
        sumVariancesSquared += pointVarianceSquared;
    }

    double meanVarianceSquared = sumVariancesSquared / values.Length;
    double std = Math.Sqrt(meanVarianceSquared);

    return (mean, std);
}

Plot Values from a DataFrame

I use ScottPlot.NET to visualize data from DataFrames in .NET applications and .NET Interactive Notebooks. ScottPlot can generate a variety of plot types and has many options for customization. See the ScottPlot Cookbook for examples and API documentation.

// Register a custom formatter to display ScottPlot plots as images
using Microsoft.DotNet.Interactive.Formatting;
Formatter.Register(typeof(ScottPlot.Plot), (plt, writer) => 
    writer.Write(((ScottPlot.Plot)plt).GetImageHTML()), HtmlFormatter.MimeType);
// Get the data you wish to display in double arrays
double[] ages = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Age"][x])).ToArray();
double[] heights = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Height"][x])).ToArray();
// Create and display a plot
var plt = new ScottPlot.Plot(400, 300);
plt.AddScatter(ages, heights);
plt.XLabel("Age");
plt.YLabel("Height");
plt

💡 See demo.html for a full length demonstration notebook

If you are only working inside a Notebook and you want all your plots to be HTML and JavaScript, XPlot.Plotly is a good tool to use.

Data may contain null

I didn't demonstrate it in the code examples above, but note that all column data types are nullable. While null-containing data requires extra considerations when writing mathematical routes, it's a convenient way to model missing data which is a common occurrence in the real world.

Why not just use LINQ?

I see this question asked frequently, often with an aggressive and condescending tone. LINQ (Language-Integrated Query) is fantastic for performing logical operations on simple collections of data. When you have large 2D datasets of labeled data, advantages of data frames over flat LINQ statements start to become apparent. It is also easy to perform logical operations across multiple data frames, allowing users to write simpler and more readable code than could be achieved with LINQ statements. Data frames also make it much easier to visualize complex data too. In the data science world where complex labeled datasets are routinely compared, manipulated, merged, and visualized, often in an interactive context, the data frames are much easier to work with than raw LINQ statements.

Conclusions

Although I typically reach for Python to perform exploratory data science, it's good to know that C# has a DataFrame available and that it can be used to inspect and manipulate tabular data. DataFrames pair well with ScottPlot figures in interactive notebooks and are a great way to inspect and communicate complex data. I look forward to watching Microsoft's Data Analysis namespace continue to evolve as part of their machine learning / ML.NET platform.

Resources

Markdown source code last modified on May 2nd, 2022
---
title: Using DataFrames in C#
description: How to use the DataFrame class from the Microsoft.Data.Analysis package to interact with tabular data
date: 2022-05-01 23:00:00
tags: csharp
---

# Using DataFrames in C# 

**The DataFrame is a data structure designed for manipulation, analysis, and visualization of tabular data, and it is the cornerstone of many data science applications.** One of the most famous implementations of the data frame is provided by the Pandas package for Python. An equivalent data structure is available for C# using Microsoft's data analysis package. Although data frames are commonly used in Jupyter notebooks, they can be used in standard .NET applications as well. This article surveys Microsoft's Data Analysis package and introduces how to interact with with data frames using C# and the .NET platform.

## DataFrame Quickstart

* A DataFrame is a 2D matrix that stores data values in named columns.
* Each column has a distinct data type.
* Rows represent observations.

Add the [Microsoft.Data.Analysis package](https://www.nuget.org/packages/Microsoft.Data.Analysis/) to your project, then you can create a DataFrame like this:

```cs
using Microsoft.Data.Analysis;

string[] names = { "Oliver", "Charlotte", "Henry", "Amelia", "Owen" };
int[] ages = { 23, 19, 42, 64, 35 };
double[] heights = { 1.91, 1.62, 1.72, 1.57, 1.85 };

DataFrameColumn[] columns = {
    new StringDataFrameColumn("Name", names),
    new PrimitiveDataFrameColumn<int>("Age", ages),
    new PrimitiveDataFrameColumn<double>("Height", heights),
};

DataFrame df = new(columns);
```

Contents of a DataFrame can be previewed using `Console.WriteLine(df)` but the formatting isn't pretty.

```text
Name  Age   Height
Oliver23    1.91
Charlotte19    1.62
Henry 42    1.72
Amelia64    1.57
Owen  35    1.85
```

## Pretty DataFrame Formatting

**A custom `PrettyPrint()` extension method can improve DataFrame readability.** Implementing this as an extension method allows me to call `df.PrettyPrint()` anywhere in my code.

<details>
<summary>💡 View the full <code>PrettyPrinters.cs</code> source code</summary>

```cs
using Microsoft.Data.Analysis;
using System.Text;

internal static class PrettyPrinters
{
    public static void PrettyPrint(this DataFrame df) => Console.WriteLine(PrettyText(df));
    public static string PrettyText(this DataFrame df) => ToStringArray2D(df).ToFormattedText();

    public static string ToMarkdown(this DataFrame df) => ToStringArray2D(df).ToMarkdown();

    public static void PrettyPrint(this DataFrameRow row) => Console.WriteLine(Pretty(row));
    public static string Pretty(this DataFrameRow row) => row.Select(x => x?.ToString() ?? string.Empty).StringJoin();
    private static string StringJoin(this IEnumerable<string> strings) => string.Join(" ", strings.Select(x => x.ToString()));

    private static string[,] ToStringArray2D(DataFrame df)
    {
        string[,] strings = new string[df.Rows.Count + 1, df.Columns.Count];

        for (int i = 0; i < df.Columns.Count; i++)
            strings[0, i] = df.Columns[i].Name;

        for (int i = 0; i < df.Rows.Count; i++)
            for (int j = 0; j < df.Columns.Count; j++)
                strings[i + 1, j] = df[i, j]?.ToString() ?? string.Empty;

        return strings;
    }

    private static int[] GetMaxLengthsByColumn(this string[,] strings)
    {
        int[] maxLengthsByColumn = new int[strings.GetLength(1)];

        for (int y = 0; y < strings.GetLength(0); y++)
            for (int x = 0; x < strings.GetLength(1); x++)
                maxLengthsByColumn[x] = Math.Max(maxLengthsByColumn[x], strings[y, x].Length);

        return maxLengthsByColumn;
    }

    private static string ToFormattedText(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x] + 2));
            }
            sb.AppendLine();
        }

        return sb.ToString();
    }


    private static string ToMarkdown(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x]));
                if (x < strings.GetLength(1) - 1)
                    sb.Append(" | ");
            }
            sb.AppendLine();

            if (y == 0)
            {
                for (int i = 0; i < strings.GetLength(1); i++)
                {
                    int bars = maxLengthsByColumn[i] + 2;
                    if (i == 0)
                        bars -= 1;
                    sb.Append(new String('-', bars));

                    if (i < strings.GetLength(1) - 1)
                        sb.Append("|");
                }
                sb.AppendLine();
            }
        }

        return sb.ToString();
    }
}
```

</details>

```cs
Name       Age  Height
Oliver     23   1.91
Charlotte  19   1.62
Henry      42   1.72
Amelia     64   1.57
Owen       35   1.85
```

I can create similar methods to format a DataFrame as Markdown or HTML.

```text
Name      | Age | Height
----------|-----|--------
Oliver    | 23  | 1.91
Charlotte | 19  | 1.62
Henry     | 42  | 1.72
Amelia    | 64  | 1.57
Owen      | 35  | 1.85
```


Name      | Age | Height
----------|-----|--------
Oliver    | 23  | 1.91
Charlotte | 19  | 1.62
Henry     | 42  | 1.72
Amelia    | 64  | 1.57
Owen      | 35  | 1.85

## Using DataFrames in Interactive Notebooks

To get started using .NET workbooks, install the [.NET Interactive Notebooks extension for VS Code](https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.dotnet-interactive-vscode), create a new `demo.ipynb` file, then add your code.

Previously users had to create custom HTML formatters to properly display DataFrames in .NET Interactive Notebooks, but these days it works right out of the box.

> 💡 See [demo.html](demo.html) for a full length demonstration notebook

```cs
// visualize the DataFrame
df
```

![](dataframe-notebook.jpg)

## Append a Row

Build a new row using key/value pair then append it to the DataFrame

```cs
List<KeyValuePair<string, object>> newRowData = new()
{
    new KeyValuePair<string, object>("Name", "Scott"),
    new KeyValuePair<string, object>("Age", 36),
    new KeyValuePair<string, object>("Height", 1.65),
};

df.Append(newRowData, inPlace: true);
```

## Add a Column

Build a new column, populate it with data, and add it to the DataFrame

```cs
int[] weights = { 123, 321, 111, 121, 131 };
PrimitiveDataFrameColumn<int> weightCol = new("Weight", weights);
df.Columns.Add(weightCol);
```

## Sort and Filter

**The DataFrame class has numerous operations available** to sort, filter, and analyze data in many different ways. A popular pattern when working with DataFrames is to use _method chaining_ to combine numerous operations together into a single statement. See the [DataFrame Class API](https://docs.microsoft.com/en-us/dotnet/api/microsoft.data.analysis.dataframe) for a full list of available operations.

```cs
df.OrderBy("Name")
    .Filter(df["Age"].ElementwiseGreaterThan(30))
    .PrettyPrint();
```

```text
Name    Age  Height
Henry   42   1.72
Oliver  23   1.91
Owen    35   1.85
```

## Mathematical Operations

It's easy to perform math on columns or across multiple DataFrames. In this example we will perform math using two columns and create a new column to hold the output.

```cs
DataFrameColumn iqCol = df["Age"] * df["Height"] * 1.5;

double[] iqs = Enumerable.Range(0, (int)iqCol.Length)
    .Select(x => (double)iqCol[x])
    .ToArray();

df.Columns.Add(new PrimitiveDataFrameColumn<double>("IQ", iqs));
df.PrettyPrint();
```

```text
Name       Age  Height  IQ
Oliver     23   1.91    65.9
Charlotte  19   1.62    46.17
Henry      42   1.72    108.36
Amelia     64   1.57    150.72
Owen       35   1.85    97.12
```

## Statistical Operations

You can iterate across every row of a column to calculate population statistics

```cs
foreach (DataFrameColumn col in df.Columns.Skip(1))
{
    // warning: additional care must be taken for datasets which contain null
    double[] values = Enumerable.Range(0, (int)col.Length).Select(x => Convert.ToDouble(col[x])).ToArray();
    (double mean, double std) = MeanAndStd(values);
    Console.WriteLine($"{col.Name} = {mean} +/- {std:N3} (n={values.Length})");
}
```

```text
Age = 36.6 +/- 15.982 (n=5)
Height = 1.734 +/- 0.130 (n=5)
```


<details>
<summary>💡 View the full <code>MeanAndStd()</code> source code</summary>

```cs
private static (double mean, double std) MeanAndStd(double[] values)
{
	if (values is null)
		throw new ArgumentNullException(nameof(values));

	if (values.Length == 0)
		throw new ArgumentException($"{nameof(values)} must not be empty");

	double sum = 0;
	for (int i = 0; i < values.Length; i++)
		sum += values[i];

	double mean = sum / values.Length;

	double sumVariancesSquared = 0;
	for (int i = 0; i < values.Length; i++)
	{
		double pointVariance = Math.Abs(mean - values[i]);
		double pointVarianceSquared = Math.Pow(pointVariance, 2);
		sumVariancesSquared += pointVarianceSquared;
	}

	double meanVarianceSquared = sumVariancesSquared / values.Length;
	double std = Math.Sqrt(meanVarianceSquared);

	return (mean, std);
}
```

</details>

## Plot Values from a DataFrame

**I use [ScottPlot.NET](https://scottplot.net) to visualize data from DataFrames in .NET applications and .NET Interactive Notebooks.** ScottPlot can generate a variety of plot types and has many options for customization. See [the ScottPlot Cookbook](https://scottplot.net/cookbook/4.1/) for examples and API documentation.

```cs
// Register a custom formatter to display ScottPlot plots as images
using Microsoft.DotNet.Interactive.Formatting;
Formatter.Register(typeof(ScottPlot.Plot), (plt, writer) => 
    writer.Write(((ScottPlot.Plot)plt).GetImageHTML()), HtmlFormatter.MimeType);
```

```cs
// Get the data you wish to display in double arrays
double[] ages = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Age"][x])).ToArray();
double[] heights = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Height"][x])).ToArray();
```

```cs
// Create and display a plot
var plt = new ScottPlot.Plot(400, 300);
plt.AddScatter(ages, heights);
plt.XLabel("Age");
plt.YLabel("Height");
plt
```

![](scottplot-notebook.png)

> 💡 See [demo.html](demo.html) for a full length demonstration notebook

If you are only working inside a Notebook and you want all your plots to be HTML and JavaScript, [XPlot.Plotly](https://towardsdatascience.com/getting-started-with-c-dataframe-and-xplot-ploty-6ea6ce0ce8e3) is a good tool to use.

## Data may contain null

I didn't demonstrate it in the code examples above, but note that all column data types are nullable. While null-containing data requires extra considerations when writing mathematical routes, it's a convenient way to model missing data which is a common occurrence in the real world. 

## Why not just use LINQ?

I see this question asked frequently, often with an aggressive and condescending tone. LINQ (Language-Integrated Query) is fantastic for performing logical operations on simple collections of data. When you have large 2D datasets of _labeled_ data, advantages of data frames over flat LINQ statements start to become apparent. It is also easy to perform logical operations across multiple data frames, allowing users to write simpler and more readable code than could be achieved with LINQ statements. Data frames also make it much easier to visualize complex data too. In the data science world where complex labeled datasets are routinely compared, manipulated, merged, and visualized, often in an interactive context, the data frames are much easier to work with than raw LINQ statements.

## Conclusions

Although I typically reach for Python to perform exploratory data science, it's good to know that C# has a DataFrame available and that it can be used to inspect and manipulate tabular data. DataFrames pair well with [ScottPlot](https://scottplot.net) figures in interactive notebooks and are a great way to inspect and communicate complex data. I look forward to watching Microsoft's Data Analysis namespace continue to evolve as part of their machine learning / ML.NET platform.

## Resources

* [Example notebook for this project](demo.html)

* [Source code for this project](https://github.com/swharden/Csharp-Data-Visualization/tree/main/projects/dataframe)

* [Official `Microsoft.Data.Analysis.DataFrame` Class Documentation](https://docs.microsoft.com/en-us/dotnet/api/microsoft.data.analysis.dataframe)

* [Microsoft.Data.Analysis source code](https://github.com/dotnet/machinelearning/tree/main/src/Microsoft.Data.Analysis) 

* [An Introduction to DataFrame](https://devblogs.microsoft.com/dotnet/an-introduction-to-dataframe/) (.NET Blog)

* [ExtremeOptimization DataFrame Quickstart](https://www.extremeoptimization.com/QuickStart/CSharp/DataFrames.aspx)

* [`Microsoft.Data.Analysis` on NuGet](https://www.nuget.org/packages/Microsoft.Data.Analysis/)

* [Getting Started With C# DataFrame and XPlot.Plotly](https://towardsdatascience.com/getting-started-with-c-dataframe-and-xplot-ploty-6ea6ce0ce8e3)

* [10 minutes to pandas](https://pandas.pydata.org/docs/user_guide/10min.html)
April 24th, 2022

FTP Deploy with GitHub Actions

This article describes how I use GitHub Actions to deploy content using FTP without any third-party dependencies. Code executed in continuous deployment pipelines may have access to secrets (like FTP credentials and SSH keys). Supply-chain attacks are becoming more frequent, including self-sabotage by open-source authors. Without 2FA, the code of well-intentioned maintainers is one stolen password away from becoming malicious. For these reasons I find it imperative to eliminate third-party Actions from my CI/CD pipelines wherever possible.

⚠️ WARNING: Third-party Actions in the GitHub Actions Marketplace may be compromised to run malicious code and leak secrets. There are dozens of public actions claiming to facilitate FTP deployment. I advise avoiding third-party actions in your CI/CD pipeline whenever possible.

This article assumes you have at least some familiarity with GitHub Actions, but if you're never used them before I recommend taking 5 minutes to work through the Quickstart for GitHub Actions.

FTP Deployment Workflow

This workflow demonstrates how to use LFTP inside a GitHub Action to transfer files/folders with FTP without requiring a third-party dependency. Users can copy/paste this workflow and edit it as needed according to the LFTP manual.

name: 🚀 FTP Deploy
on: [push, workflow_dispatch]
jobs:
  ftp-deploy:
    runs-on: ubuntu-latest
    steps:
      - name: 🛒 Checkout
        uses: actions/checkout@v2
      - name: 📦 Get LFTP
        run: sudo apt install lftp
      - name: 🛠️ Configure LFTP
        run: mkdir ~/.lftp && echo "set ssl:verify-certificate false;" >> ~/.lftp/rc
      - name: 🔑 Load Secrets
        run: echo "machine ${{ secrets.FTP_HOSTNAME }} login ${{ secrets.FTP_USERNAME }} password ${{ secrets.FTP_PASSWORD }}" > ~/.netrc
      - name: 📄 Upload File
        run: lftp -e "put -O /destination/ ./README.md" ${{ secrets.FTP_HOSTNAME }}
      - name: 📁 Upload Folder
        run: lftp -e "mirror --parallel=100 -R ./ffmpeg/ /ffmpeg/" ${{ secrets.FTP_HOSTNAME }}

This workflow uses GitHub Encrypted Secrets to store secret values:

  • FTP_HOSTNAME - a string like ftp.example.com
  • FTP_USERNAME - a string like login@example.com
  • FTP_PASSWORD - a string like superSecret123

How to Verify the Host Certificate

Extra steps can be taken to record the host's public certificate, store it as a GitHub Encrypted Secret, load it into the GitHub Action runner, and configure LFTP to compare against at run time.

  • 1: Acquire your host's entire certificate chain. The -showcerts argument was critically important for me.
openssl s_client -connect example.com:21 -starttls ftp -showcerts
      - name: 🛠️ Configure LFTP
        run: |
          mkdir ~/.lftp
          echo "set ssl:ca-file ~/.lftp/certs.crt;set ssl:check-hostname no;" >> ~/.lftp/rc
          echo "${{ secrets.FTP_CERTS_BASE64 }}" | base64 --decode > ~/.lftp/certs.crt

Notes

To avoid storing passwords to disk you can pass them in with each lftp command using the -u argument. See the LFTP Documentation for details.

Although potentially insecure, some GitHub Marketplace Actions offer compelling features: One of the most popular is SamKirkland's FTP Deploy Action which has advanced features like the use of server-stored JSON files to store file hashes to detect and selectively re-upload changed files. I encourage you to check them out, even though I try to avoid passing my secrets through third-party actions wherever possible.

Favor SSH and rsync over FTP and lftp where possible because rsync is faster, more secure, and designed to prevent needless transfer of unchanged files. I recently wrote about how to safely deploy over SSH using rsync with GitHub Actions.

Resources

Markdown source code last modified on April 30th, 2022
---
title: FTP Deploy with GitHub Actions
description: Deploy content over FTP using GitHub Actions and no dependencies
date: 2022-04-24 16:45:00
tags: GitHub
---

# FTP Deploy with GitHub Actions

**This article describes how I use GitHub Actions to deploy content using FTP without any third-party dependencies.** Code executed in continuous deployment pipelines may have access to secrets (like FTP credentials and SSH keys). Supply-chain attacks are becoming more frequent, including self-sabotage by open-source authors. Without 2FA, the code of well-intentioned maintainers is one stolen password away from becoming malicious. For these reasons I find it imperative to eliminate third-party Actions from my CI/CD pipelines wherever possible.

> ⚠️ **WARNING: Third-party Actions in the GitHub Actions Marketplace may be compromised to run malicious code and leak secrets.** There are [dozens of public actions](https://github.com/marketplace?category=&query=ftp+sort%3Apopularity-desc&type=actions) claiming to facilitate FTP deployment. I advise avoiding third-party actions in your CI/CD pipeline whenever possible.

This article assumes you have at least some familiarity with GitHub Actions, but if you're never used them before I recommend taking 5 minutes to work through the [Quickstart for GitHub Actions](https://docs.github.com/en/actions/quickstart).

## FTP Deployment Workflow
**This workflow demonstrates how to use LFTP inside a GitHub Action to transfer files/folders with FTP without requiring a third-party dependency.** Users can copy/paste this workflow and edit it as needed according to the [LFTP manual](https://lftp.yar.ru/lftp-man.html).

```yaml
name: 🚀 FTP Deploy
on: [push, workflow_dispatch]
jobs:
  ftp-deploy:
    runs-on: ubuntu-latest
    steps:
      - name: 🛒 Checkout
        uses: actions/checkout@v2
      - name: 📦 Get LFTP
        run: sudo apt install lftp
      - name: 🛠️ Configure LFTP
        run: mkdir ~/.lftp && echo "set ssl:verify-certificate false;" >> ~/.lftp/rc
      - name: 🔑 Load Secrets
        run: echo "machine ${{ secrets.FTP_HOSTNAME }} login ${{ secrets.FTP_USERNAME }} password ${{ secrets.FTP_PASSWORD }}" > ~/.netrc
      - name: 📄 Upload File
        run: lftp -e "put -O /destination/ ./README.md" ${{ secrets.FTP_HOSTNAME }}
      - name: 📁 Upload Folder
        run: lftp -e "mirror --parallel=100 -R ./ffmpeg/ /ffmpeg/" ${{ secrets.FTP_HOSTNAME }}
```

This workflow uses [GitHub Encrypted Secrets](https://docs.github.com/en/actions/security-guides/encrypted-secrets) to store secret values:

* `FTP_HOSTNAME` - a string like `ftp.example.com`
* `FTP_USERNAME` - a string like `login@example.com`
* `FTP_PASSWORD` - a string like `superSecret123`

<img src="github-actions-ftp.jpg" class="d-block border shadow my-5 mx-auto" />

## How to Verify the Host Certificate

Extra steps can be taken to record the host's public certificate, store it as a [GitHub Encrypted Secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets), load it into the GitHub Action runner, and configure LFTP to compare against at run time.

* 1: Acquire your host's _entire_ certificate chain. The `-showcerts` argument was critically important for me.

```bash
openssl s_client -connect example.com:21 -starttls ftp -showcerts
```

* 2: Copy the _entire_ output, [convert it to a Base64 string](https://emn178.github.io/online-tools/base64_encode.html), and store it as a [GitHub Encrypted Secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets) named `FTP_CERTS_BASE64`

* 3: Update your GitHub Action to save the certificate file and configure LFTP to use it:

```yaml
      - name: 🛠️ Configure LFTP
        run: |
          mkdir ~/.lftp
          echo "set ssl:ca-file ~/.lftp/certs.crt;set ssl:check-hostname no;" >> ~/.lftp/rc
          echo "${{ secrets.FTP_CERTS_BASE64 }}" | base64 --decode > ~/.lftp/certs.crt
```

## Notes

**To avoid storing passwords to disk** you can pass them in with each `lftp` command using the `-u` argument. See the [LFTP Documentation](https://lftp.yar.ru/lftp-man.html) for details.

**Although potentially insecure, some GitHub Marketplace Actions offer compelling features:** One of the most popular is [SamKirkland's FTP Deploy Action](https://github.com/SamKirkland/FTP-Deploy-Action) which has advanced features like the use of server-stored JSON files to store file hashes to detect and selectively re-upload changed files. I encourage you to check them out, even though I try to avoid passing my secrets through third-party actions wherever possible.

**Favor SSH and `rsync` over FTP and `lftp` where possible** because `rsync` is faster, more secure, and designed to prevent needless transfer of unchanged files. I recently wrote about [how to safely deploy over SSH using rsync with GitHub Actions](https://swharden.com/blog/2022-03-20-github-actions-hugo/).

## Resources
* [LFTP project on GitHub](https://github.com/lavv17/lftp)
* [LFTP Documentation](https://lftp.yar.ru/lftp-man.html)
* [GitHub Actions: Build and deploy a Hugo site](https://swharden.com/blog/2022-03-20-github-actions-hugo/)
* [GitHub Actions: How to deploy over SSH using rsync](https://swharden.com/blog/2022-03-20-github-actions-hugo/)
* [GitHub Encrypted Secrets](https://docs.github.com/en/actions/security-guides/encrypted-secrets)
* [GNU Manual: The .netrc file](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html)
* [SSL Checker: Certificate Decoder](https://www.sslchecker.com/certdecoder)
April 12th, 2022

GitHub Repository Badge

I created a badge to dynamically display stats for any public GitHub repository using HTML and Vanilla JavaScript. I designed it so anyone can have their own badge by copying two lines of HTML into their website.

I don't write web frontend code often, so after getting this idea I decided to see how far I could take it. I treated this little project as an opportunity to get some experience exploring a stack I don't interact with often, and to see if I could take it all the way to something that would look nice and scale infinitely for free. This article documents what I learned along the way

<!-- paste anywhere in your site -->
<a href="http://github.com/USER/REPO" id="github-stats-badge">GitHub</a>
<script src="https://swharden.github.io/repo-badge/badge.js" defer></script>

How it Works

  • Because defer attribute is defined in the script element, the JavaScript will not run until after the page loads. This ensures all the elements it will interact with are present in memory before it starts editing the DOM. Note that the HTML added by the user is a link to the GitHub project, so even if the JS fails completely this link is still functional and useful.

  • The a with id github-stats-badge is identified and the href is read to determine the user and name of the repository to display on the badge

  • CSS is assembled in a style element and appended to the head

  • JavaScript deletes the content of the original a and replaces it with nested div, a, and span elements to build the badge in the DOM dynamically. Each stats block is hidden by settings its opacity to zero, preventing the user from seeing elements before they are filled with real data. This also fills-out the dimensions of the badge to prevent the page from shifting as its components are loaded individually.

  • Asynchronous requests are sent to GitHub's RESTful API endpoints using fetch() and the JSON responses are parsed to get the latest release tag, star count, and number of forks

  • Information from the API is loaded into span elements and the opacity is set to one (with CSS transitions) so it fades in after the HTTP request returns a valid result. The fade-in effect makes the delayed appearance seem intentional, when in reality it's just buying time for the HTTP request to complete its round-trip. Without this fade, the rapid appearance of text (or the replacement of dummy text with real values) is much more jarring.

Example Fetch

I expect the HTTP request to return a JSON document with a tag_name element, but if not I build my own object containing this object (filed with dummy data) and pass it along.

The display code (which sets the text, increases opacity, and sets the link) doesn't actually know whether the request succeeded or failed.

This is how I ensure the badge is always left in a presentable state.

fetch(`https://api.github.com/repos/${user}/${repo}/releases/latest`)
    .then(response => { 
        return response.ok ? response.json() : { "tag_name": "none" };
    })
    .then(data => {
        const tag = document.getElementById('github-stats-badge--tag');
        tag.getElementsByTagName("span")[0].innerText = data.tag_name;
        tag.style.opacity = 1;
        tag.href = repoLinkUrl + "/releases";
    });

Fading

I don't use CSS fading that often, but I found it produced a fantastic result here. Here's the magic bit of CSS that enables fading effects as JavaScript twiddles the opacity

#github-stats-badge a {
    color: black;
    text-decoration: none;
    opacity: 0;
    transition: opacity .5s ease-in-out;
}

#github-stats-badge a:hover {
    color: #003366;
}

SVG Icons

GitHub has official MIT-licensed icons available as SVG files. These are fantastic because you can view their source and it's plain text! You can copy that plain text directly into a HTML document, or in my case wrap it in JavaScript so I can serve it dynamically.

I store the path attribute contents as a JavaScript string like this

const githubStatusBadge_tagPath = "M2.5 7.775V2.75a.25.25 0 01.25-.25h5.025a.25.25 0 01.177.073l6.25 \
    6.25a.25.25 0 010 .354l-5.025 5.025a.25.25 0 01-.354 0l-6.25-6.25a.25.25 0 01-.073-.177zm-1.5 0V2.75C1 \
    1.784 1.784 1 2.75 1h5.025c.464 0 .91.184 1.238.513l6.25 6.25a1.75 1.75 0 010 2.474l-5.026 5.026a1.75 \
    1.75 0 01-2.474 0l-6.25-6.25A1.75 1.75 0 011 7.775zM6 5a1 1 0 100 2 1 1 0 000-2z";

Then I create a function to build a SVG image from a path

function githubStatusBadge_createSVG(svgPath) {
    const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
    svg.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:xlink", "http://www.w3.org/1999/xlink");
    svg.setAttribute('width', '16');
    svg.setAttribute('height', '16');
    svg.setAttribute('viewBox', '0 0 16 16');
    svg.style.verticalAlign = 'bottom';
    svg.style.marginRight = "2px";

    const path = document.createElementNS("http://www.w3.org/2000/svg", 'path');
    path.setAttribute('fill-rule', 'evenodd');
    path.setAttribute('d', svgPath);
    svg.appendChild(path);

    return svg;
}

Note that the NS method and xmlns attribute are critical for SVG elements to work in the browser. For more information check out Mozilla's Namespaces crash course .

Minification

The non-minified plain-text JavaScript file is less than 8kb. This could be improved by minification and/or gzip compression, but I may continue to choose not to do this.

I appreciate HTML and JS which is human readable, especially when it was human-written by hand. Perhaps a good compromise would be to offer badge.js and badge.min.js, but even this would add complexity by necessitating a build step which is not currently required.

GitHub Pages

I organized this project so it could be served using GitHub Pages. Basically you just check a box on the GitHub repository settings page, then docs/index.html will be displayed when you go to USER.github.io/REPO in a browser. Building/publishing is performed automatically using GitHub Actions, and it works immediately without having to manually create a workflow yaml file.

Although GitHub pages supports a fancy markdown-based flat-file static website generation using Jekyll, I chose to create a project page using hand-crafted HTML, CSS, and Vanilla JS with no framework of build system. Web0 for the win!

GitHub stores and serves the content (with edge caching) so I'm protected in the unlikely case where this project goes viral and millions of people start downloading my JavaScript file. GitHub will scale horizontally as needed to infinity to meet the demand from increased traffic, and all the services I'm using are free.

New Website Checklist

Although the project page is simple, I wanted it to look nice. There are so many things to consider when making a new webpage! Here are a few that make my list, and most of them don't apply to this small one-page website but I thought I'd share my whole list anyway.

  • ✔️ Populate title and meta description
  • ✔️ Add metric analysis (Google Analytics)
  • ❌ Add ads where appropriate (Google AdSense)
  • ❌ Add a RSS feed
  • ❌ Add a sitemap
  • ❌ Create a custom 404 page
  • ❌ Place noindex attributes on special pages
  • ✔️ Create a 32x32 transparent favicon.ico
  • ❌ Create additional favicons
  • ✔️ Create a 1200 x 630 px Open Graph image
  • ✔️ Add twitter and facebook cards
  • ✔️ Verify OG previews look good using opengraph.xyz
  • ✔️ Confirm the site looks good on mobile (chrome dev tools)
  • ✔️ Set the meta theme-color to color the mobile address bar
  • ❌ Define 404 and permissions in .htaccess
  • ✔️ Check accessibility and performance in LightHouse

Here's the Open Graph banner I came up with:

Conclusions

Altogether the project page looks great and the badge seems to function as expected! I'll continue to watch the repository so if anyone opens an issue or creates a pull request offering improvements I will be happy to review it.

This little Vanilla JS project touched a lot of interesting corners of web frontend development, and I'm happy I got to explore them today!

If you like this project, give it a star! 🌟

Resources

Markdown source code last modified on April 13th, 2022
---
title: GitHub Repository Badge
description: What I learned creating a github repo stats badge using HTML and Vanilla JS
date: 2022-04-12 23:00:00
tags: JavaScript, GitHub
---

# GitHub Repository Badge

**I created a badge to dynamically display stats for any public GitHub repository using HTML and Vanilla JavaScript.** I designed it so anyone can have their own badge by copying two lines of HTML into their website. 

I don't write web frontend code often, so after getting this idea I decided to see how far I could take it. I treated this little project as an opportunity to get some experience exploring a stack I don't interact with often, and to see if I could take it all the way to something that would look nice and scale infinitely for free. This article documents what I learned along the way

<div class="text-center my-5">

<a href="http://github.com/ScottPlot/ScottPlot" id="github-stats-badge">GitHub</a>
<script src="https://swharden.github.io/repo-badge/badge.js" defer></script>

<a href='https://swharden.github.io/repo-badge/'>swharden.github.io/repo-badge</a>

</div>

```html
<!-- paste anywhere in your site -->
<a href="http://github.com/USER/REPO" id="github-stats-badge">GitHub</a>
<script src="https://swharden.github.io/repo-badge/badge.js" defer></script>
```

## How it Works

* Because [`defer` attribute](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script#attr-defer) is defined in the `script` element, the JavaScript will not run until after the page loads. This ensures all the elements it will interact with are present in memory before it starts editing the DOM. Note that the HTML added by the user is a link to the GitHub project, so even if the JS fails completely this link is still functional and useful.

* The `a` with id `github-stats-badge` is identified and the `href` is read to determine the user and name of the repository to display on the badge

* CSS is assembled in a `style` element and appended to the `head`

* JavaScript deletes the content of the original `a` and replaces it with nested `div`, `a`, and `span` elements to build the badge in the DOM dynamically. Each stats block is hidden by settings its `opacity` to zero, preventing the user from seeing elements before they are filled with real data. This also fills-out the dimensions of the badge to prevent the page from shifting as its components are loaded individually.

* Asynchronous requests are sent to [GitHub's RESTful API](https://docs.github.com/en/rest) endpoints using [`fetch()`](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch) and the JSON responses are parsed to get the latest release tag, star count, and number of forks
  * https://api.github.com/repos/USER/REPO
  * https://api.github.com/repos/USER/REPO/releases/latest

* Information from the API is loaded into `span` elements and the `opacity` is set to one (with [CSS transitions](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Transitions/Using_CSS_transitions)) so it fades in _after_ the HTTP request returns a valid result. The fade-in effect makes the delayed appearance seem intentional, when in reality it's just buying time for the HTTP request to complete its round-trip. Without this fade, the rapid appearance of text (or the replacement of dummy text with real values) is much more jarring.

### Example Fetch

I expect the HTTP request to return a JSON document with a `tag_name` element, but if not I build my own object containing this object (filed with dummy data) and pass it along. 

The display code (which sets the text, increases opacity, and sets the link) doesn't actually know whether the request succeeded or failed.

This is how I ensure the badge is always left in a presentable state.

```js
fetch(`https://api.github.com/repos/${user}/${repo}/releases/latest`)
    .then(response => { 
        return response.ok ? response.json() : { "tag_name": "none" };
    })
    .then(data => {
        const tag = document.getElementById('github-stats-badge--tag');
        tag.getElementsByTagName("span")[0].innerText = data.tag_name;
        tag.style.opacity = 1;
        tag.href = repoLinkUrl + "/releases";
    });
```

### Fading

I don't use CSS fading that often, but I found it produced a fantastic result here. Here's the magic bit of CSS that enables fading effects as JavaScript twiddles the `opacity`

```css
#github-stats-badge a {
    color: black;
    text-decoration: none;
    opacity: 0;
    transition: opacity .5s ease-in-out;
}

#github-stats-badge a:hover {
    color: #003366;
}
```

## SVG Icons

GitHub has official MIT-licensed icons available as SVG files. These are fantastic because you can view their source and it's plain text! You can copy that plain text directly into a HTML document, or in my case wrap it in JavaScript so I can serve it dynamically.

* https://github.com/primer/octicons/

I store the `path` attribute contents as a JavaScript string like this

```js
const githubStatusBadge_tagPath = "M2.5 7.775V2.75a.25.25 0 01.25-.25h5.025a.25.25 0 01.177.073l6.25 \
    6.25a.25.25 0 010 .354l-5.025 5.025a.25.25 0 01-.354 0l-6.25-6.25a.25.25 0 01-.073-.177zm-1.5 0V2.75C1 \
    1.784 1.784 1 2.75 1h5.025c.464 0 .91.184 1.238.513l6.25 6.25a1.75 1.75 0 010 2.474l-5.026 5.026a1.75 \
    1.75 0 01-2.474 0l-6.25-6.25A1.75 1.75 0 011 7.775zM6 5a1 1 0 100 2 1 1 0 000-2z";
```

Then I create a function to build a SVG image from a `path`

```js
function githubStatusBadge_createSVG(svgPath) {
    const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
    svg.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:xlink", "http://www.w3.org/1999/xlink");
    svg.setAttribute('width', '16');
    svg.setAttribute('height', '16');
    svg.setAttribute('viewBox', '0 0 16 16');
    svg.style.verticalAlign = 'bottom';
    svg.style.marginRight = "2px";

    const path = document.createElementNS("http://www.w3.org/2000/svg", 'path');
    path.setAttribute('fill-rule', 'evenodd');
    path.setAttribute('d', svgPath);
    svg.appendChild(path);

    return svg;
}
```

Note that the `NS` method and `xmlns` attribute are critical for SVG elements to work in the browser. For more information check out Mozilla's [Namespaces crash course
](https://developer.mozilla.org/en-US/docs/Web/SVG/Namespaces_Crash_Course).

## Minification

The non-minified plain-text JavaScript file is less than 8kb. This could be improved by minification and/or gzip compression, but I may continue to choose not to do this.

I appreciate HTML and JS which is human readable, especially when it was human-written by hand. Perhaps a good compromise would be to offer `badge.js` and `badge.min.js`, but even this would add complexity by necessitating a build step which is not currently required.

## GitHub Pages

I organized this project so it could be served using [GitHub Pages](https://pages.github.com/). Basically you just check a box on the GitHub repository settings page, then `docs/index.html` will be displayed when you go to `USER.github.io/REPO` in a browser. Building/publishing is performed automatically using GitHub Actions, and it works immediately without having to manually create a workflow yaml file.

Although GitHub pages supports a fancy markdown-based flat-file static website generation using [Jekyll](https://jekyllrb.com/), I chose to create a project page using hand-crafted HTML, CSS, and Vanilla JS with no framework of build system. [Web0](https://web0.small-web.org/) for the win!

GitHub stores and serves the content (with edge caching) so I'm protected in the unlikely case where this project goes viral and millions of people start downloading my JavaScript file. GitHub will scale horizontally as needed to infinity to meet the demand from increased traffic, and all the services I'm using are free.

## New Website Checklist

Although the project page is simple, I wanted it to look nice. There are so many things to consider when making a new webpage! Here are a few that make my list, and most of them don't apply to this small one-page website but I thought I'd share my whole list anyway.

* ✔️ Populate `title` and `meta description`
* ✔️ Add metric analysis (Google Analytics)
* ❌ Add ads where appropriate (Google AdSense)
* ❌ Add a RSS feed
* ❌ Add a sitemap
* ❌ Create a custom 404 page
* ❌ Place `noindex` attributes on special pages
* ✔️ Create a 32x32 transparent `favicon.ico`
* ❌ Create [additional favicons](https://evilmartians.com/chronicles/how-to-favicon-in-2021-six-files-that-fit-most-needs)
* ✔️ Create a 1200 x 630 px [Open Graph image](https://ogp.me/)
* ✔️ Add twitter and facebook cards
* ✔️ Verify OG previews look good using [opengraph.xyz](https://www.opengraph.xyz/)
* ✔️ Confirm the site looks good on mobile (chrome dev tools)
* ✔️ Set the meta `theme-color` to color the mobile address bar
* ❌ Define 404 and permissions in `.htaccess`
* ✔️ Check accessibility and performance in LightHouse

Here's the Open Graph banner I came up with:

<img src="banner.png" class="d-inline-block mx-auto">

## Conclusions

**Altogether the project page looks great and the badge seems to function as expected!** I'll continue to watch the repository so if anyone opens an issue or creates a pull request offering improvements I will be happy to review it.

This little Vanilla JS project touched a lot of interesting corners of web frontend development, and I'm happy I got to explore them today!

If you like this project, [give it a star! 🌟](https://github.com/swharden/repo-badge)

## Resources
* [GitHub Repo Badge Website](https://swharden.github.io/repo-badge/)
* [GitHub Repo Badge GitHub Project](https://github.com/swharden/repo-badge)
* This project was inspired by [GitHub Buttons](https://buttons.github.io)
April 4th, 2022

Mystify your Mind with SkiaSharp

This article explores my recreation of the classic screensaver Mystify your Mind implemented using C#. I used SkiaSharp to draw graphics and FFMpegCore to encode frames into high definition video files suitable for YouTube.

The Mystify Sandbox application has advanced options allowing exploration of various configurations outside the capabilities of the original screensaver. Interesting configurations can be exported as video (x264-encoded MP4 or WebM format) or viewed in full-screen mode resembling an actual screensaver.

Download

Programming Strategy

  • Corner - tracks point that bounces around the edges of the screen
    • Has Position and Velocity fields
    • Has Advance() to move points collide with edges
  • Wire - represents a single polygon that moves around the screen
    • Contains List<Corner> and a Color which all change over time
    • Has Advance() which advances all corner and cycles Color.
    • Contains List<WireSnapshot> to record history
  • WireSnapshot - represents properties of a Wire at an instant in time
    • Contains Point[] and Color and is intended to be immutable
    • Can draw itself using a Draw() method that accepts a SKCanvas
  • Field - represents the whole animation
    • Contains List<Wire> and has Width and Height
    • Has Advance() which advances all wires
    • Can draw itself using a Draw() method that accepts a SKCanvas

Original Behavior

Close inspection of video from the original Mystify screensaver revealed notable behaviors.

Broken Lines

The original Mystify implementation did not clear the screen and between every frame. With GDI large fills (clearing the background) are expensive, and drawing many polygons probably challenged performance in the 90s. Instead only the leading wire was drawn, and the trailing wire was drawn-over using black. This strategy results in lines which appear to have single pixel breaks on a black background (magenta arrow). It may not have been particularly visible on CRT monitors available in the 90s, but it is quite noticeable on LCD screens today.

Bouncing Changes Speed

Observing videos of the classic screensaver I noticed that corners don't bounce symmetrically off edges. After every bounce they change their speed slightly. This can be seen by observing the history of corners which reflect off edges of the screen demonstrating their change in speed (green arrow). I recreated this behavior using a weighted random number generator.

Programming Notes

Color Cycling

I used a HSL-to-RGB method to generate colors from hue (variable), saturation (always 100%), and luminosity (always 50%). By repeatedly ramping hue from 0% to 100% slowly I achieved a rainbow gradient effect. Increasing the color change speed (% change for every new wire) cycles the colors faster, and very high values produce polygons whose visible history spans a gradient of colors. Fade effect is achieved by increasing alpha of wire snapshots as they are drawn from old to new.

Encoding video with C

The FFMpegCore package is a C# wrapper for FFMpeg that can encode video from frames piped into it. Using this strategy required creation of a SkiaSharp.SKBitmap wrapper that implements FFMpegCore.Pipes.IVideoFrame. For a full explaination and example code see C# Data Visualization: Render Video with SkiaSharp.

Performance

It's amusing to see retro screensavers running on modern gear! I can run this graphics model simulation at full-screen resolutions using thousands of wires at real-time frame rates. The most natural density of shapes for my 3440x1440 display was 20 wires with a history of 5.

Rendering the 2D image and encoding HD video using the x264 codec occupies all my CPU cores and runs a little above 500 frames per second. Encoding 24 hours of video (over 2 million frames) took this system 1 hour and 12 minutes and produced a 15.3 GB MP4 file. Encoding WebM format is considerably slower, with the same system only achieving an encoding rate of 12 frames per second.

Simulations

Traditional Behavior

The classic screensaver is typically run with two 4-cornered polygons that slowly change color.

Rainbow

Increasing the rate of color transition produces a rainbow effect within the visible history of polygons. The effect is made more striking by increasing the history length and decreasing the speed so the historical lines are closer together.

Solid

If the speed is greatly decreased and the number of historical records is greatly increased the resulting shape has little or no gap between historical traces and appears like a solid object. If fading is enabled (where opacity of older traces fades to transparent) the resulting effect is very interesting.

Chaos

Adding 100 shapes produces a chaotic but interesting effect. This may be the first time the world has seen Mystify like this!

EDIT: All these lines are very stressful on the video encoder and produce large file sizes to achieve high quality (25 MB for 10 seconds). I'm showing this one as a JPEG but click here to view mystify-100.webm if you're on a good internet connection.

YouTube

Resources

Markdown source code last modified on April 9th, 2022
---
title: Mystify your Mind with SkiaSharp
description: My implementation of the classic screensaver using SkiaSharp, OpenGL, and FFMpeg
date: 2022-04-04 18:34:00
tags: csharp, graphics
---

# Mystify your Mind with SkiaSharp

**This article explores my recreation of the classic screensaver _Mystify your Mind_ implemented using C#.** I used [SkiaSharp](https://github.com/mono/SkiaSharp) to draw graphics and [FFMpegCore](https://github.com/rosenbjerg/FFMpegCore) to encode frames into high definition video files suitable for YouTube.

<div class="text-center">

![](mystify.gif)

</div>

**The Mystify Sandbox application has advanced options** allowing exploration of various configurations outside the capabilities of the original screensaver. Interesting configurations can be exported as video (x264-encoded MP4 or WebM format) or viewed in full-screen mode resembling an actual screensaver. 

![](mystify-advanced.jpg)

## Download
* The [Releases page](https://github.com/swharden/Mystify/releases) has a click-to-run EXE for Windows
* [GitHub.com/swharden/Mystify](https://github.com/swharden/Mystify/) contains project source code (C#/.NET6)

## Programming Strategy

* `Corner` - tracks point that bounces around the edges of the screen
  * Has `Position` and `Velocity` fields
  * Has `Advance()` to move points collide with edges
* `Wire` - represents a single polygon that moves around the screen
  * Contains `List<Corner>` and a `Color` which all change over time
  * Has `Advance()` which advances all corner and cycles `Color`.
  * Contains `List<WireSnapshot>` to record history
* `WireSnapshot` - represents properties of a `Wire` at an instant in time
  * Contains `Point[]` and `Color` and is intended to be immutable
  * Can draw itself using a `Draw()` method that accepts a `SKCanvas`
* `Field` - represents the whole animation
  * Contains `List<Wire>` and has `Width` and `Height`
  * Has `Advance()` which advances all wires
  * Can draw itself using a `Draw()` method that accepts a `SKCanvas`

## Original Behavior

Close inspection of [video from the original](https://youtu.be/SaBvcHHdlGE) Mystify screensaver revealed notable behaviors.

<img src="mystify-inspection.jpg" class="d-block shadow mx-auto my-5">

### Broken Lines
The original Mystify implementation did not clear the screen and between every frame. With GDI large fills (clearing the background) are expensive, and drawing many polygons probably challenged performance in the 90s. Instead only the leading wire was drawn, and the trailing wire was drawn-over using black. This strategy results in lines which appear to have single pixel breaks on a black background (magenta arrow). It may not have been particularly visible on CRT monitors available in the 90s, but it is quite noticeable on LCD screens today.

### Bouncing Changes Speed
Observing videos of the classic screensaver I noticed that corners don't bounce symmetrically off edges. After every bounce they change their speed slightly. This can be seen by observing the history of corners which reflect off edges of the screen demonstrating their change in speed (green arrow). I recreated this behavior using a weighted random number generator.

## Programming Notes

### Color Cycling
I used a HSL-to-RGB method to generate colors from hue (variable), saturation (always 100%), and luminosity (always 50%). By repeatedly ramping hue from 0% to 100% slowly I achieved a rainbow gradient effect. Increasing the color change speed (% change for every new wire) cycles the colors faster, and very high values produce polygons whose visible history spans a gradient of colors. Fade effect is achieved by increasing alpha of wire snapshots as they are drawn from old to new.

### Encoding video with C#
The FFMpegCore package is a C# wrapper for FFMpeg that can encode video from frames piped into it. Using this strategy required creation of a `SkiaSharp.SKBitmap` wrapper that implements `FFMpegCore.Pipes.IVideoFrame`. For a full explaination and example code see [C# Data Visualization: Render Video with SkiaSharp](https://swharden.com/csdv/skiasharp/video/).

### Performance

**It's amusing to see retro screensavers running on modern gear!** I can run this graphics model simulation at full-screen resolutions using thousands of wires at real-time frame rates. The most natural density of shapes for my 3440x1440 display was 20 wires with a history of 5.

<img src="desk.jpg" class="d-block shadow mx-auto my-5">

Rendering the 2D image and encoding HD video using the x264 codec occupies all my CPU cores and runs a little above 500 frames per second. Encoding 24 hours of video (over 2 million frames) took this system 1 hour and 12 minutes and produced a 15.3 GB MP4 file. Encoding WebM format is considerably slower, with the same system only achieving an encoding rate of 12 frames per second.

<img src="cpu.png" class="d-block mx-auto my-5">


## Simulations

### Traditional Behavior

The classic screensaver is typically run with two 4-cornered polygons that slowly change color.

<video width="759" height="470" controls class="d-block mx-auto my-5 shadow" style="max-width: 100%; height: 100%;">
  <source src="mystify-01-standard.webm" type="video/mp4">
</video>

### Rainbow

Increasing the rate of color transition produces a rainbow effect within the visible history of polygons. The effect is made more striking by increasing the history length and decreasing the speed so the historical lines are closer together.

<video width="759" height="470" controls class="d-block mx-auto my-5 shadow" style="max-width: 100%; height: 100%;">
  <source src="mystify-02-rainbow.webm" type="video/mp4">
</video>

### Solid

If the speed is greatly decreased and the number of historical records is greatly increased the resulting shape has little or no gap between historical traces and appears like a solid object. If fading is enabled (where opacity of older traces fades to transparent) the resulting effect is very interesting.

<video width="759" height="470" controls class="d-block mx-auto my-5 shadow" style="max-width: 100%; height: 100%;">
  <source src="mystify-03-solid.webm" type="video/mp4">
</video>

### Chaos

Adding 100 shapes produces a chaotic but interesting effect. This may be the first time the world has seen Mystify like this!

_EDIT: All these lines are very stressful on the video encoder and produce large file sizes to achieve high quality (25 MB for 10 seconds). I'm showing this one as a JPEG but [click here to view mystify-100.webm](mystify-04-100.webm) if you're on a good internet connection._

<a href='mystify-04-100.webm'><img src="mystify-04-100.jpg" class="d-block mx-auto my-5 shadow"></a>

## YouTube

<div class="text-center">

![](https://youtu.be/queN9r3Leis)

</div>

## Resources
* A click-to-run EXE can be downloaded from the [Releases Page](https://github.com/swharden/Mystify/releases)
* Source Code is available on https://github.com/swharden/Mystify
* Implementation Details: [C# Data Visualization: Mystify](https://swharden.com/csdv/simulations/mystify/)
* [C# Data Visualization: Render Video with SkiaSharp](https://swharden.com/csdv/skiasharp/video/)
* GitHub: [SkiaSharp](https://github.com/mono/SkiaSharp)
* GitHub: [FFMpegCore](https://github.com/rosenbjerg/FFMpegCore) 
* Windows 3.1 Mystify (video): https://youtu.be/osCZyfoScFg?t=370
* Windows 95 Mystify (video): https://youtu.be/SaBvcHHdlGE
March 20th, 2022

Build and Deploy a Hugo Site with GitHub Actions

This article describes how I safely use GitHub Actions to build a static website with Hugo and deploy it using SSH without any third-party dependencies. Code executed in continuous deployment pipelines may have access to secrets (like FTP credentials and SSH keys). Supply-chain attacks are becoming more frequent, including self-sabotage by open-source authors. Without 2FA, the code of well-intentioned maintainers is one stolen password away from becoming malicious. For these reasons I find it imperative to eliminate third-party Actions from my CI/CD pipelines wherever possible.

⚠️ WARNING: Third-party Actions in the GitHub Actions Marketplace may be compromised to run malicious code and leak secrets. There are hundreds of public actions claiming to help with Hugo, SSH, and Rsync execution. I advise avoiding third-party actions in your CI/CD pipeline whenever possible.

This article assumes you have at least some familiarity with GitHub Actions, but if you're never used them before I recommend taking 5 minutes to work through the Quickstart for GitHub Actions.

Example Workflow

This is my cicd-website.yaml workflow for building a Hugo website and deploying it with SSH. Most people can just copy/paste what they need from here, but the rest of the article will discuss the purpose and rationale for each of these sections in more detail.

name: Website

on:
  workflow_dispatch:
  push:

jobs:
  build:
    name: Build and Deploy
    runs-on: ubuntu-latest
    steps:
      - name: 🛒 Checkout
        uses: actions/checkout@v2

      - name: ✨ Setup Hugo
        env:
          HUGO_VERSION: 0.92.2
        run: |
          mkdir ~/hugo
          cd ~/hugo
          curl -L "https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_${HUGO_VERSION}_Linux-64bit.tar.gz" --output hugo.tar.gz
          tar -xvzf hugo.tar.gz
          sudo mv hugo /usr/local/bin

      - name: 🛠️ Build
        run: hugo --source website --minify

      - name: 🔑 Install SSH Key
        run: |
          install -m 600 -D /dev/null ~/.ssh/id_rsa
          echo "${{ secrets.PRIVATE_SSH_KEY }}" > ~/.ssh/id_rsa
          echo "${{ secrets.KNOWN_HOSTS }}" > ~/.ssh/known_hosts

      - name: 🚀 Deploy
        run: rsync --archive --delete --stats -e 'ssh -p 18765' 'website/public/' ${{ secrets.REMOTE_DEST }}

Triggers

The on section determines which triggers will initiate this workflow (building/deploying the site). The following will run the workflow after every push to the GitHub repository. The workflow_dispatch allows the workflow to be triggered manually through the GitHub Actions web interface.

on:
  workflow_dispatch:
  push:

I store my hugo site in the subfolder ./website, so if I wanted to only rebuild/redeploy when the website files are changed (and not other files in the repository) I could add a paths filter. If your repository has multiple branches you likely want a branches filter as well.

on:
  workflow_dispatch:
  push:
    paths:
      - "website/**"
    branches:
      - main

Download Hugo

This step defines the Hugo version I want as a temporary environment variable, downloads latest binary from the Hugo Releases page on GitHub, extracts it, and moves the executable file to the user's bin folder so it can be subsequently run from any folder.

- name: ✨ Setup Hugo
  env:
    HUGO_VERSION: 0.92.2
  run: |
    mkdir ~/hugo
    cd ~/hugo
    curl -L "https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_${HUGO_VERSION}_Linux-64bit.tar.gz" --output hugo.tar.gz
    tar -xvzf hugo.tar.gz
    sudo mv hugo /usr/local/bin

Build the Static Site with Hugo

I store my hugo site in the subfolder ./website, so when I build the site I must define the source folder. Check-out the Hugo build commands page for documentation about all the available options.

- name: 🛠️ Build
  run: hugo --source website --minify

SSH Secrets

This part is likely the most confusing for new users, so I'll keep it as minimal as possible. Before you start, I recommend you follow your hosting provider's guide for setting-up SSH. Once you can SSH from your own machine, it will be much easier to set it up in GitHub Actions.

Your Keys

  • Start by creating a private/public key pair
    • ssh-keygen -t ed25519 -C "you@gmail.com"
    • Code here assumes you use an empty passphrase
    • The public key is one long line that starts with ssh-rsa
    • The private key is a multi-line text block that starts and ends with ---
  • You give the PUBLIC key to your hosting provider to remember
  • When you log in SSH you present your PRIVATE key
  • GitHub Actions will need your PRIVATE key, so store it as a GitHub Encrypted Secret (PRIVATE_SSH_KEY)

The Host's Keys

To protect you from leaking your private key to a compromised host, you can retrieve your host's public key and check against it later to be sure it does not change. To get keys for your hosts run the following command:

ssh-keyscan example.com

My hosting provider uses a non-standard SSH port, so I must specify it with:

ssh-keyscan -p 12345 example.com

The host's public keys will be a short list of text. Store it as a GitHub Encrypted Secret (KNOWN_HOSTS)

Loading SSH Secrets in GitHub Actions

These commands will create text files in your .ssh folder containing your private key and the public keys of your host. Later rsync will complain if your private key is in a file with general read/write access, so the install command is used to create an empty file with user-only read/write access (chmod 600), then an echo command is used to populate that file with your private key information.

- name: 🔑 Install SSH Key
  run: |
    install -m 600 -D /dev/null ~/.ssh/id_rsa
    echo "${{ secrets.PRIVATE_SSH_KEY }}" > ~/.ssh/id_rsa
    echo "${{ secrets.KNOWN_HOSTS }}" > ~/.ssh/known_hosts

Deploy with Rsync

Rsync is an application for synchronizing files over networks which is available on most Linux distributions. It only sending files with different modification times and file sizes, so it can be used to efficiently deploy changes to very large websites.

Many people are okay with the defaults:

- name: 🚀 Deploy
  run: rsync --archive public/ username@example.com:~/www/

I use additional arguments (see rsync documentation) to:

  • allow remote deletion of files
  • use a non-standard SSH port (12345)
  • store my remote destination as a GitHub Encrypted Secret - not because it's private, but so I don't accidentally mess it up by incorrectly managing my workflow yaml (which could result in remote data deletion)
  • display a small stats section after finishing (see screenshot)
- name: 🚀 Deploy
  run: rsync --archive --delete --stats -e 'ssh -p 12345' website/public/ ${{ secrets.REMOTE_DEST }}

Conclusions

That's a lot to figure-out and set-up the first time, but once you have your SSH keys ready and some YAML you can copy/paste across multiple projects it's not that bad.

I find rsync to be extremely fast compared to something like FTP run in GitHub Actions, and I'm very satisfied that I can achieve all these steps using Linux console commands and not depending on any other Actions.

Resources

Markdown source code last modified on March 27th, 2022
---
title: Build and Deploy a Hugo Site with GitHub Actions
description: How I safely use GitHub Actions to build a static website with Hugo and deploy it using SSH without any third-party dependencies
date: 2022-03-20 22:45:00
tags: github, hugo
---
# Build and Deploy a Hugo Site with GitHub Actions

**This article describes how I _safely_ use GitHub Actions to build a static website with Hugo and deploy it using SSH without any third-party dependencies.** Code executed in continuous deployment pipelines may have access to secrets (like FTP credentials and SSH keys). Supply-chain attacks are becoming more frequent, including self-sabotage by open-source authors. Without 2FA, the code of well-intentioned maintainers is one stolen password away from becoming malicious. For these reasons I find it imperative to eliminate third-party Actions from my CI/CD pipelines wherever possible. 

> ⚠️ **WARNING: Third-party Actions in the GitHub Actions Marketplace may be compromised to run malicious code and leak secrets.** There are hundreds of public actions claiming to help with [Hugo](https://github.com/marketplace?type=actions&query=hugo), [SSH](https://github.com/marketplace?type=actions&query=SSH), and [Rsync](https://github.com/marketplace?type=actions&query=rsync) execution. I advise avoiding third-party actions in your CI/CD pipeline whenever possible.

This article assumes you have at least some familiarity with GitHub Actions, but if you're never used them before I recommend taking 5 minutes to work through the [Quickstart for GitHub Actions](https://docs.github.com/en/actions/quickstart).

## Example Workflow

This is my `cicd-website.yaml` workflow for building a Hugo website and deploying it with SSH. Most people can just copy/paste what they need from here, but the rest of the article will discuss the purpose and rationale for each of these sections in more detail.

```yaml
name: Website

on:
  workflow_dispatch:
  push:

jobs:
  build:
    name: Build and Deploy
    runs-on: ubuntu-latest
    steps:
      - name: 🛒 Checkout
        uses: actions/checkout@v2

      - name: ✨ Setup Hugo
        env:
          HUGO_VERSION: 0.92.2
        run: |
          mkdir ~/hugo
          cd ~/hugo
          curl -L "https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_${HUGO_VERSION}_Linux-64bit.tar.gz" --output hugo.tar.gz
          tar -xvzf hugo.tar.gz
          sudo mv hugo /usr/local/bin

      - name: 🛠️ Build
        run: hugo --source website --minify

      - name: 🔑 Install SSH Key
        run: |
          install -m 600 -D /dev/null ~/.ssh/id_rsa
          echo "${{ secrets.PRIVATE_SSH_KEY }}" > ~/.ssh/id_rsa
          echo "${{ secrets.KNOWN_HOSTS }}" > ~/.ssh/known_hosts

      - name: 🚀 Deploy
        run: rsync --archive --delete --stats -e 'ssh -p 18765' 'website/public/' ${{ secrets.REMOTE_DEST }}
```

## Triggers

The `on` section determines which triggers will initiate this workflow (building/deploying the site). The following will run the workflow after _every_ push to the GitHub repository. The `workflow_dispatch` allows the workflow to be triggered manually through the GitHub Actions web interface.

```yaml
on:
  workflow_dispatch:
  push:
```

I store my hugo site in the subfolder `./website`, so if I wanted to only rebuild/redeploy when the _website_ files are changed (and not other files in the repository) I could add a `paths` filter. If your repository has multiple branches you likely want a `branches` filter as well.

```yaml
on:
  workflow_dispatch:
  push:
    paths:
      - "website/**"
    branches:
      - main
```

## Download Hugo

This step defines the Hugo version I want as a temporary environment variable, downloads latest binary from the [Hugo Releases page on GitHub](https://github.com/gohugoio/hugo/releases), extracts it, and moves the executable file to the user's `bin` folder so it can be subsequently run from any folder.

```yaml
- name: ✨ Setup Hugo
  env:
    HUGO_VERSION: 0.92.2
  run: |
    mkdir ~/hugo
    cd ~/hugo
    curl -L "https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_${HUGO_VERSION}_Linux-64bit.tar.gz" --output hugo.tar.gz
    tar -xvzf hugo.tar.gz
    sudo mv hugo /usr/local/bin
```

## Build the Static Site with Hugo

I store my hugo site in the subfolder `./website`, so when I build the site I must define the source folder. Check-out the [Hugo build commands](https://gohugo.io/commands/hugo/) page for documentation about all the available options.

```yaml
- name: 🛠️ Build
  run: hugo --source website --minify
```

## SSH Secrets

This part is likely the most confusing for new users, so I'll keep it as minimal as possible. Before you start, I recommend you follow your hosting provider's guide for setting-up SSH. Once you can SSH from your own machine, it will be much easier to set it up in GitHub Actions. 

### Your Keys

* Start by creating a private/public key pair 
  * `ssh-keygen -t ed25519 -C "you@gmail.com"`
  * Code here assumes you use an empty passphrase
  * The public key is one long line that starts with `ssh-rsa`
  * The private key is a multi-line text block that starts and ends with `---`
* You give the PUBLIC key to your hosting provider to remember
* When you log in SSH you present your PRIVATE key
* GitHub Actions will need your PRIVATE key, so store it as a [GitHub Encrypted Secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets) (`PRIVATE_SSH_KEY`)

### The Host's Keys

To protect you from leaking your private key to a compromised host, you can retrieve your host's public key and check against it later to be sure it does not change. To get keys for your hosts run the following command:

```sh
ssh-keyscan example.com
```

My hosting provider uses a non-standard SSH port, so I must specify it with:

```sh
ssh-keyscan -p 12345 example.com
```

The host's public keys will be a short list of text. Store it as a [GitHub Encrypted Secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets) (`KNOWN_HOSTS`)

### Loading SSH Secrets in GitHub Actions

These commands will create text files in your `.ssh` folder containing your private key and the public keys of your host. Later `rsync` will complain if your private key is in a file with general read/write access, so the `install` command is used to create an empty file with user-only read/write access (chmod 600), then an `echo` command is used to populate that file with your private key information.

```yaml
- name: 🔑 Install SSH Key
  run: |
    install -m 600 -D /dev/null ~/.ssh/id_rsa
    echo "${{ secrets.PRIVATE_SSH_KEY }}" > ~/.ssh/id_rsa
    echo "${{ secrets.KNOWN_HOSTS }}" > ~/.ssh/known_hosts
```

## Deploy with Rsync

[Rsync](https://en.wikipedia.org/wiki/Rsync) is an application for synchronizing files over networks which is available on most Linux distributions. It only sending files with different modification times and file sizes, so it can be used to efficiently deploy changes to very large websites. 

Many people are okay with the defaults:

```yaml
- name: 🚀 Deploy
  run: rsync --archive public/ username@example.com:~/www/
```

I use additional arguments (see [rsync documentation](https://linux.die.net/man/1/rsync)) to:
* allow remote deletion of files
* use a non-standard SSH port (12345)
* store my remote destination as a [GitHub Encrypted Secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets) - not because it's private, but so I don't accidentally mess it up by incorrectly managing my workflow yaml (which could result in remote data deletion)
* display a small stats section after finishing (see screenshot)

```yaml
- name: 🚀 Deploy
  run: rsync --archive --delete --stats -e 'ssh -p 12345' website/public/ ${{ secrets.REMOTE_DEST }}
```

<img src="github-actions-hugo-rsync-deploy.jpg" class="border shadow d-block mx-auto my-4">

## Conclusions

That's a lot to figure-out and set-up the first time, but once you have your SSH keys ready and some YAML you can copy/paste across multiple projects it's not that bad. 

I find `rsync` to be extremely fast compared to something like FTP run in GitHub Actions, and I'm very satisfied that I can achieve all these steps using Linux console commands and not depending on any other Actions.

## Resources

* This content was written after recently creating [
C# Data Visualization](https://swharden.com/csdv/) (a Hugo site built and deployed with GitHub Actions).
  * You can inspect the workflow files in [`.GitHub/workflows/`](https://github.com/swharden/Csharp-Data-Visualization/tree/main/.github/workflows) for full details.
  * My hosting provider is [SiteGround](https://www.siteground.com) (see their [SSH Tutorials](https://www.siteground.com/tutorials/ssh/)).
* The official [Hosting and Deployment](https://gohugo.io/hosting-and-deployment/) site has information for:
Google Cloud, AWS, Azure, Netlify, GitHub Pages, KeyCDN, Render CDN, Bitbucket, Netlify, Firebase, GitLab, and Rsync over SSH.
* A collection of my personal notes related to Hugo is in my [code-notes/Hugo](https://github.com/swharden/code-notes/tree/main/Hugo) repository.
* [Deploying a Hugo site with Github Actions](https://www.yellowduck.be/posts/deploy-hugo-site-with-github-actions/) by Jono Fotografie
* Hugo: [Deployment with Rsync](https://gohugo.io/hosting-and-deployment/deployment-with-rsync/)
* Rsync documentation and argument information: [rsync(1)](https://linux.die.net/man/1/rsync)
Pages