The personal website of Scott W Harden
June 23rd, 2022

Resample Time Series Data using Cubic Spline Interpolation

Cubic spline interpolation can be used to modify the sample rate of time series data. This page describes how I achieve signal resampling with spline interpolation in pure C# without any external dependencies. This technique can be used to:

  • Convert unevenly-sampled data to a series of values with a fixed sample rate

  • Convert time series data from one sample rate to another sample rate

  • Fill-in missing values from a collection of measurements

Simulating Data Samples

To simulate unevenly-sampled data I create a theoretical signal then sample it at 20 random time points.

// this function represents the signal being measured
static double f(double x) => Math.Sin(x * 10) + Math.Sin(x * 13f);

// randomly sample values from 20 time points
Random rand = new(123);
double[] sampleXs = Enumerable.Range(0, 20)
    .Select(x => rand.NextDouble())
    .OrderBy(x => x)
    .ToArray();
double[] sampleYs = sampleXs.Select(x => f(x)).ToArray();

Resample for Evenly-Spaced Data

I then generate an interpolated spline using my sampled data points as the input.

  • I can control the sample rate by defining the number of points generated in the output signal.

  • Full source code is at the bottom of this article.

(double[] xs, double[] ys) = Cubic.Interpolate1D(sampleXs, sampleYs, count: 50);

The generated points line-up perfectly with the sampled data.

There is slight deviation from the theoretical signal (and it's larger where there is more missing data) but this is an unsurprising result considering the original samples had large gaps of missing data.

Source Code

Interpolation.cs

public static class Interpolation
{
    public static (double[] xs, double[] ys) Interpolate1D(double[] xs, double[] ys, int count)
    {
        if (xs is null || ys is null || xs.Length != ys.Length)
            throw new ArgumentException($"{nameof(xs)} and {nameof(ys)} must have same length");

        int inputPointCount = xs.Length;
        double[] inputDistances = new double[inputPointCount];
        for (int i = 1; i < inputPointCount; i++)
            inputDistances[i] = inputDistances[i - 1] + xs[i] - xs[i - 1];

        double meanDistance = inputDistances.Last() / (count - 1);
        double[] evenDistances = Enumerable.Range(0, count).Select(x => x * meanDistance).ToArray();
        double[] xsOut = Interpolate(inputDistances, xs, evenDistances);
        double[] ysOut = Interpolate(inputDistances, ys, evenDistances);
        return (xsOut, ysOut);
    }

    private static double[] Interpolate(double[] xOrig, double[] yOrig, double[] xInterp)
    {
        (double[] a, double[] b) = FitMatrix(xOrig, yOrig);

        double[] yInterp = new double[xInterp.Length];
        for (int i = 0; i < yInterp.Length; i++)
        {
            int j;
            for (j = 0; j < xOrig.Length - 2; j++)
                if (xInterp[i] <= xOrig[j + 1])
                    break;

            double dx = xOrig[j + 1] - xOrig[j];
            double t = (xInterp[i] - xOrig[j]) / dx;
            double y = (1 - t) * yOrig[j] + t * yOrig[j + 1] +
                t * (1 - t) * (a[j] * (1 - t) + b[j] * t);
            yInterp[i] = y;
        }

        return yInterp;
    }

    private static (double[] a, double[] b) FitMatrix(double[] x, double[] y)
    {
        int n = x.Length;
        double[] a = new double[n - 1];
        double[] b = new double[n - 1];
        double[] r = new double[n];
        double[] A = new double[n];
        double[] B = new double[n];
        double[] C = new double[n];

        double dx1, dx2, dy1, dy2;

        dx1 = x[1] - x[0];
        C[0] = 1.0f / dx1;
        B[0] = 2.0f * C[0];
        r[0] = 3 * (y[1] - y[0]) / (dx1 * dx1);

        for (int i = 1; i < n - 1; i++)
        {
            dx1 = x[i] - x[i - 1];
            dx2 = x[i + 1] - x[i];
            A[i] = 1.0f / dx1;
            C[i] = 1.0f / dx2;
            B[i] = 2.0f * (A[i] + C[i]);
            dy1 = y[i] - y[i - 1];
            dy2 = y[i + 1] - y[i];
            r[i] = 3 * (dy1 / (dx1 * dx1) + dy2 / (dx2 * dx2));
        }

        dx1 = x[n - 1] - x[n - 2];
        dy1 = y[n - 1] - y[n - 2];
        A[n - 1] = 1.0f / dx1;
        B[n - 1] = 2.0f * A[n - 1];
        r[n - 1] = 3 * (dy1 / (dx1 * dx1));

        double[] cPrime = new double[n];
        cPrime[0] = C[0] / B[0];
        for (int i = 1; i < n; i++)
            cPrime[i] = C[i] / (B[i] - cPrime[i - 1] * A[i]);

        double[] dPrime = new double[n];
        dPrime[0] = r[0] / B[0];
        for (int i = 1; i < n; i++)
            dPrime[i] = (r[i] - dPrime[i - 1] * A[i]) / (B[i] - cPrime[i - 1] * A[i]);

        double[] k = new double[n];
        k[n - 1] = dPrime[n - 1];
        for (int i = n - 2; i >= 0; i--)
            k[i] = dPrime[i] - cPrime[i] * k[i + 1];

        for (int i = 1; i < n; i++)
        {
            dx1 = x[i] - x[i - 1];
            dy1 = y[i] - y[i - 1];
            a[i - 1] = k[i - 1] * dx1 - dy1;
            b[i - 1] = -k[i] * dx1 + dy1;
        }

        return (a, b);
    }
}

Program.cs

This is the source code I used to generate the figures on this page.

Plots were generated using ScottPlot.NET.

// this function represents the signal being measured
static double f(double x) => Math.Sin(x * 10) + Math.Sin(x * 13f);

// create points representing randomly sampled time points of a smooth curve
Random rand = new(123);
double[] sampleXs = Enumerable.Range(0, 20)
    .Select(x => rand.NextDouble())
    .OrderBy(x => x)
    .ToArray();
double[] sampleYs = sampleXs.Select(x => f(x)).ToArray();

// use 1D interpolation to create an evenly sampled curve from unevenly sampled data
(double[] xs, double[] ys) = Interpolation.Interpolate1D(sampleXs, sampleYs, count: 50);

var plt = new ScottPlot.Plot(600, 400);

double[] theoreticalXs = ScottPlot.DataGen.Range(xs.Min(), xs.Max(), .01);
double[] theoreticalYs = theoreticalXs.Select(x => f(x)).ToArray();
var perfectPlot = plt.AddScatterLines(theoreticalXs, theoreticalYs);
perfectPlot.Label = "theoretical signal";
perfectPlot.Color = plt.Palette.GetColor(2);
perfectPlot.LineStyle = ScottPlot.LineStyle.Dash;

var samplePlot = plt.AddScatterPoints(sampleXs, sampleYs);
samplePlot.Label = "sampled points";
samplePlot.Color = plt.Palette.GetColor(0);
samplePlot.MarkerSize = 10;
samplePlot.MarkerShape = ScottPlot.MarkerShape.openCircle;
samplePlot.MarkerLineWidth = 2;

var smoothPlot = plt.AddScatter(xs, ys);
smoothPlot.Label = "interpolated points";
smoothPlot.Color = plt.Palette.GetColor(3);
smoothPlot.MarkerShape = ScottPlot.MarkerShape.filledCircle;

plt.Legend();
plt.SaveFig("output.png");

2D and 3D Spline Interpolation

The interpolation method described above only considered the horizontal axis when generating evenly-spaced time points (1D interpolation). For information and code examples regarding 2D and 3D cubic spline interpolation, see my previous blog post: Spline Interpolation with C#

Resources

Markdown source code last modified on June 24th, 2022
---
Title: Resample Time Series Data using Cubic Spline Interpolation
Description: This page describes how I achieve signal resampling with spline interpolation in pure C# without any external dependencies.
Date: 2022-06-23 21:15PM EST
Tags: csharp, graphics
---

# Resample Time Series Data using Cubic Spline Interpolation

**Cubic spline interpolation can be used to modify the sample rate of time series data.** This page describes how I achieve signal resampling with spline interpolation in pure C# without any external dependencies. This technique can be used to:

* Convert unevenly-sampled data to a series of values with a fixed sample rate

* Convert time series data from one sample rate to another sample rate

* Fill-in missing values from a collection of measurements

### Simulating Data Samples

To simulate unevenly-sampled data I create a theoretical signal then sample it at 20 random time points.

```cs
// this function represents the signal being measured
static double f(double x) => Math.Sin(x * 10) + Math.Sin(x * 13f);

// randomly sample values from 20 time points
Random rand = new(123);
double[] sampleXs = Enumerable.Range(0, 20)
    .Select(x => rand.NextDouble())
    .OrderBy(x => x)
    .ToArray();
double[] sampleYs = sampleXs.Select(x => f(x)).ToArray();
```

<img src="1-samples-only.png" class="mx-auto d-block mb-5">

### Resample for Evenly-Spaced Data

I then generate an interpolated spline using my sampled data points as the input. 

* I can control the sample rate by defining the number of points generated in the output signal. 

* Full source code is at the bottom of this article.

```cs
(double[] xs, double[] ys) = Cubic.Interpolate1D(sampleXs, sampleYs, count: 50);
```

<img src="2-resample-only.png" class="mx-auto d-block mb-5">

The generated points line-up perfectly with the sampled data.

<img src="2-resample.png" class="mx-auto d-block mb-5">

There is slight deviation from the theoretical signal (and it's larger where there is more missing data) but this is an unsurprising result considering the original samples had large gaps of missing data.

<img src="3-comparison.png" class="mx-auto d-block mb-5">

## Source Code

### Interpolation.cs

```cs
public static class Interpolation
{
    public static (double[] xs, double[] ys) Interpolate1D(double[] xs, double[] ys, int count)
    {
        if (xs is null || ys is null || xs.Length != ys.Length)
            throw new ArgumentException($"{nameof(xs)} and {nameof(ys)} must have same length");

        int inputPointCount = xs.Length;
        double[] inputDistances = new double[inputPointCount];
        for (int i = 1; i < inputPointCount; i++)
            inputDistances[i] = inputDistances[i - 1] + xs[i] - xs[i - 1];

        double meanDistance = inputDistances.Last() / (count - 1);
        double[] evenDistances = Enumerable.Range(0, count).Select(x => x * meanDistance).ToArray();
        double[] xsOut = Interpolate(inputDistances, xs, evenDistances);
        double[] ysOut = Interpolate(inputDistances, ys, evenDistances);
        return (xsOut, ysOut);
    }

    private static double[] Interpolate(double[] xOrig, double[] yOrig, double[] xInterp)
    {
        (double[] a, double[] b) = FitMatrix(xOrig, yOrig);

        double[] yInterp = new double[xInterp.Length];
        for (int i = 0; i < yInterp.Length; i++)
        {
            int j;
            for (j = 0; j < xOrig.Length - 2; j++)
                if (xInterp[i] <= xOrig[j + 1])
                    break;

            double dx = xOrig[j + 1] - xOrig[j];
            double t = (xInterp[i] - xOrig[j]) / dx;
            double y = (1 - t) * yOrig[j] + t * yOrig[j + 1] +
                t * (1 - t) * (a[j] * (1 - t) + b[j] * t);
            yInterp[i] = y;
        }

        return yInterp;
    }

    private static (double[] a, double[] b) FitMatrix(double[] x, double[] y)
    {
        int n = x.Length;
        double[] a = new double[n - 1];
        double[] b = new double[n - 1];
        double[] r = new double[n];
        double[] A = new double[n];
        double[] B = new double[n];
        double[] C = new double[n];

        double dx1, dx2, dy1, dy2;

        dx1 = x[1] - x[0];
        C[0] = 1.0f / dx1;
        B[0] = 2.0f * C[0];
        r[0] = 3 * (y[1] - y[0]) / (dx1 * dx1);

        for (int i = 1; i < n - 1; i++)
        {
            dx1 = x[i] - x[i - 1];
            dx2 = x[i + 1] - x[i];
            A[i] = 1.0f / dx1;
            C[i] = 1.0f / dx2;
            B[i] = 2.0f * (A[i] + C[i]);
            dy1 = y[i] - y[i - 1];
            dy2 = y[i + 1] - y[i];
            r[i] = 3 * (dy1 / (dx1 * dx1) + dy2 / (dx2 * dx2));
        }

        dx1 = x[n - 1] - x[n - 2];
        dy1 = y[n - 1] - y[n - 2];
        A[n - 1] = 1.0f / dx1;
        B[n - 1] = 2.0f * A[n - 1];
        r[n - 1] = 3 * (dy1 / (dx1 * dx1));

        double[] cPrime = new double[n];
        cPrime[0] = C[0] / B[0];
        for (int i = 1; i < n; i++)
            cPrime[i] = C[i] / (B[i] - cPrime[i - 1] * A[i]);

        double[] dPrime = new double[n];
        dPrime[0] = r[0] / B[0];
        for (int i = 1; i < n; i++)
            dPrime[i] = (r[i] - dPrime[i - 1] * A[i]) / (B[i] - cPrime[i - 1] * A[i]);

        double[] k = new double[n];
        k[n - 1] = dPrime[n - 1];
        for (int i = n - 2; i >= 0; i--)
            k[i] = dPrime[i] - cPrime[i] * k[i + 1];

        for (int i = 1; i < n; i++)
        {
            dx1 = x[i] - x[i - 1];
            dy1 = y[i] - y[i - 1];
            a[i - 1] = k[i - 1] * dx1 - dy1;
            b[i - 1] = -k[i] * dx1 + dy1;
        }

        return (a, b);
    }
}
```

### Program.cs

This is the source code I used to generate the figures on this page. 

Plots were generated using [ScottPlot.NET](https://scottplot.net).

```cs
// this function represents the signal being measured
static double f(double x) => Math.Sin(x * 10) + Math.Sin(x * 13f);

// create points representing randomly sampled time points of a smooth curve
Random rand = new(123);
double[] sampleXs = Enumerable.Range(0, 20)
    .Select(x => rand.NextDouble())
    .OrderBy(x => x)
    .ToArray();
double[] sampleYs = sampleXs.Select(x => f(x)).ToArray();

// use 1D interpolation to create an evenly sampled curve from unevenly sampled data
(double[] xs, double[] ys) = Interpolation.Interpolate1D(sampleXs, sampleYs, count: 50);

var plt = new ScottPlot.Plot(600, 400);

double[] theoreticalXs = ScottPlot.DataGen.Range(xs.Min(), xs.Max(), .01);
double[] theoreticalYs = theoreticalXs.Select(x => f(x)).ToArray();
var perfectPlot = plt.AddScatterLines(theoreticalXs, theoreticalYs);
perfectPlot.Label = "theoretical signal";
perfectPlot.Color = plt.Palette.GetColor(2);
perfectPlot.LineStyle = ScottPlot.LineStyle.Dash;

var samplePlot = plt.AddScatterPoints(sampleXs, sampleYs);
samplePlot.Label = "sampled points";
samplePlot.Color = plt.Palette.GetColor(0);
samplePlot.MarkerSize = 10;
samplePlot.MarkerShape = ScottPlot.MarkerShape.openCircle;
samplePlot.MarkerLineWidth = 2;

var smoothPlot = plt.AddScatter(xs, ys);
smoothPlot.Label = "interpolated points";
smoothPlot.Color = plt.Palette.GetColor(3);
smoothPlot.MarkerShape = ScottPlot.MarkerShape.filledCircle;

plt.Legend();
plt.SaveFig("output.png");
```

## 2D and 3D Spline Interpolation

**The interpolation method described above only considered the horizontal axis** when generating evenly-spaced time points (1D interpolation). For information and code examples regarding 2D and 3D cubic spline interpolation, see my previous blog post: [Spline Interpolation with C#](https://swharden.com/blog/2022-01-22-spline-interpolation/) 

<a href="https://swharden.com/blog/2022-01-22-spline-interpolation/"><img src="https://swharden.com/blog/2022-01-22-spline-interpolation/screenshot.gif" class="mx-auto d-block mb-5"></a>

## Resources

* [2D and 3D Spline Interpolation with C#](https://swharden.com/blog/2022-01-22-spline-interpolation/) - Blog post from Jan, 2022

* [Spline Interpolation with ScottPlot](https://scottplot.net/cookbook/4.1/category/misc/#spline-interpolation) - Demonstrates additional types of interpolation: Bezier, Catmull-Rom, Chaikin, Cubic, etc. The project is open source under a MIT license.

* [Programmer's guide to polynomials and splines](https://wordsandbuttons.online/programmers_guide_to_polynomials_and_splines.html)
May 29th, 2022

Show A Progress Bar as your Blazor App Loads

Today I created a Blazor WebAssembly app that shows a progress bar while the page loads. This is especially useful for users on slow connections because Blazor apps typically require several megabytes of DLL and DAT files to be downloaded before meaningful content appears on the page.

Live Demo: LJPcalc (source code)

Step 1: Add a progress bar

Edit index.html and identify your app's main div:

<div id="app">Loading...</div>

Add a progress bar inside it:

<div id="app">
    <h2>Loading...</h2>
    <div class="progress mt-2" style="height: 2em;">
        <div id="progressbar" class="progress-bar progress-bar-striped progress-bar-animated"
            style="width: 10%; background-color: #204066;"></div>
    </div>
    <div>
        <span id="progressLabel" class="text-muted">Downloading file list</span>
    </div>
</div>

See Bootstrap's progressbar page for extensive customization and animation options and best practices when working with progress indicators. Also ensure the version of Bootstrap in your Blazor app is consistent with the documentation/HTML you are referencing.

Step 2: Disable Blazor AutoStart

Edit index.html and identify where your app loads Blazor resources:

<script src="_framework/blazor.webassembly.js"></script>

Update that script so it does not download automatically:

<script src="_framework/blazor.webassembly.js" autostart="false"></script>

Step 3: Create a Blazor Startup Script

Add a script to the bottom of the page to start Blazor manually, identifying all the resources needed and incrementally downloading them while updating the progressbar along the way.

<script>
    function StartBlazor() {
        let loadedCount = 0;
        const resourcesToLoad = [];
        Blazor.start({
            loadBootResource:
                function (type, filename, defaultUri, integrity) {
                    if (type == "dotnetjs")
                        return defaultUri;

                    const fetchResources = fetch(defaultUri, {
                        cache: 'no-cache',
                        integrity: integrity,
                        headers: { 'MyCustomHeader': 'My custom value' }
                    });

                    resourcesToLoad.push(fetchResources);

                    fetchResources.then((r) => {
                        loadedCount += 1;
                        if (filename == "blazor.boot.json")
                            return;
                        const totalCount = resourcesToLoad.length;
                        const percentLoaded = 10 + parseInt((loadedCount * 90.0) / totalCount);
                        const progressbar = document.getElementById('progressbar');
                        progressbar.style.width = percentLoaded + '%';
                        const progressLabel = document.getElementById('progressLabel');
                        progressLabel.innerText = `Downloading ${loadedCount}/${totalCount}: ${filename}`;
                    });

                    return fetchResources;
                }
        });
    }

    StartBlazor();
</script>

Resources

Markdown source code last modified on May 29th, 2022
---
title: Show A Progress Bar as your Blazor App Loads
description: How to add a progress bar to your client-side Blazor WebAssembly app to indicate page load progress.
date: 2022-05-29 16:05:00 EST
tags: blazor, csharp
---

# Show A Progress Bar as your Blazor App Loads

**Today I created a Blazor WebAssembly app that shows a progress bar while the page loads.** This is especially useful for users on slow connections because Blazor apps typically require several megabytes of DLL and DAT files to be downloaded before meaningful content appears on the page.

Live Demo: [LJPcalc](https://swharden.com/LJPcalc/) ([source code](https://github.com/swharden/LJPcalc))

<img src="blazor-load-progress-v2.gif" class="mx-auto d-block border shadow my-5">

## Step 1: Add a progress bar

Edit index.html and identify your app's main div:

```html
<div id="app">Loading...</div>
```

Add a progress bar inside it:
```html
<div id="app">
	<h2>Loading...</h2>
	<div class="progress mt-2" style="height: 2em;">
		<div id="progressbar" class="progress-bar progress-bar-striped progress-bar-animated"
			style="width: 10%; background-color: #204066;"></div>
	</div>
	<div>
		<span id="progressLabel" class="text-muted">Downloading file list</span>
	</div>
</div>
```

See [Bootstrap's progressbar page](https://getbootstrap.com/docs/5.2/components/progress/) for extensive customization and animation options and best practices when working with progress indicators. Also ensure the version of Bootstrap in your Blazor app is consistent with the documentation/HTML you are referencing.

## Step 2: Disable Blazor AutoStart

Edit index.html and identify where your app loads Blazor resources:

```html
<script src="_framework/blazor.webassembly.js"></script>
```

Update that script so it does _not_ download automatically:
```html
<script src="_framework/blazor.webassembly.js" autostart="false"></script>
```

## Step 3: Create a Blazor Startup Script

Add a script to the bottom of the page to start Blazor manually, identifying all the resources needed and incrementally downloading them while updating the progressbar along the way.

```html
<script>
	function StartBlazor() {
		let loadedCount = 0;
		const resourcesToLoad = [];
		Blazor.start({
			loadBootResource:
				function (type, filename, defaultUri, integrity) {
					if (type == "dotnetjs")
						return defaultUri;

					const fetchResources = fetch(defaultUri, {
						cache: 'no-cache',
						integrity: integrity,
						headers: { 'MyCustomHeader': 'My custom value' }
					});

					resourcesToLoad.push(fetchResources);

					fetchResources.then((r) => {
						loadedCount += 1;
						if (filename == "blazor.boot.json")
							return;
						const totalCount = resourcesToLoad.length;
						const percentLoaded = 10 + parseInt((loadedCount * 90.0) / totalCount);
						const progressbar = document.getElementById('progressbar');
						progressbar.style.width = percentLoaded + '%';
						const progressLabel = document.getElementById('progressLabel');
						progressLabel.innerText = `Downloading ${loadedCount}/${totalCount}: ${filename}`;
					});

					return fetchResources;
				}
		});
	}

	StartBlazor();
</script>
```

## Resources

* [ASP.NET Core Blazor startup](https://docs.microsoft.com/en-us/aspnet/core/blazor/fundamentals/startup) describes the `Blazor.start()` process and `loadBootResource()`.

* Code here inspired by [@EdmondShtogu's comment](https://github.com/dotnet/aspnetcore/issues/25165#issuecomment-781683925) on [dotnet/aspnetcore #25165](https://github.com/dotnet/aspnetcore/issues/25165) from Feb 18, 2021
May 25th, 2022

Use Maui.Graphics to Draw 2D Graphics in Any .NET Application

This week Microsoft officially released .NET Maui and the new Microsoft.Maui.Graphics library which can draw 2D graphics in any .NET application (not just Maui apps). This page offers a quick look at how to use this new library to draw graphics using SkiaSharp in a .NET 6 console application. The C# Data Visualization site has additional examples for drawing and animating graphics using Microsoft.Maui.Graphics in Windows Forms and WPF applications.

The code below is a full .NET 6 console application demonstrating common graphics tasks (setting colors, drawing shapes, rendering text, etc.) and was used to generate the image above.

// These packages are available on NuGet
using Microsoft.Maui.Graphics;
using Microsoft.Maui.Graphics.Skia;

// Create a bitmap in memory and draw on its Canvas
SkiaBitmapExportContext bmp = new(600, 400, 1.0f);
ICanvas canvas = bmp.Canvas;

// Draw a big blue rectangle with a dark border
Rect backgroundRectangle = new(0, 0, bmp.Width, bmp.Height);
canvas.FillColor = Color.FromArgb("#003366");
canvas.FillRectangle(backgroundRectangle);
canvas.StrokeColor = Colors.Black;
canvas.StrokeSize = 20;
canvas.DrawRectangle(backgroundRectangle);

// Draw circles randomly around the image
for (int i = 0; i < 100; i++)
{
    float x = Random.Shared.Next(bmp.Width);
    float y = Random.Shared.Next(bmp.Height);
    float r = Random.Shared.Next(5, 50);

    Color randomColor = Color.FromRgb(
        red: Random.Shared.Next(255),
        green: Random.Shared.Next(255),
        blue: Random.Shared.Next(255));

    canvas.StrokeSize = r / 3;
    canvas.StrokeColor = randomColor.WithAlpha(.3f);
    canvas.DrawCircle(x, y, r);
}

// Measure a string
string myText = "Hello, Maui.Graphics!";
Font myFont = new Font("Impact");
float myFontSize = 48;
canvas.Font = myFont;
SizeF textSize = canvas.GetStringSize(myText, myFont, myFontSize);

// Draw a rectangle to hold the string
Point point = new(
    x: (bmp.Width - textSize.Width) / 2,
    y: (bmp.Height - textSize.Height) / 2);
Rect myTextRectangle = new(point, textSize);
canvas.FillColor = Colors.Black.WithAlpha(.5f);
canvas.FillRectangle(myTextRectangle);
canvas.StrokeSize = 2;
canvas.StrokeColor = Colors.Yellow;
canvas.DrawRectangle(myTextRectangle);

// Daw the string itself
canvas.FontSize = myFontSize * .9f; // smaller than the rectangle
canvas.FontColor = Colors.White;
canvas.DrawString(myText, myTextRectangle, 
    HorizontalAlignment.Center, VerticalAlignment.Center, TextFlow.OverflowBounds);

// Save the image as a PNG file
bmp.WriteToFile("console2.png");

Multi-Platform Graphics Abstraction

The Microsoft.Maui.Graphics namespace a small collection of interfaces which can be implemented by many different rendering technologies (SkiaSharp, SharpDX, GDI, etc.), making it possible to create drawing routines that are totally abstracted from the underlying graphics rendering system.

I really like that I can now create a .NET Standard 2.0 project that exclusively uses interfaces from Microsoft.Maui.Graphics to write code that draws complex graphics, then reference that code from other projects that use platform-specific graphics libraries to render the images.

When I write scientific simulations or data visualization code I frequently regard my graphics drawing routines as business logic, and drawing with Maui.Graphics lets me write this code to an abstraction that keeps rendering technology dependencies out of my business logic - a big win!

Rough Edges

After working with this library while it was being developed over the last few months, these are the things I find most limiting in my personal projects which made it through the initial release this week. Some of them are open issues so they may get fixed soon, and depending on how the project continues to evolve many of these rough edges may improve with time. I'm listing them here now so I can keep track of them, and I intend to update this list if/as these topics improve:

  • Strings cannot be accurately measured: The size returned by GetStringSize() is inaccurate and does not respect font. There's an issue tracking this (#279), but it's been open for more than three months and the library was released this week in its broken state.

    • EDIT: I concede multi-platform font support is a very hard problem, but this exactly the type of problem that .NET Maui was created to solve.

  • Missing XML documentation: Intellisense can really help people who are new to a library. The roll-out of a whole new application framework is a good example of a time when a lot of people will be exploring a new library. Let's take the Color class for example (which 100% of people will interact with) and consider misunderstandings that could be prevented by XML documentation and intellisense: If new Color() accepts 3 floats, should they be 0-255 or 0-1? I need to make a color from the RGB web value #003366, why does Color.FromHex() tell me to use FromArgb? Web colors are RGBA, should I use FromRrgba()? But wait, that string is RGB, not ARGB or RGBA, so will it throw an exception? What does Color.Parse() do?

    • Edit 1: Some of these answers are documented in source code, but they are not XML docs, so this information is not available to library users.

    • Edit 2: Is it on the open-source community to contribute XML documentation? If so, fair enough, but it is a very extensive effort (to write and to review), so a call should be put out for this job to ensure someone doesn't go through all the effort then have their open PR sit unmerged for months while it falls out of sync with the main branch.

  • The library has signs of being incomplete: There remain a good number of NotImplementedException and // todo in sections of the code base that indicate additional work is still required.

Again, I'm pointing these things out the very first week .NET Maui was released, so there's plenty of time and opportunity for improvements in the coming weeks and months.

I'm optimistic this library will continue to improve, and I am very excited to watch it progress! I'm not aware of the internal pressures and constraints that led to the library being released like it was this week, but I want to end by complimenting the team on their great job so far and encourage everyone (at Microsoft and in the open-source community at large) to keep moving this library forward. The .NET Maui team undertook an ambitious challenge by setting-out to implement cross-platform graphics support, but achieving this goal elegantly will be a huge accomplishment for the .NET community!

Resources

Markdown source code last modified on May 26th, 2022
---
title: Use Maui.Graphics to Draw 2D Graphics in Any .NET Application
description: How to use Microsoft.Maui.Graphics to draw graphics in a .NET console application and save the output as an image file using SkiaSharp
date: 2022-05-25 22:13:00
tags: csharp, graphics, maui
---

# Use Maui.Graphics to Draw 2D Graphics in Any .NET Application

**This week Microsoft officially released .NET Maui and the new `Microsoft.Maui.Graphics` library which can draw 2D graphics in any .NET application (not just Maui apps).** This page offers a quick look at how to use this new library to draw graphics using SkiaSharp in a .NET 6 console application. The [C# Data Visualization](https://swharden.com/csdv/) site has additional examples for drawing and animating graphics using `Microsoft.Maui.Graphics` in Windows Forms and WPF applications.

<img src="maui-graphics-quickstart.png" class="mx-auto my-5 d-block shadow">

The code below is a full .NET 6 console application demonstrating common graphics tasks (setting colors, drawing shapes, rendering text, etc.) and was used to generate the image above.

```cs
// These packages are available on NuGet
using Microsoft.Maui.Graphics;
using Microsoft.Maui.Graphics.Skia;

// Create a bitmap in memory and draw on its Canvas
SkiaBitmapExportContext bmp = new(600, 400, 1.0f);
ICanvas canvas = bmp.Canvas;

// Draw a big blue rectangle with a dark border
Rect backgroundRectangle = new(0, 0, bmp.Width, bmp.Height);
canvas.FillColor = Color.FromArgb("#003366");
canvas.FillRectangle(backgroundRectangle);
canvas.StrokeColor = Colors.Black;
canvas.StrokeSize = 20;
canvas.DrawRectangle(backgroundRectangle);

// Draw circles randomly around the image
for (int i = 0; i < 100; i++)
{
    float x = Random.Shared.Next(bmp.Width);
    float y = Random.Shared.Next(bmp.Height);
    float r = Random.Shared.Next(5, 50);

    Color randomColor = Color.FromRgb(
        red: Random.Shared.Next(255),
        green: Random.Shared.Next(255),
        blue: Random.Shared.Next(255));

    canvas.StrokeSize = r / 3;
    canvas.StrokeColor = randomColor.WithAlpha(.3f);
    canvas.DrawCircle(x, y, r);
}

// Measure a string
string myText = "Hello, Maui.Graphics!";
Font myFont = new Font("Impact");
float myFontSize = 48;
canvas.Font = myFont;
SizeF textSize = canvas.GetStringSize(myText, myFont, myFontSize);

// Draw a rectangle to hold the string
Point point = new(
    x: (bmp.Width - textSize.Width) / 2,
    y: (bmp.Height - textSize.Height) / 2);
Rect myTextRectangle = new(point, textSize);
canvas.FillColor = Colors.Black.WithAlpha(.5f);
canvas.FillRectangle(myTextRectangle);
canvas.StrokeSize = 2;
canvas.StrokeColor = Colors.Yellow;
canvas.DrawRectangle(myTextRectangle);

// Daw the string itself
canvas.FontSize = myFontSize * .9f; // smaller than the rectangle
canvas.FontColor = Colors.White;
canvas.DrawString(myText, myTextRectangle, 
    HorizontalAlignment.Center, VerticalAlignment.Center, TextFlow.OverflowBounds);

// Save the image as a PNG file
bmp.WriteToFile("console2.png");
```

## Multi-Platform Graphics Abstraction

**The `Microsoft.Maui.Graphics` namespace a small collection of interfaces which can be implemented by many different rendering technologies** (SkiaSharp, SharpDX, GDI, etc.), making it possible to create drawing routines that are totally abstracted from the underlying graphics rendering system.

I really like that I can now create a .NET Standard 2.0 project that exclusively uses interfaces from `Microsoft.Maui.Graphics` to write code that draws complex graphics, then reference that code from other projects that use platform-specific graphics libraries to render the images.

When I write scientific simulations or data visualization code I frequently regard my graphics drawing routines as business logic, and drawing with Maui.Graphics lets me write this code to an abstraction that keeps rendering technology dependencies out of my business logic - a big win!

## Rough Edges

After working with this library while it was being developed over the last few months, these are the things I find most limiting in my personal projects which made it through the initial release this week. Some of them are [open issues](https://github.com/dotnet/Microsoft.Maui.Graphics/issues) so they may get fixed soon, and depending on how the project continues to evolve many of these rough edges may improve with time. I'm listing them here now so I can keep track of them, and I intend to update this list if/as these topics improve:

<svg xmlns="http://www.w3.org/2000/svg" style="display: none;">
  <symbol id="check-circle-fill" fill="currentColor" viewBox="0 0 16 16">
    <path d="M16 8A8 8 0 1 1 0 8a8 8 0 0 1 16 0zm-3.97-3.03a.75.75 0 0 0-1.08.022L7.477 9.417 5.384 7.323a.75.75 0 0 0-1.06 1.06L6.97 11.03a.75.75 0 0 0 1.079-.02l3.992-4.99a.75.75 0 0 0-.01-1.05z"/>
  </symbol>
  <symbol id="info-fill" fill="currentColor" viewBox="0 0 16 16">
    <path d="M8 16A8 8 0 1 0 8 0a8 8 0 0 0 0 16zm.93-9.412-1 4.705c-.07.34.029.533.304.533.194 0 .487-.07.686-.246l-.088.416c-.287.346-.92.598-1.465.598-.703 0-1.002-.422-.808-1.319l.738-3.468c.064-.293.006-.399-.287-.47l-.451-.081.082-.381 2.29-.287zM8 5.5a1 1 0 1 1 0-2 1 1 0 0 1 0 2z"/>
  </symbol>
  <symbol id="exclamation-triangle-fill" fill="currentColor" viewBox="0 0 16 16">
    <path d="M8.982 1.566a1.13 1.13 0 0 0-1.96 0L.165 13.233c-.457.778.091 1.767.98 1.767h13.713c.889 0 1.438-.99.98-1.767L8.982 1.566zM8 5c.535 0 .954.462.9.995l-.35 3.507a.552.552 0 0 1-1.1 0L7.1 5.995A.905.905 0 0 1 8 5zm.002 6a1 1 0 1 1 0 2 1 1 0 0 1 0-2z"/>
  </symbol>
</svg>

<div class="alert alert-primary d-flex align-items-center" role="alert">
  <svg class="bi flex-shrink-0 me-2" width="24" height="24" role="img" aria-label="Info:"><use xlink:href="#info-fill"/></svg>
  <div>
    <strong>Note:</strong> This section was last reviewed on April 25, 2022 and improvements may have been made since this text was written.
  </div>
</div>

* **Strings cannot be accurately measured:** The size returned by `GetStringSize()` is inaccurate and does not respect font. There's an issue tracking this ([#279](https://github.com/dotnet/Microsoft.Maui.Graphics/issues/279)), but it's been open for more than three months and the library was released this week in its broken state.

  * EDIT: I concede multi-platform font support is a very hard problem, but this exactly the type of problem that .NET Maui was created to solve.<br><br>

* **Missing XML documentation:** Intellisense can really help people who are new to a library. The roll-out of a whole new application framework is a good example of a time when a lot of people will be exploring a new library. Let's take the [`Color` class](https://github.com/dotnet/Microsoft.Maui.Graphics/blob/main/src/Microsoft.Maui.Graphics/Color.cs) for example (which 100% of people will interact with) and consider misunderstandings that could be prevented by XML documentation and intellisense: If `new Color()` accepts 3 floats, should they be 0-255 or 0-1? I need to make a color from the RGB web value `#003366`, why does `Color.FromHex()` tell me to use `FromArgb`? Web colors are RGBA, should I use `FromRrgba()`? But wait, that string is RGB, not ARGB or RGBA, so will it throw an exception? What does `Color.Parse()` do?

  * Edit 1: Some of these answers are [documented in source code](https://github.com/dotnet/Microsoft.Maui.Graphics/blob/e15f2d552d851c28771e7fe092895e908395f8a4/src/Microsoft.Maui.Graphics/Color.cs#L574-L590), but they are not XML docs, so this information is not available to library users.

  * Edit 2: Is it on the open-source community to contribute XML documentation? If so, fair enough, but it is a very extensive effort (to write _and_ to review), so a call should be put out for this job to ensure someone doesn't go through all the effort then have their open PR sit unmerged for months while it falls out of sync with the main branch.

* **The library has signs of being incomplete:** There remain a good number of [NotImplementedException](https://github.com/dotnet/Microsoft.Maui.Graphics/search?q=NotImplementedException) and [// todo](https://github.com/dotnet/Microsoft.Maui.Graphics/search?q=todo) in sections of the code base that indicate additional work is still required.

Again, I'm pointing these things out the very first week .NET Maui was released, so there's plenty of time and opportunity for improvements in the coming weeks and months.

**I'm optimistic this library will continue to improve, and I am very excited to watch it progress!** I'm not aware of the internal pressures and constraints that led to the library being released like it was this week, but I want to end by complimenting the team on their great job so far and encourage everyone (at Microsoft and in the open-source community at large) to keep moving this library forward. The .NET Maui team undertook an ambitious challenge by setting-out to implement cross-platform graphics support, but achieving this goal elegantly will be a huge accomplishment for the .NET community!

## Resources
* [Source code for this project](https://github.com/swharden/Csharp-Data-Visualization/tree/main/projects/maui-graphics)
* [Maui.Graphics WinForms Quickstart](https://swharden.com/csdv/maui.graphics/quickstart-winforms/)
* [Maui.Graphics WPF Quickstart](https://swharden.com/csdv/maui.graphics/quickstart-wpf/)
* [Maui.Graphics Console Quickstart](https://swharden.com/csdv/maui.graphics/quickstart-console/)
* [Maui.Graphics .NET Maui Quickstart](https://swharden.com/csdv/maui.graphics/quickstart-maui/)
* [https://maui.graphics](https://maui.graphics)
May 1st, 2022

Using DataFrames in C#

The DataFrame is a data structure designed for manipulation, analysis, and visualization of tabular data, and it is the cornerstone of many data science applications. One of the most famous implementations of the data frame is provided by the Pandas package for Python. An equivalent data structure is available for C# using Microsoft's data analysis package. Although data frames are commonly used in Jupyter notebooks, they can be used in standard .NET applications as well. This article surveys Microsoft's Data Analysis package and introduces how to interact with with data frames using C# and the .NET platform.

DataFrame Quickstart

  • A DataFrame is a 2D matrix that stores data values in named columns.
  • Each column has a distinct data type.
  • Rows represent observations.

Add the Microsoft.Data.Analysis package to your project, then you can create a DataFrame like this:

using Microsoft.Data.Analysis;

string[] names = { "Oliver", "Charlotte", "Henry", "Amelia", "Owen" };
int[] ages = { 23, 19, 42, 64, 35 };
double[] heights = { 1.91, 1.62, 1.72, 1.57, 1.85 };

DataFrameColumn[] columns = {
    new StringDataFrameColumn("Name", names),
    new PrimitiveDataFrameColumn<int>("Age", ages),
    new PrimitiveDataFrameColumn<double>("Height", heights),
};

DataFrame df = new(columns);

Contents of a DataFrame can be previewed using Console.WriteLine(df) but the formatting isn't pretty.

Name  Age   Height
Oliver23    1.91
Charlotte19    1.62
Henry 42    1.72
Amelia64    1.57
Owen  35    1.85

Pretty DataFrame Formatting

A custom PrettyPrint() extension method can improve DataFrame readability. Implementing this as an extension method allows me to call df.PrettyPrint() anywhere in my code.

💡 View the full PrettyPrinters.cs source code
using Microsoft.Data.Analysis;
using System.Text;

internal static class PrettyPrinters
{
    public static void PrettyPrint(this DataFrame df) => Console.WriteLine(PrettyText(df));
    public static string PrettyText(this DataFrame df) => ToStringArray2D(df).ToFormattedText();

    public static string ToMarkdown(this DataFrame df) => ToStringArray2D(df).ToMarkdown();

    public static void PrettyPrint(this DataFrameRow row) => Console.WriteLine(Pretty(row));
    public static string Pretty(this DataFrameRow row) => row.Select(x => x?.ToString() ?? string.Empty).StringJoin();
    private static string StringJoin(this IEnumerable<string> strings) => string.Join(" ", strings.Select(x => x.ToString()));

    private static string[,] ToStringArray2D(DataFrame df)
    {
        string[,] strings = new string[df.Rows.Count + 1, df.Columns.Count];

        for (int i = 0; i < df.Columns.Count; i++)
            strings[0, i] = df.Columns[i].Name;

        for (int i = 0; i < df.Rows.Count; i++)
            for (int j = 0; j < df.Columns.Count; j++)
                strings[i + 1, j] = df[i, j]?.ToString() ?? string.Empty;

        return strings;
    }

    private static int[] GetMaxLengthsByColumn(this string[,] strings)
    {
        int[] maxLengthsByColumn = new int[strings.GetLength(1)];

        for (int y = 0; y < strings.GetLength(0); y++)
            for (int x = 0; x < strings.GetLength(1); x++)
                maxLengthsByColumn[x] = Math.Max(maxLengthsByColumn[x], strings[y, x].Length);

        return maxLengthsByColumn;
    }

    private static string ToFormattedText(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x] + 2));
            }
            sb.AppendLine();
        }

        return sb.ToString();
    }


    private static string ToMarkdown(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x]));
                if (x < strings.GetLength(1) - 1)
                    sb.Append(" | ");
            }
            sb.AppendLine();

            if (y == 0)
            {
                for (int i = 0; i < strings.GetLength(1); i++)
                {
                    int bars = maxLengthsByColumn[i] + 2;
                    if (i == 0)
                        bars -= 1;
                    sb.Append(new String('-', bars));

                    if (i < strings.GetLength(1) - 1)
                        sb.Append("|");
                }
                sb.AppendLine();
            }
        }

        return sb.ToString();
    }
}
Name       Age  Height
Oliver     23   1.91
Charlotte  19   1.62
Henry      42   1.72
Amelia     64   1.57
Owen       35   1.85

I can create similar methods to format a DataFrame as Markdown or HTML.

Name      | Age | Height
----------|-----|--------
Oliver    | 23  | 1.91
Charlotte | 19  | 1.62
Henry     | 42  | 1.72
Amelia    | 64  | 1.57
Owen      | 35  | 1.85
Name Age Height
Oliver 23 1.91
Charlotte 19 1.62
Henry 42 1.72
Amelia 64 1.57
Owen 35 1.85

Using DataFrames in Interactive Notebooks

To get started using .NET workbooks, install the .NET Interactive Notebooks extension for VS Code, create a new demo.ipynb file, then add your code.

Previously users had to create custom HTML formatters to properly display DataFrames in .NET Interactive Notebooks, but these days it works right out of the box.

💡 See demo.html for a full length demonstration notebook

// visualize the DataFrame
df

Append a Row

Build a new row using key/value pair then append it to the DataFrame

List<KeyValuePair<string, object>> newRowData = new()
{
    new KeyValuePair<string, object>("Name", "Scott"),
    new KeyValuePair<string, object>("Age", 36),
    new KeyValuePair<string, object>("Height", 1.65),
};

df.Append(newRowData, inPlace: true);

Add a Column

Build a new column, populate it with data, and add it to the DataFrame

int[] weights = { 123, 321, 111, 121, 131 };
PrimitiveDataFrameColumn<int> weightCol = new("Weight", weights);
df.Columns.Add(weightCol);

Sort and Filter

The DataFrame class has numerous operations available to sort, filter, and analyze data in many different ways. A popular pattern when working with DataFrames is to use method chaining to combine numerous operations together into a single statement. See the DataFrame Class API for a full list of available operations.

df.OrderBy("Name")
    .Filter(df["Age"].ElementwiseGreaterThan(30))
    .PrettyPrint();
Name    Age  Height
Henry   42   1.72
Oliver  23   1.91
Owen    35   1.85

Mathematical Operations

It's easy to perform math on columns or across multiple DataFrames. In this example we will perform math using two columns and create a new column to hold the output.

DataFrameColumn iqCol = df["Age"] * df["Height"] * 1.5;

double[] iqs = Enumerable.Range(0, (int)iqCol.Length)
    .Select(x => (double)iqCol[x])
    .ToArray();

df.Columns.Add(new PrimitiveDataFrameColumn<double>("IQ", iqs));
df.PrettyPrint();
Name       Age  Height  IQ
Oliver     23   1.91    65.9
Charlotte  19   1.62    46.17
Henry      42   1.72    108.36
Amelia     64   1.57    150.72
Owen       35   1.85    97.12

Statistical Operations

You can iterate across every row of a column to calculate population statistics

foreach (DataFrameColumn col in df.Columns.Skip(1))
{
    // warning: additional care must be taken for datasets which contain null
    double[] values = Enumerable.Range(0, (int)col.Length).Select(x => Convert.ToDouble(col[x])).ToArray();
    (double mean, double std) = MeanAndStd(values);
    Console.WriteLine($"{col.Name} = {mean} +/- {std:N3} (n={values.Length})");
}
Age = 36.6 +/- 15.982 (n=5)
Height = 1.734 +/- 0.130 (n=5)
💡 View the full MeanAndStd() source code
private static (double mean, double std) MeanAndStd(double[] values)
{
    if (values is null)
        throw new ArgumentNullException(nameof(values));

    if (values.Length == 0)
        throw new ArgumentException($"{nameof(values)} must not be empty");

    double sum = 0;
    for (int i = 0; i < values.Length; i++)
        sum += values[i];

    double mean = sum / values.Length;

    double sumVariancesSquared = 0;
    for (int i = 0; i < values.Length; i++)
    {
        double pointVariance = Math.Abs(mean - values[i]);
        double pointVarianceSquared = Math.Pow(pointVariance, 2);
        sumVariancesSquared += pointVarianceSquared;
    }

    double meanVarianceSquared = sumVariancesSquared / values.Length;
    double std = Math.Sqrt(meanVarianceSquared);

    return (mean, std);
}

Plot Values from a DataFrame

I use ScottPlot.NET to visualize data from DataFrames in .NET applications and .NET Interactive Notebooks. ScottPlot can generate a variety of plot types and has many options for customization. See the ScottPlot Cookbook for examples and API documentation.

// Register a custom formatter to display ScottPlot plots as images
using Microsoft.DotNet.Interactive.Formatting;
Formatter.Register(typeof(ScottPlot.Plot), (plt, writer) => 
    writer.Write(((ScottPlot.Plot)plt).GetImageHTML()), HtmlFormatter.MimeType);
// Get the data you wish to display in double arrays
double[] ages = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Age"][x])).ToArray();
double[] heights = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Height"][x])).ToArray();
// Create and display a plot
var plt = new ScottPlot.Plot(400, 300);
plt.AddScatter(ages, heights);
plt.XLabel("Age");
plt.YLabel("Height");
plt

💡 See demo.html for a full length demonstration notebook

If you are only working inside a Notebook and you want all your plots to be HTML and JavaScript, XPlot.Plotly is a good tool to use.

Data may contain null

I didn't demonstrate it in the code examples above, but note that all column data types are nullable. While null-containing data requires extra considerations when writing mathematical routes, it's a convenient way to model missing data which is a common occurrence in the real world.

Why not just use LINQ?

I see this question asked frequently, often with an aggressive and condescending tone. LINQ (Language-Integrated Query) is fantastic for performing logical operations on simple collections of data. When you have large 2D datasets of labeled data, advantages of data frames over flat LINQ statements start to become apparent. It is also easy to perform logical operations across multiple data frames, allowing users to write simpler and more readable code than could be achieved with LINQ statements. Data frames also make it much easier to visualize complex data too. In the data science world where complex labeled datasets are routinely compared, manipulated, merged, and visualized, often in an interactive context, the data frames are much easier to work with than raw LINQ statements.

Conclusions

Although I typically reach for Python to perform exploratory data science, it's good to know that C# has a DataFrame available and that it can be used to inspect and manipulate tabular data. DataFrames pair well with ScottPlot figures in interactive notebooks and are a great way to inspect and communicate complex data. I look forward to watching Microsoft's Data Analysis namespace continue to evolve as part of their machine learning / ML.NET platform.

Resources

Markdown source code last modified on May 2nd, 2022
---
title: Using DataFrames in C#
description: How to use the DataFrame class from the Microsoft.Data.Analysis package to interact with tabular data
date: 2022-05-01 23:00:00
tags: csharp
---

# Using DataFrames in C# 

**The DataFrame is a data structure designed for manipulation, analysis, and visualization of tabular data, and it is the cornerstone of many data science applications.** One of the most famous implementations of the data frame is provided by the Pandas package for Python. An equivalent data structure is available for C# using Microsoft's data analysis package. Although data frames are commonly used in Jupyter notebooks, they can be used in standard .NET applications as well. This article surveys Microsoft's Data Analysis package and introduces how to interact with with data frames using C# and the .NET platform.

## DataFrame Quickstart

* A DataFrame is a 2D matrix that stores data values in named columns.
* Each column has a distinct data type.
* Rows represent observations.

Add the [Microsoft.Data.Analysis package](https://www.nuget.org/packages/Microsoft.Data.Analysis/) to your project, then you can create a DataFrame like this:

```cs
using Microsoft.Data.Analysis;

string[] names = { "Oliver", "Charlotte", "Henry", "Amelia", "Owen" };
int[] ages = { 23, 19, 42, 64, 35 };
double[] heights = { 1.91, 1.62, 1.72, 1.57, 1.85 };

DataFrameColumn[] columns = {
    new StringDataFrameColumn("Name", names),
    new PrimitiveDataFrameColumn<int>("Age", ages),
    new PrimitiveDataFrameColumn<double>("Height", heights),
};

DataFrame df = new(columns);
```

Contents of a DataFrame can be previewed using `Console.WriteLine(df)` but the formatting isn't pretty.

```text
Name  Age   Height
Oliver23    1.91
Charlotte19    1.62
Henry 42    1.72
Amelia64    1.57
Owen  35    1.85
```

## Pretty DataFrame Formatting

**A custom `PrettyPrint()` extension method can improve DataFrame readability.** Implementing this as an extension method allows me to call `df.PrettyPrint()` anywhere in my code.

<details>
<summary>💡 View the full <code>PrettyPrinters.cs</code> source code</summary>

```cs
using Microsoft.Data.Analysis;
using System.Text;

internal static class PrettyPrinters
{
    public static void PrettyPrint(this DataFrame df) => Console.WriteLine(PrettyText(df));
    public static string PrettyText(this DataFrame df) => ToStringArray2D(df).ToFormattedText();

    public static string ToMarkdown(this DataFrame df) => ToStringArray2D(df).ToMarkdown();

    public static void PrettyPrint(this DataFrameRow row) => Console.WriteLine(Pretty(row));
    public static string Pretty(this DataFrameRow row) => row.Select(x => x?.ToString() ?? string.Empty).StringJoin();
    private static string StringJoin(this IEnumerable<string> strings) => string.Join(" ", strings.Select(x => x.ToString()));

    private static string[,] ToStringArray2D(DataFrame df)
    {
        string[,] strings = new string[df.Rows.Count + 1, df.Columns.Count];

        for (int i = 0; i < df.Columns.Count; i++)
            strings[0, i] = df.Columns[i].Name;

        for (int i = 0; i < df.Rows.Count; i++)
            for (int j = 0; j < df.Columns.Count; j++)
                strings[i + 1, j] = df[i, j]?.ToString() ?? string.Empty;

        return strings;
    }

    private static int[] GetMaxLengthsByColumn(this string[,] strings)
    {
        int[] maxLengthsByColumn = new int[strings.GetLength(1)];

        for (int y = 0; y < strings.GetLength(0); y++)
            for (int x = 0; x < strings.GetLength(1); x++)
                maxLengthsByColumn[x] = Math.Max(maxLengthsByColumn[x], strings[y, x].Length);

        return maxLengthsByColumn;
    }

    private static string ToFormattedText(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x] + 2));
            }
            sb.AppendLine();
        }

        return sb.ToString();
    }


    private static string ToMarkdown(this string[,] strings)
    {
        StringBuilder sb = new();
        int[] maxLengthsByColumn = GetMaxLengthsByColumn(strings);

        for (int y = 0; y < strings.GetLength(0); y++)
        {
            for (int x = 0; x < strings.GetLength(1); x++)
            {
                sb.Append(strings[y, x].PadRight(maxLengthsByColumn[x]));
                if (x < strings.GetLength(1) - 1)
                    sb.Append(" | ");
            }
            sb.AppendLine();

            if (y == 0)
            {
                for (int i = 0; i < strings.GetLength(1); i++)
                {
                    int bars = maxLengthsByColumn[i] + 2;
                    if (i == 0)
                        bars -= 1;
                    sb.Append(new String('-', bars));

                    if (i < strings.GetLength(1) - 1)
                        sb.Append("|");
                }
                sb.AppendLine();
            }
        }

        return sb.ToString();
    }
}
```

</details>

```cs
Name       Age  Height
Oliver     23   1.91
Charlotte  19   1.62
Henry      42   1.72
Amelia     64   1.57
Owen       35   1.85
```

I can create similar methods to format a DataFrame as Markdown or HTML.

```text
Name      | Age | Height
----------|-----|--------
Oliver    | 23  | 1.91
Charlotte | 19  | 1.62
Henry     | 42  | 1.72
Amelia    | 64  | 1.57
Owen      | 35  | 1.85
```


Name      | Age | Height
----------|-----|--------
Oliver    | 23  | 1.91
Charlotte | 19  | 1.62
Henry     | 42  | 1.72
Amelia    | 64  | 1.57
Owen      | 35  | 1.85

## Using DataFrames in Interactive Notebooks

To get started using .NET workbooks, install the [.NET Interactive Notebooks extension for VS Code](https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.dotnet-interactive-vscode), create a new `demo.ipynb` file, then add your code.

Previously users had to create custom HTML formatters to properly display DataFrames in .NET Interactive Notebooks, but these days it works right out of the box.

> 💡 See [demo.html](demo.html) for a full length demonstration notebook

```cs
// visualize the DataFrame
df
```

![](dataframe-notebook.jpg)

## Append a Row

Build a new row using key/value pair then append it to the DataFrame

```cs
List<KeyValuePair<string, object>> newRowData = new()
{
    new KeyValuePair<string, object>("Name", "Scott"),
    new KeyValuePair<string, object>("Age", 36),
    new KeyValuePair<string, object>("Height", 1.65),
};

df.Append(newRowData, inPlace: true);
```

## Add a Column

Build a new column, populate it with data, and add it to the DataFrame

```cs
int[] weights = { 123, 321, 111, 121, 131 };
PrimitiveDataFrameColumn<int> weightCol = new("Weight", weights);
df.Columns.Add(weightCol);
```

## Sort and Filter

**The DataFrame class has numerous operations available** to sort, filter, and analyze data in many different ways. A popular pattern when working with DataFrames is to use _method chaining_ to combine numerous operations together into a single statement. See the [DataFrame Class API](https://docs.microsoft.com/en-us/dotnet/api/microsoft.data.analysis.dataframe) for a full list of available operations.

```cs
df.OrderBy("Name")
    .Filter(df["Age"].ElementwiseGreaterThan(30))
    .PrettyPrint();
```

```text
Name    Age  Height
Henry   42   1.72
Oliver  23   1.91
Owen    35   1.85
```

## Mathematical Operations

It's easy to perform math on columns or across multiple DataFrames. In this example we will perform math using two columns and create a new column to hold the output.

```cs
DataFrameColumn iqCol = df["Age"] * df["Height"] * 1.5;

double[] iqs = Enumerable.Range(0, (int)iqCol.Length)
    .Select(x => (double)iqCol[x])
    .ToArray();

df.Columns.Add(new PrimitiveDataFrameColumn<double>("IQ", iqs));
df.PrettyPrint();
```

```text
Name       Age  Height  IQ
Oliver     23   1.91    65.9
Charlotte  19   1.62    46.17
Henry      42   1.72    108.36
Amelia     64   1.57    150.72
Owen       35   1.85    97.12
```

## Statistical Operations

You can iterate across every row of a column to calculate population statistics

```cs
foreach (DataFrameColumn col in df.Columns.Skip(1))
{
    // warning: additional care must be taken for datasets which contain null
    double[] values = Enumerable.Range(0, (int)col.Length).Select(x => Convert.ToDouble(col[x])).ToArray();
    (double mean, double std) = MeanAndStd(values);
    Console.WriteLine($"{col.Name} = {mean} +/- {std:N3} (n={values.Length})");
}
```

```text
Age = 36.6 +/- 15.982 (n=5)
Height = 1.734 +/- 0.130 (n=5)
```


<details>
<summary>💡 View the full <code>MeanAndStd()</code> source code</summary>

```cs
private static (double mean, double std) MeanAndStd(double[] values)
{
	if (values is null)
		throw new ArgumentNullException(nameof(values));

	if (values.Length == 0)
		throw new ArgumentException($"{nameof(values)} must not be empty");

	double sum = 0;
	for (int i = 0; i < values.Length; i++)
		sum += values[i];

	double mean = sum / values.Length;

	double sumVariancesSquared = 0;
	for (int i = 0; i < values.Length; i++)
	{
		double pointVariance = Math.Abs(mean - values[i]);
		double pointVarianceSquared = Math.Pow(pointVariance, 2);
		sumVariancesSquared += pointVarianceSquared;
	}

	double meanVarianceSquared = sumVariancesSquared / values.Length;
	double std = Math.Sqrt(meanVarianceSquared);

	return (mean, std);
}
```

</details>

## Plot Values from a DataFrame

**I use [ScottPlot.NET](https://scottplot.net) to visualize data from DataFrames in .NET applications and .NET Interactive Notebooks.** ScottPlot can generate a variety of plot types and has many options for customization. See [the ScottPlot Cookbook](https://scottplot.net/cookbook/4.1/) for examples and API documentation.

```cs
// Register a custom formatter to display ScottPlot plots as images
using Microsoft.DotNet.Interactive.Formatting;
Formatter.Register(typeof(ScottPlot.Plot), (plt, writer) => 
    writer.Write(((ScottPlot.Plot)plt).GetImageHTML()), HtmlFormatter.MimeType);
```

```cs
// Get the data you wish to display in double arrays
double[] ages = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Age"][x])).ToArray();
double[] heights = Enumerable.Range(0, (int)df.Rows.Count).Select(x => Convert.ToDouble(df["Height"][x])).ToArray();
```

```cs
// Create and display a plot
var plt = new ScottPlot.Plot(400, 300);
plt.AddScatter(ages, heights);
plt.XLabel("Age");
plt.YLabel("Height");
plt
```

![](scottplot-notebook.png)

> 💡 See [demo.html](demo.html) for a full length demonstration notebook

If you are only working inside a Notebook and you want all your plots to be HTML and JavaScript, [XPlot.Plotly](https://towardsdatascience.com/getting-started-with-c-dataframe-and-xplot-ploty-6ea6ce0ce8e3) is a good tool to use.

## Data may contain null

I didn't demonstrate it in the code examples above, but note that all column data types are nullable. While null-containing data requires extra considerations when writing mathematical routes, it's a convenient way to model missing data which is a common occurrence in the real world. 

## Why not just use LINQ?

I see this question asked frequently, often with an aggressive and condescending tone. LINQ (Language-Integrated Query) is fantastic for performing logical operations on simple collections of data. When you have large 2D datasets of _labeled_ data, advantages of data frames over flat LINQ statements start to become apparent. It is also easy to perform logical operations across multiple data frames, allowing users to write simpler and more readable code than could be achieved with LINQ statements. Data frames also make it much easier to visualize complex data too. In the data science world where complex labeled datasets are routinely compared, manipulated, merged, and visualized, often in an interactive context, the data frames are much easier to work with than raw LINQ statements.

## Conclusions

Although I typically reach for Python to perform exploratory data science, it's good to know that C# has a DataFrame available and that it can be used to inspect and manipulate tabular data. DataFrames pair well with [ScottPlot](https://scottplot.net) figures in interactive notebooks and are a great way to inspect and communicate complex data. I look forward to watching Microsoft's Data Analysis namespace continue to evolve as part of their machine learning / ML.NET platform.

## Resources

* [Example notebook for this project](demo.html)

* [Source code for this project](https://github.com/swharden/Csharp-Data-Visualization/tree/main/projects/dataframe)

* [Official `Microsoft.Data.Analysis.DataFrame` Class Documentation](https://docs.microsoft.com/en-us/dotnet/api/microsoft.data.analysis.dataframe)

* [Microsoft.Data.Analysis source code](https://github.com/dotnet/machinelearning/tree/main/src/Microsoft.Data.Analysis) 

* [An Introduction to DataFrame](https://devblogs.microsoft.com/dotnet/an-introduction-to-dataframe/) (.NET Blog)

* [ExtremeOptimization DataFrame Quickstart](https://www.extremeoptimization.com/QuickStart/CSharp/DataFrames.aspx)

* [`Microsoft.Data.Analysis` on NuGet](https://www.nuget.org/packages/Microsoft.Data.Analysis/)

* [Getting Started With C# DataFrame and XPlot.Plotly](https://towardsdatascience.com/getting-started-with-c-dataframe-and-xplot-ploty-6ea6ce0ce8e3)

* [10 minutes to pandas](https://pandas.pydata.org/docs/user_guide/10min.html)
April 24th, 2022

FTP Deploy with GitHub Actions

This article describes how I use GitHub Actions to deploy content using FTP without any third-party dependencies. Code executed in continuous deployment pipelines may have access to secrets (like FTP credentials and SSH keys). Supply-chain attacks are becoming more frequent, including self-sabotage by open-source authors. Without 2FA, the code of well-intentioned maintainers is one stolen password away from becoming malicious. For these reasons I find it imperative to eliminate third-party Actions from my CI/CD pipelines wherever possible.

⚠️ WARNING: Third-party Actions in the GitHub Actions Marketplace may be compromised to run malicious code and leak secrets. There are dozens of public actions claiming to facilitate FTP deployment. I advise avoiding third-party actions in your CI/CD pipeline whenever possible.

This article assumes you have at least some familiarity with GitHub Actions, but if you're never used them before I recommend taking 5 minutes to work through the Quickstart for GitHub Actions.

FTP Deployment Workflow

This workflow demonstrates how to use LFTP inside a GitHub Action to transfer files/folders with FTP without requiring a third-party dependency. Users can copy/paste this workflow and edit it as needed according to the LFTP manual.

name: 🚀 FTP Deploy
on: [push, workflow_dispatch]
jobs:
  ftp-deploy:
    runs-on: ubuntu-latest
    steps:
      - name: 🛒 Checkout
        uses: actions/checkout@v2
      - name: 📦 Get LFTP
        run: sudo apt install lftp
      - name: 🛠️ Configure LFTP
        run: mkdir ~/.lftp && echo "set ssl:verify-certificate false;" >> ~/.lftp/rc
      - name: 🔑 Load Secrets
        run: echo "machine ${{ secrets.FTP_HOSTNAME }} login ${{ secrets.FTP_USERNAME }} password ${{ secrets.FTP_PASSWORD }}" > ~/.netrc
      - name: 📄 Upload File
        run: lftp -e "put -O /destination/ ./README.md" ${{ secrets.FTP_HOSTNAME }}
      - name: 📁 Upload Folder
        run: lftp -e "mirror --parallel=100 -R ./ffmpeg/ /ffmpeg/" ${{ secrets.FTP_HOSTNAME }}

This workflow uses GitHub Encrypted Secrets to store secret values:

  • FTP_HOSTNAME - a string like ftp.example.com
  • FTP_USERNAME - a string like login@example.com
  • FTP_PASSWORD - a string like superSecret123

How to Verify the Host Certificate

Extra steps can be taken to record the host's public certificate, store it as a GitHub Encrypted Secret, load it into the GitHub Action runner, and configure LFTP to compare against at run time.

  • 1: Acquire your host's entire certificate chain. The -showcerts argument was critically important for me.
openssl s_client -connect example.com:21 -starttls ftp -showcerts
      - name: 🛠️ Configure LFTP
        run: |
          mkdir ~/.lftp
          echo "set ssl:ca-file ~/.lftp/certs.crt;set ssl:check-hostname no;" >> ~/.lftp/rc
          echo "${{ secrets.FTP_CERTS_BASE64 }}" | base64 --decode > ~/.lftp/certs.crt

Notes

To avoid storing passwords to disk you can pass them in with each lftp command using the -u argument. See the LFTP Documentation for details.

Although potentially insecure, some GitHub Marketplace Actions offer compelling features: One of the most popular is SamKirkland's FTP Deploy Action which has advanced features like the use of server-stored JSON files to store file hashes to detect and selectively re-upload changed files. I encourage you to check them out, even though I try to avoid passing my secrets through third-party actions wherever possible.

Favor SSH and rsync over FTP and lftp where possible because rsync is faster, more secure, and designed to prevent needless transfer of unchanged files. I recently wrote about how to safely deploy over SSH using rsync with GitHub Actions.

Resources

Markdown source code last modified on April 30th, 2022
---
title: FTP Deploy with GitHub Actions
description: Deploy content over FTP using GitHub Actions and no dependencies
date: 2022-04-24 16:45:00
tags: GitHub
---

# FTP Deploy with GitHub Actions

**This article describes how I use GitHub Actions to deploy content using FTP without any third-party dependencies.** Code executed in continuous deployment pipelines may have access to secrets (like FTP credentials and SSH keys). Supply-chain attacks are becoming more frequent, including self-sabotage by open-source authors. Without 2FA, the code of well-intentioned maintainers is one stolen password away from becoming malicious. For these reasons I find it imperative to eliminate third-party Actions from my CI/CD pipelines wherever possible.

> ⚠️ **WARNING: Third-party Actions in the GitHub Actions Marketplace may be compromised to run malicious code and leak secrets.** There are [dozens of public actions](https://github.com/marketplace?category=&query=ftp+sort%3Apopularity-desc&type=actions) claiming to facilitate FTP deployment. I advise avoiding third-party actions in your CI/CD pipeline whenever possible.

This article assumes you have at least some familiarity with GitHub Actions, but if you're never used them before I recommend taking 5 minutes to work through the [Quickstart for GitHub Actions](https://docs.github.com/en/actions/quickstart).

## FTP Deployment Workflow
**This workflow demonstrates how to use LFTP inside a GitHub Action to transfer files/folders with FTP without requiring a third-party dependency.** Users can copy/paste this workflow and edit it as needed according to the [LFTP manual](https://lftp.yar.ru/lftp-man.html).

```yaml
name: 🚀 FTP Deploy
on: [push, workflow_dispatch]
jobs:
  ftp-deploy:
    runs-on: ubuntu-latest
    steps:
      - name: 🛒 Checkout
        uses: actions/checkout@v2
      - name: 📦 Get LFTP
        run: sudo apt install lftp
      - name: 🛠️ Configure LFTP
        run: mkdir ~/.lftp && echo "set ssl:verify-certificate false;" >> ~/.lftp/rc
      - name: 🔑 Load Secrets
        run: echo "machine ${{ secrets.FTP_HOSTNAME }} login ${{ secrets.FTP_USERNAME }} password ${{ secrets.FTP_PASSWORD }}" > ~/.netrc
      - name: 📄 Upload File
        run: lftp -e "put -O /destination/ ./README.md" ${{ secrets.FTP_HOSTNAME }}
      - name: 📁 Upload Folder
        run: lftp -e "mirror --parallel=100 -R ./ffmpeg/ /ffmpeg/" ${{ secrets.FTP_HOSTNAME }}
```

This workflow uses [GitHub Encrypted Secrets](https://docs.github.com/en/actions/security-guides/encrypted-secrets) to store secret values:

* `FTP_HOSTNAME` - a string like `ftp.example.com`
* `FTP_USERNAME` - a string like `login@example.com`
* `FTP_PASSWORD` - a string like `superSecret123`

<img src="github-actions-ftp.jpg" class="d-block border shadow my-5 mx-auto" />

## How to Verify the Host Certificate

Extra steps can be taken to record the host's public certificate, store it as a [GitHub Encrypted Secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets), load it into the GitHub Action runner, and configure LFTP to compare against at run time.

* 1: Acquire your host's _entire_ certificate chain. The `-showcerts` argument was critically important for me.

```bash
openssl s_client -connect example.com:21 -starttls ftp -showcerts
```

* 2: Copy the _entire_ output, [convert it to a Base64 string](https://emn178.github.io/online-tools/base64_encode.html), and store it as a [GitHub Encrypted Secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets) named `FTP_CERTS_BASE64`

* 3: Update your GitHub Action to save the certificate file and configure LFTP to use it:

```yaml
      - name: 🛠️ Configure LFTP
        run: |
          mkdir ~/.lftp
          echo "set ssl:ca-file ~/.lftp/certs.crt;set ssl:check-hostname no;" >> ~/.lftp/rc
          echo "${{ secrets.FTP_CERTS_BASE64 }}" | base64 --decode > ~/.lftp/certs.crt
```

## Notes

**To avoid storing passwords to disk** you can pass them in with each `lftp` command using the `-u` argument. See the [LFTP Documentation](https://lftp.yar.ru/lftp-man.html) for details.

**Although potentially insecure, some GitHub Marketplace Actions offer compelling features:** One of the most popular is [SamKirkland's FTP Deploy Action](https://github.com/SamKirkland/FTP-Deploy-Action) which has advanced features like the use of server-stored JSON files to store file hashes to detect and selectively re-upload changed files. I encourage you to check them out, even though I try to avoid passing my secrets through third-party actions wherever possible.

**Favor SSH and `rsync` over FTP and `lftp` where possible** because `rsync` is faster, more secure, and designed to prevent needless transfer of unchanged files. I recently wrote about [how to safely deploy over SSH using rsync with GitHub Actions](https://swharden.com/blog/2022-03-20-github-actions-hugo/).

## Resources
* [LFTP project on GitHub](https://github.com/lavv17/lftp)
* [LFTP Documentation](https://lftp.yar.ru/lftp-man.html)
* [GitHub Actions: Build and deploy a Hugo site](https://swharden.com/blog/2022-03-20-github-actions-hugo/)
* [GitHub Actions: How to deploy over SSH using rsync](https://swharden.com/blog/2022-03-20-github-actions-hugo/)
* [GitHub Encrypted Secrets](https://docs.github.com/en/actions/security-guides/encrypted-secrets)
* [GNU Manual: The .netrc file](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html)
* [SSL Checker: Certificate Decoder](https://www.sslchecker.com/certdecoder)
Pages