The personal website of Scott W Harden

C# Microphone Level Monitor

This page demonstrates how to continuously monitor microphone input using C#. Code here may be a helpful reference for developers interested in working with mono or stereo data captured from an audio device in real time. This project uses NAudio to provide simple access to the microphone on Windows platforms.

Mono Stereo

Full source code is available on GitHub (Program.cs)

Configure the Audio Input Device

This program starts by creating a WaveInEvent with a WaveFormat that specifies the sample rate, bit depth, and number of channels (1 for mono, 2 for stereo).

We can create a function to handle incoming data and add it to the DataAvailable event handler:

var waveIn = new NAudio.Wave.WaveInEvent
{
    DeviceNumber = 0, // customize this to select your microphone device
    WaveFormat = new NAudio.Wave.WaveFormat(rate: 44100, bits: 16, channels: 1),
    BufferMilliseconds = 50
};
waveIn.DataAvailable += ShowPeakMono;
waveIn.StartRecording();

Analyze Mono Audio Data

This method is called when the incoming audio buffer is filled. One of the arguments gives you access to the raw bytes in the buffer, and it's up to you to convert them to the appropriate data format.

This example is suitable for 16-bit (two bytes per sample) mono input.

private static void ShowPeakMono(object sender, NAudio.Wave.WaveInEventArgs args)
{
    float maxValue = 32767;
    int peakValue = 0;
    int bytesPerSample = 2;
    for (int index = 0; index < args.BytesRecorded; index += bytesPerSample)
    {
        int value = BitConverter.ToInt16(args.Buffer, index);
        peakValue = Math.Max(peakValue, value);
    }

    Console.WriteLine("L=" + GetBars(peakValue / maxValue));
}

This method converts a level (fraction) into bars suitable to display in the console:

private static string GetBars(double fraction, int barCount = 35)
{
    int barsOn = (int)(barCount * fraction);
    int barsOff = barCount - barsOn;
    return new string('#', barsOn) + new string('-', barsOff);
}

Analyze Stereo Audio Data

When the WaveFormat is configured for 2 channels, bytes in the incoming audio buffer will have left and right channel values interleaved (2 bytes for left, two bytes for right, then repeat). Left and right channels must be treated separately to display independent levels for stereo audio inputs.

This example is suitable for 16-bit (two bytes per sample) stereo input.

private static void ShowPeakStereo(object sender, NAudio.Wave.WaveInEventArgs args)
{
    float maxValue = 32767;
    int peakL = 0;
    int peakR = 0;
    int bytesPerSample = 4;
    for (int index = 0; index < args.BytesRecorded; index += bytesPerSample)
    {
        int valueL = BitConverter.ToInt16(args.Buffer, index);
        peakL = Math.Max(peakL, valueL);
        int valueR = BitConverter.ToInt16(args.Buffer, index + 2);
        peakR = Math.Max(peakR, valueR);
    }

    Console.Write("L=" + GetBars(peakL / maxValue));
    Console.Write(" ");
    Console.Write("R=" + GetBars(peakR / maxValue));
    Console.Write("\n");
}

Resources

Markdown source code last modified on July 4th, 2021
---
Title: C# Microphone Level Monitor
Description: How to continuously monitor the level of an audio input (mono or stereo) with C#
Date: 2021-07-03 10:42PM EST
---

# C# Microphone Level Monitor

**This page demonstrates how to continuously monitor microphone input using C#.** Code here may be a helpful reference  for developers interested in working with mono or stereo data captured from an audio device in real time. This project uses [NAudio](https://www.nuget.org/packages/NAudio) to provide simple access to the microphone on Windows platforms.

Mono | Stereo
---|---
<img src='microphone-mono.gif'>|<img src='microphone-stereo.gif'>

 Full source code is available on GitHub ([Program.cs](https://github.com/swharden/Csharp-Data-Visualization/blob/master/examples/2021-07-03-console-microphone/Program.cs))



## Configure the Audio Input Device

This program starts by creating a `WaveInEvent` with a `WaveFormat` that specifies the sample rate, bit depth, and number of channels (1 for mono, 2 for stereo).

We can create a function to handle incoming data and add it to the `DataAvailable` event handler:

```cs
var waveIn = new NAudio.Wave.WaveInEvent
{
    DeviceNumber = 0, // customize this to select your microphone device
    WaveFormat = new NAudio.Wave.WaveFormat(rate: 44100, bits: 16, channels: 1),
    BufferMilliseconds = 50
};
waveIn.DataAvailable += ShowPeakMono;
waveIn.StartRecording();
```

## Analyze Mono Audio Data

This method is called when the incoming audio buffer is filled. One of the arguments gives you access to the raw bytes in the buffer, and it's up to you to convert them to the appropriate data format. 

This example is suitable for 16-bit (two bytes per sample) mono input.

```cs
private static void ShowPeakMono(object sender, NAudio.Wave.WaveInEventArgs args)
{
    float maxValue = 32767;
    int peakValue = 0;
    int bytesPerSample = 2;
    for (int index = 0; index < args.BytesRecorded; index += bytesPerSample)
    {
        int value = BitConverter.ToInt16(args.Buffer, index);
        peakValue = Math.Max(peakValue, value);
    }

    Console.WriteLine("L=" + GetBars(peakValue / maxValue));
}
```

This method converts a level (fraction) into bars suitable to display in the console:

```cs
private static string GetBars(double fraction, int barCount = 35)
{
    int barsOn = (int)(barCount * fraction);
    int barsOff = barCount - barsOn;
    return new string('#', barsOn) + new string('-', barsOff);
}
```

<div class="text-center">

![](microphone-mono.gif)

</div>

## Analyze Stereo Audio Data

When the `WaveFormat` is configured for 2 channels, bytes in the incoming audio buffer will have left and right channel values interleaved (2 bytes for left, two bytes for right, then repeat). Left and right channels must be treated separately to display independent levels for stereo audio inputs.

This example is suitable for 16-bit (two bytes per sample) stereo input.

```cs
private static void ShowPeakStereo(object sender, NAudio.Wave.WaveInEventArgs args)
{
    float maxValue = 32767;
    int peakL = 0;
    int peakR = 0;
    int bytesPerSample = 4;
    for (int index = 0; index < args.BytesRecorded; index += bytesPerSample)
    {
        int valueL = BitConverter.ToInt16(args.Buffer, index);
        peakL = Math.Max(peakL, valueL);
        int valueR = BitConverter.ToInt16(args.Buffer, index + 2);
        peakR = Math.Max(peakR, valueR);
    }

    Console.Write("L=" + GetBars(peakL / maxValue));
    Console.Write(" ");
    Console.Write("R=" + GetBars(peakR / maxValue));
    Console.Write("\n");
}
```

<div class="text-center">

![](microphone-stereo.gif)

</div>

## Resources

* [Realtime Audio Visualization in Python](https://swharden.com/blog/2016-07-19-realtime-audio-visualization-in-python/) - a similar project using Python and the [pyaudio](http://people.csail.mit.edu/hubert/pyaudio/) library

* [NuGet: NAudio](https://www.nuget.org/packages/NAudio)

* [GitHub: console-microphone/Program.cs](https://github.com/swharden/Csharp-Data-Visualization/blob/master/examples/2021-07-03-console-microphone/Program.cs)

* [GitHub: C# Data Visualization](https://github.com/swharden/Csharp-Data-Visualization)

Working with 16-bit Images in CSharp

Scientific image analysis frequently involves working with 12-bit and 14-bit sensor data stored in 16-bit TIF files. Images commonly encountered on the internat are 24-bit or 32-bit RGB images (where each pixel is represented by 8 bits each for red, green, blue, and possibly alpha). Typical image analysis libraries and documentation often lack information about how to work with 16-bit image data.

This page summarizes how I work with 16-bit TIF file data in C#. I prefer Magick.NET (an ImageMagick wrapper) when working with many different file formats, and LibTiff.Net whenever I know my source files will all be identically-formatted TIFs or multidimensional TIFs (stacks).

ImageMagick

ImageMagick is a free and open-source cross-platform software suite for displaying, creating, converting, modifying, and editing raster images. Although ImageMagick is commonly used at the command line, .NET wrappers exist to make it easy to use ImageMagick from within C# applications.

ImageMagick has many packages on NuGet and they are described on ImageMagick's documentation GitHub page. TLDR: Install the Q16 package (not HDRI) to allow you to work with 16-bit data without losing precision.

ImageMagick is free and distributed under the Apache 2.0 license, so it can easily be used in commercial projects.

An advantage of loading images with ImageMagick is that it will work easily whether the source file is a JPG, PNG, GIF, TIF, or something different. ImageMagick supports over 100 file formats!

Load a 16-bit TIF File as a Pixel Value Array

// Load pixel values from a 16-bit TIF using ImageMagick (Q16)
MagickImage image = new MagickImage("16bit.tif");
ushort[] pixelValues = image.GetPixels().GetValues();

That's it! The pixelValues array will contain one value per pixel from the original image. The length of this array will equal the image's height times its width.

Load an 8-bit TIF File as a Pixel Value Array

Since the Q16 package was installed, 2 bytes will be allocated for each pixel (16-bit) even if it only requires one byte (8-bit). In this case you must collect just the bytes you are interested in:

MagickImage image = new MagickImage("8bit.tif");
ushort[] pixelValues = image.GetPixels().GetValues();
int[] values8 = Enumerable.Range(0, pixelValues.Length / 2).Select(x => (int)pixelValues[x * 2 + 1]).ToArray();

Load a 32-bit TIF File as a Pixel Value Array

For this you'll have to install the high dynamic range (HDRI) Q16 package, then your GetValues() method will return a float[] instead of a ushort[]. Convert these values to proper pixel intensity values by dividing by 2^16.

MagickImage image = new MagickImage("32bit.tif");
float[] pixels = image.GetPixels().GetValues();
for (int i = 0; i < pixels.Length; i++)
    pixels[i] = (long)pixels[i] / 65535;

LibTiff

LibTiff is a pure C# (.NET Standard) TIF file reader. Although it doesn't support all the image formats that ImageMagick does, it's really good at working with TIFs. It has a more intuitive interface for working with TIF-specific features such as multi-dimensional images (color, Z position, time, etc.).

LibTiff gives you a lower-level access to the bytes that underlie image data, so it's on you to perform the conversion from a byte array to the intended data type. Note that some TIFs are little-endian encoded and others are big-endian encoded, and endianness can be read from the header.

LibTiff is distributed under a BSD 3-clause license, so it too can be easily used in commercial projects.

Load a 16-bit TIF as a Pixel Value Array

// Load pixel values from a 16-bit TIF using LibTiff
using Tiff image = Tiff.Open("16bit.tif", "r");

// get information from the header
int width = image.GetField(TiffTag.IMAGEWIDTH)[0].ToInt();
int height = image.GetField(TiffTag.IMAGELENGTH)[0].ToInt();
int bytesPerPixel = image.GetField(TiffTag.BITSPERSAMPLE)[0].ToInt() / 8;

// read the image data bytes
int numberOfStrips = image.NumberOfStrips();
byte[] bytes = new byte[numberOfStrips * image.StripSize()];
for (int i = 0; i < numberOfStrips; ++i)
    image.ReadRawStrip(i, bytes, i * image.StripSize(), image.StripSize());

// convert the data bytes to a double array
if (bytesPerPixel != 2)
    throw new NotImplementedException("this is only for 16-bit TIFs");
double[] data = new double[bytes.Length / bytesPerPixel];
for (int i = 0; i < data.Length; i++)
{
    if (image.IsBigEndian())
        data[i] = bytes[i * 2 + 1] + (bytes[i * 2] << 8);
    else
        data[i] = bytes[i * 2] + (bytes[i * 2 + 1] << 8);
}

Routines for detecting and converting data from 8-bit, 24-bit, and 32-bit TIF files can be created by inspecting bytesPerPixel. LibTiff has documentation describing how to work with RGB TIF files and multi-frame TIFs.

Convert a Pixel Array to a 2D Array

I often prefer to work with scientific image data as a 2D arrays of double values. I write my analysis routines to pass double[,] between methods so the file I/O can be encapsulated in a static class.

// Load pixel values from a 16-bit TIF using ImageMagick (Q16)
MagickImage image = new MagickImage("16bit.tif");
ushort[] pixelValues = image.GetPixels().GetValues();

// create a 2D array of pixel values
double[,] imageData = new double[image.Height, image.Width];
for (int i = 0; i < image.Height; i++)
    for (int j = 0; j < image.Width; j++)
        imageData[i, j] = pixelValues[i * image.Width + j];

Other Libraries

ImageProcessor

According to ImageProcessor's GitHub page, "ImageProcessor is, and will only ever be supported on the .NET Framework running on a Windows OS" ... it doesn't appear to be actively maintained and is effectively labeled as deprecated, so I won't spend much time looking further into it.

ImageSharp

As of the time of writing, ImageSharp does not support TIF format, but it appears likely to be supported in a future release.

System.Drawing

Although this library can save images at different depths, it can only load image files with 8-bit depths. System.Drawing does not support loading 16-bit TIFs, so another library must be used to work with these file types.

Resources

Markdown source code last modified on June 9th, 2021
---
Title: Working with 16-bit Images in CSharp
Description: A summary of how I work with 16-bit TIF file data in C# (using ImageMagick and LibTiff).
Date: 2021-06-08 11PM EST
---

# Working with 16-bit Images in CSharp

**Scientific image analysis frequently involves working with 12-bit and 14-bit sensor data stored in 16-bit TIF files.** Images commonly encountered on the internat are 24-bit or 32-bit RGB images (where each pixel is represented by 8 bits each for red, green, blue, and possibly alpha). Typical image analysis libraries and documentation often lack information about how to work with 16-bit image data. 

**This page summarizes how I work with 16-bit TIF file data in C#.** I prefer [Magick.NET](https://www.nuget.org/packages?q=magick) (an ImageMagick wrapper) when working with many different file formats, and [LibTiff.Net](https://bitmiracle.com/libtiff) whenever I know my source files will all be identically-formatted TIFs or multidimensional TIFs (stacks).

[](https://github.com/swharden/Csharp-Image-Analysis/blob/main/dev/notes.md)

## ImageMagick

[ImageMagick](https://imagemagick.org/index.php) is a free and open-source cross-platform software suite for displaying, creating, converting, modifying, and editing raster images. Although ImageMagick is commonly used at the command line, .NET wrappers exist to make it easy to use ImageMagick from within C# applications.

ImageMagick has [many packages on NuGet](https://www.nuget.org/packages?q=imagemagick) and they are described on ImageMagick's [documentation](https://github.com/dlemstra/Magick.NET/tree/main/docs) GitHub page. **TLDR: Install the Q16 package (not HDRI)** to allow you to work with 16-bit data without losing precision.

ImageMagick is free and distributed under the Apache 2.0 license, so it can easily be used in commercial projects.

An advantage of loading images with ImageMagick is that it will work easily whether the source file is a JPG, PNG, GIF, TIF, or something different. ImageMagick supports over 100 file formats!

### Load a 16-bit TIF File as a Pixel Value Array

```cs
// Load pixel values from a 16-bit TIF using ImageMagick (Q16)
MagickImage image = new MagickImage("16bit.tif");
ushort[] pixelValues = image.GetPixels().GetValues();
```

That's it! The `pixelValues` array will contain one value per pixel from the original image. The length of this array will equal the image's height times its width.

### Load an 8-bit TIF File as a Pixel Value Array
Since the `Q16` package was installed, 2 bytes will be allocated for each pixel (16-bit) even if it only requires one byte (8-bit). In this case you must collect just the bytes you are interested in:

```cs
MagickImage image = new MagickImage("8bit.tif");
ushort[] pixelValues = image.GetPixels().GetValues();
int[] values8 = Enumerable.Range(0, pixelValues.Length / 2).Select(x => (int)pixelValues[x * 2 + 1]).ToArray();
```

### Load a 32-bit TIF File as a Pixel Value Array

For this you'll have to install the high dynamic range (HDRI) Q16 package, then your `GetValues()` method will return a `float[]` instead of a `ushort[]`. Convert these values to proper pixel intensity values by dividing by 2^16.

```cs
MagickImage image = new MagickImage("32bit.tif");
float[] pixels = image.GetPixels().GetValues();
for (int i = 0; i < pixels.Length; i++)
    pixels[i] = (long)pixels[i] / 65535;
```

## LibTiff

**LibTiff is a pure C# (.NET Standard) TIF file reader.** Although it doesn't support all the image formats that ImageMagick does, it's really good at working with TIFs. It has a more intuitive interface for working with TIF-specific features such as multi-dimensional images (color, Z position, time, etc.). 

LibTiff gives you a lower-level access to the bytes that underlie image data, so it's on you to perform the conversion from a byte array to the intended data type. Note that some TIFs are little-endian encoded and others are big-endian encoded, and endianness can be read from the header.

LibTiff is distributed under a BSD 3-clause license, so it too can be easily used in commercial projects.

### Load a 16-bit TIF as a Pixel Value Array

```cs
// Load pixel values from a 16-bit TIF using LibTiff
using Tiff image = Tiff.Open("16bit.tif", "r");

// get information from the header
int width = image.GetField(TiffTag.IMAGEWIDTH)[0].ToInt();
int height = image.GetField(TiffTag.IMAGELENGTH)[0].ToInt();
int bytesPerPixel = image.GetField(TiffTag.BITSPERSAMPLE)[0].ToInt() / 8;

// read the image data bytes
int numberOfStrips = image.NumberOfStrips();
byte[] bytes = new byte[numberOfStrips * image.StripSize()];
for (int i = 0; i < numberOfStrips; ++i)
    image.ReadRawStrip(i, bytes, i * image.StripSize(), image.StripSize());

// convert the data bytes to a double array
if (bytesPerPixel != 2)
    throw new NotImplementedException("this is only for 16-bit TIFs");
double[] data = new double[bytes.Length / bytesPerPixel];
for (int i = 0; i < data.Length; i++)
{
    if (image.IsBigEndian())
        data[i] = bytes[i * 2 + 1] + (bytes[i * 2] << 8);
    else
        data[i] = bytes[i * 2] + (bytes[i * 2 + 1] << 8);
}
```

Routines for detecting and converting data from 8-bit, 24-bit, and 32-bit TIF files can be created by inspecting `bytesPerPixel`. LibTiff has [documentation](https://bitmiracle.com/libtiff/) describing how to work with RGB TIF files and multi-frame TIFs.

## Convert a Pixel Array to a 2D Array

I often prefer to work with scientific image data as a 2D arrays of `double` values. I write my analysis routines to pass `double[,]` between methods so the file I/O can be encapsulated in a static class.

```cs
// Load pixel values from a 16-bit TIF using ImageMagick (Q16)
MagickImage image = new MagickImage("16bit.tif");
ushort[] pixelValues = image.GetPixels().GetValues();

// create a 2D array of pixel values
double[,] imageData = new double[image.Height, image.Width];
for (int i = 0; i < image.Height; i++)
    for (int j = 0; j < image.Width; j++)
        imageData[i, j] = pixelValues[i * image.Width + j];
```

## Other Libraries

### ImageProcessor

According to [ImageProcessor's GitHub page](https://github.com/JimBobSquarePants/ImageProcessor), "ImageProcessor is, and will **only ever be supported on the .NET Framework running on a Windows OS**" ... it doesn't appear to be actively maintained and is effectively labeled as deprecated, so I won't spend much time looking further into it.

### ImageSharp

As of the time of writing, [ImageSharp](https://docs.sixlabors.com/articles/imagesharp/imageformats.html) does not support TIF format, but it appears likely to be supported in a future release.

### System.Drawing

Although this library can _save_ images at different depths, it can only _load_ image files with 8-bit depths. System.Drawing does not support loading 16-bit TIFs, so another library must be used to work with these file types.

## Resources

* [C# Image Analysis](https://github.com/swharden/Csharp-Image-Analysis) (GitHub) - a collection of code examples for working with image data as 2D arrays
* [LibTiff.Net](https://bitmiracle.com/libtiff/) - The .NET version of original libtiff library
* [dlemstra/Magick.NET](https://github.com/dlemstra/Magick.NET) - The .NET library for ImageMagick
* [ImageSharp](https://docs.sixlabors.com/articles/imagesharp/imageformats.html) - Supported image formats

Representing Images in Memory

This page is a quick reference for programmers interested in working with image data in memory (byte arrays). This topic is straightforward overall, but there are a few traps that aren't necessarily intuitive so I try my best to highlight those here.

This article assumes you have some programming experience working with byte arrays in a C-type language and have an understanding of what is meant by 32-bit, 24-bit, 16-bit, and 8-bit integers.

Pixel Values

An image is composed of a 2D grid of square pixels, and the type of image greatly influences how much memory each pixel occupies and what format its data is in.

Bits per pixel (bpp) is the number of bits it takes to represent the value a single pixel. This is typically a multiple of 8 bits (1 byte).

Common Pixel Formats

  • 8-bit (1 byte) Pixel Formats
    • Gray 8 - Specifies one of 2^8 (256) shades of gray.
    • Indexed 8 - The pixel data contains color-indexed values, which means the values are an index to colors in the system color table, as opposed to individual color values.
  • 16-bit (2-byte) Pixel Formats
    • ARGB 1555 - Specifies one of 2^15 (32,768) shades of color (5 bits each for red, green, and blue) and 1 bit for alpha.
    • Gray 16 - Specifies one of 2^16 (65,536) shades of gray.
    • RGB 555 - 5 bits each are used for the red, green, and blue components. The remaining bit is not used.
    • RGB 565 - 5 bits are used for the red component, 6 bits are used for the green component, and 5 bits are used for the blue component. The additional green bit doubles the number of gradations and improves image perception in most humans.
  • 24-bit (3 byte) Pixel Formats
    • RGB 888 - 8 bits each are used for the red, green, and blue components.
  • 32-bit (4-byte) Pixel Formats
    • ARGB - 8 bits each are used for the alpha, red, green, and blue components. This is the most common pixel format.

There are others (e.g., 64-bit RGB images), but these are the most typically encountered pixel formats.

Endianness

Endianness describes the order of bytes in a multi-byte value uses to store its data:

  • big-endian: the smallest address contains the most significant byte

  • little-endian: the smallest address contains the least significant byte

Assuming array index values ascend from left to right, 32-bit (4-byte) pixel data can be represented using either of these two formats in memory:

  • 4 bpp little-endian: [A, B, G, R] (most common)

  • 4 bpp big-endian: [R, G, B, A]

Bitmap images use little-endian integer format! New programmers may expect the bytes that contain "RGB" values to be sequenced in the order "R, G, B", but this is not the case.

Premultiplied Alpha

Premultiplication refers to the relationship between color (R, G, B) and transparency (alpha). In transparent images the alpha channel may be straight (unassociated) or premultiplied (associated).

With straight alpha, the RGB components represent the full-intensity color of the object or pixel, disregarding its opacity. Later R, G, and B will each be multiplied by the alpha to adjust intensity and transparency.

With premultiplied alpha, the RGB components represent the emission (color and intensity) of each pixel, and the alpha only represents transparency (occlusion of what is behind it). This reduces the computational performance for image processing if transparency isn't actually used.

In C# using System.Drawing premultiplied alpha is not enabled by default. This must be defined when creating new Bitmap as seen here:

var bmp = new Bitmap(400, 300, PixelFormat.Format32bppPArgb);

Benchmarking reveals the performance enhancement of drawing on bitmaps in memory using premultiplied alpha pixel format. In this test I'm using .NET 5 with the System.Drawing.Common NuGet package. Anti-aliasing is off in this example, but similar results were obtained with it enabled.

Random rand = new(0);
int width = 600;
int height = 400;
var bmp = new Bitmap(600, 400, PixelFormat.Format32bppPArgb);
var gfx = Graphics.FromImage(bmp);
var pen = new Pen(Color.Black);
gfx.Clear(Color.Magenta);
var sw = Stopwatch.StartNew();
for (int i = 0; i < 1e6; i++)
{
    pen.Color = Color.FromArgb(rand.Next());
    gfx.DrawLine(pen, rand.Next(width), rand.Next(height), rand.Next(width), rand.Next(height));
}
Console.WriteLine(sw.Elapsed);
bmp.Save("benchmark.png", ImageFormat.Png);

Time to render 1 million frames:

  • Standard ARGB: 6.77 ± 0.02 sec
  • Premultiplied ARGB: 5.83 ± 0.03 sec (14% faster)

At the end you have a beautiful figure:

Pixel Locations in Space and Memory

A 2D image is composed of pixels, but addressing them in memory isn't as trivial as it may seem. The dimensions of bitmaps are stored in their header, and the arrangement of pixels forms rows (left-to-right) then columns (top-to-bottom).

Width and height are the dimensions (in pixels) of the visible image, but...

⚠️ Image size in memory is not just width * height * bytesPerPixel

Because of old hardware limitations, bitmap widths in memory (also called the stride) must be multiplies of 4 bytes. This is effortless when using ARGB formats because each pixel is already 4 bytes, but when working with RGB images it's possible to have images with an odd number of bytes in each row, requiring data to be padded such that the stride length is a multiple of 4.

// calculate stride length of a bitmap row in memory
int stride = 4 * ((imageWidth * bytesPerPixel + 3) / 4);

Working with Bitmap Bytes in C#

This example demonstrates how to convert a 3D array (X, Y, C) into a flat byte array ready for copying into a bitmap. Notice this code adds padding to the image width to ensure the stride is a multiple of 4 bytes. Notice also the integer encoding is little endian.

public static byte[] GetBitmapBytes(byte[,,] input)
{
    int height = input.GetLength(0);
    int width = input.GetLength(1);
    int bpp = input.GetLength(2);
    int stride = 4 * ((width * bpp + 3) / 4);

    byte[] pixelsOutput = new byte[height * stride];
    byte[] output = new byte[height * stride];

    for (int y = 0; y < height; y++)
        for (int x = 0; x < width; x++)
            for (int z = 0; z < bpp; z++)
                output[y * stride + x * bpp + (bpp - z - 1)] = input[y, x, z];

    return output;
}

For completeness, here's the complimentary code that converts a flat byte array from bitmap memory to a 3D array (assuming we know the image dimensions and bytes per pixel from reading the image header):

public static byte[,,] GetBitmapBytes3D(byte[] input, int width, int height, int bpp)
{
    int stride = 4 * ((width * bpp + 3) / 4);

    byte[,,] output = new byte[height, width, bpp];
    for (int y = 0; y < height; y++)
        for (int x = 0; x < width; x++)
            for (int z = 0; z < bpp; z++)
                output[y, x, z] = input[stride * y + x * bpp + (bpp - z - 1)];

    return output;
}

Marshalling Bytes in and out of Bitmaps

The code examples above are intentionally simple to focus on the location of pixels in memory and the endianness of their values. To actually convert between byte[] and System.Drawing.Bitmap you must use Marshall.Copy as shown:

public static byte[] BitmapToBytes(Bitmap bmp)
{
    Rectangle rect = new(0, 0, bmp.Width, bmp.Height);
    BitmapData bmpData = bmp.LockBits(rect, ImageLockMode.ReadWrite, bmp.PixelFormat);
    int byteCount = Math.Abs(bmpData.Stride) * bmp.Height;
    byte[] bytes = new byte[byteCount];
    Marshal.Copy(bmpData.Scan0, bytes, 0, byteCount);
    bmp.UnlockBits(bmpData);
    return bytes;
}
public static Bitmap BitmapFromBytes(byte[] bytes, PixelFormat bmpFormat)
{
    Bitmap bmp = new(width, height, bmpFormat);
    var rect = new Rectangle(0, 0, width, height);
    BitmapData bmpData = bmp.LockBits(rect, ImageLockMode.ReadOnly, bmpFormat);
    Marshal.Copy(bytes, 0, bmpData.Scan0, bytes.Length);
    bmp.UnlockBits(bmpData);
    return bmp;
}

Reference

Markdown source code last modified on June 8th, 2021
---
Title: Representing Images in Memory
Description: quick reference for programmers interested in working with image data in memory (byte arrays)
Date: 2021-06-03 00:03:00 EST
---

# Representing Images in Memory

**This page is a quick reference for programmers interested in working with image data in memory (byte arrays).** This topic is straightforward overall, but there are a few traps that aren't necessarily intuitive so I try my best to highlight those here. 

This article assumes you have some programming experience working with byte arrays in a C-type language and have an understanding of what is meant by 32-bit, 24-bit, 16-bit, and 8-bit integers.

## Pixel Values

**An image is composed of a 2D grid of square pixels,** and the type of image greatly influences how much memory each pixel occupies and what format its data is in.

**Bits per pixel (bpp)** is the number of bits it takes to represent the value a single pixel. This is typically a multiple of 8 bits (1 byte).

### Common Pixel Formats

* **8-bit (1 byte)  Pixel Formats**
  * **`Gray 8`** - Specifies one of 2^8 (256) shades of gray.
  * **`Indexed 8`** - The pixel data contains color-indexed values, which means the values are an index to colors in the system color table, as opposed to individual color values.
* **16-bit (2-byte) Pixel Formats**
  * **`ARGB 1555`** - Specifies one of 2^15 (32,768) shades of color (5 bits each for red, green, and blue) and 1 bit for alpha.
  * **`Gray 16`** - Specifies one of 2^16 (65,536) shades of gray.
  * **`RGB 555`** - 5 bits each are used for the red, green, and blue components. The remaining bit is not used.
  * **`RGB 565`** - 5 bits are used for the red component, 6 bits are used for the green component, and 5 bits are used for the blue component. The additional green bit doubles the number of gradations and improves image perception in most humans.
* **24-bit (3 byte) Pixel Formats**
  * **`RGB 888`** - 8 bits each are used for the red, green, and blue components.
* **32-bit (4-byte) Pixel Formats**
  * **`ARGB`** - 8 bits each are used for the alpha, red, green, and blue components. This is the most common pixel format.

There are others (e.g., 64-bit RGB images), but these are the most typically encountered pixel formats.

### Endianness

[Endianness](https://en.wikipedia.org/wiki/Endianness) describes the order of bytes in a multi-byte value uses to store its data:

* **big-endian:** the smallest address contains the most significant byte

* **little-endian:** the smallest address contains the least significant byte

Assuming array index values ascend from left to right, 32-bit (4-byte) pixel data can be represented using either of these two formats in memory:

* 4 bpp little-endian: **`[A, B, G, R]`** (most common)

* 4 bpp big-endian: **`[R, G, B, A]`**

**Bitmap images use little-endian integer format!** New programmers may expect the bytes that contain "RGB" values to be sequenced in the order "R, G, B", but this is not the case.

### Premultiplied Alpha

**Premultiplication refers to the relationship between color (R, G, B) and transparency (alpha).** In transparent images the alpha channel may be straight (unassociated) or premultiplied (associated).

With **straight alpha**, the RGB components represent the full-intensity color of the object or pixel, disregarding its opacity. Later R, G, and B will each be multiplied by the alpha to adjust intensity and transparency.

With **premultiplied alpha**, the RGB components represent the emission (color and intensity) of each pixel, and the alpha only represents transparency (occlusion of what is behind it). This reduces the computational performance for image processing if transparency isn't actually used.

**In C# using `System.Drawing` premultiplied alpha is not enabled by default.** This must be defined when creating new `Bitmap` as seen here:

```cs
var bmp = new Bitmap(400, 300, PixelFormat.Format32bppPArgb);
```

**Benchmarking reveals the performance enhancement** of drawing on bitmaps in memory using premultiplied alpha pixel format. In this test I'm using .NET 5 with the [System.Drawing.Common](https://www.nuget.org/packages/System.Drawing.Common) NuGet package. Anti-aliasing is off in this example, but similar results were obtained with it enabled.

```cs
Random rand = new(0);
int width = 600;
int height = 400;
var bmp = new Bitmap(600, 400, PixelFormat.Format32bppPArgb);
var gfx = Graphics.FromImage(bmp);
var pen = new Pen(Color.Black);
gfx.Clear(Color.Magenta);
var sw = Stopwatch.StartNew();
for (int i = 0; i < 1e6; i++)
{
    pen.Color = Color.FromArgb(rand.Next());
    gfx.DrawLine(pen, rand.Next(width), rand.Next(height), rand.Next(width), rand.Next(height));
}
Console.WriteLine(sw.Elapsed);
bmp.Save("benchmark.png", ImageFormat.Png);
```

Time to render 1 million frames:
* Standard ARGB: 6.77 ± 0.02 sec
* Premultiplied ARGB: 5.83 ± 0.03 sec (14% faster)

At the end you have a beautiful figure:

<div class="text-center img-border">

![](benchmark.png)

</div>

## Pixel Locations in Space and Memory

**A 2D image is composed of pixels, but addressing them in memory isn't as trivial as it may seem.** The dimensions of bitmaps are stored in their header, and the arrangement of pixels forms rows (left-to-right) then columns (top-to-bottom). 

**Width** and **height** are the dimensions (in pixels) of the visible image, but...

⚠️ **Image size in memory is _not_ just `width * height * bytesPerPixel`**

Because of old hardware limitations, bitmap widths _in memory_ (also called the **stride**) must be multiplies of 4 bytes. This is effortless when using ARGB formats because each pixel is already 4 bytes, but when working with RGB images it's possible to have images with an odd number of bytes in each row, requiring data to be **padded** such that the stride length is a multiple of 4.

<div class="text-center">

![](image-byte-position.png)

</div>

```cs
// calculate stride length of a bitmap row in memory
int stride = 4 * ((imageWidth * bytesPerPixel + 3) / 4);
```

### Working with Bitmap Bytes in C# 

**This example demonstrates how to convert a 3D array (X, Y, C) into a flat byte array ready for copying into a bitmap.** Notice this code adds padding to the image width to ensure the stride is a multiple of 4 bytes. Notice also the integer encoding is little endian.

```cs
public static byte[] GetBitmapBytes(byte[,,] input)
{
    int height = input.GetLength(0);
    int width = input.GetLength(1);
    int bpp = input.GetLength(2);
    int stride = 4 * ((width * bpp + 3) / 4);

    byte[] pixelsOutput = new byte[height * stride];
    byte[] output = new byte[height * stride];

    for (int y = 0; y < height; y++)
        for (int x = 0; x < width; x++)
            for (int z = 0; z < bpp; z++)
                output[y * stride + x * bpp + (bpp - z - 1)] = input[y, x, z];

    return output;
}
```

For completeness, here's the complimentary code that converts a flat byte array from bitmap memory to a 3D array (assuming we know the image dimensions and bytes per pixel from reading the image header):

```cs
public static byte[,,] GetBitmapBytes3D(byte[] input, int width, int height, int bpp)
{
    int stride = 4 * ((width * bpp + 3) / 4);

    byte[,,] output = new byte[height, width, bpp];
    for (int y = 0; y < height; y++)
        for (int x = 0; x < width; x++)
            for (int z = 0; z < bpp; z++)
                output[y, x, z] = input[stride * y + x * bpp + (bpp - z - 1)];

    return output;
}
```

### Marshalling Bytes in and out of Bitmaps

The code examples above are intentionally simple to focus on the location of pixels in memory and the endianness of their values. To actually convert between `byte[]` and `System.Drawing.Bitmap` you must use `Marshall.Copy` as shown:

```cs
public static byte[] BitmapToBytes(Bitmap bmp)
{
    Rectangle rect = new(0, 0, bmp.Width, bmp.Height);
    BitmapData bmpData = bmp.LockBits(rect, ImageLockMode.ReadWrite, bmp.PixelFormat);
    int byteCount = Math.Abs(bmpData.Stride) * bmp.Height;
    byte[] bytes = new byte[byteCount];
    Marshal.Copy(bmpData.Scan0, bytes, 0, byteCount);
    bmp.UnlockBits(bmpData);
    return bytes;
}
```

```cs
public static Bitmap BitmapFromBytes(byte[] bytes, PixelFormat bmpFormat)
{
    Bitmap bmp = new(width, height, bmpFormat);
    var rect = new Rectangle(0, 0, width, height);
    BitmapData bmpData = bmp.LockBits(rect, ImageLockMode.ReadOnly, bmpFormat);
    Marshal.Copy(bytes, 0, bmpData.Scan0, bytes.Length);
    bmp.UnlockBits(bmpData);
    return bmp;
}
```

## Reference

* [A programmer's view on digital images: the essentials](https://www.collabora.com/news-and-blog/blog/2016/02/16/a-programmers-view-on-digital-images-the-essentials/)
* [Microsoft: `System.Drawing.PixelFormat` Enumeration](https://docs.microsoft.com/en-us/dotnet/api/system.drawing.imaging.pixelformat?view=net-5.0) - describes pixel formats supported by this common drawing library
* [Creighton University: Tip of the week (June 2014) - ](https://medschool.creighton.edu/fileadmin/user/medicine/Departments/Biomedical_Sciences/CUIBIF/Tip_of_the_week/June_2014/4096_Shades_of_Gray.pdf) inspired the title of this article.
* [8, 12, 14 vs 16-Bit Depth: What Do You Really Need?!](https://petapixel.com/2018/09/19/8-12-14-vs-16-bit-depth-what-do-you-really-need/)
* [Premultiplied alpha](https://shawnhargreaves.com/blog/premultiplied-alpha.html) by Shawn Hargreaves (2009)
* [Straight versus premultiplied alpha](https://en.wikipedia.org/wiki/Alpha_compositing#Straight_versus_premultiplied) on Wikipedia

Deploy a Website with Python and FTPS

Python can be used to securely deploy website content using FTPS. Many people have used a FTP client like FileZilla to drag-and-drop content from their local computer to a web server, but this method requires manual clicking and is error-prone. If you write a script to accomplish this task it lowers the effort barrier for deployment (encouraging smaller iterations) and reduces the risk you'll accidentally do something unintentional (like deleting an important folder by accident).

This post reviews how I use Python, keyring, and TLS to securely manage login credentials and deploy builds from my local computer to a remote server using FTP. The strategy discussed here will be most useful in servers that use the LAMP stack, and it's worth noting that .NET and Node have their own deployment paradigms. I hope you find the code on this page useful, but you should carefully review your deployment script and create something specific to your needs. Just as you could accidentally delete an important folder using a graphical client, an incorrectly written deployment script could cause damage to your website or leak secrets.

Use Keyring to Manage Your Password

I recently wrote about several ways to manage credentials with Python.

In these examples I will use the keyring package to store and recall my FTP password securely.

pip install keyring

Storing Credentials

Store your password using an interactive interpreter to ensure you don't accidentally save it in a plain text file somewhere. This only needs to be done once.

>>> import keyring
>>> keyring.set_password("system", "me@swharden.com", "P455W0RD")

Recalling Credentials

import keyring
hostname = "swharden.com"
username = "me@swharden.com"
password = keyring.get_password("system", username)

FTP vs. FTPS vs. SFTP

FTP was not designed to be a secure - it transfers login credentials in plain text! Traditionally FTP in Python was achieved using ftplib.FTP from the standard library, but logging-in using this protocol allows anyone sniffing traffic on your network to capture your password. In Python 2.7 ftplib.FTP_TLS was added which adds transport layer security to FTP (FTPS), improving protection of your login credentials.

# ⚠️ This is insecure (password transferred in plain text)
from ftplib import FTP
with FTP(hostname, username, password) as ftp:
    print(ftp.nlst())
# 👍 This is better (password is encrypted)
from ftplib import FTP_TLS
with FTP_TLS(hostname, username, password) as ftps:
    print(ftps.nlst())

By default ftplib.FTP_TLS only encrypts the username and password. You can call prot_p() to encrypt all transferred data, but in this post I'm only interested in encrypting my login credentials.

FTP over SSL (FTPS) is different than FTP over SSH (SFTP), but both use encryption to transfer usernames and passwords, making them superior to traditional FTP which transfers these in plain text.

Recursively Delete a Folder with FTP

This method deletes each of the contents of a folder, then deletes the folder itself. If one of the contents is a subfolder, it calls itself. This example uses modern Python practices, favoring pathlib over os.path.

Note that I define the remote path using pathlib.PurePosixPath() to ensure it's formatted as a unix-type path since my remote server is a Linux machine.

import ftplib
import pathlib

def removeRecursively(ftp: ftplib.FTP, remotePath: pathlib.PurePath):
    """
    Remove a folder and all its contents from a FTP server
    """

    def removeFile(remoteFile):
        print(f"DELETING FILE {remoteFile}")
        ftp.delete(str(remoteFile))

    def removeFolder(remoteFolder):
        print(f"DELETING FOLDER {remoteFolder}/")
        ftp.rmd(str(remoteFolder))

    for (name, properties) in ftp.mlsd(remotePath):
        fullpath = remotePath.joinpath(name)
        if name == '.' or name == '..':
            continue
        elif properties['type'] == 'file':
            removeFile(fullpath)
        elif properties['type'] == 'dir':
            removeRecursively(ftp, fullpath)

    removeFolder(remotePath)

if __name__ == "__main__":
    remotePath = pathlib.PurePosixPath('/the/remote/folder')
    with ftplib.FTP_TLS("swharden.com", "scott", "P455W0RD") as ftps:
        removeFolder(ftps, remotePath)

Recursively Upload a Folder with FTP

This method recursively uploads a local folder tree to the FTP server. It first creates the folder tree, then uploads all files individually. This example uses modern Python practices, favoring pathlib over os.walk() and os.path.

Like before I define the remote path using pathlib.PurePosixPath() since the server is running Linux, but I can use pathlib.Path() for the local path and it will auto-detect how to format it based on which system I'm currently running on.

import ftplib
import pathlib

def uploadRecursively(ftp: ftplib.FTP, remoteBase: pathlib.PurePath, localBase: pathlib.PurePath):
    """
    Upload a local folder to a remote path on a FTP server
    """

    def remoteFromLocal(localPath: pathlib.PurePath):
        pathParts = localPath.parts[len(localBase.parts):]
        return remoteBase.joinpath(*pathParts)

    def uploadFile(localFile: pathlib.PurePath):
        remoteFilePath = remoteFromLocal(localFile)
        print(f"UPLOADING FILE {remoteFilePath}")
        with open(localFile, 'rb') as localBinary:
            ftp.storbinary(f"STOR {remoteFilePath}", localBinary)

    def createFolder(localFolder: pathlib.PurePath):
        remoteFolderPath = remoteFromLocal(localFolder)
        print(f"CREATING FOLDER {remoteFolderPath}/")
        ftp.mkd(str(remoteFolderPath))

    createFolder(localBase)
    for localFolder in [x for x in localBase.glob("**/*") if x.is_dir()]:
        createFolder(localFolder)
    for localFile in [x for x in localBase.glob("**/*") if not x.is_dir()]:
        uploadFile(localFile)

if __name__ == "__main__":
    localPath = pathlib.Path(R'C:\my\project\folder')
    remotePath = pathlib.PurePosixPath('/the/remote/folder')
    with ftplib.FTP_TLS("swharden.com", "scott", "P455W0RD") as ftps:
        uploadRecursively(ftps, remotePath, localPath)

Minimize Disruption by Renaming

Because walking remote folder trees deleting and upload files can be slow, this process may be disruptive to a website with live traffic. For low-traffic websites this isn't an issue, but as traffic increases (or the size of the deployment increases) it may be worth considering how to achieve the swap faster.

An improved method of deployment could involve uploading the new website to a temporary folder, switching the names of the folders, then deleting the old folder. There is brief downtime between the two FTP rename calls.

remotePath = "/live"
remotePathNew = "/live-new"
remotePathOld = "/live-old"
localPath = R"C:\dev\site\live"

upload(localPath, remotePathNew)
ftpRename(remotePath, remotePathOld)
ftpRename(remotePathNew, remotePath)
delete(remotePathOld)

Speed could be improved by handling the renaming with a shell script that runs on the server. This would require some coordination to execute though, but is worth considering. It could be executed by a HTTP endpoint.

mv /live /live-old;
mv /live-new /live;
rm -rf /live-old;

Deploy a React App with FTP and Python

You can automate deployment of a React project using Python and FTPS. After creating a new React app add a deploy.py in the project folder that uses FTPS to upload the build folder to the server, then edit your project's package.json to add predeploy and deploy commands.

  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test",
    "eject": "react-scripts eject",
    "predeploy": "npm run build",
    "deploy" : "python deploy.py"
  },

Then you can create a production build and deploy with one command:

npm run deploy

Consider Using Git to Deploy Content

This post focused on how to automate uploading local content to a remote server using FTP, but don't overlook the possibility that this may not be the best method for deployment for your application.

You can maintain a website as a git repository and use git pull on the server to update it. GitHub Actions can be used to trigger the pull step automatically using an HTTP endpoint (e.g., main.yml). If this method is available to you, it should be strongly considered.

This method is very popular, but it (1) requires git to be on the server and (2) requires all the build tools/languages to be available on the server if a build step is required. I'm reminded that only SiteGround's most expensive shared hosting plan even has git available at all.

Resources

Markdown source code last modified on May 16th, 2021
---
Title: Deploy a Website with Python and FTPS
Description: How to use Python, keyring, and TLS to securely deploy website content using FTP (FTPS)
Date: 2021-05-16 11AM EST
---

# Deploy a Website with Python and FTPS

**Python can be used to securely deploy website content using FTPS.** Many people have used a FTP client like [FileZilla](https://en.wikipedia.org/wiki/FileZilla) to drag-and-drop content from their local computer to a web server, but this method requires manual clicking and is error-prone. If you write a script to accomplish this task it lowers the effort barrier for deployment (encouraging smaller iterations) and reduces the risk you'll accidentally do something unintentional (like deleting an important folder by accident).

**This post reviews how I use Python, keyring, and TLS to securely manage login credentials and deploy builds from my local computer to a remote server using FTP.** The strategy discussed here will be most useful in servers that use the [LAMP stack](https://en.wikipedia.org/wiki/LAMP_(software_bundle)), and it's worth noting that .NET and Node have their own deployment paradigms. I hope you find the code on this page useful, but you should carefully review your deployment script and create something specific to your needs. Just as you could accidentally delete an important folder using a graphical client, an incorrectly written deployment script could cause damage to your website or leak secrets.

## Use Keyring to Manage Your Password

I recently wrote about [several ways to manage credentials with Python](https://swharden.com/blog/2021-05-15-python-credentials/). 

In these examples I will use the [keyring package](https://pypi.org/project/keyring/) to store and recall my FTP password securely.

```bash
pip install keyring
```

### Storing Credentials

Store your password using an interactive interpreter to ensure you don't accidentally save it in a plain text file somewhere. This only needs to be done once.

```py
>>> import keyring
>>> keyring.set_password("system", "me@swharden.com", "P455W0RD")
```

### Recalling Credentials

```py
import keyring
hostname = "swharden.com"
username = "me@swharden.com"
password = keyring.get_password("system", username)
```

## FTP vs. FTPS vs. SFTP

**FTP was not designed to be a secure - it transfers login credentials in plain text!** Traditionally FTP in Python was achieved using `ftplib.FTP` from the standard library, but logging-in using this protocol allows anyone sniffing traffic on your network to capture your password. In Python 2.7 `ftplib.FTP_TLS` was added which adds transport layer security to FTP (FTPS), improving protection of your login credentials. 

```py
# ⚠️ This is insecure (password transferred in plain text)
from ftplib import FTP
with FTP(hostname, username, password) as ftp:
    print(ftp.nlst())
```

```py
# 👍 This is better (password is encrypted)
from ftplib import FTP_TLS
with FTP_TLS(hostname, username, password) as ftps:
    print(ftps.nlst())
```

By default `ftplib.FTP_TLS` only encrypts the username and password. You can call `prot_p()` to encrypt all transferred data, but in this post I'm only interested in encrypting my login credentials.

FTP over SSL (FTPS) is different than FTP over SSH (SFTP), but both use encryption to transfer usernames and passwords, making them superior to traditional FTP which transfers these in plain text.

## Recursively Delete a Folder with FTP

This method deletes each of the contents of a folder, then deletes the folder itself. If one of the contents is a subfolder, it calls itself. This example uses modern Python practices, favoring `pathlib` over `os.path`.

Note that I define the remote path using `pathlib.PurePosixPath()` to ensure it's formatted as a unix-type path since my remote server is a Linux machine.

```py
import ftplib
import pathlib

def removeRecursively(ftp: ftplib.FTP, remotePath: pathlib.PurePath):
    """
    Remove a folder and all its contents from a FTP server
    """

    def removeFile(remoteFile):
        print(f"DELETING FILE {remoteFile}")
        ftp.delete(str(remoteFile))

    def removeFolder(remoteFolder):
        print(f"DELETING FOLDER {remoteFolder}/")
        ftp.rmd(str(remoteFolder))

    for (name, properties) in ftp.mlsd(remotePath):
        fullpath = remotePath.joinpath(name)
        if name == '.' or name == '..':
            continue
        elif properties['type'] == 'file':
            removeFile(fullpath)
        elif properties['type'] == 'dir':
            removeRecursively(ftp, fullpath)

    removeFolder(remotePath)

if __name__ == "__main__":
    remotePath = pathlib.PurePosixPath('/the/remote/folder')
    with ftplib.FTP_TLS("swharden.com", "scott", "P455W0RD") as ftps:
        removeFolder(ftps, remotePath)
```

## Recursively Upload a Folder with FTP

This method recursively uploads a local folder tree to the FTP server. It first creates the folder tree, then uploads all files individually. This example uses modern Python practices, favoring `pathlib` over `os.walk()` and `os.path`.

Like before I define the remote path using `pathlib.PurePosixPath()` since the server is running Linux, but I can use `pathlib.Path()` for the local path and it will auto-detect how to format it based on which system I'm currently running on.

```py
import ftplib
import pathlib

def uploadRecursively(ftp: ftplib.FTP, remoteBase: pathlib.PurePath, localBase: pathlib.PurePath):
    """
    Upload a local folder to a remote path on a FTP server
    """

    def remoteFromLocal(localPath: pathlib.PurePath):
        pathParts = localPath.parts[len(localBase.parts):]
        return remoteBase.joinpath(*pathParts)

    def uploadFile(localFile: pathlib.PurePath):
        remoteFilePath = remoteFromLocal(localFile)
        print(f"UPLOADING FILE {remoteFilePath}")
        with open(localFile, 'rb') as localBinary:
            ftp.storbinary(f"STOR {remoteFilePath}", localBinary)

    def createFolder(localFolder: pathlib.PurePath):
        remoteFolderPath = remoteFromLocal(localFolder)
        print(f"CREATING FOLDER {remoteFolderPath}/")
        ftp.mkd(str(remoteFolderPath))

    createFolder(localBase)
    for localFolder in [x for x in localBase.glob("**/*") if x.is_dir()]:
        createFolder(localFolder)
    for localFile in [x for x in localBase.glob("**/*") if not x.is_dir()]:
        uploadFile(localFile)

if __name__ == "__main__":
    localPath = pathlib.Path(R'C:\my\project\folder')
    remotePath = pathlib.PurePosixPath('/the/remote/folder')
    with ftplib.FTP_TLS("swharden.com", "scott", "P455W0RD") as ftps:
        uploadRecursively(ftps, remotePath, localPath)
```

## Minimize Disruption by Renaming

Because walking remote folder trees deleting and upload files can be slow, this process may be disruptive to a website with live traffic. For low-traffic websites this isn't an issue, but as traffic increases (or the size of the deployment increases) it may be worth considering how to achieve the swap faster.

An improved method of deployment could involve uploading the new website to a temporary folder, switching the names of the folders, then deleting the old folder. There is brief downtime between the two FTP rename calls.

```py
remotePath = "/live"
remotePathNew = "/live-new"
remotePathOld = "/live-old"
localPath = R"C:\dev\site\live"

upload(localPath, remotePathNew)
ftpRename(remotePath, remotePathOld)
ftpRename(remotePathNew, remotePath)
delete(remotePathOld)
```

Speed could be improved by handling the renaming with a shell script that runs on the server. This would require some coordination to execute though, but is worth considering. It could be executed by a HTTP endpoint.

```bash
mv /live /live-old;
mv /live-new /live;
rm -rf /live-old;
```

## Deploy a React App with FTP and Python

**You can automate deployment of a React project using Python and FTPS.** After [creating a new React app](https://reactjs.org/docs/create-a-new-react-app.html) add a `deploy.py` in the project folder that uses FTPS to upload the build folder to the server, then edit your project's `package.json` to add `predeploy` and `deploy` commands.

```json
  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test",
    "eject": "react-scripts eject",
    "predeploy": "npm run build",
    "deploy" : "python deploy.py"
  },
```

Then you can create a production build and deploy with one command:

```
npm run deploy
```

## Consider Using Git to Deploy Content

This post focused on how to automate uploading local content to a remote server using FTP, but don't overlook the possibility that this may not be the best method for deployment for your application.

**You can maintain a website as a git repository and use `git pull` on the server to update it.** GitHub Actions can be used to trigger the pull step automatically using an HTTP endpoint (e.g., [main.yml](https://github.com/ScottPlot/Website/blob/main/.github/workflows/main.yml)). If this method is available to you, it should be strongly considered.

This method is very popular, but it (1) requires `git` to be on the server and (2) requires all the build tools/languages to be available on the server if a build step is required. I'm reminded that [only SiteGround's most expensive shared hosting plan](https://www.siteground.com/shared-hosting-features.htm) even has `git` available at all. 

## Resources
* [ftplib.FTP_TLS official documentation](https://docs.python.org/2/library/ftplib.html#ftplib.FTP_TLS)
* [SFTP vs. FTPS: What's the Best Protocol for Secure FTP?](https://www.goanywhere.com/blog/2011/10/20/sftp-ftps-secure-ftp-transfers)
* [Managing Credentials with Python](https://swharden.com/blog/2021-05-15-python-credentials/)
* [Create React App: Deployment](https://create-react-app.dev/docs/deployment/) is useful but never mentions FTP
* [ftp-deploy](https://www.npmjs.com/package/ftp-deploy) is a Node.js package to help with deploying code using FTP

Managing Credentials with Python

I enjoy contributing to open-source, but I'd prefer to keep my passwords to myself! Python is a great glue language for automating tasks, and recently I've been using it to log in to my web server using SFTP and automate log analysis, file management, and software updates. The Python scripts I'm working on need to know my login information, but I want to commit them to source control and share them on GitHub so I have to be careful to use a strategy which minimizes risk of inadvertently leaking these secrets onto the internet.

This post explores various options for managing credentials in Python scripts in public repositories. There are many different ways to manage credentials with Python, and I was surprised to learn of some new ones as I was researching this topic. This post reviews the most common options, starting with the most insecure and working its way up to the most highly regarded methods for managing secrets.

Plain-Text Passwords in Code

⚠️☠️ DANGER: Never do this

You could put a password or API key directly in your python script, but even if you intend to remove it later there's always a chance you'll accidentally commit it to source control without realizing it, posing a security risk forever. This method is to be avoided at all costs!

username = "myUsername"
password = "S3CR3T_P455W0RD"
logIn(username, password)

Obfuscated Passwords in Code

⚠️☠️ DANGER: Never do this

A slightly less terrible idea is to obfuscate plain-text passwords by storing them as base 64 strings. You won't know the password just by seeing it, but anyone who has the string can easily decode it. Websites like https://www.base64decode.org are useful for this.

"""Demonstrate conversion to/from base 64"""

import base64

def obfuscate(plainText):
    plainBytes = plainText.encode('ascii')
    encodedBytes = base64.b64encode(plainBytes)
    encodedText = encodedBytes.decode('ascii')
    return encodedText


def deobfuscate(obfuscatedText):
    obfuscatedBytes = obfuscatedText.encode('ascii')
    decodedBytes = base64.b64decode(obfuscatedBytes)
    decodedText = decodedBytes.decode('ascii')
    return decodedText
original = "S3CR3T_P455W0RD"
obfuscated = obfuscate(original)
deobfuscated = deobfuscate(obfuscated)

print("original: " + original)
print("obfuscated: " + obfuscated)
print("deobfuscated: " + deobfuscated)
original: S3CR3T_P455W0RD
obfuscated: UzNDUjNUX1A0NTVXMFJE
deobfuscated: S3CR3T_P455W0RD

Passwords in Plain Text Files

⚠️ WARNING: This method is prone to mistakes. Ensure the text file is never committed to source control.

You could store username/password on the first two lines of a plain text file, then use python to read it when you need it.

with open("secrets.txt") as f:
    lines = f.readlines()
    username = lines[0].strip()
    password = lines[1].strip()
    print(f"USERNAME={username}, PASSWORD={password}")

If the text file is in the repository directory you should modify .gitignore to ensure it's not tracked by source source control. There is a risk that you may forget to do this, exposing your credentials online! A better idea may be to place the secrets file outside your repository folder entirely.

💡 There are libraries which make this easier. One example is Python Decouple which implements a lot of this logic gracefully and can even combine settings from multiple files (e.g., .ini vs .env files) for environments that can benefit from more advanced configuration options. See the notes below about helper libraries that environment variables and .env files

Passwords in Python Modules

⚠️ WARNING: This method is prone to mistakes. Ensure the secrets module is never committed to source control.

Similar to a plain text file not tracked by source control (ideally outside the repository folder entirely), you could store passwords as variables in a Python module then import it.

from mySecrets import username, password
print(f"USERNAME={username}, PASSWORD={password}")

If your secrets file is in an obscure folder, you will have to add it to your path so the module can be found when importing.

import sys
sys.path.append("C:/path/to/secrets/folder")

from mySecrets import username, password
print(f"USERNAME={username}, PASSWORD={password}")

Don't name your module secrets because the secrets module is part of the standard library and that will likely be imported in stead.

Passwords as Program Arguments

⚠️ WARNING: This method may store plain text passwords in your command history.

This isn't a great idea because passwords are seen in plain text in the console and also may be stored in the command history. However, you're unlikely to accidentally commit passwords to source control.

import sys
username = sys.argv[1]
password = sys.argv[2]
print(f"USERNAME={username}, PASSWORD={password}")
python test.py myUsername S3CR3T_P455W0RD

Type Passwords in the Console

You could request the user to type their password in the console, but the characters would be visible as they're typed.

# ⚠️ This code displays the typed password
password = input("Password: ")

Python has a getpass module in its standard library made for prompting the user for passwords as console input. Unlike input(), characters are not visible as the password is typed.

# 👍 This code hides the typed password
import getpass
password = getpass.getpass('Password: ')

Extract Passwords from the Clipboard

This is an interesting method. It's fast and simple, but a bit quirky. Downsides are (1) it requires the password to be in the clipboard which may expose it to other programs, (2) it requires installation of a nonstandard library, and (3) it won't work easily in server environments.

Note that I trust pyperclip more than clipboard (which is just another developer wrapping pyperclip)

pip install pyperclip

Run after copying a password to the clipboard:

import pyperclip
password = pyperclip.paste()

Request Credentials with Tk

The Tk graphics library is a cross-platform graphical widget toolkit that comes with Python. A login window that collects username and password can be created programmatically and wrapped in a function for easily inclusion in scripts that otherwise don't have a GUI.

I find this technique particularly useful when the username and password are stored in a password manager.

def getCredentials(defaultUser):
    """Request login credentials using a GUI."""
    import tkinter
    root = tkinter.Tk()
    root.eval('tk::PlaceWindow . center')
    root.title('Login')
    uv = tkinter.StringVar(root, value=defaultUser)
    pv = tkinter.StringVar(root, value='')
    userEntry = tkinter.Entry(root, bd=3, width=35, textvariable=uv)
    passEntry = tkinter.Entry(root, bd=3, width=35, show="*", textvariable=pv)
    btnClose = tkinter.Button(root, text="OK", command=root.destroy)
    userEntry.pack(padx=10, pady=5)
    passEntry.pack(padx=10, pady=5)
    btnClose.pack(padx=10, pady=5, side=tkinter.TOP, anchor=tkinter.NE)
    root.mainloop()
    return [uv.get(), pv.get()]
username, password = getCredentials("user@site.com")

Manage Passwords with a Keyring

The keyring package provides an easy way to access the system's keyring service from python. On MacOS it uses Keychain, on Windows it uses the Windows Credential Locker, and on Linux it can use KDE's KWallet or GNOME's Secret Service.

Downsides of keyrings are (1) it requires a nonstandard library, (2) implementation may be OS-specific, (3) it may not function easily in cloud environments.

pip install keyring
# store the password once
import keyring
keyring.set_password("system", "myUsername", "S3CR3T_P455W0RD")
# recall the password at any time
import keyring
password = keyring.get_password("system", "myUsername")

Passwords in Environment Variables

Environment variables are one of the better ways of managing credentials with Python. There are many articles on this topic, including Twilio's How To Set Environment Variables and Working with Environment Variables in Python. Environment variables are one of the preferred methods of credential management when working with cloud providers.

Be sure to restart your console session after editing environment variables before attempting to read them from within python.

import os
password = os.getenv('demoPassword')

There are many helper libraries such as python-dotenv and Python Decouple which can use local .env files to dynamically set environment variables as your program runs. As noted in previous sections, when storing passwords in plain-text in the file structure of your repository be extremely careful not to commit these files to source control!

Example .env file:

demoPassword2=superSecret

The dotenv package can load .env variables as environment variables when a Python script runs:

import dotenv
dotenv.load_dotenv()
password2 = os.getenv('demoPassword2')
print(password2)

Additional Resources

How do you manage credentials in Python? If you wish to share feedback or a creative method you use that I haven't discussed above, send me an email and I can include your suggestions in this document.

Markdown source code last modified on May 16th, 2021
---
Title: Managing Credentials with Python
Description: How to safely work with secret passwords in python scripts that are committed to source control
Date: 2021-05-15 9PM EST
---

# Managing Credentials with Python

**I enjoy contributing to open-source, but I'd prefer to keep my _passwords_ to myself!** Python is a great glue language for automating tasks, and recently I've been using it to log in to my web server using SFTP and automate log analysis, file management, and software updates. The Python scripts I'm working on need to know my login information, but I want to commit them to source control and share them on GitHub so I have to be careful to use a strategy which minimizes risk of inadvertently leaking these secrets onto the internet.

**This post explores various options for managing credentials in Python scripts in public repositories.** There are many different ways to manage credentials with Python, and I was surprised to learn of some new ones as I was researching this topic. This post reviews the most common options, starting with the most insecure and working its way up to the most highly regarded methods for managing secrets.

## Plain-Text Passwords in Code

> **⚠️☠️ DANGER:** Never do this

You could put a password or API key directly in your python script, but even if you intend to remove it later there's always a chance you'll accidentally commit it to source control without realizing it, posing a security risk forever. This method is to be avoided at all costs!

```python
username = "myUsername"
password = "S3CR3T_P455W0RD"
logIn(username, password)
```

## Obfuscated Passwords in Code

> **⚠️☠️ DANGER:** Never do this

A _slightly_ less terrible idea is to obfuscate plain-text passwords by storing them as base 64 strings. You won't know the password just by seeing it, but anyone who has the string can easily decode it. Websites like https://www.base64decode.org are useful for this.

```py
"""Demonstrate conversion to/from base 64"""

import base64

def obfuscate(plainText):
    plainBytes = plainText.encode('ascii')
    encodedBytes = base64.b64encode(plainBytes)
    encodedText = encodedBytes.decode('ascii')
    return encodedText


def deobfuscate(obfuscatedText):
    obfuscatedBytes = obfuscatedText.encode('ascii')
    decodedBytes = base64.b64decode(obfuscatedBytes)
    decodedText = decodedBytes.decode('ascii')
    return decodedText
```

```py
original = "S3CR3T_P455W0RD"
obfuscated = obfuscate(original)
deobfuscated = deobfuscate(obfuscated)

print("original: " + original)
print("obfuscated: " + obfuscated)
print("deobfuscated: " + deobfuscated)
```

```
original: S3CR3T_P455W0RD
obfuscated: UzNDUjNUX1A0NTVXMFJE
deobfuscated: S3CR3T_P455W0RD
```

## Passwords in Plain Text Files

> **⚠️ WARNING:** This method is prone to mistakes. Ensure the text file is never committed to source control.

You could store username/password on the first two lines of a plain text file, then use python to read it when you need it.

```py
with open("secrets.txt") as f:
    lines = f.readlines()
    username = lines[0].strip()
    password = lines[1].strip()
    print(f"USERNAME={username}, PASSWORD={password}")
```


If the text file is in the repository directory you should modify `.gitignore` to ensure it's not tracked by source source control. There is a risk that you may forget to do this, exposing your credentials online! A better idea may be to place the secrets file outside your repository folder entirely.

💡 There are libraries which make this easier. One example is [Python Decouple](https://pypi.org/project/python-decouple/) which implements a lot of this logic gracefully and can even combine settings from multiple files (e.g., `.ini` vs `.env` files) for environments that can benefit from more advanced configuration options. See the notes below about helper libraries that environment variables and `.env` files

## Passwords in Python Modules

> **⚠️ WARNING:** This method is prone to mistakes. Ensure the secrets module is never committed to source control.

Similar to a plain text file not tracked by source control (ideally outside the repository folder entirely), you could store passwords as variables in a Python module then import it.

```py
from mySecrets import username, password
print(f"USERNAME={username}, PASSWORD={password}")
```

If your secrets file is in an obscure folder, you will have to add it to your path so the module can be found when importing.

```py
import sys
sys.path.append("C:/path/to/secrets/folder")

from mySecrets import username, password
print(f"USERNAME={username}, PASSWORD={password}")
```

Don't name your module `secrets` because the [secrets module](https://docs.python.org/3/library/secrets.html) is part of the standard library and that will likely be imported in stead.

## Passwords as Program Arguments

> **⚠️ WARNING:** This method may store plain text passwords in your command history.

This isn't a great idea because passwords are seen in plain text in the console and also may be stored in the command history. However, you're unlikely to accidentally commit passwords to source control.

```py
import sys
username = sys.argv[1]
password = sys.argv[2]
print(f"USERNAME={username}, PASSWORD={password}")
```

```bash
python test.py myUsername S3CR3T_P455W0RD
```

## Type Passwords in the Console

You could request the user to type their password in the console, but the characters would be visible as they're typed.

```py
# ⚠️ This code displays the typed password
password = input("Password: ")
```

Python has a [getpass module](https://docs.python.org/3/library/getpass.html) in its standard library made for prompting the user for passwords as console input. Unlike `input()`, characters are not visible as the password is typed.

```py
# 👍 This code hides the typed password
import getpass
password = getpass.getpass('Password: ')
```

## Extract Passwords from the Clipboard

This is an interesting method. It's fast and simple, but a bit quirky. Downsides are (1) it requires the password to be in the clipboard which may expose it to other programs, (2) it requires installation of a nonstandard library, and (3) it won't work easily in server environments.

Note that I trust [pyperclip](https://pypi.org/project/pyperclip/) more than [clipboard](https://pypi.org/project/clipboard/) (which is just [another developer wrapping pyperclip](https://github.com/terryyin/clipboard/blob/master/clipboard.py))

```bash
pip install pyperclip
```

Run after copying a password to the clipboard:

```py
import pyperclip
password = pyperclip.paste()
```

## Request Credentials with Tk

The [Tk graphics library](https://docs.python.org/3/library/tk.html) is a cross-platform graphical widget toolkit that comes with Python. A login window that collects username and password can be created programmatically and wrapped in a function for easily inclusion in scripts that otherwise don't have a GUI.

I find this technique particularly useful when the username and password are stored in a password manager.

<div class="text-center">

![](tk-password-dialog.png)

</div>

```py
def getCredentials(defaultUser):
    """Request login credentials using a GUI."""
    import tkinter
    root = tkinter.Tk()
    root.eval('tk::PlaceWindow . center')
    root.title('Login')
    uv = tkinter.StringVar(root, value=defaultUser)
    pv = tkinter.StringVar(root, value='')
    userEntry = tkinter.Entry(root, bd=3, width=35, textvariable=uv)
    passEntry = tkinter.Entry(root, bd=3, width=35, show="*", textvariable=pv)
    btnClose = tkinter.Button(root, text="OK", command=root.destroy)
    userEntry.pack(padx=10, pady=5)
    passEntry.pack(padx=10, pady=5)
    btnClose.pack(padx=10, pady=5, side=tkinter.TOP, anchor=tkinter.NE)
    root.mainloop()
    return [uv.get(), pv.get()]
```

```py
username, password = getCredentials("user@site.com")
```

## Manage Passwords with a Keyring

The [keyring package](https://pypi.org/project/keyring/) provides an easy way to access the system's keyring service from python. On MacOS it uses [Keychain](https://en.wikipedia.org/wiki/Keychain_%28software%29), on Windows it uses the [Windows Credential Locker](https://docs.microsoft.com/en-us/windows/uwp/security/credential-locker), and on Linux it can use KDE's [KWallet](https://en.wikipedia.org/wiki/KWallet) or GNOME's [Secret Service](https://specifications.freedesktop.org/secret-service/latest).

Downsides of keyrings are (1) it requires a nonstandard library, (2) implementation may be OS-specific, (3) it may not function easily in cloud environments.

```bash
pip install keyring
```

```py
# store the password once
import keyring
keyring.set_password("system", "myUsername", "S3CR3T_P455W0RD")
```

```py
# recall the password at any time
import keyring
password = keyring.get_password("system", "myUsername")
```

## Passwords in Environment Variables

Environment variables are one of the better ways of managing credentials with Python. There are many articles on this topic, including Twilio's [How To Set Environment Variables](https://www.twilio.com/blog/2017/01/how-to-set-environment-variables.html) and [Working with Environment Variables in Python](https://www.twilio.com/blog/environment-variables-python). Environment variables are one of the preferred methods of credential management when working with cloud providers.

<div class="text-center">

![](environment-variables.png)

</div>

Be sure to restart your console session after editing environment variables before attempting to read them from within python.

```py
import os
password = os.getenv('demoPassword')
```

There are many helper libraries such as [python-dotenv](https://pypi.org/project/python-dotenv/) and [Python Decouple](https://pypi.org/project/python-decouple/) which can use local `.env` files to dynamically set environment variables as your program runs. As noted in previous sections, when storing passwords in plain-text in the file structure of your repository be extremely careful not to commit these files to source control!


Example `.env` file:
```txt
demoPassword2=superSecret
```

The `dotenv` package can load `.env` variables as environment variables when a Python script runs:

```py
import dotenv
dotenv.load_dotenv()
password2 = os.getenv('demoPassword2')
print(password2)
```

## Additional Resources
* [Using .env Files for Environment Variables in Python Applications](https://dev.to/jakewitcher/using-env-files-for-environment-variables-in-python-applications-55a1)
* [Environment Variables vs. Secrets In Python](https://www.activestate.com/blog/python-environment-variables-vs-secrets/)
* [How To Set Environment Variables](https://www.twilio.com/blog/2017/01/how-to-set-environment-variables.html)
* [Working with Environment Variables in Python](https://www.twilio.com/blog/environment-variables-python)
* [How to Set and Get Environment Variables in Python](https://able.bio/rhett/how-to-set-and-get-environment-variables-in-python--274rgt5)
* [python-dotenv](https://pypi.org/project/python-dotenv/) reads key-value pairs from a .env file and can set them as environment variables. It helps in the development of applications following the 12-factor principles.
* [Python Decouple](https://pypi.org/project/python-decouple/) helps you to organize your settings so that you can change parameters without having to redeploy your app.

_How do you manage credentials in Python? If you wish to share feedback or a creative method you use that I haven't discussed above, send me an email and I can include your suggestions in this document._
Pages