Volume forms on the mass-shell

The setting for dynamics is the cotangent bundle \mathcal{T}^*\mathcal{M} of a manifold \mathcal{M} with pseudo-Riemannian metric g_{\mu\nu}; relevant observables can be functions of both position and momentum. For example, the distribution function f(x,p), which is the number density of particles in phase space (x^\mu and p^\mu are coordinates on \mathcal{M} and \mathcal{T}^*_x\mathcal{M} respectively).

The spatial volume form (integral measure) is \mathrm{vol}(\mathcal{N}) \equiv \sqrt{g(x)} dx^1 \wedge \ldots \wedge dx^n, where g(x) is the determinant of the metric evaluated at x \in M; and the volume form for the full phase space is dx^1 \wedge \ldots \wedge dx^n \wedge dp_1 \wedge \ldots \wedge dp_n. If we want to integrate out the momentum-dependence of some observable, we need just the momentum-part of the volume form. From the two expressions above we can see that this is \mathrm{vol}(\mathcal{T}^*_x\mathcal{M}) = \frac{1}{\sqrt{g(x)}} dp^1 \wedge \ldots \wedge dp^n.

However, paths that obey the classical equations of motion are constrained to lie on the mass-shell: in flat Lorentzian spacetime this is a hyperboloid in the momentum cotangent space \left(\mathbb{R}^{1,3}\right)^*, given by -p_0^2 + p_i^2 = -m^2, where p_0 is the energy, p_i the spatial 3-momentum and m the rest mass of a particular particle. However, in fully general curved spacetime the corresponding condition gives a hypersurface \mathcal{N} \equiv \{(x,p) \in \mathcal{T}^*\mathcal{M} \: | \: p^\mu p_\mu = -m^2 \}.

Because the permitted region of phase space has been restricted, we can eliminate one momentum component from the integral, treating it as a function of the other coordinates. Conventionally we pick p_0 to be this unwanted component, writing p_0 = p_0(p_i), and note that the equation for \mathcal{N} can be written p_\mu p^\mu = p_0 p^0 + p_i p^i = -m^2, which allows us to solve for p_0.

We then find the volume form induced on \mathcal{N}. Let n be the unit vector field normal to \mathcal{N}:

n = \left.\frac{d(p_\mu p^\mu)}{\left\|d(p_\mu p^\mu)\right\|}\right|_\mathcal{N}

The denominator works out to be -4m^2. Noting that the metric is a function of x alone, we perform the derivative on the numerator and so find that

n = \frac{2p_\mu dp^\mu}{-4m^2} = \frac{-p_\mu dp^\mu}{2m^2}

(the normal index convention for vectors and covectors is swapped round, as we are working on \mathcal{T}^*\mathcal{M}, so the coordinates p_\mu are covariant to begin with).

In general, the volume form induced from a manifold \mathcal{M} onto a submanifold \mathcal{N} with normal VF n is

\mathrm{vol}(\mathcal{N}) = \left. n \:\lrcorner\: \mathrm{vol}(\mathcal{M}) \right|_\mathcal{N}.

For the present purposes we therefore have

\sqrt{g(x)} \mathrm{vol}(\mathcal{N}) = n \:\lrcorner\: \left( dp_0 \wedge dp_1 \wedge dp_2 \wedge dp_3 \right) \\ \:\: = (n \:\lrcorner\: dp_0) dp_1 \wedge dp_2 \wedge dp_3 - (n \:\lrcorner\: dp_1) dp_0 \wedge dp_2 \wedge dp_3 \\ \:\:\:\:\:\: + (n \:\lrcorner\: dp_2) dp_0 \wedge dp_1 \wedge dp_3 - (n \:\lrcorner\: dp_3) dp_0 \wedge dp_1 \wedge dp_2

We differentiate the condition p^\mu p_\mu = -m^2 with respect to a spatial component p_i and rearrange, giving (note the positions of the indices):

\frac{\partial p_0}{\partial p_i} = \frac{-p^i}{p^0} \\ dp_0 = \frac{\partial p_0}{\partial p_i}dp_i = \frac{-p^i}{p^0}dp_i.

Expressions of the form dp_0 \wedge dp_i \wedge dp_j can be simplified to involve only the k-component of dp_0 ( i \neq k \neq j ), as repetition of a form in a wedge product sets the entire expression to zero. We also have n \:\lrcorner\: dp_\mu = n_\mu = \frac{-p_\mu}{2m^2}.

So, putting it all together,

\sqrt{g(x)} \mathrm{vol}(\mathcal{N}) = \\ \:\:  \left( \frac{-p_0}{2m^2} \right) dp_1 \wedge dp_2 \wedge dp_3 - \left( \frac{-p_1}{2m^2} \frac{-p^1}{p^0} \right) dp_1 \wedge dp_2 \wedge dp_3 \\ \:\:\:\:\:\: + \left( \frac{-p_2}{2m^2} \frac{-p^2}{p^0} \right) dp_2 \wedge dp_1 \wedge dp_3 - \left( \frac{-p_3}{2m^2} \frac{-p^3}{p^0} \right) dp_3 \wedge dp_1 \wedge dp_2 \\ \:\: = \frac{-1}{2 m^2 p^0}\left(p_0p^0 + p_ip^i\right) dp_1 \wedge dp_2 \wedge dp_3 \\ \:\: = \frac{-(-m^2)}{2 m^2 p^0} dp_1 \wedge dp_2 \wedge dp_3

And so the final result is

\mathrm{vol}(\mathcal{N}) = \frac{dp_1 \wedge dp_2 \wedge dp_3}{2 p^0 \sqrt{g(x)}} \\

which has the expected form. Integrals over momentum space therefore look like

\int_\mathcal{N} f(p_i) \frac{dp_1 \wedge dp_2 \wedge dp_3}{2 E(p_i) \sqrt{g(x)}}.

Everything we wrote down was manifestly covariant, so this volume form transforms in the correct way under general coordinate transformations. The rest mass m does not appear in the final volume form, so we are free to set m = 0 if we choose, as is the case with photons.

Posted in physics | Leave a comment

Composing array masks

Let’s say you have an array P, containing N (~millions) points P_i \in \mathbb{R}^3. Perhaps it’s the output of an n-body simulation or something more complicated. Anyway, suppose you also have several other arrays of size N, each listing some quantity that is associated with each point P_i – for example M_i, the mass at each point.

The task: find some small contiguous region \Omega \subset \mathbb{R}^3, and calculate some function at every point within it. For example, approximate the integral of the density by summing over the mass at each point:

\displaystyle  \int_{\Omega} \rho(x) d^3x \approx \sum_{P_i \in \Omega} M_i

Let us introduce an operation A \star B = C, where |A| = |B|, B contains only zeroes or ones, and C contains A_i iff B_i = 1; also |C| \leq |A|, but C maintains the order of elements from A. Assume that this operation can be calculated efficiently (in parallel), as opposed to sequentially traversing the elements of an array, which is much slower (this is the case in the IDL programming language, for example).

The solution to the problem at hand is to produce a mask S containing N elements with values the characteristic function \chi_\Omega at each point – that is, S_i = \chi_\Omega(P_i) = 1 if P_i \in \Omega and 0 otherwise. Then the final calculation is

\displaystyle  \sum_{P_i \in \Omega} M_i = \sum_i \left(M \star S\right)_i

and the advantage is that we can reuse the mask S for additional calculations.

Now suppose that \Omega is itself a very large set, and that we have another, smaller region \Omega' \subset \Omega on which we would also like to perform calculations. Let P' = P \star S and S' = \chi_{\Omega'}(P'). Then we can compose our masks to find the M values that lie within \Omega'.

\displaystyle  \sum_{P_i \in \Omega'} M_i = \sum_i \left((M \star S) \star S'\right)_i.

What are the P-indices of points that are in the region \Omega'? To find the answer we do:

\displaystyle  A = [0,1,2,\ldots,|S|] \\ B = (A \star S) \star S'

and B now contains the desired P-indices.

In the IDL programming language, A = B \star C is written as

A = B[where(C)]

A more general (and also efficient) way of dealing with large, unstructured lists of coordinates is to use an oct-tree.

Posted in maths, programming | Leave a comment

From notated music to audible sounds

This is the second post in a series devoted to music from a mathematical point of view. The first post dealt with written intervals and notes; the moral of that post was that there is some structure (a vector space) hidden inside the way we talk about intervals and notes, which we can (and should) take advantage of.

In this post, I will make the transition from notated music to audible noises, still in a way that is aimed at my hypothetical musically-ignorant mathematician.

Revision of previous ideas

Notated intervals form a two-dimensional vector space. Pitches form a two-dimensional affine space, with intervals as the ‘difference’ vectors. See the previous post for details.

Addition in log-space

I take audible sounds to be the space of frequencies as measured in units of hertz (cycles per second). However, what we’re really interested in are the ratios between these frequencies. The absolute values only come into play when we choose an arbitrary reference point off which to base all our absolute pitches. (Choosing a reference point is different from using a non-standard tuning system – you can have equal temperament, but at Baroque pitch (A = 415), for example).

Pitch ratios are, of course, combined by multiplication, but we can still write the operation as addition provided we understand that they are being `added’ in log-space:

f_1 = f_2f_3 \:\:\: \Longleftrightarrow \:\:\: \log{f_1} = \log{f_2} + \log{f_3}.

In practice, these interval ratios will always be formed by taking rational numbers to rational powers.

Constraints on a tuning system

Different musical instruments are suited to different methods of tuning notes. For example, the human voice can trivially produce pitches at any frequency in a certain range – and the same for string instruments. Wind instruments have a fixed number of ‘holes’, plus some standardised ways of shifting the basic pitches around. Brass instruments are even more restricted, and the notes they can play are closely related to the harmonic series.

Keyboard instruments are a somewhat different beast – in theory you could associate a button/key with any note imaginable, but due to practical limitations a one-dimensional array of keys is used. This obviously causes issues when we try to match up a notation system based on a two-dimensional system of intervals to the keys available. Therefore we’ll need to come up with some way of reducing (“projecting”) our two dimensions down to a single dimension. This is the Fundamental Keyboard Problem.

Intervals with rational coefficients

When defining a tuning system, what is typically given are particular ratios for certain intervals. Suppose we have a tuning system t : \mathcal{I} \longrightarrow \mathbb{R}, i.e. a map that takes intervals to pitch ratios. We fix two intervals, t(i_1) = f_1 and t(i_2) = f_2. Assuming it is not the case that i_1 \propto i_2, these two intervals span \mathcal{I}, so t(i) is now fixed for all i \in \mathcal{I}. This is because any i \in \mathcal{I} can be written in the i_1, i_2 basis,

i = \alpha\cdot i_1 + \beta\cdot i_2

and hence

t(i) = \alpha\cdot t(i_1) + \beta\cdot t(i_2) \equiv f_1^\alpha f_2^\beta

Many well-known tuning systems can be specified this way. They are called syntonic tuning systems, or rank-2 tuning systems. However, in practice there is only one interval ratio that is free to be specified arbitrarily, because the other fixed interval is always t(\mathsf{P8}) = 2, otherwise octaves aren’t pure!

This gives rise to the main problem: two non-octave intervals can’t be simultaneously pure. This is distinct from the problem of designing keyboard instruments. The diatonic scale of Ptolemy specifies pure intervals for all eight steps of the major scale:

degree ratio
P1 1
M2 9/8
M3 5/4
P4 4/3
P5 3/2
M6 5/3
M7 15/8
P8 2

(There exist numerous slight variations of the Ptolemaic scale, as well as the minor scale etc.)

With a syntonic temperament, we can only get a few of these ‘correct’, unless we happen to get lucky with our ratios. P1 and P8 are correct by definition; then i (e.g. P5) can be specified freely; then, perhaps \mathsf{P8} - i (e.g. P4) will come out correct too. After that you’re out of luck.

Syntonic tuning systems

In Pythagorean tuning, the given intervals are 3/2 for the perfect fifth, and 2 for the octave. As indicated above, this completely specifies the tuning. The procedure for general intervals is then as follows:

  • Define a map t : \mathcal{I} \longrightarrow \mathbb{R} that takes intervals to pitch ratios, and define it for the two chosen basis intervals, e.g.

t(\mathsf{P5}) = \frac{3}{2}\\ t(\mathsf{P8}) = 2

  • Write your chosen interval in terms of the new basis and calculate the appropriate ratio, e.g.

\mathsf{M6} = 3\cdot \mathsf{P5} - 1\cdot\mathsf{P8} \\ t(\mathsf{M6}) = \left(\frac{3}{2}\right)^3 \left(2\right)^{-1} = \frac{27}{16}

  • Then, for notes, define a new map T : \mathcal{P} \longrightarrow \mathbb{R}
  • Fix the origin under T, i.e. T(p_0) = f_0 for some note p_0 and pitch f_0; the common choice is p_0 = \mathsf{A}, and f_0 =   440\:\mathrm{Hz}
  • Extend T to all notes by

T(p) = t(p - p_0)\times T(p_0)

For example,

T(\mathsf{F\sharp}) = t(\mathsf{M6}) \times T(\mathsf{A}) = \frac{27}{16} \times 440\:\mathrm{Hz} = 742.5\:\mathrm{Hz}

Here is a table of some common syntonic tuning systems, in each case assuming that the second constrained interval is \mathsf{P8} \longrightarrow 2:

Tuning system Fixed interval
Pythagorean \mathsf{P5} \longrightarrow \frac{2}{3}
Quarter-comma meantone \mathsf{M3} \longrightarrow \frac{5}{4}
Sixth-comma meantone \mathsf{A4} \longrightarrow \frac{45}{32}
Third-comma meantone \mathsf{m3} \longrightarrow \frac{6}{5}
Schismatic 8\cdot\mathsf{P4} \longrightarrow 10

Note that we quickly enter the realm of irrational numbers: for example, under quarter-comma meantone, \mathsf{P5} \longrightarrow \left(\frac{5}{4}\right)^\frac{1}{4}\left(2\right)^\frac{1}{2} \approx 1.495.

You can immediately see that different tuning systems give different trades-off: quarter-comma meantone provides you with sweet-sounding (and narrow) major thirds, while abandoning the pure fifths of Pythagorean tuning.

There is a link here between theory and practice: in Medieval music, for which Pythagorean tuning was used, phrase-endings rarely feature major thirds – normally open fifths and octaves are the only intervals considered ‘pure’ enough to end a phrase. In Renaissance and Baroque music, major thirds are used much more often, and this coincides with the use of quarter-comma meantone tuning.

Keyboard instruments with syntonic temperaments

Let us design a keyboard that will use notes from a syntonic temperament t : \mathcal{I} \longrightarrow \mathbb{R} (with fixed interval i, origin note b, and note-mapping T : \mathcal{P} \longrightarrow \mathbb{R} ); we know that octaves will be pure, so we make our one-dimensional keyboard periodic at the octave, and then place n keys in each octave. Each key (attached to a physical string or pipe) will be tuned to some definite frequency f \in \{T(p) \: | \: p \in \mathcal{P} \}.

Now we’ll attempt to distribute notes from our temperament to the physical keys on the keyboard. Starting at note b (with frequency T(b) ), assign the notes (b\pm k\cdot i) \: \mathrm{mod} \: \mathsf{P8} to their keys (with frequencies T\left((b\pm k\cdot i) \: \mathrm{mod} \: \mathsf{P8}\right) ), ending at k = n (\pm 1 depending on whether n is odd or even). Unfortunately in general the cycle is not closed, as \left(n\cdot i \: \mathrm{mod} \: \mathsf{P8}\right) \neq \mathsf{P1}. This is called a wolf interval, and its existence limits the usefulness of syntonic tuning systems for keyboards.

To minimise disruption, the wolf interval is normally chosen to be one that is little used if playing in keys with a low number of sharps and flats; for example \mathsf{G\sharp} - \mathsf{E\flat} = \mathsf{A3}: under Pythagorean tuning, the A3 is about 1.35, to contrast with the pure P4 which is exactly \frac{4}{3}.

Keyboard instruments with equal temperaments

Returning to the Fundamental Keyboard Problem, we see that the solution is to project the two dimensions of notated intervals down to a one-dimensional subspace. This necessarily involves one interval being set to zero (or to one, multiplicatively speaking). Our search therefore is effectively for syntonic tuning systems where the fixed ratios are \mathsf{P8} \longrightarrow 2 and i \longrightarrow 1 for some interval i.

Before we know what i is, can we say what such a tuning system would look like? Well, if we pick an interval j \in \mathcal{I}, j \neq i, and use i, j as our new basis, then because i \longrightarrow 1, we can generate all intervals with \alpha\cdot j for some rational \alpha. Furthermore, we can pick j carefully so that all intervals can actually be represented by \alpha\cdot j for integral \alpha. Then we can use j as a convenient “unit” with which to construct our notation system or keyboard. If \mathsf{P8} = n\cdot j, then the tuning system is called n-equal temperament.

A bit of experimentation (or suitably clever calculation) results in some promising-looking candidates for i:

i j n
\mathsf{A1} \mathsf{M2} 7
\mathsf{d2} \mathsf{A1}, \mathsf{m2} 12
\mathsf{dd2} \mathsf{d2} 19
\mathsf{d^4 3} \mathsf{d2} 31
\mathsf{d^7 6} \mathsf{d2} 53

(The j interval is non-unique, as various intervals become identified under equal temperaments.)

As you may have guessed already, the favourite choice here is n = 12 and i = \mathsf{d2} \longrightarrow 1. This means that \mathsf{A1} \longrightarrow 2^\frac{1}{12}, and \mathsf{m2} \longrightarrow 2^\frac{1}{12}. So A1 and m2 are identified, and are used as the generator j. They are referred to interchangeably as a “semitone”. The other useful property of 12-equal temperament is that \mathsf{P5} \longrightarrow 2^\frac{7}{12} \approx 1.498, which is extremely close to the Pythagorean value!

Thus the use of 12-equal temperament to resolve the Fundamental Keyboard Problem leads directly to keyboards with 12 keys per octave; seven “white” notes {\mathsf{A},\mathsf{B},\mathsf{C},\mathsf{D},\mathsf{E},\mathsf{F},\mathsf{G}}, and five “black” notes {\mathsf{A\sharp},\mathsf{C\sharp},\mathsf{D\sharp},\mathsf{F\sharp},\mathsf{G\sharp}}. There are no more notes to account for, because the equivalency of A1 and m2 means that notes that differ by these intervals are identified, e.g. \mathsf{B\sharp} \equiv \mathsf{C} and \mathsf{F\sharp} \equiv \mathsf{G\flat}.

Twelve notes per octave is also fairly convenient given the size of human hands, and how difficult the resulting instrument is to play.

Other instruments

Consider an ensemble of dynamically-tunable instruments (string instruments, human voices, etc.). If this ensemble plays a major chord, there’s no reason why the players can’t all agree to tune it totally purely – with ratios of 1, \frac{5}{4}, \frac{3}{2}.

As a general strategy, the ensemble could choose to fix just a few notes overall, and then tweak any chord slightly to maximise harmonicity. Or, locally fix any note that is constant between successive chords, and change all the other notes around it.

These systems of constant readjustment have one big advantage – much nicer-sounding intervals – and several major annoyances, which are:

  • There’s no longer an unambiguous mapping between written notes and sounding frequencies. This may or may not offend you greatly, depending on how you axiomatise musical notation (you can probably guess my position…)
  • A tendency for the pitch of the entire ensemble to drift over time (particularly with the second system).
  • Cannot include certain instruments in the ensemble (any keyboard instruments, certain wind instruments).

Nevertheless, it is hypothesised that certain ensembles (string quartets, unaccompanied choirs) do in fact adjust their intonation in this way.

Posted in maths, music | Leave a comment

Cheap & Easy differential forms

There’s a way of motivating the notions of tangent vectors and covectors that’s hinted at but generally glossed over – at least in the physics courses that I take. This post is a quick overview, serving mostly as a reminder to myself of the topic. Please excuse the lack of rigour.

I will use the Einstein summation convention throughout,

x^iy_iz^j \equiv \sum\limits_i \: x^iy_iz^j,

and hopefully by the end I’ll even have explained why it makes sense.

Tangent vectors

We have an n-dimensional manifold M, which contains points, but not vectors. You cannot subtract two points on a manifold and expect to get something useful; imagine a line drawn between two points on the Earth’s surface. It would go awkwardly underground, and wouldn’t measure any sort of quantity that’s appreciable by inhabitants on the surface.

Let \gamma \: : \: \mathbb{R} \longrightarrow M be a curve on M. It takes some real parameter (lets call it t ) and spits out points in M along a line, as you evolve t. Let’s call the coordinates of these points p^i(t) in some coordinate system, and p'^i(t) in some other coordinate system. Then we can find a ‘velocity’ vector \dot{\gamma}, tangent to the curve, whose coordinates are \left(\frac{dp^1}{dt}, \frac{dp^2}{dt}, \ldots, \frac{dp^n}{dt}\right). The coordinates of \dot{\gamma} in the primed coordinate system are then given by the chain rule,

\frac{dp'^i}{dt} = \frac{dp'^i}{dp^j}\frac{dp^j}{dt}.

This motivates the study of all objects that transform this way, and they are called contravariant vectors, or contravectors, or just vectors.

Now, so far the vectors are just n-tuples of numbers, with no particular geometric significance. I will however write down a vector \mathbf{v} with a basis by pairing up its components v^i with a basis \mathbf{e}_i, as well as the same in the primed coordinate system:

\mathbf{v} = v^i\mathbf{e}_i = v'^i\mathbf{e'}_i,

and for now these basis vectors \mathbf{e}_i are formal placeholders. All we can say is, whatever the choice of \mathbf{e}_i, they will have to transform using the inverse of the transformation matrix used by v^i, in order that the expression above remains true in any coordinate system.

A vector lives at a single point p of M, in a space called ‘the tangent space to M at p ‘, or T_pM for short – imagine a flat plane ( T_pM ) balancing on top of a lumpy surface ( M ), touching it at a point ( p ). If \mathbf{v} varies from point to point, it is strictly a vector field, or in other words a function \mathbf{v} \: : \: M \longrightarrow T_pM; in this case we can just say that it lives in TM, and we have to be careful not to forget about its position-dependence even if we suppress it occasionally for notational convenience.

Differential operators

Let f \: : \: M \longrightarrow \mathbb{R} be a scalar field (just a real-valued function) on our manifold M. We can write differentiation of f along a vector (the directional derivative along \mathbf{v} ) in three ways, all defined to be equivalent

\left( \nabla_\mathbf{v} f \right) (p) \equiv \frac{df(p + t\mathbf{v})}{dt}\Big|_{t = 0} \equiv v^i\frac{\partial f}{\partial x^i},

and note that we’re not really adding the vector \mathbf{v} to the point p, because we’re evaluating the expression at t = 0. The p-dependence is casually suppressed in the last expression.

We might worry that this is coordinate system-dependent, so lets try to write the same quantity down in the primed coordinate system, using the transformation properties of \mathbf{v} that we already know, and the chain rule:

v'^i\frac{\partial f}{\partial x'^i} = \frac{\partial x'^i}{\partial x^j} v^j \frac{\partial f}{\partial x^k}\frac{\partial x^k}{\partial x'^i} = v^j \frac{\partial f}{\partial x^k} \delta^k_j = v^k \frac{\partial f}{\partial x^k},

so our directional derivative is coordinate-invariant after all! Note that multiplying the coordinates of a matrix with those of its inverse (and summing according to the Einstein convention) gives the Kronecker delta, which is why we can swap out j for k in the last expression.

Coordinate-invariance shouldn’t surprise us too much, because the first two ways of writing the directional derivative made no mention of any coordinate system for \mathbf{v}.

Now, recall that first-order differential operators on real functions of n variables all take the form

D f = a^i\frac{\partial f}{\partial x^i},

and so if we just interpret the values a^i as components of a vector, we’ve found a one-to-one correspondence between vectors and first-order differential operators (strictly speaking it’s between vector components and operators, but all the transformation matrices between coordinate systems are one-to-one too so it doesn’t matter).

This correspondence with differential operators hints strongly at what quantities to use as our basis vectors – the individual derivative operators \frac{\partial}{\partial x^i} certainly transform in the correct way. We now make the formal identification

\frac{\partial}{\partial x^i} \equiv \mathbf{e}_i.

I say formal because we will not treat these basis vectors like ‘proper’ derivative symbols, as their ‘true’ meaning will only come into play in certain carefully-defined situations.

Let’s make the following abbreviations: \frac{\partial}{\partial x^i} \equiv \partial_i and \frac{\partial}{\partial x'^i} \equiv \partial'_i when talking about operators; and \frac{\mathbf{\partial}}{\mathbf{\partial} x^i} \equiv \partial_i \equiv \mathbf{e}_i and \frac{\mathbf{\partial}}{\mathbf{\partial} x'^i} \equiv \partial'_i \equiv \mathbf{e'}_i when talking about basis vectors.

Linear functionals

A linear functional is a function \alpha \: : \: TM \longrightarrow \mathbb{R} that satisfies linearity, i.e.

\alpha(\mathbf{u}) = x \in \mathbb{R} \\ (\alpha + \beta)(\mathbf{u}) = \alpha(\mathbf{u}) + \beta(\mathbf{u}) \\ \alpha(\mathbf{u} + \mathbf{v}) = \alpha(\mathbf{u}) + \alpha(\mathbf{v}).

Linear functionals are ‘vector-like’, but live in a space called T^*M, rather than the TM that contains vectors. They are totally determined by their action on the basis vectors of TM, so can be written down in components:

\alpha_i = \alpha(\mathbf{e}_i) \\ \alpha(\mathbf{v}) = \alpha_i v^i \\ \alpha = \alpha_i\eta^i \\ \eta^i(\mathbf{v}) = v^i,

where the \eta^i are some as-yet-mysterious basis for our linear functionals. Note the position of the indices on each quantity: the Einstein summation convention is working correctly, even if we don’t necessarily know yet what sort of quantities we’re dealing with.

The expression \alpha(\mathbf{v}) = \alpha_i v^i must be coordinate independent, as the left hand side makes no reference to any coordinate system; and we already know how to transform v^i. Therefore the components \alpha_i must use the opposite transformation, \alpha'_i = \alpha_j \frac{\partial x^j}{\partial x'^i}. So we have

\alpha'_i v'^i = \alpha_j \frac{\partial x^j}{\partial x'^i} \frac{\partial x'^i}{\partial x^k} v^k = \alpha_j \delta^j_k v^k = \alpha_j v^j = \alpha(\mathbf{v}).

These linear functionals are also called covariant vectors, covectors, differential one-forms, or one-forms. Remember that both \alpha and \mathbf{v} can have p-dependence in general, making them covector fields and vector fields respectively.

Total differential of a function

The following formula for the ‘total’ differential of a function should be familiar:

df = \partial_i f dx^i,

where p-dependence has been suppressed on both sides. However, we don’t currently have a way to make geometric sense of the individual coordinate differentials dx^i. This expression must be coordinate-independent (no mention of coordinates is made on the left side), so the coordinate differentials must transform as

dx'^i = \frac{\partial x'^i}{\partial x^j} dx^j.

This is exactly how the basis for our covectors \eta^i transforms! So we can make the formal identification \eta^i \equiv dx^i, much like how we earlier decided that \mathbf{e}_i \equiv \partial_i.

The full component-wise expression for the action of our covectors on our vectors is

\alpha(\mathbf{v}) = \alpha_i v^j dx^i(\partial_j) = \alpha_i v^j \delta^i_j = \alpha_i v^i.

The only trick here is dx^i(\partial_j) = \delta^i_j, which we have defined to be true.

Our expression for df now comes in handy as a way to generate new covectors. In fact, covectors generated in this way have the following useful property:

df(\mathbf{v}) = \partial_i f v^i = \nabla_\mathbf{v} f.

You may have spotted by now the value of the Einstein summation convention – as long as you keep your up-indices on vector components (and covector bases), and down-indices on covector components (and vector bases), any scalar you end up with will be coordinate-independent. This is a useful ‘type-check’ on any expression; if the indices don’t match, something must be wrong (or you’ve violated relativity by finding a preferred coordinate system).

I finish with three warnings:

  • Covectors generated from functions (like df above) are not the only kind! Any linear combination of the basis covectors dx^i is a covector, and in general the arbitrary covector a_i dx^i will not be the differential of any function at all.
  • The components of vectors transform in the opposite way to the components of covectors. The basis vectors transform oppositely to the vector components, and to the basis covectors. This is confusing! Hence physicists like to pretend that basis vectors don’t exist, and only work with components. This is a convenient way to work for many computations, but you can end up getting confused when your basis vectors change from point-to-point (as they do on most non-trivial manifolds and coordinate systems):

    \frac{d\mathbf{v}}{dt} = \frac{d}{dt}\left(v^i \mathbf{e}_i\right) = \dot{v}^i\mathbf{e}_i + v^i\mathbf{\dot{e}}_i.

    Mathematicians never write any down coordinates, say they are working in a ‘coordinate-free’ way, and act all clever about it.

  • There is one more way to write the directional derivative, which is

    \mathbf{v}\left(df\right) = v^i \frac{\partial f}{\partial x^j} \partial_i(dx^j) = v^i \frac{\partial f}{\partial x^j} \delta^j_i = v^i \frac{\partial f}{\partial x^i} = \nabla_\mathbf{v} f,

    treating \mathbf{v} as a function T^*M \longrightarrow \mathbb{R}. Unfortunately you also see people write the above as

    \mathbf{v}\left(f\right) = v^i \partial_i f  = \nabla_\mathbf{v} f,

    which is very confusing, as it conflicts with our careful definitions of what the basis vectors and covectors mean – such is life.

Posted in maths | Leave a comment

Algebraic structure of musical intervals and pitches

Here’s the first in what will hopefully be a series of related posts about one particular (limited) aspect of the interaction between music and mathematics. In my mind, I’ll be explaining things to a hypothetical musically uneducated mathematician, who should nevertheless end up with an understanding as good as any bona-fide musician’s – in the tradition of certain physics books of which I am a fan.

I begin by revising, in unnecessary rigorous detail, what you already knew about musical pitches and intervals.

Musical intervals are the signed distance between musical notes, as written on a traditional (Western) five-lined musical stave. For completeness, I will first summarise traditional pitch and interval notation.

Pitch syntax

Pitches are a pair, consisting of a letter and an accidental: P = (N, a), P \in \mathcal{P}, where

N \in \{A,B,C,\ldots,A',\ldots\}, \\ a \in \{\natural,\sharp,\flat,\sharp\sharp,\flat\flat,\ldots,\sharp^n,\flat^n,\ldots\}

leading to constructions such as C♮, F♯, B♭ etc. These pitches correspond in a slightly irregular way to the horizontal lines (and gaps between them) on the stave, but I will not go into the details here. All pairs (N, a) correspond to valid pitches. The set of pitches is actually extended upwards beyond that written above with super-prime symbols (A', B' ), and downwards with sub-primes (C_{'},D_{'} ).

The accidentals are pronounced as follows: ♮ is a natural, ♭ is a flat, ♯ is a sharp.

Pitches form an affine space, with intervals as the difference type (subtraction of two pitches). I will not define this subtraction until we have a clearer idea of the algebra of intervals.

Interval syntax

Intervals are also a pair, consisting of a quality and a number: I = (q, n), I \in \mathcal{I}, where

q \in \{\mathsf{P,M,m,A,d,AA,dd},\ldots,\mathsf{A}^n,\mathsf{d}^n,\ldots\}, \\ n \in \{\ldots,\mathsf{-3,-2,1,2,3},\ldots\}

leading to interval names such as P5, M3, m6 etc. Note that the set which n belongs to is not ℤ – the n are simply an arbitrary label (nominal numbers), and their arithmetic is tied up with the overall algebra of musical intervals in a complex way that does not correspond to the conventional notion of integers. Intervals form an associative abelian algebra (generally written as addition), together with the four additional operations of augmentation, diminution, inversion and negation.

The interval qualities listed above are pronounced, respectively, perfect, major, minor, augmented, diminished, doubly augmented etc. The interval numbers are pronounced as ordinal numbers, with the special case that 8 is an octave, and 1 a unison.

Note also that not all combinations (q, n) are permitted; in particular, there are certain rules that allow one to construct valid intervals. Start with one of the eleven valid base intervals:

\mathcal{I}' = \{\mathsf{P1}, \mathsf{m2}, \mathsf{M2}, \mathsf{m3}, \mathsf{M3}, \mathsf{P4}, \mathsf{P5}, \mathsf{m6}, \mathsf{M6}, \mathsf{m7}, \mathsf{M7}\}.

The total set \mathcal{I} of valid intervals is actually periodic: take \mathcal{I}', and in the positive n direction \mathcal{I} has period 7 and the entire inverted set (the result of the map n \longrightarrow -n ) is also in \mathcal{I}, i.e.

(q,n) \in \mathcal{I}' \implies (q,n \pm 7m) \in \mathcal{I}^{\prime\prime},

and then also

(q,n) \in \mathcal{I}^{\prime\prime} \implies (q, -n) \in \mathcal{I} \:\: \mathrm{and} \:\: (q, n) \in \mathcal{I}.

Operations on intervals

I will freely interchange two forms of notation for the same interval: \mathsf{P5} and (P,5) for convenience. I will use + for interval vector addition and addition of intervals to pitches, and - for interval vector subtraction and subtraction of pitches. I will use a dot \cdot for scalar multiplication of interval vectors by integers.

Before I get to the complete decision procedure for intervallic addition below, a limited form of interval addition can be defined on \mathcal{I}',

\mathsf{P1} + I = I\\ (\mathsf{M},n) - (m,n) = \mathsf{m2}\\ (\mathsf{m},n+1) - (\mathsf{M},n) = \mathsf{m2}\\ (\mathsf{m},n+1) - (\mathsf{P},n) = \mathsf{m2},\\

which can be extended in the obvious way to \mathcal{I} (of course, each case is defined only for (q,n) that actually exist in \mathcal{I} ).

Then, you can augment intervals:

aug(q,n) = \begin{cases} (\mathsf{A},n) & \mbox{if } q = \mathsf{P} \\ (\mathsf{M},n) & \mbox{if } q = \mathsf{m} \\ (\mathsf{A},n) & \mbox{if } q = \mathsf{M} \\ (\mathsf{m},n) & \mbox{if } q = \mathsf{d} \mbox{ and } (\mathsf{m},n) \in \mathcal{I} \\ (\mathsf{P},n) & \mbox{if } q = \mathsf{d} \mbox{ and } (\mathsf{P},n) \in \mathcal{I} \\ (\mathsf{A}^{i+1},n) & \mbox{if } q = \mathsf{A}^i \\ (\mathsf{d}^{i-1},n) & \mbox{if } q = \mathsf{d}^i,\\ \end{cases}

and diminish them:

dim(q,n) = \begin{cases} (\mathsf{d},n) & \mbox{if } q = \mathsf{P} \\ (\mathsf{m},n) & \mbox{if } q = \mathsf{M} \\ (\mathsf{d},n) & \mbox{if } q = \mathsf{M} \\ (\mathsf{M},n) & \mbox{if } q = \mathsf{A} \mbox{ and } (\mathsf{M},n) \in \mathcal{I} \\ (\mathsf{P},n) & \mbox{if } q = \mathsf{A} \mbox{ and } (\mathsf{P},n) \in \mathcal{I} \\ (\mathsf{A}^{i-1},n) & \mbox{if } q = \mathsf{A}^i.\\ (\mathsf{d}^{i+1},n) & \mbox{if } q = \mathsf{d}^i \\ \end{cases}

Note that aug(dim(I)) = dim(aug(I)) = I, for all I.

We can now define addition on intervals. Let I_1 = (q_1,n_1) and I_2 = (q_2,n_2). Then,

I_1 + I_2 = I_3 = (q_3, n_3),

finding q_3 and n_3 according to the following procedure: augment or diminish I_1 until you have a new interval I'_1 \in \mathcal{I}, along the way calculating an augmentation index j_1, where you increment j_1 once for each augmentation, and decrement once for each diminution. Repeat for I'_2. Then perform the following interval addition, which is possible because addition is already defined on \mathcal{I}:

I'_1 +  I'_2 = I'_3.

Then, perform the appropriate number (j_1 + j_2 ) of augmentations on I'_3, to give I_3:

aug^{j_1+j_2}(I'_3) = (q_3,n_3).

Incidentally, it is always true that n_3 = n_1 + n_2 - 1.

Free abelian groups

Quick revision of free abelian groups. A free abelian group G of rank n satisfies the following: there exists at least one set B \subset G, with |B| = n, such that every element of G can be written as a linear combination of the elements of B, with integer coefficients, i.e. g = b_1^{i_1}b_2^{i_2}\ldots b_n^{i_n}.

Free abelian groups can be thought of as vector spaces over the integers (ℤ as the field of scalars, G as the group of vectors).

A rank-n free abelian group is isomorphic to the group of n-tuples of integers, with the group operation pairwise addition:

(a_1,a_2,\ldots) + (b_1,b_2,\ldots) = (a_1 + b_1, a_2 + b_2, \ldots),

or in other words, any rank-n free abelian group is isomorphic to the direct sum of n copies of ℤ:

G \cong \mathbb{Z} \oplus \mathbb{Z} \oplus \ldots \oplus \mathbb{Z}.

There may be many different choices of basis set b, much like a vector space with no preferred basis.

Intervals as a free abelian group

As you can probably tell, our previous method of interval addition is a horrible mess. Luckily we can prove that intervals form a rank-2 free abelian group (said proof consists of a pile of tedious case analysis). Hence we can find a two-element basis; after which interval addition proceeds easily, being reduced to element-wise addition of pairs of integers.

We need to find any pair of linearly independent intervals to use as a basis – to decompose any arbitrary interval into a linear combination of the two basis intervals. Once we have one basis, we can use them to easily generate all the other bases. However, it’s not immediately obvious how to find our first pair of linearly independent intervals. Luckily, I have such a pair up my sleeve already: (A1, d2). They must be linearly independent, because

n\cdot \mathsf{A1} = (\mathsf{A}^n, n - n + 1) = (\mathsf{A}^n,1) \neq \mathsf{d2} \:\:\: \forall n.

We will now take the opportunity to simplify our augmentation and diminution operations,

aug(I) = I + \mathsf{A1} \\ dim(I) = I - \mathsf{A1}.

Decomposing arbitrary intervals into a basis requires some more tedious case analysis, so here are just a few examples:

\mathsf{m2} = \mathsf{A1} + \mathsf{d2}\\ \mathsf{P5} = 7\cdot \mathsf{A1} + 4\cdot \mathsf{d2}\\ \mathsf{M6} = 9\cdot \mathsf{A1} + 5\cdot \mathsf{d2}\\ \mathsf{P8} = 12\cdot \mathsf{A1} + 7\cdot \mathsf{d2}.

If we define the map \phi : \mathcal{I} \longrightarrow \mathbb{Z}\times\mathbb{Z} to be the map that gives the decomposition into the (A1, d2) basis, then the above could be written as

\phi(\mathsf{m2}) = (1,1)\\ \phi(\mathsf{P5}) = (7,4)\\ \phi(\mathsf{M6}) = (9,5)\\ \phi(\mathsf{P8}) = (12,7).

The basis (A1, d2) is convenient insofar as A1 matches up with what we think of as semitones, and d2 simply counts an interval’s number.

Now that we can add and subtract intervals easily, I can concisely define the two remaining special operations on intervals, inversion and negation:

inv(I) = \mathsf{P8} - I\\ neg(I) = \mathsf{P1} - I.

Pitch space

Of course, merely being able to add and subtract intervals is pretty useless on its own. What we really want to do is use intervals to hop around the space of pitches. The rules for pitch arithmetic are only barely less irregular than for intervals.

Note: adding an interval to a pitch is called transposition, and it is technically a separate operation from interval addition (it has a different type signature), but we shall use the + symbol for it anyway.

In our notation from the first section, adding an octave (P8) adds a prime symbol to a letter name

(N,a) + \mathsf{P8} = (N',a)\\ (N_{'},a) + \mathsf{P8} = (N,a)

with the obvious extension to multiple octave subtraction, and multiple primes and sub-primes.

Adding and subtracting A1 corresponds to adding and subtracting sharps and flats:

(N,\natural) + \mathsf{A1} = (N,\sharp)\\ (N,\flat) + \mathsf{A1} = (N,\natural)

with the obvious extension to double sharps (♯♯) and double flats (♭♭) and arbitrary numbers of accidentals.

All that remains is to give the intervals between the natural (♮) pitches:

\mathsf{A'\natural} - \mathsf{G\natural} = \mathsf{M2}\\ \mathsf{G\natural} - \mathsf{F\natural} = \mathsf{M2}\\ \mathsf{F\natural} - \mathsf{E\natural} = \mathsf{m2}\\ \mathsf{E\natural} - \mathsf{D\natural} = \mathsf{M2}\\ \mathsf{D\natural} - \mathsf{C\natural} = \mathsf{M2}\\ \mathsf{C\natural} - \mathsf{B\natural} = \mathsf{m2}\\ \mathsf{B\natural} - \mathsf{A\natural} = \mathsf{M2}.

The equivalent of choosing a basis for our intervals is finding a coordinate system for our pitches. To do this we must convert the pitch affine space \mathcal{P} into a vector space (Wikipedia suggests the name Pointed space) by choosing an origin. At this point, any choice of origin is arbitrary and meaningless, but it becomes important when we get to tuning systems, which will be the subject of the next post in this series.

For example, let us define a map \psi : \mathcal{P} \longrightarrow \mathbb{Z}\times\mathbb{Z} with origin

\psi(\mathsf{A\natural}) = O.

Then, to find the coordinates of an arbitrary pitch P, we compute

\psi(P) = O + \phi(P - O).

Of course, it is easiest to simply define O = (0,0).

We now no longer need to worry about how to represent pitches, and will focus on intervals for the purposes of basis changes.

Change of interval basis

Given that we now have one valid basis, the problem of further changes of basis reduces to linear algebra in two dimensions.

Let \phi(I) = (m,n) be our interval in the (A1,d2) representation, and let \phi(I_1) = (a,b) and \phi(I_2) = (c,d) be a different pair of linearly independent basis intervals. Then

x\cdot(a,b) + y\cdot(c,d) = (m,n),

which is simply a system of two linear equations, to be solved by determinants in the usual way:

x = \frac{dm - cn}{ad - bc}, \:\:\:\: y = \frac{an - bm}{ad - bc}.

Clearly the solution will not always be in the integers, so we may sometimes choose to extend our scalar field to the rationals (particularly when we come to tuning systems). Here are the examples from the previous-but-one section, but demonstrating the (P5,P8) basis:

\mathsf{m2} = -5\cdot\mathsf{P5} + 3\cdot\mathsf{P8}\\ \mathsf{P5} = 1\cdot\mathsf{P5} + 0\cdot \mathsf{P8}\\ \mathsf{M6} = 3\cdot \mathsf{P5} - 1\cdot\mathsf{P8}\\ \mathsf{P8} = 0\cdot \mathsf{P5} + 1\cdot\mathsf{P8},

and again with the (M2,m2) basis:

\mathsf{m2} = 0\cdot\mathsf{M2} + 1\cdot\mathsf{m2}\\ \mathsf{P5} = 3\cdot\mathsf{M2} + 1\cdot\mathsf{m2}\\ \mathsf{M6} = 4\cdot \mathsf{M2} + 1\cdot\mathsf{m2}\\ \mathsf{P8} = 5\cdot \mathsf{M2} + 2\cdot \mathsf{m2}.

Finally, here is a diagram showing pitch space, with arrows representing 3 choices of interval basis.


Figure 1: Lattice

The ideas in this post are implemented concretely in two software projects: the Haskell Music Suite (also available on Hackage), and in AbstractMusic (the latter being my own personal research project).

Posted in maths, music | 1 Comment

Blog relaunch

To misquote The Who, “Meet the new blog, same URL as the old blog”. I have a post-New Year’s resolution, which is to write more frequently, and more interestingly. One of the traditional, less-frequent, less-interesting posts is shown below, just for the sake of nostalgia.

The blog now has an obligatory self-referential punning title, so it’s easier to remember and refer to once I become rich and famous through blogging.

Merry 2015 everyone!

Posted in meta | Leave a comment

Piping from GNU Screen to GNU Emacs

I’m continually discovering nifty new features of GNU Screen, and I’ve taken advantage of one of them to fix one of my long-standing annoyances with what’s otherwise a nearly-perfect piece of software – the rather awkward scroll-back buffer.

Here’s a method for essentially using Emacs as the scroll-back buffer (though convenient paste behaviour will have to come later):

First, create a named pipe in your home directory:

mkfifo ~/.screen.fifo

Then place the following shell script somewhere in your $PATH, calling it e.g. read-screen-fifo.sh:


emacsclient -n -t -e '(shell-command "cat ~/.screen.fifo" "*screenfifo*")' &
sleep 0.5
screen -X eval "hardcopy -h $HOME/.screen.fifo"
screen -X screen -t hardcopy 0 emacsclient -t -e '(switch-to-buffer "*screenfifo*")' -e '(goto-char (point-max))'

Then place the following binding in your .screenrc:

bind E exec read-screen-fifo.sh

Now, pressing your screen prefix key (C-a by default, back-tick for me) then E will result in screen opening an emacsclient frame containing the entire contents of the initial window’s scroll-back buffer.

Note that commented-out sleep – if Emacs is a bit slow off the mark reading from the fifo, screen can get very angry and appear to hang (albeit recoverable by reattaching); so you may want to give it a split-second to catch up. Also, I discovered that screen is also very picky about what it lets you run inside an eval, so watch out (hence this all being done with shell scripts). This may be because screen will refuse to write to the fifo until there’s another program already reading from it – this is different from the behaviour of, say, cat.

Bonus screen trick I discovered in the course of this: put bind R source ~/.screenrc in your .screenrc to give yourself a way of reloading screen without the need for a restart (came in very handy when working out the above – I actually managed to completely crash Emacs three times, but Screen only tended to hang, in a recoverable way).

Posted in emacs | Leave a comment