Solving a quartic equation

This is a method of solving the general quartic equation that my father showed me. The result is a very nice, compact and symmetric expression for the roots.


Solve the quartic equation


for X.


First divide through by a and then substitute X=xb4a in order to eliminate the cubic term. This leaves us with

(depressed-quartic)x4+px2+qx+r=0. x^4+px^2+qx+r=0.\tag{depressed-quartic}

Now, introduce a new variable y=x2+p/2. Then our quartic is equivalent to the pair of simultaneous equations: (parabolas)y2+qx+(rp24)=0x2y+p2=0.\begin{aligned} y^2+qx+\left(r-\frac{p^2}{4}\right)&=0\tag{parabolas}
x^2-y+\frac{p}{2}&=0. \end{aligned}

These two equations each describe parabolas in the xy-plane, one oriented vertically and one oriented horizontally. The intersections (there can be up to four) correspond to roots of our quartic (at least, the x-coordinates of those intersections do). Finding those intersections is still very hard.

There is a great trick to move forward: we consider the problem in a more general light. Define fm(x,y)=(y2+qx+(rp24))+m(x2y+p2).

If we consider a solution (x,y) of our simultaneous equations, then fm(x,y)=0 for every m (of course, fm will generally have other zeroes in addition to those). In other words, every curve in the family F={(x,y)R2fm(x,y)=0}mR will contain our points of interest.

Now, the equation fm(x,y)=0

describes a conic section in the xy-plane. Completing the square, we find (ym2)2+m(x+q2m)2=(q24m+14(pm)2r).

This conic changes its character as m is varied, from ellipse to hyperbola, etc. However, every conic in this family F contains the points where the parabolas \eqref{parabolas} intersect.

To find these points common to the family of conics, we use another great trick. There are a handful of particularly simple conic sections that can be picked out of the family F: when the right hand side vanishes, the conic sections are simply lines! The condition for this is

(cubic)m32pm2+(p24r)m+q2=0, m^3-2pm^2+(p^2-4r)m+q^2=0,\tag{cubic}

and the resulting conics are (ym2)2=m(x+q2m)2

or y=m2±m(x+q2m).

We will call the three roots of \eqref{cubic} m1, m2 and m3. For each of those values of m, the conic is a pair of intersecting lines which, again, always contain the points we care about.

Out of those three special values of m, we will take m1 and m2 and consider the intersections of those conics. In a normal, non-degenerate case, we expect four intersections: line 1 of the m1 conic will hit each of the two lines of the m2 conic; line 2 of the m1 conic will do the same. Of course, working with lines, it is easy to find the intersections.

For example, considering the intersection of the + branch for m1 with the + branch of m2, we write m12+m1(x+++q2m1)=m22+m2(x+++q2m2).

Rearranging, x++(m1m2)=12(m1m2)q2m1m2m1m2.

This simplifies if we expand m1m2 as a difference of squares. Then we get x++=12(m1+m2+qm1m2).

The other three solutions, x+, x+ and x all look similar.

Another simplification comes if we recall that (mm1)(mm2)(mm3)=m32pm2+(p24r)m+q2,

so that q2=m1m2m3. If we assume that q>0, then q=m1m2m3, we find a very nice form for the four solutions: x++=12(m1m2m3),x+=12(m1+m2+m3),x+=12(+m1m2+m3),x=12(+m1+m2m3).\begin{gathered} x_{++}=\frac{1}{2}\left(-\sqrt{-m_1}-\sqrt{-m_2}-\sqrt{-m3}\right),
{–}=\frac{1}{2}\left(+\sqrt{-m_1}+\sqrt{-m_2}-\sqrt{-m_3}\right). \end{gathered}

Thus the roots of the depressed quartic \eqref{depressed_quartic} are expressed as symmetric combinations of the roots of the cubic \eqref{cubic}. They are symmetric in the sense that the arbitrary numbering “m1m1”, etc. is irrelevant: x++ is independent of any numbering scheme, while the other three solutions each simply single out one of three m values to have a negative sign.

We can verify that these solutions are correct by checking that (xx++)(xx+)(xx+)(xx)=x4+px2+qx+r.

Expanding, and making reference to a similar expansion of \eqref{cubic}, we see that ixi=0,i<jxixj=12(m1+m2+m3)=p,i<j<kxixjxk=m1m2m3=q2=q,ixi=116((m1+m2+m3)24(m1m2+m1m3+m2m3))=r.\begin{aligned} \sum_i xi=&0,
{i<j} x_i x_j=\frac{1}{2}(m_1+m_2+m3)=&p,
{i<j<k} x_i x_j x_k=-\sqrt{-m_1 m_2 m_3}=-\sqrt{q^2}=&-q,
\prod_i x_i=\frac{1}{16}\left(\left(m_1+m_2+m_3\right)^2-4(m_1 m_2+m_1 m_3+m_2 m_3)\right)=&r. \end{aligned}

Therefore the roots of the quartic may be expressed as X1=b4a+12(m1m2m3),X2=b4a+12(m1+m2+m3),X3=b4a+12(+m1m2+m3),X4=b4a+12(+m1+m2m3).\begin{gathered} X_1=-\frac{b}{4a}+\frac{1}{2}\left(-\sqrt{-m_1}-\sqrt{-m_2}-\sqrt{-m_3}\right),
X_4=-\frac{b}{4a}+\frac{1}{2}\left(+\sqrt{-m_1}+\sqrt{-m_2}-\sqrt{-m_3}\right). \end{gathered}

with m1,m2,m3 the roots of the cubic \eqref{cubic}.

What if q<0?

If we back up a bit, we see that it was important, in the final expressions, that q>0. However, the actual expression for q, coming from \eqref{depressed-quartic} is q=da+b38a3bc2a2.

This could very easily be negative. Therefore we must consider the case separately.

We now have q=m1m2m3 and the solutions become x++=12(m1m2+m3),x+=12(m1+m2m3),x+=12(+m1m2m3),x=12(+m1+m2+m3).\begin{gathered} x'_{++}=\frac{1}{2}\left(-\sqrt{-m_1}-\sqrt{-m_2}+\sqrt{-m3}\right),
{–}=\frac{1}{2}\left(+\sqrt{-m_1}+\sqrt{-m_2}+\sqrt{-m_3}\right). \end{gathered}

Note that x++=x, etc. These are the negatives of the roots from the case where q>0. However, they are, indeed, different values from before.

In verifying the expansion for \eqref{depressed-quartic}, we find that the condition for the coefficient of x changes slightly so that the q<0 solutions check out.

TODO: numerically evaluating this stuff in Mathematica indicates there are issues. for some parameters the solution is right, but not as often as I’d hope. the issues are probably due to being too loose with radicals of negative quantities. It’s standard for some of the m’s to be positive. m1m2 is not generally m1m2.