I had to add a character limit to an XHtmlString property in Episerver on a publish event the other day, and knew I had get rid of the markup to get an accurate count. I was pleased to find that the good old TextIndexer was still in there to clean the HTML for me.
Here is a short reminder on how to use it:
// Markup with encoding as a string instead of fragments. string htmlText = myPage.HtmlTextProperty.ToString(); // Encoded text with markup removed. string plainText = EPiServer.Core.Html.TextIndexer .StripHtml(htmlText, maxTextLengthToReturn: htmlText.Length); // Decoded and readable text. string decodedText = System.Web.HttpUtility .HtmlDecode(plainText);