Orphaned translation scanner for EPiServer language files

My current project has undergone some rather large changes the last couple of months; core functionality being moved out of the solution, functionality needing to be cleaned away and so forth. I suspected that we were left with a rather large number of stray translations in our EPiServer language files because of this, so I took some time and wrote a small console tool to scan our source code identifying them.

Small console tool for finding unused translations in EPiServer language files

The code for this is fairly straight forward. A NuGet package called NDesk.Options to help with handling command line arguments, and basic string searches through out the website’s source code. In short, the language files in the specified language directory are scanned to construct paths to each translation leaf node. Then each relevant file in the source code is searched for occurences of these paths. Findings are presented in a list along with language file and translated content.

There are limitations though; for instance, if you concatenate your language paths dynamically through your code, those translations will show up as orphans as it is the whole paths that are used for the searches. So if you want to look for anything that is not on the form /forum/thread/unsticky you will have to modify the application so suite your needs. Also, you probably will not want to scan for EPiServer’s own translations inside your source code.

StickyButton.Text = Translate("/forum/thread/unsticky");
<a title="<%= EPiServer.Core.LanguageManager.Instance.Translate("/navigation/skipnavigation") %>">

<dt><EPiServer:Translate Text="/news/publishdate" runat="server" /></dt>

The options available on the command line are as follows.

  • rp|rootPath – The path to the directory containing the source code for your site.
  • lp|langPath – The path to the language directory.
  • ee|excludeExtensions – Comma separated list of file extensions to skip while scanning; defaults are binary files etc.
  • ic|ignoreCase – If you need to make the scans case insensitive.
  • dt|distinctTranslations – A single translation path probably occur once in each language file, this option only scans for one instance; you will not get all the filenames.
  • h|?|help – Displays the help.

Sure things could be made more sophisticated, for instance in the path-department, but as this was initially ment as a quick and dirty scanning tool just to help clean up the project, it was satisfying enough.

Usage examples:

.\LanguageScanner.exe --help

.\LanguageScanner.exe --rp=C:\path\to\wwwroot --lp=C:\path\to\wwwroot\lang --ignoreCase=True

Code available at GitHub.