Check HTML markup in ResX files

So, we got almost one thousand of resource files translated for public part of our project.

Unfortunatelly on such huge amounts of data there is big chanse for human mistakes. We broke few important pages in our site with not valid html markup from resx files.

So here is nice and simple way to check if there any errors in resource files html:

[System.Reflection.Assembly]::LoadFrom((Join-Path $PSScriptRoot -ChildPath 'HtmlAgilityPack.dll')) | Out-Null
$doc = New-Object HtmlAgilityPack.HtmlDocument


$files = Get-ChildItem -Path C:\Rabota.UA\trunk\Version\Rabota2.WebUI -File -Include *.resx -Recurse -ErrorAction SilentlyContinue

$errors = @()
foreach($file in $files) {
    Write-Verbose $file.FullName

    $items = ([xml]$xml = Get-Content $file.FullName -Encoding UTF8).root.data

    foreach($item in $items) {
        $doc.LoadHtml($item.value)
        if($doc.ParseErrors.Count -gt 0) {
            Write-Host $file.FullName -ForegroundColor Yellow -NoNewline
            Write-Host (' ' + $item.name) -ForegroundColor Cyan

            $doc.ParseErrors | ft -AutoSize

            $errors += $doc.ParseErrors
        }
    }

    Write-Progress -Activity 'Checking HTML' -Status $file.FullName -PercentComplete ( [Array]::IndexOf($files, $file) / $files.Count * 100 )
}

if($errors.Count -gt 0) {
    Write-Host ('Found ' + $errors.Count + ' errors') -ForegroundColor Red
} else {
    Write-Host 'All seems to be OK' -ForegroundColor Green
}

and its output:

C:\Rabota.UA\trunk\Version\Rabota2.WebUI\App_GlobalResources\cvbuilder.en.resx FinanceSkillsRightExample

        Code Line LinePosition Reason                      SourceText StreamPosition
        ---- ---- ------------ ------                      ---------- --------------
TagNotClosed    3            1 End tag </ul> was not found                        34
TagNotClosed    7           57 End tag </ul> was not found                       266


C:\Rabota.UA\trunk\Version\Rabota2.WebUI\App_GlobalResources\cvbuilder.en.resx ITSkillsRightExample

        Code Line LinePosition Reason                      SourceText StreamPosition
        ---- ---- ------------ ------                      ---------- --------------
TagNotClosed    3            1 End tag </ul> was not found                        34
TagNotClosed    7           57 End tag </ul> was not found                       371

...

C:\Rabota.UA\trunk\Version\Rabota2.WebUI\Controls\CvBuilder\App_LocalResources\StepThree.ascx.resx Incomplete2

                Code Line LinePosition Reason                      SourceText StreamPosition
                ---- ---- ------------ ------                      ---------- --------------
EndTagNotRequired    1           41 End tag </> is not required <                      40


C:\Rabota.UA\trunk\Version\Rabota2.WebUI\Controls\CvBuilder\App_LocalResources\StepThree.ascx.uk.resx Incomplete2

                Code Line LinePosition Reason                      SourceText StreamPosition
                ---- ---- ------------ ------                      ---------- --------------
EndTagNotRequired    1           37 End tag </> is not required <                      36


Found 52 errors

The reason why I so excited about this stuf - think how could such script to be used to detect broken html in asp webforms user controls!

Note: for this to work you will need HtmlAgilityPack.dll, also do not forget to change path to root folder of your porject

Check broken HTML in WebForms User Controls

[System.Reflection.Assembly]::LoadFrom((Join-Path $PSScriptRoot -ChildPath 'HtmlAgilityPack.dll')) | Out-Null
$doc = New-Object HtmlAgilityPack.HtmlDocument


$files = Get-ChildItem -Path C:\Rabota.UA\trunk\Version\Rabota2.WebUI -File -Include *.ascx, *.aspx, *.master -Recurse -ErrorAction SilentlyContinue

$errors = @()
foreach($file in $files) {
    Write-Verbose $file.FullName

    $doc.LoadHtml((Get-Content $file.FullName -Encoding UTF8))

    if($doc.ParseErrors.Count -gt 0) {
        Write-Host $file.FullName -ForegroundColor Yellow

        $doc.ParseErrors | ft -AutoSize

        $errors += $doc.ParseErrors
    }

    Write-Progress -Activity 'Checking HTML' -Status $file.FullName -PercentComplete ( [Array]::IndexOf($files, $file) / $files.Count * 100 )
}

if($errors.Count -gt 0) {
    Write-Host ('Found ' + $errors.Count + ' errors') -ForegroundColor Red
} else {
    Write-Host 'All seems to be OK' -ForegroundColor Green
}

What you should be avare of - it is not silver bullet and can not find matchin tags for stuffs like Repeater > Header(Footer)Template